igor_suhorukov December 26, 2017 at 02:12

How changing two lines of code can take several days

I wonder if anyone else believes that the work of the developer can be measured by the number of lines of code? Let's try to debunk this myth as old as the world with our red eyes.

Is it difficult to change two lines of code?

The hero of this story is an open source project H2 database of a popular relational database for tests, a web console for SQL, and even contains an analog of LevelDB / Berkeley DB Java Edition / SQLite 3. It is an excellent project, it has been used many times for its practice and there were no problems. Until I tried to use it with redshift jdbc driver.

There is such a database in AWS, Redshift - fork of the times of PostgreSQL 8.0.2. Somewhere in the same decade, his rival greenplum-db appeared ... Despite the fact that this database has a massively parallel architecture and other “goodies” of column-oriented DBMS, the feeling that you are in a museum of computer history does not leave you working with it . I realized that this feeling was not without reason when I discovered that the drivers of the modern PostgreSQL 9.6 and Redshift fossil wire driver postgresql 8.x are conflicting in the application.

Found that PG wire protocol 8.x is used when connecting to PostgreSQL 9.6 in the H2 web console. The results saddened me and I began to understand how this could happen. Debugging brought to the line to get the connection :

 DriverManager.getConnection(url, prop);

It seems that everything looks according to the specification, since it is not JNDI and not javax.sql.DataSource.

Going deeper into the DriverManager. Everything is already known and expected there. In its static initialization block, ServiceLoader is used to load implementers of java.sql.Driver and declare this using the implementation record META-INF / services / java.sql.Driver in its jar. This has long since canceled the use of Class.forName (driver) - so all modern drivers load without this fossil call. There is nothing new for me here .

When requesting a connection, drivers are sorted in the order they registered in the registeredDrivers field. DriverManager for each of them calls driver.connect (url, info) in a chain. If a particular driver returned a database connection object, return it from the function. The first one to process the connection URL from the driver chain wins!

The driver itself analyzes whether it can process the jdbc: subprotocol subprotocol . To my misfortune, the redshift jdbc driver processed not only its "redshift" but also "postgresql", but using the ancient code of the mid-2000s era. It is clear that the connection url request will never reach the postgres 9.6 driver.

One plus in karma to the redshift jdbc developers - thanks though, the classes of the ancient PG implementation were hidden in a separate package, and did not leave to conflict with org.postgresql.Driver in jar hell . I tried to use a more “fresh” driver, but it did not work inside the spring boot executable jar, since dependencies packaged in it with a “nested doll” - dependency jars inside the driver jar.

At the same time, the HikariCP connection pool correctly creates the new postgresql driver, unlike the H2 console. Since the user has specified driverClass, he calls it on it and does not rely on DriverManager. This works in the hell inflicted by redshift jdbc. The reason was found quickly and it became clear how to solve the problem.

The patch was created on the weekend and sent as a pull request to the project repository, at the same time created an error request . After that, correspondence and argumentation of the change began for the contributor, the second most active in the h2database repository. Fulfilled all its requirements and comments for this pull request and the changes were accepted into the main project code. It took a lot of free time due to two lines of changes and the redshift driver. But there was already excitement and a matter of principle - to survive in a world where the fossil protocol overlaps the modern one. Thanks to him for his time, for delving into this problem. I believe that meticulousness in accepting a pull request in a popular open source project is good for quality. Almost two days passed before two lines for fixing the bug appeared in the project.

Another pull request for new functionality in schemaspy has been hanging for more than a week. It’s my fault here, the problem is that it was developed on linux, but did not work on a windows system. I repent that I did not test immediately.

Share how a few lines of code consumed time. Are there intriguing stories and stories of the detective genre?

Tags:

How changing two lines of code can take several days

Also popular now: