Software degradation
- Transfer
In the book “The Electromagnetic Age: Work, Love, and Life, When Robots Rule the World,” Robin Hanson briefly discusses program degradation:
I am sure it is true. As a rule, competent adaptation of software to new conditions takes more time and effort than writing new software from scratch. Programmers do not like to admit this, but the evidence is obvious. There are several well-known examples in open source projects.
Mozilla Firefox initially ran all tasks in one process. With the release of Google Chrome, it became clear that a multi-process model improves security and performance. Mozilla developers soon began planning how to implement multiprocessing in Firefox. That was in 2007.
Nearly ten years later, Mozilla finally released the multiprocess Firefox to the mass audience . This delay did not come at all from a lack of desire. Mozilla has talented and motivated developers. However, Chrome was written from scratch in much less time than it took Firefox to change. There are two main reasons for this:
But the situation is even worse. The restrictions contradict each other: you need to rebuild the internal architecture, but leave the public APIs almost unchanged. No wonder Mozilla took ten years to do this.
Initially, Apache httpd worked on the “one process per connection” model. One process listened on port 80, then did
This architecture is simple, easy to implement on many platforms and ... that’s it. It is absolutely terrible for performance, especially when processing long-lived compounds. In fairness: it was 1995. And soon, Apache switched to a streaming model that improved performance. However, it could not handle 10,000 simultaneous connections. The “one thread per connection” architecture requires 1000 threads to serve 1000 concurrent connections. Each thread has its own stack and state; it must be scheduled and launched by the operating system sheduler. It takes time.
In contrast, Nginx used the reactor template from the start . This allowed it to handle more simultaneous connections and made it immune to Slowloris attacks .
Nginx was released in 2007, and its performance advantage was obvious. A few years before the release of Nginx, Apache developers began redesigning httpd to improve performance. Multiprocess module eventReleased with Apache 2.2 in 2005. But compatibility issues were discovered. Worst of all, it broke compatibility with popular modules like mod_php. The problem could not be fixed until 2012, when Apache 2.4 came out with this module (MPM) by default. Although it worked much better than the previous prefork and MPM worker , it still could not match the performance of Nginx. Instead of a reactor template, Apache used separate thread pools to listen / receive connections and process requests. The architecture is roughly equivalent to starting a load balancer or reverse proxy before MPM httpd 2 worker.
Python is a good programming language. It is expressive, easy to learn (at least for a programming language) and is supported on different platforms. But over the past two decades, the most popular Python implementation has had a serious problem: it couldn’t easily take advantage of several processor cores.
The reason is the global blocking of the interpreter or GIL. From the python wiki :
Initially, the GIL was not a big deal. When creating Python, multicore systems were rare. And GIL was easy to write and it is a simple, logical system. But today, multi-core processors work even in wrist watches. GIL is an obvious and flagrant defect in every way a pleasant language. Despite the popularity of CPython, despite the talented developers, despite sponsors such as Google, Microsoft and Intel, it is not even planned to fix the GIL .
Even with talented engineers, lots of money, and a clear plan, mature software is extremely difficult to change. I tried to find cases that refute software degradation, but they do not seem to exist. Robin Hanson asked for counterexamples , but no one suggested anything convincing. There are many old software projects, but they did not have to adapt very much. I really want to find good counterexamples, because without them, a too gloomy picture of the prospects for software is created.
1. Quote from the article “Software Evolution: Software Change Processes”. The work is older than me, and I can not find the online version. I bought a physical copy and hardly mastered it. The terminology is strange, but the conclusions are not too surprising. ↑
2. Anyone who knows the httpd device will object to such a comparison, but here we sacrifice accuracy for the sake of brevity. I apologize for this. ↑
The software was originally designed for one set of tasks, tools, and situations. But it is slowly changing to cope with a constant stream of new tasks, tools and situations. Such software becomes more complex, fragile, it is more difficult to make useful changes in it (Lehman and Biledy, 1985) 1. In the end, it is better to start all over again and write from scratch new subsystems, and sometimes completely new systems.
I am sure it is true. As a rule, competent adaptation of software to new conditions takes more time and effort than writing new software from scratch. Programmers do not like to admit this, but the evidence is obvious. There are several well-known examples in open source projects.
Multiprocess Firefox
Mozilla Firefox initially ran all tasks in one process. With the release of Google Chrome, it became clear that a multi-process model improves security and performance. Mozilla developers soon began planning how to implement multiprocessing in Firefox. That was in 2007.
Nearly ten years later, Mozilla finally released the multiprocess Firefox to the mass audience . This delay did not come at all from a lack of desire. Mozilla has talented and motivated developers. However, Chrome was written from scratch in much less time than it took Firefox to change. There are two main reasons for this:
- Remaking a single-process architecture into a multi-process one involves many small changes. Some function calls need to be replaced by interprocess communications. The general state needs to be wrapped in mutexes. Caches and local databases must support concurrent access.
- Firefox should maintain compatibility with existing extensions (or force developers to update them). Chrome created an API for extensions from scratch without such restrictions.
But the situation is even worse. The restrictions contradict each other: you need to rebuild the internal architecture, but leave the public APIs almost unchanged. No wonder Mozilla took ten years to do this.
Event Oriented Apache
Initially, Apache httpd worked on the “one process per connection” model. One process listened on port 80, then did
accept()
and fork()
. The child is then carried out read()
and write()
on the socket. When the request is complete, the child process closed the socket close()
and executed exit()
. This architecture is simple, easy to implement on many platforms and ... that’s it. It is absolutely terrible for performance, especially when processing long-lived compounds. In fairness: it was 1995. And soon, Apache switched to a streaming model that improved performance. However, it could not handle 10,000 simultaneous connections. The “one thread per connection” architecture requires 1000 threads to serve 1000 concurrent connections. Each thread has its own stack and state; it must be scheduled and launched by the operating system sheduler. It takes time.
In contrast, Nginx used the reactor template from the start . This allowed it to handle more simultaneous connections and made it immune to Slowloris attacks .
Nginx was released in 2007, and its performance advantage was obvious. A few years before the release of Nginx, Apache developers began redesigning httpd to improve performance. Multiprocess module eventReleased with Apache 2.2 in 2005. But compatibility issues were discovered. Worst of all, it broke compatibility with popular modules like mod_php. The problem could not be fixed until 2012, when Apache 2.4 came out with this module (MPM) by default. Although it worked much better than the previous prefork and MPM worker , it still could not match the performance of Nginx. Instead of a reactor template, Apache used separate thread pools to listen / receive connections and process requests. The architecture is roughly equivalent to starting a load balancer or reverse proxy before MPM httpd 2 worker.
CPython GIL
Python is a good programming language. It is expressive, easy to learn (at least for a programming language) and is supported on different platforms. But over the past two decades, the most popular Python implementation has had a serious problem: it couldn’t easily take advantage of several processor cores.
The reason is the global blocking of the interpreter or GIL. From the python wiki :
In CPython, a global interpreter lock is a mutex that blocks the simultaneous execution of multiple code streams. This locking is necessary mainly because CPython's memory management is not thread safe. (But since GIL exists, other functions become dependent on it).
Initially, the GIL was not a big deal. When creating Python, multicore systems were rare. And GIL was easy to write and it is a simple, logical system. But today, multi-core processors work even in wrist watches. GIL is an obvious and flagrant defect in every way a pleasant language. Despite the popularity of CPython, despite the talented developers, despite sponsors such as Google, Microsoft and Intel, it is not even planned to fix the GIL .
Conclusion
Even with talented engineers, lots of money, and a clear plan, mature software is extremely difficult to change. I tried to find cases that refute software degradation, but they do not seem to exist. Robin Hanson asked for counterexamples , but no one suggested anything convincing. There are many old software projects, but they did not have to adapt very much. I really want to find good counterexamples, because without them, a too gloomy picture of the prospects for software is created.
Additional materials
- Overcoming bias: what is the cause of software degradation?
- Surprise: software is degrading!
- Wikipedia: software degradation
1. Quote from the article “Software Evolution: Software Change Processes”. The work is older than me, and I can not find the online version. I bought a physical copy and hardly mastered it. The terminology is strange, but the conclusions are not too surprising. ↑
2. Anyone who knows the httpd device will object to such a comparison, but here we sacrifice accuracy for the sake of brevity. I apologize for this. ↑