PatientZero May 17, 2019 at 07:59

The faster you forget OOP, the better for you and your programs.

Transfer

Object-oriented programming is an extremely bad idea that could only come up in California.

- Edsger Weebe Dijkstra

Perhaps this is just my feeling, but object-oriented programming seems to be the standard, most common software design paradigm. It is he who is usually taught to students, explained in online tutorials and, for some reason, spontaneously applied even when they were not going to do it.

I know how attractive she is and how wonderful this idea seems to be on the surface. It took me many years to destroy her spell, and now I understand how terrible she is and why. Thanks to this point of view, I have a clear belief that people should be aware of the fallacy of OOP and know the solutions that can be used instead.

Many people have previously discussed OOP issues, and at the end of this post I will list my favorite articles and videos. But first, I want to share my own view.

Data is more important than code

At its core, all software is designed to manipulate data in order to achieve a certain result. The result determines how data is structured, and the data structure determines the necessary code.

This point is very important, so I repeat: цель -> архитектура данных -> код. Here, the order can not be changed in any case! When designing a program, you should always start by figuring out the goal you want to achieve, and then at least roughly imagine the data architecture: the data structures and infrastructure necessary to achieve it effectively. And only after that you need to write code to work with such an architecture. If the goal changes over time, then you must first change the data architecture, and then the code.

In my experience, the most serious problem of OOP is that it motivates to ignore the architecture of the data model and use the stupid pattern of saving everything into objects that promise some vague advantages. If this is suitable for the class, then it is sent to the class. Do i have Customer? He goes to class Customer. Do I have a rendering context? He is leaving class RenderingContext.

Instead of building a good data architecture, the developer’s attention is shifted towards the invention of “good” classes, the relationships between them, taxonomies, inheritance hierarchies, and so on. This is not just a futile exercise. In its depths, it is very harmful.

Motivation for difficulty

When designing the data architecture explicitly, the result is usually the minimum required set of data structures that serve the purpose of our software. If you think in terms of abstract classes and objects , then the grandeur and complexity of abstractions from above is not limited to anything. Just take a look at the FizzBuzz Enterprise Edition - such a simple task can be implemented in so many lines of code just because OOP always has room for new abstractions.

OOP advocates will say that checking abstractions is a matter of developer skill level. Maybe. But in practice, OOP programs always grow and never decrease, because OOP stimulates this.

Counts are everywhere

Since OOP requires scattering information on many small encapsulated objects, the number of links to these objects is also growing explosively. OOP requires passing everywhere long lists of arguments or directly storing references to related objects for quick access to them.

Yours class Customerhas a link to class Order, and vice versa. class OrderManagercontains links to everything Order, and therefore indirectly to Customer. Everything tends to refer to everything else, because gradually more and more places appear in the code that refer to a related object.

You needed a banana, but you got a gorilla holding a banana and the whole jungle.

OOP projects usually do not look like well-designed data warehouses, but like huge spaghetti-graphs of objects pointing to each other, and methods that receive huge lists of arguments. When you start designing objects Contextjust to reduce the number of arguments passed back and forth, you understand that you are writing real OOP code at the Enterprise level.

Cross Cut Tasks

The vast majority of essential code does not work with just one object, but actually implements the tasks of cross sections. Example: when class Playerstriking with a method hits()class Monster, where do you really need to change the data? The size of the hpobject Monstershould decrease by the attackPowerobject Player; the size of the xpobject Playershould increase by level Monsterin case of killing Monster. Should this happen in Player.hits(Monster m)or in Monster.isHitBy(Player p)? What if you need to consider and class Weapon? Are we passing an argument to isHitByor Playeris there a getter currentWeapon()?

This simplified example with just three interacting classes is already becoming a typical OOP nightmare. Simple data transformation turns into a bunch of clumsy intertwined methods that call each other, and the reason for this is only in the OOP dogma - encapsulation. If we add a little inheritance to this mixture, we get a good example of what stereotyped Enterprise-level software looks like.

Schizophrenic encapsulation of objects

Let's take a look at the definition of encapsulation :

Encapsulation is a concept of OOP that connects data and functions for manipulating this data, which helps protect them from external interference and misuse. Encapsulating data has led to the concept of data hiding, which is important for OOP.

The intention is good, but in practice, encapsulation with the fragmentation of an object or class often leads to the fact that the code tries to separate everything from everything else (from itself). This creates a huge amount of boilerplate: getters, setters, numerous constructors, strange methods, and they all try to protect us from errors that are too unlikely to occur on such a modest scale. You can use this metaphor: I hook a padlock on my left pocket so that my right hand cannot take anything from it.

Don't get me wrong - imposing restrictions, especially in the case of ADT , is usually a good idea. But in OOP with all these cross-references of objects, encapsulation often does not achieve anything useful, and it is quite difficult to take into account the limitations scattered across many classes.

In my opinion, classes and objects are too fractional, and in terms of isolation, API, etc. it is better to work within “modules” / “components” / “libraries”. And in my experience, it is in the OOP (Java / Scala) code bases that modules / libraries are not used. The developers are focused on building fences around each class, without really thinking about which groups of classes together form a separate, reusable, integral logical unit.

You can look at the same data in different ways.

OOP requires ordering data in an inflexible way: divide it into many logical objects, which determines the data architecture - a graph of objects with related behavior (methods). However, it is often useful to have different logical expressions for manipulating data.

If the program data, for example, is stored in a tabular, data-oriented form, then two or more modules can be created, each of which works with the same data structure, but in a different way. If the data is divided into objects with methods, then this is no longer possible.

This is also the main reason for the object-relational gap.. Although the relational data structure is not always the best, it is usually flexible enough to work with it in various ways, using different paradigms. However, the rigidity of data organization in OOP causes incompatibility with any other data architecture.

Low performance

The combination of data scattering over many small objects, the active use of indirection and pointers, and the lack of a proper data architecture lead to low execution speed. This justification is more than enough.

What approach should be used instead of OOP?

I don’t think there is a “silver bullet”, so I’ll just describe how it usually works today in my code.

The first thing I do is study the data. I analyze what goes into the input and outputs, the data format, their volume. I figure out how the data should be stored at runtime and how it is stored: what operations should be supported and at what speed (processing speed, delay), etc.

Usually if the data is significant, my structure is close to the database. That is, I will have some kind of object, for exampleDataStorewith an API that provides access to all the necessary operations to execute queries and save data. The data itself will be contained in the form of ADT / PoD structures, and any links between data records will be presented in the form of ID (number, uuid or deterministic hash). In terms of internal structure, this usually resembles strongly or actually has support for a relational database: Vectori either HashMapstore the bulk of the data by Index or ID, other structures are used as “indexes” necessary for performing a quick search, and so on. Other data structures are also located here, such as LRU caches and the like.

The main part of the program logic gets a link to suchDataStoreand performs the necessary operations with them. For the sake of concurrency and multithreading, I usually connect different logical components through messaging like actors. Actor example: stdin reader, input handler, trust manager, game status, etc. Such "actors" can be implemented as pools of subprocesses, elements of pipelines, etc. If necessary, they may have their own or shared with other "actors" DataStore.

Such an architecture gives me convenient testing points: DataStorethey can have various implementations with the help of polymorphism, and actors exchanging messages can be created separately and managed through test message sequences.

The main idea is this: only because my software works in an area where there are concepts, for example, clients and orders, it does not necessarily have a class Customerand methods associated with it. The opposite is true: a concept Customeris just a set of data in tabular form in one or more DataStore, and the “business logic” code directly manipulates this data.

Additional reading

Like so much in software design, criticism of OOP is not an easy topic. Perhaps I could not clearly convey my point of view and / or convince you. But if you are interested, here are a few more links:

Two videos of Brian Will in which he makes excellent arguments against using OOP: Object-Oriented Programming is Bad and Object-Oriented Programming is Garbage: 3800 SLOC example
Stoyan Nikolov's report with CppCon 2018: “OOP Is Dead, Long Live Data-oriented Design” , in which the author performs an excellent analysis of the OOP code base example and points out its problems.
Arguments against OOP at wiki.c2.com - A list of standard arguments against OOP.
Lawrence Krubner's article, "Object-Oriented Programming is a Costly Disaster to Stop," is a long post that looks deeply at many ideas.
Quora: OOP in C ++ is slower than C? If so, is the difference significant?

Tags: