maslyaev July 1, 2019 at 01:25

ORM: why this task has no solution, but to do with it, nevertheless, something needs to be

Modern information technologies amaze with their power, they are overwhelmed by the opportunities that open up, they are discouraged by the technical perfection inherent in them, but there is one ludicrous point about which IT breaks its teeth again and again. Show the user data from the database, get input from him, put it back into the database, show the result. Input fields, buttons, check marks, inscriptions - it would seem that they can be so prohibitively complicated that it would take to build puzzle constructions like frameworks on top of templating engines on top of frameworks on top of transpilers? And why, despite all the enormous efforts, we have the fact that the toy examples on the tutorial, of course, are made easy and pleasant, but as soon as the toolkit encounters real tasks of real life ... how to put it mildly ... with the increasing complexity of the tasks to be solved, there is a strong non-linearity of increasing complexity of implementation. Well, it would be a question of something really puzzling at the level of theoretical physics or space technology, because there aren’t anyway - buttons and checkmarks. Why does this nonsense continue to poison the lives of citizens and work collectives for decades?

The reasons, probably, as always, are many. Probably all of them are worthy of consideration in one way or another, but here and now we will talk about the Object-Relational Mapping (ORM) task, which always behind some of these “buttons and checkmarks” in some form.

What do we know about ORM

Storing data in relational tables is not to say that it is a very simple matter, but in general both the idea and the ways of its application are quite clear and well researched.
With object programming, not everything is so good (there are several competing approaches), but in general, with the implementation and application of technologies, everything is also more or less clear.
Both that, and another - about the data, their storage and processing. That is, in fact, about the same thing.
It seems logical to us that there should be a simple, understandable, convenient, predictable and universal bridge between the two worlds.
And every time we find this bridge easily, but the misfortune, its simplicity, comprehensibility, convenience, predictability and universality do not work beyond simple examples from tutorials.
Everyone suffers: the developers, who have to do a lot of extremely boring work, and the users who have to fight with clumsy software, and the business, the realization of the needs of which suddenly turns out to be unbearably long and expensive, and the industry as a whole.
I have seen many different ORMs, but have not seen any good ones. That is, one that, beyond simple examples, does not turn into a burden and a Procrustean bed.

Why is everything so bizarre

The ideological basis of the theory and practice of relational databases is predicate calculus, that is, a branch of mathematics. As for OOP, a similar ideological basis is lacking there. One can try to formulate the basic idea of OOP like this: since the world consists of objects, it would be convenient to model it, this world, by creating objects inside a software system. In this sense, two errors at once. Firstly, the world itself does not consist and never consisted of objects. Secondly, I'm sorry, but why does the program have to simulate the world? That is, we have a conceptually incorrect statement, perfectly complemented by a meaningless statement of the problem.

Any ORM is an attempt to clearly stretch the unified correspondence between, in fact, the branch of mathematics and a loose set of diverse practices, based on considerations of convenience, historically established approaches, and also often on legends, opinions of authorities, and simply misconceptions. In vitro it can be made to work, but in vivo it is doomed to look pathetic and bring more grief than joy.

On the inevitability of object orientation

Nevertheless, the need for the object orientation of our software is our inevitable reality. This inevitability is based primarily on the fact that operating with objects is the essence and foundation of any of our activities. The world itself does not consist of objects, but in order to understand something in this world and do something with this world, we ourselves declare its parts as objects, call them names, try to understand their behavior, apply to him efforts to obtain the desired results. This is our way of functioning, and it is impossible to leave it, and it is not necessary. Everything is an object, not because it really is, but because we cannot do otherwise. That which in no way can be an object lies completely outside the limits of our ability to comprehend and cannot serve as the subject of our efforts.

Even if the program is written without the use of OOP techniques, it inevitably contains objects (in the broad sense of the word), by manipulating which the developer solves his problem - variables, data types, operators, algorithms, syntactic constructions, functions, modules. From the user's point of view, the program also has a set of objects with which it interacts - buttons, labels, input fields, pages, sites, and the whole system.

What we store in our databases

As mentioned above, relational databases are based on predicate calculus. A predicate is a fact formulated and, in our case, stored on a medium. Just in case, I note that relational database is not only and not so much about relationships between tables by foreign keys. In proper terminology, relations are what we simply call tables. That is, even if your database has only one table with two columns (for example, name and phone number), you already have a relational database that establishes the relationship between two sets, in this case, sets of names and phone numbers. A relational database does not store objects; it stores facts.Stored facts, of course, have an object (“what is this fact about?”), And when we try to teach the system to answer this question, we have entities, that is, objects with which facts are associated. If you work correctly, the structure of our base is born out of a series of answers to the question “what kind of facts do we intend to keep?”, And only at the next stage we get something that resembles objects that give facts to objectivity. You can, of course, design “from objects”, but I would not recommend doing so except in laboratory work in subjects not directly related to database design. The danger of heavy architectural miscalculations is too great. In addition, it is at least inconvenient.

A small digression about object databases.A very simple thought: if we are tired of problems with ORM, then maybe we should do something with the part that is “R”? Let our database not be a tough and demanding relational monster, but something fashionable youth specially tailored for storing objects. Some kind of schematic NoSQL-base, for example. But in the end it suddenly turns out that NoSQL-like ORMs are even more awkward and unnatural than the good old SQL-ones. Anyway, we can have and with pleasure to operate a circuit-free DBMS, but there are no circuit-free data in nature. There is nothing more helpless, irresponsible and immoral than ORM for circuitless databases.

Good ORM

A good ORM is the missing ORM. Seriously. Look at any of your ORM systems and honestly try to answer your question: what are the benefits of this monster? What is the reason for its use except for unfulfilled promises of happiness and repeatedly prejudiced discredited ones? Of course, there are some useful handy things, but what are they against the background of introduced architectural deformations and performance problems that constantly arise out of the blue?

As a rule, the “low-level” database API is simple, convenient, complete and consistent. Usually enough fingers to list all the features. To learn them is not a god news what a task.

I'll try to sketch out a set of architectural principles that allow you to map objects to a database without using ORM:

We store the facts, operate on objects.Remember that the database stores facts, and the object models involved in data processing are projections of sets of facts from different points of view. For example, for the given example with names and phone numbers, we can have the Abonent class, for which several numbers can be stored, and also the PhoneNumber class, for which several subscribers can be set (do not forget that in addition to personal mobile numbers, we have where there are still apartment and office phones). And the table in the database is just one that defines the many-to-many relationship between many names and phone numbers. Just two different projections. This approach, by the way, normally scales to much more complex cases, allowing you to have such useful classes in the system as, for example, "average sales for a given period according to a given combination of criteria."
Facts are projected into objects and vice versa through a problem-oriented API. Without applying a solution that claims to be universal. If you still do not know how to compose convenient APIs, then learn how. And importantly, teach yourself to document them right away.
Order above all. If you use the classic version of a DBMS with a rigid data scheme, then this scheme in itself brings order to the work with data. The extra structure encoded by the structure of the objects is simply redundant. If you use a schematic-free DBMS, then of course you will have to make some efforts to ensure that your database does not turn into a mountain of garbage due to the fact that different developers have different ideas about what is stored.
DBMS independence (if necessary). If the most interesting property of ORM for you is DBMS independence, then use such special tools as ODBC, JDBC, ADO, or if this is not possible, make your own level of abstraction. It is not as difficult as it might seem at first. You do not need support for absolutely all the capabilities of absolutely all DBMS for your task, right?
Do not forget about additional aspects of working with data. Such as, for example, sharing access to data (which can be arbitrarily complex), monitoring, replication, and more. But here I have good news for you: because in your favorite ORM, the add. the aspect is still implemented according to the principle “you will not please everyone, eat what you give”, the rejection of a dubious service that you have to deal with more than cooperate with will ultimately prove to be a strategically correct decision.

Total

ORM is a very sore subject for our industry. In an era when cloud artificial intelligence plows the quantum blockchain, the vast majority of the workforce is busy screwing business logic and a user interface to databases. Millions of lines of terrible code, nailing with microscopes, pain and despair everywhere accompany this creative process. One of the roots of this sad state of affairs is the extremely persistent misconception that a universal ORM is in principle possible. But it is impossible, and there is a fundamental reason for this, which cannot be eliminated. Awareness of this fact is the first step towards getting out of this nightmare. There is a way out, there are alternatives, but first you need to realize, feel and learn to keep the focus of attention.

PS I sincerely apologize to those brothers in the profession who have invested a lot of time and effort in creating numerous universal best in the world of ORM. I am really sorry.

Tags: