groaner May 27, 2013 at 12:41

When polymorphism fails

From the sandbox

Most OOP fans are polymorphism fans at the same time. Many good books (even taking Fowler's Refactoring) go to extremes and say: if you use type checks at run time (such as an operation instanceofin Java), then you are most likely a terrible monster at heart. Of those that scare young children with operators switch.

Generally speaking, I acknowledge that the use of instanceofand its analogues is usuallyis the result of insufficient OOP design skills. Polymorphism is better than type checks. It makes the code more flexible and understandable. However, there is at least one common case where you definitely cannot use polymorphism. Moreover, this case is so widespread that it can already be considered a pattern. I would love to use polymorphism in it, honestly. And if you know how to do this - tell me. But I do not think that this is possible. At least not exactly in static languages like Java or C ++.

Definition of polymorphism

In case you are not familiar with the OOP terminology, I’ll explain what is at stake. Polymorphism is an ambitious designation for the concept of late binding. Late binding, in turn, is a pretentious designation (you will find a pattern here if you dig deeper) for a situation in which the decision about which particular method will be called is postponed until the program starts. Thus, checking the correspondence of the object and the message (i.e. the method) will be performed already during the application.

In performance-oriented programming languages such as C ++, Java, or OCaml, numbers are assigned to methods, and then a table of its methods is created for each class. By which the search is performed at runtime. In languages that prefer flexibility and dynamism, the search is carried out not among numbers, but among hashed method names. Otherwise, these two approaches are almost identical.

Virtual methods alone do not generate polymorphism. It comes into play only when a class has several subclasses. Moreover, each of which implements its own, special, version of the polymorphic method. In the most banal example from the textbook, this would be illustrated by the analogy with the zoo, in which all animals process the message differentlyнеприятноПахнуть(). (Though actually textbooks lie - everything smells pretty darn similar, it's just in their magnitude .. In my humble opinion, of course is true, I still have not decided who this value is maximum - at a hippopotamus or giraffe Ask about this later. )

Polymorphism in action

As an example, let's take a look at the classic problem of calculating a mathematical expression, which is often found in interviews. It was first used by Ron Braunstein in Amazon (as far as I know). The task is quite complex and allows you to check the possession of many important skills. This OOP design, and recursion, and binary trees, and polymorphism paired with dynamic typing, and the general ability to program, and even (if you suddenly want to complicate the task to the maximum ) theory of parsing.

So, thinking over this task, the candidate at some point realizes that if you use only binary operations, such as “+”, “-”, “*”, “/”, then the arithmetic expression can be represented as a binary tree. All leaves of the tree will be numbers, and all intermediate nodes will be operations. The expression will be calculated by traversing the tree. If the applicant cannot independently come to such a decision, you can delicately hint. Or, if things are really bad, say it in the forehead. Indeed, even after this, the task will still remain interesting.

The first half of it, which some people (whose names I will take to the grave with me, but whose initials Willy Lewis) consider as a Prerequisite for Those Who Want to Call Myself a Developer and Work In Amazon, is actually quite complicated. The question here is how to go from a string with an arithmetic expression, such as “2 + (2)”, to the expression tree. And this is a serious question.

We are interested in the second half of the problem: suppose that you solve it together and your partner is responsible for converting the string into a tree (we will call him Willy). You just need to decide from which classes Willy will build his tree. You can choose any language. The main thing, do not forget to do it, because Willy may give preference to assembler. Moreover, the assembler has long been discontinued by the processor. If it’s in a bad mood, of course.

You will be amazed how many subjects this stage baffles.

It seems that I have already let it slip about the correct answer, but one way or another the rating of solutions looks as follows. The Standard Bad Solution is to use operators switchor case(at worst - good old cascadingifs). The slightly Improved Solution will use a table of function pointers. And finally, Perhaps the Best Solution will apply polymorphism. Try to implement each of them at your leisure. It delivers!

Ironically (as you will see later), a solution with polymorphism is ideal for an extensible system. If you want to add new functions without having to recompile everything from and to and, in particular, without having to add more and more cases to your Giant Switch Operator Consisting of 500 Cases, then you just have to use polymorphism.

Three-time polymorphic cheers in honor of polymorphism

Thus, polymorphism, one way or another, but seems useful. Perhaps the most successful application is the polymorphic print operator. If you program in Java, Python, Ruby, or any other “real” object-oriented language, then you probably take it for granted. You ask the object to print itself and, by golly, he does it. Each object reports about itself exactly as much as you need to know about its internal state. This is very useful for debugging, tracing, logging, and possibly even documentation.

If you use a mutilated fake for OOP language, such as C ++ or Perl, to which all object-orientedness is screwed like a pair of disks for $ 2500 to the Subaru Legacy of 1978, then you are probably steeped in a debugger. OrData::Dumper'e. Or something else like that. In general, sucks you!

(The rhetorical question: why do we choose C ++ or Perl? These are the two most terrible languages in the world! Could we use Pascal or Cobol with the same success, is it not clear?)

By the way, polymorphic printis the main reason why I don’t write in the last time about OCaml. For reasons that I have not yet fully understood, but which are definitely on the list of The Most Insane Languages Designers Motives, OCaml is not polymorphic print. Therefore, you cannot output arbitrary objects to the console for debugging. I 'm trying to believethat it was needed to achieve legendary performance, even superior to C ++. Because any other reason would be a monstrous insult to usability. Well, but they have a debugger capable of returning the program back in time. It will definitely come in handy more than once.

So, we all love polymorphism. This is an alternative to micromanagement. You ask objects to do something without saying how to do it, and they obediently obey. Spending the day watching Strong Bad videos online. Oh, those silly objects! It’s impossible not to love them!

But polymorphism, like all worthy heroes, has a Dark Side. Of course, not as dark as Anakin Skywalker, but nonetheless.

The paradox of polymorphism

Using polymorphism involves a rarely spoken aloud, but a very important condition: you should be able to change the code in the future. After all, at least in statically typed languages, such as Java and C ++, when you add a polymorphic method, you need to recompile all classes that implement this method. And this, in turn, means that you need to have access to their source code. And also the ability to modify it.

There is a certain class of systems for which this is not feasible - the so-called extensible systems.

Suppose you are designing a hypothetical system that allows users to add their own code. This is not a trivial task for many reasons, including the need to protect against unauthorized access, ensure streaming security, and much, much more. But such systems exist! For example, there are online games that allow players to make changes without having access to the original source code. In fact, most online multiplayer games are moving in this direction - the management of the companies realized that users can and will create excellent content themselves. Therefore, games open their APIs and allow players to expand programs by creating their own monsters, their spells, and further down the list.

Something tells me that web services are in the same boat as online games.

Every time you create such an extensible system, you have to work three times more. Design internal APIs and classes so that they can be modified by end users.

A good example is Java Swing. Each extensible system is confronted with an inventor paradox. You can read more about this paradox somewhere else, I’ll only say about the essence: you cannot predict in advance what changes users want to make. You can do anything - even expose each line of code outside as a separate virtual function - but users will inevitably encounter something that they want but cannot modify. This is a real tragedy - there is no elegant solution. Swing is trying to fight by providing lots of hooks. But this makes its API terribly cumbersome and difficult to learn.

The essence of the problem

To make the conversation more specific, let's go back to the example of online games. Suppose you have perfectly designed and published all the APIs and classes for creating and managing spells, monsters, and other game objects. Suppose you have a large base of monsters. I’m sure you can imagine it if you try.

Suppose now that one of the players wanted to create a small pet named Evaluation Elf. This, of course, is a far-fetched example, working similarly to proving the problem of stopping, but a similar situation is quite possible in real life.

Let the only meaning in the life of our Evaluation Elf be the announcement of whether he likes other monsters or not. He sits on your shoulder and every time you meet, say, Orc, he bloodthirsty shouts: “I hate orcs !!! Aaaaaaaa !!! " (By the way, these are exactly the feelings I feel about C ++) A

polymorphic solution to this problem is simple: iterate over each of your 150 monsters and add a method to them ненавидитЛиМеняОценочныйЭльф().

Heck! It even sounds insanely stupid. But this is a truly polymorphic approach, isn't it? If there is a group of similar objects (in our case, monsters), and all of them must react in different ways to the same situation, then you add a virtual method to them and implement it differently for different objects. Right?

Obviously, this approach will not work in our case, and even if it could work (and it cannot, because the user who wrote this little elf does not have access to the source codes), he would definitely have a taste of Bad Design. Of course, there is no reason to add such a specific method to every monster in the game. What if it later becomes clear that the Value Elf infringes copyrights and needs to be deleted? You will have to return everything to its original state by removing this method from all 150 classes.

As far as I know (and I don’t pretend to be the laurels of a good designer, I just want to find the right answer), the correct solution is dynamic type determination. The code will look something like this:

public boolean нравитсяЛиОнЭльфу(Монстр mon)
{
    if (mon instanceof Орк) { return false; }
    if (mon instanceof Эльф) { return true; }
    ... <повторить 150 раз>
}

Of course, you can turn out to be an OOP freak and create 150 helper classes for the Evaluation Elf, one for each monster. This still will not solve the root of the problem, because its essence lies in the fact that the different behavior does not apply to the called party, but to the caller. It belongs to her.

In some high-level languages, the problem is solved a little more elegantly (I emphasize - only a little bit). In Ruby, for example, adding methods to other classes is supported. And even library ones. And even if you do not have the source code. For example, you can put the following code in the Evaluation Elf file:

class Орк
def нравлюсьЛиЯЭльфу; return false; end
end
class Тролль
def нравлюсьЛиЯЭльфу; return false; end
end
class ЭльфийскаяДева
def нравлюсьЛиЯЭльфу; return true; end
end
...

Ruby will load all the classes listed, if they are not already loaded, and add your method to each of them. This is a very cool feature, generally speaking.

But this approach has both pros and cons. How does it work? In Ruby (as in most other high-level languages), methods are just entries in the hash table corresponding to the class. And then you appear and add your entry to the hash table of each of the Monster subclasses. Benefits:

all the code of the Evaluation Elf is contained in its file;
code is not added to classes until the elf file is loaded;
not only the elf, but anyone else in the system can ask the monster whether the elf likes it or not.

The disadvantage is that you will have to provide default behavior for the case when the elf does not recognize the monster, because it was added to the game after the elf was written. If someone comes up with Gremlin, your elf will freeze, shouting something like “Damn it, what is it ?!” until you update its code by adding gremlins to it.

I think if one could somehow sort through all the classes in the system and check if they are descendants of the Monster, then everything would be decided with a few lines of code. In Ruby, I bet it's possible ... but only for classes already loaded. For classes still on disk, this will not work! Surely you can get around this problem, but in addition to the disk, there is also a network ...

However, the need for default behavior is not the worst. There are much more serious disadvantages. Say thread safety. It seriously bothers me - I don’t think that the semantics of Ruby for thread safety in this case are clearly defined. Will there be class-level synchronization? What will happen to streams of instances of the pre-elf class? I do not know enough Japanese to understand the specifications or source codes.

But what really is the problem, what really bothers me is that the code starts to multiply across all classes in the system. It smacks of encapsulation violation.

It’s actually worse. Smells of Bad Design. We get a situation in which the observer makes a kind of judgment and we attach the code of these judgments to the objects of observation. It looks as if I walked around colleagues from my floor and handed each individual badge with the words: “Please, don’t do it anywhere. I can understand from him whether I like you or not. ” In the real world, everything works differently, and OOP is supposed to model the real world.

Revision polymorphism

Well, now that I have formulated my thought so clearly, polymorphism no longer seems like a silver bullet. Even in non-expandable systems, if you want to make some choice based on the type of object, then it makes no sense to transfer the selection to this object itself.

As an example, it is more practical and groundier to take authentication. Let me ask you: if you were developing an access control system, would you make a virtual method имеюЛиЯПравоДоступа(), forcing all interested parties to implement this method? That is - would you put a security guard at the entrance, asking each incoming person, is he allowed access to the building?

No way! You would have to add to the runtime verification code:

public boolean запретитьВходВЗдание(Субъект s)
{
    return (s.неИмеетБейджа() || s.подозрительноВыглядит() || s.вооруженАвтоматом());
}

But wait - nowhere is class verification directly used. I didn’t write, for example s instanceof НосительАвтомата. What is the matter here?

Well, the "type" of an object is, in essence, the aggregate of its class (which is clearly fixed and unchanged) and its properties (which can be either fixed or changing at run time). This is a topic for another discussion, but it seems to me that the type is determined more by properties than by classes. It is because of the inherent inflexibility of the latter. But in “traditional” languages such as C ++ and Java, such an approach would make code reuse a bit more difficult due to the lack of syntax support for delegation. (If it suddenly seemed to you that this does not make sense, everything is in order: I’m already finishing my third glass of wine on the way to the penultimate stage. So let’s leave this topic for another note.)

At the same time, I hope that I managed to clearly express the main idea - polymorphism makes sense only when the polymorphic behavior actually belongs to the object . If this is the behavior of the subject, then dynamic type checking is preferable.

Summarizing

So, I hope you learned something useful from today's post. I'm sure about myself. For example, I found out that the Google search engine is really smart enough to fix "En and kin Skywalker" by asking, "Did you mean: En and kin Skywalker?" Oh and arrogant guys! Not that the copyright belongs to them.

I also found out that the ideal length of a blog entry is exactly two glasses of wine. If you go further, you begin to rant almost incoherently. And the speed of dialing goes to hell.

Good luck.

Original - When Polymorphism Fails. Steve Yegge. Stevey's Drunken Blog Rants

Tags: