maxstroy June 29, 2017 at 07:28

How to confuse analytics. Part one

- The army learned to combine space and time.
- How?
- Very simple! The ensign gives the task: “Today we will dig from the fence to lunch.”

In this article I will begin the story of the confusions that regularly occur and that wander into information models without any critical analysis.

In a previous article, I defined type and attribute. Let me remind them:

A type is the selection of a heap (subset) from a heap (set) and the endowment of objects of this heap with a unique name - a noun.
The attribute divides the heap (set) into heaps (subsets) and gives the objects of these heaps different adjectives.

It was a type definition and attribute definition based on analysis - we divided the bunch into parts. In fact, it was type building using analysis. Now I will show how you can build types and attributes based on synthesis.

You can take many boats, many inflatable mattresses, many seaplanes, combine them into one set and get a lot of swimming equipment. So a lot of boats will be obtained by the synthesis of several sets.

Earlier, I argued that the type "object" is not deducible. Derivability refers to the isolation of a subset from a superset. However, the type “object” can be synthesized. To do this, there must be an expert who determines whether the new accounting object is related to the type of objects or not. In this way, the “object” type can be synthesized. Synthesis can be done in the following ways:

listing all accounting objects related to this type
a listing of all types of accounting objects that relate to this type.

You can do the same with an attribute. You can take several heaps, assign values to them, combine them together and call it all an attribute.

We generalize to any operations on sets:

Type is a common property of set objects and a common name for these objects. Moreover, the identification of the composition of the set is possible in any convenient way.
An attribute is a set of subsets, each subset of which corresponds to a unique attribute value.

You can say that operating with piles is too primitive. After all, we can work with such abstractions as graphs, logics, algebras. However, the introduction of new abstractions without a deep understanding of what is behind it quickly leads us to become confused. At the same time, mathematicians, using their tools, do not make logical errors, because, on the one hand, they adhere to a strict discipline of reasoning, which they are instilled in by studying the fundamentals of mathematics, and on the other hand, creating models for solving specific problems, they do not burden themselves with joining these models among themselves. In constructing models of different subject areas, we often don’t own the necessary mathematical apparatus, for example, we don’t know what a Jordan measure is or what scaling is, but at the same time we solve the most complicated problems of joining different models. For instance, you can easily come across models, in one of which the color is expressed by the wavelength, in the other - the frequency of oscillation of electromagnetic waves, and in the third - with words like “red”. Or it may happen that “Collie” in one model is an attribute of the “dog” class, in another it is a separate class of objects. Of course, you can build a transformation from one model to another. But how do you do it, and what is behind it?

In order not to get confused, you can use a simple method - rely on what we all can do at the genetic level: divide the heaps into parts, put the heaps together. Therefore, my story about types and attributes, although it seemed so naive, had a very important goal: to demonstrate how to build models based on this foundation. I tried to introduce the concepts of type and attribute in the way mathematicians would have introduced them, but in a language that is understandable to the practitioner.

To understand what kind of confusion I'm talking about, we need to ask simple questions:

Why, as a result of multiplying 3 apples and 4 baskets, it turns out 12 apples, not 12 baskets, or 12 apple baskets?
Why are 3 Amps multiplied by 4 Volts giving 12 Watts instead of 12 Amps, or 12 Volts?
Why is 3 meters times 4 meters giving 12 square meters?
Why does Pascal go into the bar and see that there are 100,000 Pascals?
Why do the models of linear space transformation and linear basis transformation look exactly the same: in the form of a matrix? Although the physical meaning of these actions is completely different?

The answer is simple: mathematics is nothing more than a modeling tool. But the question is: why? An accounting object, type of accounting objects, a class to which an accounting object belongs, or a model of an accounting object? Using numbers, formulas, graphs, etc., you can easily get confused and not give an exact answer to this simple question. When models of accounting objects, models of types and models of models appear on the same canvas, working with this canvas will require great care. My experience suggests that errors are guaranteed in this case.

Let's start with the model of the accounting object. The accounting object is that which is highlighted by the subject as part of the real world, but has not yet been classified and named, for example, four-dimensional space-time volume. Further, this accounting object passes the modeling stage in the consciousness of the subject using a metamodel. A metamodel is a person’s view of how to model. For example, for all of us, the common idea is that there are objects, there are actions performed by objects, there are types of objects, types of actions performed by them and properties. This metamodel has been noticed and well described by Aristotle. Only one question remained unanswered: if there is an attribute “height” for trees, and there is an attribute “height” for buildings, is this the same attribute, or are these different attributes? We don’t know what Aristotle thought on this subject, but we know

The metamodel that Aristotle noticed, and on the basis of which our consciousness works, has severe limitations. This is due to the fact that the language with which we express our models, balances on the border between completeness and brevity. If the language is complete, then the statements in this language will be so heavy that in the real world a person will not survive. If the language is brief, then it will be a narrow specialized contextual language not applicable to all occasions. A universal language is between these extremes; therefore, it is neither complete nor brief.

Mathematicians faced with the limitations of the metamodel, on the basis of which we build models in our minds, and tried to develop our own metamodel, devoid of these shortcomings. This metamodel consists of objects and sets. The advantage of this metamodel is that it allows you to build consistent and extensible models. The disadvantage is that the models built using this metamodel are poorly translated into ordinary human language and are rather cumbersome.

If there were a modeling standard built on sets and objects, then using this standard it would be possible to build models in which types would be modeled by sets, attributes by sets of sets, and the model itself would be unlimitedly expandable. But so far there is no such standard. Perhaps the fact is that the models built on this metamodel would be quite puffy in volume, since a separate class of objects would have to be created for each attribute value in the model. It is possible that on the way to creating such a standard we are constrained by the habit of thinking with types and attributes. One way or another, there is no such standard yet and this is bad. Bad in terms of modeling theory.

In practice, existing modeling standards try to sit on two chairs at once, while simultaneously modeling types, sets and attributes.

So for modeling types in OOP "classes" are used. They are modeled by a set of attributes built into them. You can also specify the name of objects of this type. But nothing more is provided for describing the type in OOP.

To model attributes in OOP, attributes of these “classes” are created. If in OOP you need to model attributes common to several types of objects, you will have to create an empty “superclass”, the “heirs” of which will model these types of objects. It is clear that one cannot do without multiple heritage.

Suppose that in OOP you still need to model other properties of the type, for example, specify unique features of objects of this type. To do this, you can cheat. You can create a “class” whose objects are types, that is, create a “class” of types whose “instances” are object types. Thus, two objects simultaneously appear in the model, modeling one type: the “class” of objects and the associated “instance” of the “class” of types. At the same time, there are no regular mechanisms in OOP that would allow maintaining the integrity of this model. Now the integrity of the model will depend on the accuracy of the programmer.

If you need to simulate a lot, then OOP uses various methods: lists, collections, and so on. But this simulation is only the composition of the multitude. It is impossible to model the set itself using standard methods in OOP, as well as operations on sets. To create a set model, you will have to create a “class” of sets, the “instance” of which will model the set. Then the collection and this object associated with it will together model one object - a set. And again, the integrity of the model is in the hands of the programmer.

The confusion of types and objects

Programmers often think that object type modeling and object modeling are one and the same thing. For example, a class in OOP is for some reason stubbornly called an object, for example, a cup, and not a type of objects, and for some reason the object is stubbornly called an "instance of this cup," not a "cup." Because of this, standards are born with already inherent collisions in them.

Have you ever heard that a process can be modeled using a directed graph? Here is an example. You know that a directed graph may or may not have cycles. It is clear that when it comes to the sequence of operations, the graph modeling the sequence of operations cannot have cycles, since time always moves forward and cannot move backward. A graph that satisfies this requirement is called a network graph. But a model built in BPMN notation can contain loops. Therefore, it can be inferred that in BPMN notation, we are modeling not a sequence of operations, but something else. Personally, this seems obvious to me, but the programmers who understand this are few. Numerous examples of this can be found in discussions of my articles.

The confusion of types and sets

Another confusion arises from the fact that programmers often make no distinction between types of objects and sets of objects. For example, there is an opinion that a class in OOP is a model of a set. But alas, this is an object type model, but not a set model. Because of this confusion, when you need to simulate a lot of structural elements, consisting of objects and the connections between them, programmers pass. It is difficult for them to imagine how objects of different types can fall into one set.

These were examples of confusion widespread, but beyond the circle of programmers not going out. There are those that are much more widespread. In the following articles we will try to talk about them.

Tags:

How to confuse analytics. Part one

The confusion of types and objects

The confusion of types and sets

Also popular now: