Matryoshka C Layer program language system

Let us try to imagine chemistry without the periodic system of Mendeleev (1869). How many elements had to be kept in mind, and in an arbitrary order ... (Then - 60.)

To do this, just think about one or more programming languages ​​at once. The same feelings, the same creative mess.

And now we can experience the feelings of chemists of the XIX century, when they were offered all their knowledge, and a little from above, in one Periodic table.


The book “Matryoshka C. Layer system of the program language ”represents all the units of the C language at a glance. This allows you to organize them, correct outdated information, and even clarify the very concept of the program.

Today, programming information needs to be systematized even more than chemical elements 150 years ago.

The first need is teaching. Mendeleev began creating his own system when he faced the question of which element to start lecturing with: O, H, N, He, Au ... At the same time, it was easier for him - he taught the chemistry of the best - students of St. Petersburg University. And programming is already being taught at school and will soon begin in kindergarten.

The second need is a scientific approach. Using the Periodic System, new elements were discovered and information about old ones was corrected. She helped in creating the atom model (1911). Etc.

The third need is clarification of the concept of a program.

Modern programming with one foot stuck in the 50s of the XX century. Then the programs were simple, and machines and machine languages ​​were complex, so everything revolved around machines and languages.

Now the opposite is true: programs are complex and primary, languages ​​are simple and secondary. This is called the applied approach, which everyone seems to be familiar with. But students and developers continue to persuade that everything is the same.

Which brings us back to the first lecture of privat-docent Mendeleev. What to say to freshmen? Where is the truth? That's the question.

The book “Matryoshka C. Layer system of the program language. " Moreover, it is addressed not only to students, but also to trained programmers, since it is they who must search for the truth and turn around the worldview, that is, we.

The following is a summary of the book.

1. Introduction

In 1969, the C language was created, which has become the fundamental programming language and has remained with it for 50 years. Why is this so? First of all, because C is an applied language that gave the program a human look instead of a machine one . This achievement was fixed by languages ​​from the C family: C ++, JavaScript, PHP, Java, C # and others. Secondly, it is a short and beautiful language.

However, the C language itself is usually mixed with machine assembler, thereby complicating and distorting its perception. Another extreme is the imposition of a certain “philosophy” on the load of the language: procedural, object, functional, compiled, interpreted, typed, and so on. This adds emotion, but does not help to better describe the language.

The truth is in the middle, and for the C language - strictly in the middle between philosophical and machine perception.

The C language is not independent, it obeys ordinary written speech, and at the same time it controls the assembler language itself. This position describes the Speech model of the program , according to which the program is divided into three subordinate types: speech, code, command. The C language is responsible for the second, code form.

Having determined the place of the language in the program, it is possible to arrange information about it, which is done by the Layer system of the program language , which represents the C language in the spirit of the Mendeleev system - on one page.

The system is built taking into account the community of applied languagesarising from their speech subordination. One set of Matryoshka C units allows you to describe and compare different languages, creating a number of Matryoshkas: C ++, PHP, JavaScript, C #, MySQL, Python, and so on. It is worthy and correct that different languages ​​are described by units of the fundamental language.

2. CHAPTER 1. The speech model of the program. Clear C

The first chapter presents the speech model of the program , reflecting the applied approach. According to him, the program has three obvious sequential types:

  1. speech - direct speech of a programmer solving a problem,
  2. code - coding the solution in mathematical form in C (or any other)
  3. and command - immediate machine commands.

The speech model explains why C is a simple and understandable language. C is built in the image and likeness of human speech familiar to us.

The first type of program is the direct speech of the programmer. Speech is in line with human thinking. With the help of speech, novice programmers write programs - first in Russian, then step-by-step translating actions into a code language. And it was precisely on this model that the C language was created.

The programmer’s conclusions expressed by speech are converted into a code numerical form. This transformation should be called reflection , since speech and code have the same nature (reflection - birth - gender). This is quite obvious if we compare the speech (on the left) and code (on the right) types of the program.


It is curious that reflection occurs very simply - with just two kinds of expressions.

However, the modern description of the C language (from 1978) does not contain a sufficient list of names for describing the language in general, or for the task of reflection in particular. Therefore, we are forced to engage in creativity and introduce these names.

The choice of words must be accurate and clear. This required a special approach, summarized as follows - the strict use of the native language. For the English it would be English, but we are not English. So we use what we have and we will try to speak Russian.

Reflection is performed by two kinds of expressions:

  1. Calculation (HF) - reflects the change in the properties of the subject. The property of an object is expressed by a number, then the action on the property is the action on the number - the operation.
  2. submission (Pch) - reflects a change in the order of actions. A prototype of PC is a speech sentence with a complex sentence, therefore, most types of PC begin with subordinate unions “if”, “otherwise”, “bye”, “for”. Other types of PCs complement them.

By the way, can you believe that in the description of C there is no name for the expression calculation - are they simply called "expressions"? After this, the absence of a name and association for the kind of submission, and indeed scarcity in terms of names, definitions and generalizations, will no longer surprise. This is because the famous K / R (“C language”, Kernigan / Ritchie, 1978) is not a description, but a guide to the use of the language.

However, I would like to have a description of the language. Therefore, it is offered by the Layer program language system .

3. CHAPTER 2. Layer system. Short C

Any description should be accurate and extremely short. In the case of a program language, a frontal description is difficult.

Here we have a program. It consists of modules. Modules consist of subroutines and collections. Subprograms consist of separate expressions: declarations, calculations, submissions. Submissions - as many as ten species. Subordinates connect sublevels and subroutines. There are also several ads. However, declarations are included not only in subprograms and sublevels, but also in modules and collections. And most expressions consist of words that are so difficult to describe, so usually they are simply given in two lists - the source and derivative words that you have to get acquainted with throughout the study and application of the language. Add to this punctuation marks and a number of expressions.

In this statement it is not easy to understand who was standing on whom.

A direct hierarchical approach to language description will be overly complex. The search for roundabout paths leads to a description of the language based on its speech nature and command side. Thus was born layer system partially matched with the periodic system, which is also a layer . As it turned out 42 years after its publication (1869), the periodicity of the system is associated with electronic layers (1911, the Bohr-Rutherford atom model). Also, the Layered and Periodic systems are related by the tabular placement of all units on one page.

The description of the language units is concise - only 10 kinds of expressions and 8 kinds of other units, as well as meaningful and visual. Although unusual for a first acquaintance.

The language units are divided into 6 levels:

  1. squads - table rows
  2. departments - special groups of genera (parts of the first line)
  3. kind - cells (main division level)
  4. sub-species - species dividers (rare level)
  5. views - unit formulas at the bottom of the cell or separately
  6. patterns are units themselves (words only)

Samples of words are described by a dictionary - a separate subsystem composed of the same six levels.

The speech component of the C language is quite obvious, although it still deserves a description. But the command part of the language is just connected with compilation control, during which a third type of program is created - the command. Here we come to the most exciting side of C - beauty.

4. NEXT CHAPTERS. Beautiful si

The C language underlies modern programming. Why? Firstly, by virtue of the most consistent speech. Secondly, because it nicely circumvented the limitations of machine processing of numbers.

What exactly did Xi offer? Image and layer.

The word "image" is a translation of the English word "type", which comes from the Greek "prototype" - "prototype". In Russian, the word “type” does not convey the cornerstone of the expressed concept, moreover, it is mixed with the official meaning of “view”.

Initially, the image solved the purely machine task of computing, and then became the runway for the birth of object languages.

The layer immediately solved several problems - both machine and applied. Therefore, the review will begin with a single-tasking image and move on to the multi-tasking layer.

One of the unpleasant features of historical programming is that most of the concepts, including the basic ones, are given without definitions. "The programming language (the name of the rivers) has a whole and floating type of numbers ..." - and scratched on. What is a “type” (image) is not necessary to determine, because the authors themselves do not fully understand this and will be hushed up “for clarity”. If they are fastened to the wall, they will give a vague and useless definition. It helps to hide behind foreign words: for Russian authors - for English (type), for English - for French (subroutine), Greek (polymorphism), Latin (encapsulation) or their combinations (ad-hoc polymorphism).

But that is not our fate. Our choice - definitions with a raised visor in pure Russian.


An image is a predictive name of a quantity that defines 1) the proper properties of a quantity and 2) the selection of operations for a quantity.

The word “type” (type) corresponds to the first part of the definition: “proper properties of a quantity”. But the main meaning of the image is in the second part: “selection of operations to quantities”.

The starting point for introducing an image in the C language is the usual calculation, for example, the addition operation.

Paper math, hand-written or printed, does not make much difference between the kinds of numbers, usually considering them to be real. Therefore, their processing operations are unambiguous.

Machinemathematics strictly separates numbers into integer and fractional. Different types of numbers are stored differently in memory and processed by different processor instructions. For example, the addition of integer and fractional numbers are two different commands corresponding to two different processor nodes. But the command for adding integer and fractional arguments is missing.

Applied mathematics, that is, the C language, separates the types of numbers, but combines operations: addition for integer and / or fractional numbers is written with one action sign.

A clear definition of the concept image allows you to definitely talk about two other concepts: size and operation .

Value and operation

The value is the number being processed.

Operation - processing the values ​​of the initial values ​​(arguments) to obtain the total number (total).

The magnitude and operation are interconnected. Each operation is a quantity, since it has a numerical total. And each value is the result of sending the value to the processor register / from the processor, that is, the result of the operation. Despite this interconnection, the main thing is the possibility of their separate description, albeit with the repetition of one word in different parts of the dictionary, which happens in ma3.

The machine approach divided all the numbers used by the programmer into commands and data. Previously, both were numbers, for example, commands were written in numerical codes. However, in applied languages, teams ceased to be numbers and became words and signs of action . Only “data” remained numbers, but it’s ridiculous to continue to call them, because in the transition from a machine point of view to a mathematical point of view, numbers are quantities that divide the initial ( data ) and final ( sought ) ones. “Unknown given” - this will sound silly.

Teams are also divided into two types of actions: mathematical and service. Mathematical actions - operations. We will turn to the service later.

In C languages, the usual paper and machine unambiguous, or single, mathematical operations almost without exception become multiple.

Multiple operations - several operations of the same name with different images of arguments and different, similar in meaning, actions.

Integer arguments correspond to the whole operation, fractional - fractional. This difference is especially evident in the division operation, when the expression 1/2 gives a total of 0, not 0.5. Such a record does not comply with the rules of paper mathematics, but the C language does not seek to comply with them (unlike Fortran) - it plays according to its own applied rules.

In the case of mixing integers and fractional numbers, the only correct casting of argument values ​​is included- Collaborative conversion of values ​​from one image to another. Indeed, when adding an integer and a fractional number, the result is fractional, so the image of the operation selects the operation of converting the integer argument to a fractional value.

A number of operations remain not multiple but single . Such operations are defined for only one type of argument: the remainder of the division is integer arguments, the styling (bitwise operations) are natural integers. Ma3 indicates the plurality of operations by signs (# ^) indicating the patterns for which the operation is defined. This is an important but previously overlooked property of each operation.

All functions are arbitrary unit operations. The exception is operators - non - bracket functionsembedded in the language (initial operations).


Assistance is a concomitant action.

If we consider the operation as the main action, then we can distinguish two accompanying, which provide the operation and differ from it. These are 1) variable control and 2) submission. This action is called a promotion .

Here it is necessary to distract and separately say about Russian translations of programming textbooks. A new word statement (expression) was introduced in the K / P text to record actions , which made an attempt to divide the concepts of a machine command into different actions: 1) an operation, 2) an announcement, and 3) submission (called "control structures"). This attempt was buried by Russian translators, replacing the "expression" with the word "operator", which:

  1. became synonymous with the machine word "team"
  2. turned out to be synonymous with the phrase “sign of action”,
  3. and also received an unlimited number of additional values. That is, it turned into a likeness of the English article "uh ...".

Consider associated activities, or facilitations .

Variable management

Variable Management (UE) - create / delete variable cells.
UP occurs implicitly when a variable is declared, which is already written for another reason - to indicate the image of a quantity. Only one type of incremental variable is explicitly controlled using the malloc () and free () functions.

It should be noted that implicit actions are more convenient for writing, since it does not require writing anything at all, but more difficult to understand - they are more difficult to take into account and interpret.


Submission - enable / disable layer sections.

The C language proposed a different way from assembler, an applied way to control the order of actions - submission. It reflects and develops a speech complex sentence with an explicit division into the main part (sentence submission) and subordinate part (sub-level / subprogram sections).

Both announcement and submission are completely built on the concept of a layer .


A layer is a limited, single-level selective expression set.

The layer explicitly and implicitly took upon itself the execution of several tasks at once:

  1. streamlining the program,
  2. name visibility restriction (implicitly),
  3. management of variables (memory cells) (implicitly),
  4. definition of subordinate sections for subordination,
  5. definitions of functions and collections and others.

In machine languages ​​there was no concept of a layer, therefore it did not appear in K / P, and if something was not there, then introducing it in subsequent books would be heresy and freethinking. Therefore, the concept of a layer did not appear at all, although it is extremely useful and quite obvious.

Without a layer, it is impossible to briefly and clearly explain many of the actions and rules of the program. For example, why simple goto like three kopecks is bad, and tricky while is good. You can only swear helplessly, as Dijkstra did (“qualification of programmers is a function that inversely depends on the frequency of occurrence of goto statements in their programs.” In short, only goats use goto. Level of justification is God.) True, it’s not so bad if your books they don’t have to explain anything at all, but, as we have said, this is not our fate.

By the way, it can be assumed that Den Ritchie left goto precisely as a key to search for some unnamed concept, because there was no need or beauty in the goto expression. But there was a need for a simple and understandable explanation of the new principles of the language that Richie himself did not want to give, and which are just based on the concept of a layer .


Deviation - changing the usual properties of a new name.

The most important deviation is just related to the layer properties of the program, and is described by one word “static”, which has different meanings in each type of layer.

5. THE LAST CHAPTER. Common application languages

Applied languages ​​are figurative languages ​​(having an image, "typed"). They are based on explicit or implicit use of the image. And here again a contradiction manifests itself: the explicit image is more understandable, but less convenient, and vice versa.


(The layout of the table has not yet been delivered, so the table is given in a picture.)

After C, the development of applied languages ​​went the way of increasing their imagery. The most important for understanding high-definition is a direct descendant of C - the C ++ language. He develops the idea of ​​arbitrary selection of operations for quantities and embodies it on the basis of a sy-expressive expression, a collection that receives a new name - an object. However, C ++ is not as concise and expressive as C, due to the overload of new types of collections and related rules. By the way, let's talk about “overload”.

Overload and polymorphism

The word overload is an outdated term in a machine approach that denotes the creation of multiple operations .

Machine (system) programmers could be annoyed by the multiplicity of operations: “What does this sign (+) mean: addition of integers, addition of fractional numbers, or even an offset ?! Nowadays, they didn’t write like that! ” Hence the negative connotation of the selected word ("bust", "tired"). For an application programmer, multiple operations are the cornerstone, the main achievement and legacy of the C language, so natural that they are often not realized.

In C ++, pluralityextended not only to the original operations, but also to the functions - both individual and grouped into methods - methods. Together with multiple methods, it became possible to redefine them in extended classes, which was vaguely called "polymorphism." The combination of polymorphism and overload gave an explosive mixture, which scattered into two polymorphisms: “true” and “ad-hoc”. You can only figure this out in spite of the assigned names. Foreign names paved the road to ad.

Declaration of the form "overload" is better expressed by the word add - adding an declaration of the function of the same name with the arguments of another image.

An ad of the form “polymorphism” is better called the word redeclaration- overlapping declaration in a new extension layer of the function of the same name with arguments of the same image.

Then it will be easy to understand that the methods of the same name of different images (arguments) are redeclared , and one image is redeclared .

Russian words decide.


Consideration of the concepts of highly-figurative languages ​​confirms the importance of a clear definition of fundamental concepts. With C correctly described, learning highly-shaped languages ​​will be easy and enjoyable.

This is especially important for implicit highly figurative languages (PHP, JavaScript). For them, the importance of objects (composite images) becomes even higher than in C ++, but the very concept of an image becomes implicit and elusive. From the point of view of convenience, they have become simpler, but from the point of view of understanding, it is more complicated.

Therefore, you should start learning programming languages ​​with the C language and move on in the order of appearance of the C family languages.

The same goes for language descriptions. Different languages ​​have the same or smaller set of units of genus than the C language. The number of species and samples can differ in both directions: C ++ has more species than C, JavaScript has fewer.

The MySQL language deserves special mention. It would seem - nothing in common, but he is perfectly described by Matryoshka, and getting to know him becomes faster and easier. What is important, given its importance for the web - the canteen of modern programming. Where MySQL is, there are other SQL. Well, all sorts of Fortran-Pascal-pythons are also described by Matryoshka as soon as their hands reach.

So, great things await us - an applied description of the C language and a single description of the languages ​​following it. “Our goals are clear, the tasks are defined. To work, comrades! (Stormy, prolonged applause, turning into a standing ovation. Everyone stands up.) ”

Your opinions will be heard with great attention, your help in creating the dolls website will be greatly appreciated. More complete information about the book is on the site, cunningly hidden in Matryoshka C.

Also popular now: