Creating a programming language. Part 0

Good day Dear Habrir users! I won’t go for a long time, I’ll tell you only the main thing that prompted me to write this article, and to actually develop my own programming language.

The thing is that I have been programming for a long time, and I know several programming languages. And despite their differences, I manage to twist complex constructions in any language (even in Python my code is sometimes so twisted that I myself do not understand what I smoked when I wrote it). Due to the fact that my code completely contradicts all the canons of the correct code, I wondered how the compilers and interpreters understand my crooked code.

In this regard, I immediately give an answer to the questions “Why is this necessary ?! Another bike to write? Is there nothing to do? ” - this is done in order to satisfy interest, as well as to make those as interested as I had an idea of ​​how it works.

Now actually to the theory of programming languages. Let's see what everyone's favorite Wikipedia has to do with this:

A programming language is a formal sign system designed to record computer programs. A programming language defines a set of lexical, syntactic, and semantic rules that determine the appearance of a program and the actions that an executor (usually a computer) performs under its control.

With this, everything is clear, nothing complicated, we all know what it is.

About what remains to be done


1. Lexical analyzer. A module that will check the correctness of the lexical constructions that are provided by our programming language.
2. The parser. This module will translate human-readable code into a stream of tokens, which will subsequently be executed or translated into machine language.
3. Usually an optimizer stands in this place, but since our craft is more of a toy than a large project, I will refuse the optimizer. And now our paths diverge:
3.1. The translator. This module will translate the stream of tokens received from the parser into machine code. This approach is used in compilers
3.2.Executor. This module executes commands written as a stream of tokens. This approach is used in interpreters.

I am more inclined to create some kind of intermediate link between the interpreter and the compiler. That is, to create a programming language that will be translated into the byte code of the virtual machine, which is also to be written.

A bit about implementation


1. To implement the translator, the Python programming language will be used. Why exactly him? Because I know him better than anyone. In addition, its typing, or rather its complete absence, will reduce the number of variables used when writing code.
2. Python will also be used to implement the virtual machine.
3. PyInstaller will be used to build the project, since it allows you to pack everything into one file, in addition, at the moment you can build it for Linux and Windows without any troubles.

Now to practice


I propose to set ourselves the minimum task, upon completion of which we will consider the task conditionally completed and we can’t go further. To do this, we determine the minimum language syntax:

1. There are single-line comments that start with a sharp sign (#) and continue to the end of the line.
2. There are two types of data (integer, string).
3. It is possible to display information on the screen.
4. It is possible to enter values ​​from the keyboard.

Let's write a simple program in our new language, taking into account the rules that we just formulated:

int a = 10;
	int b;
	str c;
	str name;
	c='Hello!!!';
	b=a+a+10;
	print(a);
	print(b);
	print(c);
	input(name);
	c = 'Hello '+name;
	print(c);

Waters, in fact, is all. A simple program that demonstrates the capabilities of the newly invented language. On this I think should be finished.

In the next part, we will start writing our bike, which will be able to execute the code above.

Also popular now: