oberon87 March 10, 2015 at 19:17

Oberon Interpreter on Go

From the sandbox

Introduction

History moves in a spiral. Go authors, having embarked on the shoulders of the reputable Google corporation, were able to bring their ideal programming language to life, as Niklaus Wirth once brought Oberon into this world, fighting with its own unwavering and authoritative tendency to the emerging trend of complication and extensive development of mainstream languages.

Today, the language of Oberon is forgotten, but the case of Wirth manifested itself in the Go language completely unexpectedly. In some aspects of the implementation, the similarity of approaches is not in doubt. Of course, each language is unique in its own way and there is no need to make another language from one language. Time does not turn back. But at different times, people sought to preserve technologies that were a thing of the past. What are carefully stored emulators of the Spectrum and MSX, museums of old computers in different countries. Emulation, as one of the ways to stop the moment, has firmly established itself.

It is generally accepted that the Go language found itself in the niche of network services. But we will go the other way.

Task

History of the language Oberon has several variants of the language. N. Wirth created each edition for a specific project, its languages were always created for a task based on its goals. Oberon-2 was created as a variant of Oberon with a developed mechanism of data types, realizing the concept of OOP. The students of N. Wirth decided that it was Oberon-2 that was suitable for industrial programming. They supplemented the language with features, expanded the type system to compatible with the increasingly popular Java, implemented the idea of true modularity and released the BlackBox product with the language Component Pascal (KP) inside.

As time passed, corporations from across the ocean entered the market more successfully and the BlackBox went into the shadows, remaining the object of research by a small group of enthusiasts, including from Russia and the CIS countries.

One of the principles of KP is modularity, the ability to change the implementation of a component without changing its interface (as applied to objects, this is called pimpl). And so I thought, how does a particular framework differ from component implementation. And the description of the language is His interface ... Simply put, it occurred to me to implement an interpreter with the functions of the framework. That is, with support for modularity, as described in the language post.

Investigation

A feature of the BlackBox framework is the dynamic linking of modules in the process memory. As you know, the process of creating a program from source code consists of several stages. In different implementations of various compilers, these stages can be hidden, skipped, and primitivized. Linking is the reduction of machine code to a form acceptable for direct execution by the processor. For example, in Go, the link is static, at the compilation stage. And in BlackBox it’s dynamic, at the stage of loading the module’s body into memory. An interesting feature of this solution is the independence of the code from the operating system, when the same machine code is executed in different operating systems without recompilation, provided that the architecture is the same (x86, for example).

However, the BlackBox compiler provides another possibility - changing the code generator for a specific architecture. This result is achieved by using the AST structure inside the compiler - an abstract syntax tree for platform-independent preservation of the structure of programs and data. It was possible to find out that the BlackBox was originally written for the Mac + PowerPC platform, and only then rewritten for Wintel. At the same time, the compiler remained almost unchanged. The removable backend is not surprising today, but in the 90s it was unusual and new.

Thus, these two features allowed me to create my backend. Without much effort, I translated the AST into XML format, more precisely, in graphml, with the aim of a beautiful visualization of the interpretation process. The plans were huge.

Decision

For implementation, I chose the Go language, although I could choose Java, C #, Dart, and others. Why? There are several reasons, the main one - information that Go by its nature is similar to Oberon, cuts off all unnecessary from the process of expression of the subject area. To take even the exceptions, these light legalized supposedly harmless analogues of the GOTO operator.

Great reason to learn Go.

The project I called simply: Framework .

A brief excursion into the description of the language - and into battle. Deserializing xml into structures using tags is great. Data hiding in packages is excellent, we apply our favorite pimpl. Duck typing with interfaces was confusing at first, but after a couple of bumps, understanding came. The main trap here is comparing opposites with the same set of methods. This set can be formed historically, and even an experienced developer may find that the type A switch tips to process type B and naturally falls.

So, AST is loaded into memory, the main nodes are described, interpretation rules are known. As it turned out, there are many types of nodes. There are even more data types. One of the main principles that I adopted was the principle of let it fall, in any incomprehensible situation, the program ends the work. Of course, the home project allowed such liberties.

Interpreting algorithms is interesting. KP language AST consists of statements, expressions and pointers to objects (statement, expression, designator). Operators follow each other, change objects, change the program flow depending on subordinate nodes (expressions or pointers). In this case, there are two subordinate nodes - left and right. There are also additional nodes, such as link and object.

First, let's choose a data crawl scheme - I suggested that you don’t need to invent anything, everything has already been invented, so I used the stack to process nodes in depth and non-recursively traverse nodes from one expression to another. Each node in the stack corresponds to a frame in which the result of processing the child nodes and instructions from the parent nodes are stored. The specific actions for a particular node are the main part of the interpretation of the program. A program is executed while statements that control the flow of execution add one or more of the following statements to the stack for execution.

As soon as the stack is empty or the result of the iteration returns an error code, the program stops. Nothing complicated.
To store data in a high-level Go language environment, you can describe various types of data, including primitive ones. In this, I saw another advantage of the Go language for the task of implementing the interpreter.

A complete list of the rules by which the AST tree is compiled can be seen here .

Example

I will briefly describe the process of interpreting the nodes of a simple program.

MODULE XevDemo2;
VAR
	i: INTEGER;
	PROCEDURE Init;
	BEGIN
		i:=1;
	END Init;
	PROCEDURE Stop;
	BEGIN
		i:=0;
	END Stop;
BEGIN
	Init
CLOSE
	Stop
END XevDemo2.

The program tree is loaded into memory. The EnterNode node for entering the program (BEGIN) is detected, and the interpreter is aiming at it.
The interpreter pushes the current node (EnterNode) onto the stack. EnterNode gives the command to place the variable i, then pushes the CallNode node onto the stack, that is, a call to the Init procedure.
The current CallNode node pushes the EnterNode node of the Init procedure onto the stack.
The EnterNode node does not host data because there are no local variables in the procedure. The next to the execution stack is the AssignNode assignment node.
The assignment node must calculate the pointer to the variable object and the value on the right. To do this, it pushes the VariableNode node onto the stack.
VariableNode returns the address of its object in the frame data.
An assignment node pushes a constant node onto the stack. Although it would be possible to perform optimization for a constant, but to understand the process, I made the processing of all expression nodes uniform.
The constant node simply puts its value on the frame.
The assignment node takes the results of the left and right parts and updates the data in the variable's memory.
Since there is no operator behind the assignment node, the frame is removed from the stack, control passes to the EnterNode node.
The EnterNode node, upon shutdown, clears the procedure data, since it is not needed and returns control. Its frame is also removed from the stack.
Next, the CLOSE section works similarly to the BEGIN section. In multi-module systems, the CLOSE section has the important function of shutting down the module as a whole.
The interpretation process ends.

Results and Prospects

At the moment, the interpreter supports a full set of KP language operators. The type system, inheritance, extension of methods and so on are also supported, I did not mention them so as not to clutter up the article.

There are separate problems in managing complex data structures, such as an array of record arrays and so on. But this is already a matter of time - carefully check the operation algorithms and correct errors.

The initial plans to create specialized DSL from the language of the CP had to be postponed for different occasions. It turned out that even the simple task of interpreting the KP language, which is not the most difficult from the point of view of features, encounters the problem of the complexity of the subject area, when a bunch of implementation nuances must exist and work together. Here it is worth shaking hands with the authors of the JVM, you can only imagine what tasks they solved in the process of developing the environment, especially in the early stages.

I added the Go language to the treasury of knowledge, a bunch of materials about programming languages, interpretation, modifiable semantics of the language, and so on. At the end of the work, I realized that the abstract syntax tree in its current form can be applicable not only to KP, but also to similar imperative languages. After completing the main work, I learned that the Lua interpreter for Go was released.

In a sense, history again moved in a spiral, it turned out that the KP language already in the 90s was an extract of the basic ideas of imperative programming, which, with a fair wind, could radically change everything that we are dealing with now.

One can only reflect on how, like the butterfly effect, jokes about “Pascal for learning” and “ridiculous syntax” changed the future of the industry for decades to come.

Tags: