KonstantinSolomatov August 3, 2009 at 18:55

How JetBrains MPS allows you to achieve greater use of DSLs (domain specific languages)

DSLs (domain specific languages or languages for specific areas) have been known to programmers for a long time. Despite this, they are rarely used in real systems. This article will examine what DSLs are and why they are not widely used. It will also describe how the JetBrains MPS solves problems that prevent their widespread use.

So what is DSL? DSL is a language created for solving problems in a specific subject area. DSLs are most declarative languages that solve problems in narrow subject areas. For example, SQL, regular expressions, XPath, Prolog, formulas in Excel. Unfortunately, this is where the list of commonly used DSLs ends. The main advantage of such languages is that due to the proximity of their constructions to the subject area, the code in these languages is very clear and concise. Moreover, in order to edit the code in such languages, it is not necessary to be a programmer. If a person understands the subject area, then he can easily write code in such languages thanks to his knowledge.

In theory, everything looks simple: we take a subject area, say, accounting, write a language and give it to experts or use it ourselves for a brief description of the subject area. Unfortunately, this approach is not widespread. Let's see why.
Why are DSLs not widespread?

Let's look at the reasons why DSLs are not used widely. The first reason is that the authors of such languages focus on closed languages for only one specific problem, rather than expanding on existing general-purpose programming languages such as Java, PHP, JavaScript. There is a reason for this: the possibility of the appearance of ambiguities when combining such extensions with each other. Another reason is the difficulty of creating the language infrastructure necessary for the implementation of the language, and comfortable working with it.

Most of the efforts in the DSL community are focused on working with closed languages. In many cases, it would be more useful to add new constructs to existing languages, such as Java. Imagine that you can use language extensions the same way you use libraries now. With this approach, developers will be able to simultaneously use the expressive power of the DSL and the versatility of languages such as Java, which is impossible when using existing technologies.

When creating extensions, in order to use them in the same way we use libraries now, it is necessary to make languages compatible with each other. This means that if we add one extension to our language, for example, support for money: type for money, literals like $ 10 or 100r, and another extension that adds mathematical notation to the language: amounts, products, etc., then we can use them together, even if they were created by different authors.

Unfortunately, all popular general-purpose programming languages are based on text grammars. These grammars have one unpleasant property: they can be ambiguous, they may have several interpretations of the same line. Moreover, if we add new constructs to Java using extension A, and achieve uniqueness of the grammar, and then do the same with extension B, it may happen that if we take Java and both extensions, the resulting grammar will be ambiguous.

Let's look at an example. Let's say 2 companies decided to add support for string interpolation (string interpolation allows you to write expressions inside string literals) in Java. Let's say the first company uses this syntax:
"{2 + 3}"
And the second one is:
"$ {2 + 3}"
If we use both extensions at the same time and introduce the following program:
“Account balance is $ {account.getBalance ()}”,
then its interpretation is ambiguous. Is $ part of the interpolation syntax, or part of a string literal? The example is somewhat artificial, but allows us to understand the general problem of ambiguity, which arises when there is a similar syntax for different constructions.

To achieve high developer productivity, intelligent development tools are required. With the advent of intelligent editors such as IntelliJ IDEA or Eclipse, it can be difficult for developers to switch to text editing in regular editors. Text editors do not highlight errors, do not provide contextual help, do not show menus with available options, they do not support refactoring. There are frameworks for creating intelligent editors, for example, IntelliJ IDEA Language API, XText, Oslo, but none of these frameworks supports extensible languages at the proper level. Even if we do not need extensibility, creating language support using these tools requires good knowledge of programming languages and takes a lot of time. As seen,

Let's summarize: people are doing the wrong type of DSL; to achieve increased productivity, existing general-purpose programming languages must be expanded. It is difficult to create such extensions due to the fact that widespread technologies do not support the compatibility of extensions with each other.
How JetBrains MPS solves these problems

Let us now look at how MPS solves these problems. In order to maintain compatibility of extensions with each other, MPS does not work with programs as with text. Instead, MPS stores them as a syntax tree, and editing takes place directly, without intermediate use of text. This approach allows us to significantly simplify the creation of an IDE, since the constant presence of a syntax tree makes it easy to implement error highlighting, automatic addition, contextual prompts, etc.

MPS solves the ambiguity problem in a radical way: if we don’t have text grammar, then we don’t have ambiguity either. This approach, however, does not mean that MPS does not use grammars. Instead of specific syntax, the definition of a language in MPS defines an abstract syntax (structure of the syntax tree). If you are familiar with XML, you probably know about XML Schema, which resembles the syntax description method used in MPS.

Since ambiguities are not possible with this approach, languages can easily be combined with each other. You can extend the syntax of the language with new constructs. You can insert code in one language into code in another language, or even insert code in a general-purpose programming language inside a relatively closed DSL. This means that languages are compatible with each other, and this allows reuse of languages and their parts, which is possible with big problems in the case of traditional technologies. At JetBrains, we experimented a lot with such reuse. The MPS distribution includes a large number of Java extensions:

collections language, which makes it easier to work with collections
dates language, which adds date support directly in Java
math language, which allows you to write sums, works, etc. mathematical constructions

Most of the languages that we use to define languages are both expandable and contain Java. For example, one of the language constructs for type systems, an output rule, looks like a normal DSL. At the same time, inside this rule you can write in Java, extended by constructions specific to type systems.

Since we got rid of the text representation, we cannot use a regular text editor. To work with the code, we use a special projection editor. For each node of the syntax tree, it creates a projection - a part of the screen with which the user can interact. When developing MPS, great efforts were made to ensure that such an editor behaves as close to the test editor as possible. For example, if you enter 1 + 2 + 3 in MPS, you will get the same syntax tree that would be obtained by parsing this line in Java. Of course, a projection editor is different from a text editor, and there are things that are possible in one and impossible in the other, and vice versa. Despite this, you can get used to these differences without losing performance. In our experience

Creating support for smart editing when working with the syntax tree directly is greatly simplified. Moreover, in many places, the intellectual capabilities are provided by the MPS IDE without any effort from the author of the language. Features such as automatic addition, search for uses, renaming, work automatically. When developing IntelliJ IDEA, intelligent editing support was implemented for many languages. The implementation of such support required a lot of effort: several months for a language. With MPS, similar capabilities can be realized in a matter of days. This is possible because special languages are used to develop languages that configure the existing language infrastructure. MPS is not just an editor. You can create a full-fledged IDE with it.

Inside JetBrains, we use MPS to develop commercial projects. Our new bug tracking system, codenamed Charisma, is entirely created on MPS, and this is just the beginning.
Conclusion

The widespread use of DSL is hindered by 2 problems: the inability to reuse them in text-based grammar systems, and the difficulty of creating intelligent tools for working with them. MPS solves both of these problems by working with the syntax tree directly, without an intermediate textual representation, and providing the infrastructure for creating intelligent tools for working with such languages.

MPS 1.0 was released in July. Most of the code is available under the Apache 2.0 license (with the exception of the JetBreains IDE Framework, whose license allows you to use MPS in MPS-based products without buying any license from JetBrains).

You can download MPS from here: www.jetbrains.com/mps and start creating languages today.

PS We are looking for a senior developer in the project. Details in vacancies in my profile.

Tags:

How JetBrains MPS allows you to achieve greater use of DSLs (domain specific languages)

Also popular now: