Introduction to DSL. Part 1 - Design and Coding Issues

    For several decades, the challenge has been to find a repeatable, predictable process or methodology that would improve the productivity, quality and reliability of development. Some tried to systematize and formalize this apparently unpredictable process. Others applied project management methods and software engineering methods to him. Still others believed that without constant monitoring by the customer, software development gets out of control, which entails an increase in time and financial costs.
    Computer science as a scientific discipline offers and uses technology of reliable software development based on structural programming methods, using software testing and verification based on evidence-based programming methods for a systematic analysis of the correctness of algorithms and program development without algorithmic errors.
    This methodology is aimed at solving problems on computers, similar to the technology of developing algorithms and programs used at the programming olympiads by domestic students and programmers using testing and structural pseudo-code for documenting programs at IBM since the 70s.
    The methodology of structural design of software can be used using various languages ​​and programming tools to develop reliable programs for any purpose.
    However, when using the classical approach to development, the problems described under habrakat arise:

    1. Lack of transparency. At any given time, it is difficult to say in what condition the project is and what percentage of its completion. This problem arises with insufficient planning of the structure (or architecture) of the future software product, which is most often the result of the lack of sufficient funding for the project or low qualifications of developers.
    2. Lack of control. Without an accurate assessment of the development process, work schedules are disrupted and set budgets are exceeded. It is difficult to assess the amount of work completed and remaining. This problem arises at the stage when the project, completed by more than half, continues to be developed after additional funding without assessing the degree of completion of the project.
    3. Lack of monitoring. Problems associated with the inability to monitor the development of the project do not allow real-time monitoring of the development progress. Using tools, project managers make decisions based on real-time data. This problem arises in conditions when the cost of training management of tool ownership is comparable to the cost of developing the program itself.
    4. Uncontrolled changes. The customer constantly has new ideas regarding the software being developed. The impact of changes can greatly change the architecture of the project being developed, so it is important to evaluate the proposed changes and implement only those approved, controlling this process using software tools. This problem arises due to the unwillingness of the end consumer to use certain software environments. For example, when creating a client-server system, the consumer makes demands not only on the operating system on the client computers, but also on the server computer.
    5. Lack of reliability. The most difficult process is the search and correction of errors in computer programs. Since the number of errors in programs is not known in advance, the duration of program debugging and the absence of guarantees for the absence of errors in programs are also unknown in advance. It should be noted that using a evidence-based approach to software design allows you to detect errors in the program before its execution. Professor Wirth, while developing Pascal and Oberon, due to the rigor of their syntax, achieved mathematical provability of the completeness and correctness of programs written in these languages. A particularly significant contribution to the discipline of programming was made by Donald Knuth. His four-volume “The Art of Programming” is a necessary book for every serious programmer.
    6. Lack of guarantees of quality and reliability of programs due to the inability to ensure the absence of errors in software products up to the formal delivery of programs to customers.

    To solve the problems discussed above, it is proposed to introduce the following innovations:
    1. Systematic reuse. The most important approach is to isolate product families whose components vary. Based on these families, product lines are being developed. Products designed as family components reuse requirements, architecture, frameworks, components, texts, etc.
    2. Build Automation. Facilitates the assembly of independently designed components. When automating the assembly, a number of innovations appear:
      • platform independent protocols;
      • Autodescription (reduces architectural discrepancies based on the contract and specification);
      • delayed encapsulation (reduces the level of architectural inconsistencies due to the weaving of adaptations into published components);
      • architecture-driven development (based on software architecture, you can make suggestions about its operational qualities).

    3. Model Driven Development (MDD). This approach proposes the use of the model as source code, and not as documentation. For this, the model must be accurate, and the exact modeling language must be designed for a specific purpose. Modeling language is a system created for the specification of model-based programs. It increases the level of abstraction and translates the implementation into a domain dictionary.

    Technologies for modeling domain knowledge

    Application Programming Interface (API) - a group of system services focused on solving common problems.
    Component technologies - a set of software modules with a standardized interface, focused on solving common problems.
    Architectural patterns - design decisions that describe the architecture of a software system based on some concept.
    GoF templates are design solutions that describe aspects of implementing a software system to solve specific programming problems.
    XML-based languages ​​- A structured description of some data and transformation mechanisms.
    SQL is the language of structured queries to the DBMS.
    Ontology is a representation of any field of knowledge or part of the real world that is used for semantic analysis of texts.
    A domain-specific language (DSL) models the concepts identified in a particular domain. A well-designed DSL is a powerful modeling language that has a higher degree of unambiguity than a general-purpose modeling language.
    DSL is a programming language specially designed to solve a certain range of problems, in contrast to general-purpose programming languages. There are three main types of DSL:
    • internal DSL (internal DSL);
    • external DSL (external DSL) is a DSL that is written in a language different from the main language of the software application;
    • DSL (Language Workbench) integrated development environment.

    three main types of DSL

    External DSL uses constructs that are separate from the main syntax, similar to natural language. It requires an external compiler, interpreter or postprocessor, which is why it is executed at the compilation stage, unlike the internal DSL. External DSL often uses special languages, but in many general cases, tags are used that are taken from the syntax of other languages, such as XML, as a general alternative. Traditionally, Unix systems use the style of "small languages" (little languages). One of the first examples of external DSLs was regular expressions, SQL, awk, and XML, which were used on systems like Struts and Hibernate. The biggest plus of external DSLs is that they can be written as the developer wishes. In other words, You can express the subject area in the simplest and most readable and editable form. The format of such a DSL will be limited only by the ability to create a translator that can read the configuration file and provide some executable code in the main language of the application. This also implies the main drawback of external DSLs - the need to create a translator directly.
    Internal DSL uses part of the syntactic constructs of a common programming language to express, in a near-natural language, certain aspects of an application. It does not require a third-party compiler for execution; it is executed when the main program code is executed in a common programming language. Internal DSL is used by some common programming languages ​​to extend the capabilities of programs, but it creates a substantially limited subset of constructs for managing the program. A classic example of using internal DSL is Lisp and Ruby.
    The Integrated Development Environment (IDE) is a tool for creating DSLs. It provides editor and generator capabilities for defining the abstract syntax of a language, similar to modern IDEs for program development.
    In general, DSL, supplemented by metaprogramming technology, is an effective means of automating software development and is currently widely used in information technology.
    The generalized algorithm for developing a new DSL is as follows:
    1 Define syntax in terms of an implementation language
    2 Use DSL patterns to implement a new DSL
    3 Use metaprogramming tools to implement DSL within a source language
    Using DSL also has several advantages:
    • at the design stage, it makes it possible to create solutions in terms of the subject area, so that specialists in this subject area can create and modify DSL programs.
    • when designing in a similar subject area, you can use the finished DSL;
    • DSL solves a domain problem at an appropriate level of abstraction. This allows subject matter experts to understand and verify DSL programs;
    • programs written using DSL are concise. Writing DSL using domain terms makes it possible to read the program quite easily in the future;
    • there is an increase in the reliability, efficiency and quality of maintenance. Since it is easier to carry out operations at the model level, they are more efficient and prone to fewer errors than the same operations at the code level;
    • DSL allows optimization and validation at the abstraction level corresponding to the domain;
    • a domain description at one level of abstraction can then be converted to a lower level with detailed details. Thus, it is possible to supplement the model at different stages of development.

    The disadvantages of use include the following:
    • the cost of design, implementation and maintenance is quite high;
    • the need for user training;
    • the scope of a DSL is quite difficult to determine;
    • the difficulty of maintaining a balance between the constructs used in DSL and the constructs of a general-purpose programming language.

    This first (introductory) part ends.
    Thanks to those who have read to the end, I would like to hear your opinion about the problems of design in general and languages ​​for describing subject areas in particular as one of the solutions to the difficulties that arise.

    Also popular now: