Design and architecture in FP. Introduction and Part 1

Published on February 10, 2014

Design and architecture in FP. Introduction and Part 1

  • Tutorial

Introduction


In the world of functional programming, there is one big gap, namely, the topic of the high-level design of large applications is almost not covered. I decided for myself to study this issue. Are there any significant differences in application design in the FP world from that in the imperative world? What is a canonical FP code? What are the development idioms? Does it make sense to talk about design patterns when applied to FP ? These and other important issues are often erupt there , here , but for the time being I do not know a single book, similar to the Gang of Four book. Someone has probably repeated my research, but all the better: similar results will confirm the correctness, others will indicate a place in the theory that needs to be improved.

About Material

The basis of these articles is experience in developing the game "The Amoeba World". The language used is Haskell. I originally made the game as part of the Ludum Dare contest # 27, but the goal was to prototype the infrastructure for another game, a larger one, whose name is Big Space. “The Amoeba World” is just a “cat”, on which I explore approaches in a “functional” game dev. Competitively-oriented game has become a good launching pad for further thought.

The code for "The Amoeba World" is here . The Big Space documentation is here . By the way, a long time ago I was already doing something similar. It was the “Haskell Quest Tutorial", as part of which I, um, also developed the game. If you can call it that. But it was a very simple low-level material for studying the Haskell language, but here we will talk about completely different matters. Accordingly, it is assumed that the reader is sufficiently familiar with the FP and the Haskell programming language.

Plan

At least the following topics should be addressed further:

  1. Part 1 . FP application architecture. Downstream design in FP. Fighting software complexity.
  2. Part 2 . Upstream design in FP. Idea is the foundation of good design. Antipatterns in Haskell.
  3. Part 3 . Properties and laws. Scenarios Inversion of Control in Haskell.
  4. Part 4. FRP


Also, this series of articles can be found as a single document here .

Part one


FP application architecture. Downstream design in FP. Fighting software complexity.

Bit of theory

The software development process is well-established along and across. We have been using standardized techniques in our practice for a long time - these are various methodologies, and software design techniques, and modeling tools. For example, UML is very famous - an object-oriented (for the most part) design language. It is used for end-to-end development, starting with requirements and ending, if you're lucky, with code generation. What about design patterns? Today it is necessary to know them in order to solve typical problems in the OO code. And there is only one question - how applicable is this body of knowledge to the functional paradigm? Can I come up with my own modeling language? Are patterns necessary for functional languages?

It seems that the initial stages are the collection and analysis of requirements, determining the functionality of the program, writing business scenarios, will always be present in one way or another, no matter what software is discussed. Imagine that we already know the requirements, and skip the first steps for now, focusing on more engineering ones.

And the first truly engineering phase is the development of software architecture, or, equivalently, the architectural design of software. Here we solve the issues of the high-level structure and functioning of the program in order to support all the non-functional requirements collected earlier. Choosing a programming paradigm is also an architectural issue. Choosing an FP, we almost completely abandon UML in view of its object nature. So, diagrams of classes, objects and sequences practically lose their meaning. There are no classes or objects in Haskell, no encapsulated mutability, but only a data transformation process. The sequence diagram needed to describe the interactions of these same objects could somehow be used for chains of functions, but the thing is that the chains themselves will be more readable and understandable. All other diagrams, however, quite applicable. In a large program on the FP, there are also subsystems or components, which means that the diagrams of components, packages, and communications remain in order. The state diagram is universal: processes and state machines are very common. It could be applied even in other areas, not only in software development. Finally, the use case diagram has little to do with software design; It links business requirements to system requirements during the analysis phase. the use case diagram has little to do with software design; It links business requirements to system requirements during the analysis phase. the use case diagram has little to do with software design; It links business requirements to system requirements during the analysis phase.

But if you look closely at the “applicable” diagrams, you can come to the conclusion that they will not help much in the high-level design of the code (which is located below the architecture design), or even they can do harm, prompting imperative thinking. For example, thinking about component architecture, we recall Inversion of Control . IoC solves one of the main problems of software development - it helps to deal with complexity by highlighting abstractions. But what if you take it into service? And here we recall that there is one way to implement it - Dependency Injection. It can probably be adapted to our needs - in the following parts we will look at several different solutions, such as: Existential Types, module-level abstractions, monadic abstraction, and monadic state injection. But in fact, having first-class functions, we can completely forget about DI, since this is a non-idiomatic approach , since it leads (explicitly or implicitly) to a state in the FP program. And this is not always good. Although, in fairness, I must admit that IoC is also present in FI.

And what else, if not IoC, helps in the fight against software design complexity? From SICP, we know that there are three important techniques:

  1. abstractions for hiding implementation details (“black box”);
  2. public interfaces of interaction between independent systems;
  3. domain-specific languages ​​(DSL, Domain Specific Languages ).


We can assume that the first two methods are known to us. Interfaces, abstraction, hiding details are all about one thing, and IoC is an example. Due to the dominance of OOP languages, we have developed strong associative relations “abstraction” => “inheritance” and “interface” => “OOP interface”. In fact, this is only part of the truth. We too narrow the terms “interface” and “abstraction” to OOP concepts and thereby cut off “inappropriate” paradigms. However, the ideas underlying the first two methods are more general, they are valid for all worlds, and knowledge of other approaches can in no way be superfluous.

Regarding the third technique, the following can be said. In view of its nature, FY have the power to write all sorts of subject-oriented languages ​​- internal, external. Firstly, because the binding code for parsers and translators turns out to be short, understandable, and easily modifiable. Secondly, the syntax of FP languages ​​allows you to create embedded DSL without overloading the main code. Subject-oriented languages ​​can drastically reduce program complexity and reduce errors. A side (or maybe the main) effect of the DSL implementation is a clearer understanding of the subject area. Code in a subject-oriented language is much better suited to formalizing requirements than low-level approaches.

However, it should be recognized: specialized languages ​​are greatly underestimated in real life. There is a myth that it is difficult, difficult and expensive to support. Its reason is that an imperative programmer needs to get out of his comfort zone and imagine a different syntax, semantics, and maybe even a different paradigm to create an external DSL. Without knowing this, he will be able to come up with DSL only within his own framework, which will automatically lead him to the current code, and then to the question “Why then DSL?”. But that is not all. How to implement DSL? Dominant OOP languages ​​(with rare exceptions) do not offer an elegant solution compared to functional ones; Indeed, significant efforts are required in the traditional approach to externalDSL did not increase risks. To remove risks, you need to study other paradigms and pull up the theory; as a result, the complexity of implementing an external DSL in a familiar language carries over to the very idea of ​​DSL. For some reason, this error is considered a strong argument against ... And it is easily broken by the fact that in addition to the external DSL, there are also internal ones (built-in, embedded DSL, eDSL). For an external DSL, inexpressible with the grammar of the current language, at least a parser and a translator are needed, which leads to an additional serving code. But innerDSL is within the grammar of the current language, which means that no parsers or translators are needed. However, you still need to come up with a different code organization; and again we came to the conclusion that we cannot do without a broad programming horizons. And this is a natural requirement for a modern programmer. Well, Martin Fowler to help us.

What else can be said about architecture as applied to FP? An important point is that in AF side effects are undesirable. But in any large program it is necessary to work with the outside world; that is, there should be some mechanisms to control side effects. And they are. The notorious Haskell monads are a great option, but it's more of a lower design tool than an element of the overall architecture. So, the code for communicating with an external server can be implemented as part of the IO monad, which will not differ much from imperative. The second solution - the code can be declared on DSL. In such a DSL there are bricks with side effects, there are bricks with pure behavior, but all of them are just a declaration, therefore, all code will be clean and less error prone. Probably, it will be understandable, flexible, combinable and manageable, in fact - a constructor. The execution of this code can be assigned to a state machine working on a thin layer of a monadic IO code. And please, we got a very good architectural solution, thanks to which we also reduced the complexity of formalizing our business processes.

To combat side effects, there is another architectural solution known as reactive programming . When applied to functional languages, it's worth talking about FRP, that is, about Functional Reactive Programming. This concept is mathematical, so it fits well on FP. The essence of FRP is to propagate changes across the data model. Each element of such a model is a value that depends on other values ​​using the necessary formulas. Thus, a “reactive model” is a tree where leaf values ​​can change in time, exciting a wave by recounting higher values. Sources of values ​​can be any functions, including those with side effects. The model code will be declarative and composable.

In general, in functional programming, the code written incombinatorial style as a constructor. The idea is simple: an arbitrarily large, self-applicable system is made up of small bricks of the same type. The better the constructor is designed, the more powerful and expressive the code. Usually combinatorial code is also some eDSL, which significantly reduces the complexity of development. Many Haskell libraries use this principle, for example: Parsec, Netwire, HaXml, Lenses ... The idea of ​​monadic parser combinators was so successful that Parsec became known outside of Haskell. Its ports exist in F #, Erlang, Java, and other languages. It is curious that Parsec implements as many as three ideas: combinators, DSL and monads, and all this is organically connected in a single coherent API.

Now it’s easy to single out another very powerful technique for dealing with complexity (author’s name):

4. Domain-specific combinators.

DSC is an eDSL whose elements are combinators. The most convenient DSCs are obtained in functional languages ​​due to the syntax (functions are the combinators of transformations), but this is possible in ordinary languages. Of course, designing a DSC is even more difficult than a simple DSL, which is why this approach is probably completely unknown in the mainstream.

A bit of practice

Moving from abstract thinking to practice, we turn to the project of the big game "Big Space". In short, this is a realistic space 3D-flyer of real proportions with fully programmable game elements and, probably, with the possibility of time travel. Curious people can look into the design document and other related materials for a more detailed acquaintance, but this description is enough for us to think over the general architecture of the game. But first, a small digression.

The beginning of any project lies in the idea, which, having been born, requires development and reflection. In the case of "Big Space" the main assistants in this were memory cards, which perfectly allow you to understand and structure the cornerstones of a future game. Starting with generalized concepts (such as “Concept”, “Space”, “Motivation”, “Exploring the Galaxy”), you can go down deeper until we get specific elements of the game. Memory cards also help to see and weed out contradictions and inconsistencies.

Big Space: MM-01 'Concept'

Can memory cards be considered an alternative to use cases? Here is an articlewhose author raises the same question. It follows that memory cards are a more general tool, as they can easily contain use cases. Memory cards answer the question "what is in general, and what solutions in particular", and use cases answer the question "what is allowed to be done to solve the problem." The latter is expressed in the fact that each use case is usually associated with a short story (user story), in which high-level changes in the system are described in steps. And several of these options will be levers for which you need to pull the user so that the system takes the necessary state. We can conclude that memory cards and use cases are related in the same way as declarative (in the Prolog style) and imperative programming are related. In the first case, we describe what exactly do we want to get; in the second case, we describe how we want to achieve this. That is, memory cards are more suitable for functional programming, since declarative decisions are encouraged in FI.

Replacing the use case diagram in the first step can go further. Concept cards - relatives of memory cards - are well suited for displaying requirements for rough architecture. This is achieved through three stages (copyright).

  1. Map of need. Contains as large blocks as possible. Links are optional, blocks are placed on the subject. The map shows what the game is based on. When designing, you need to consider the most general requirements; for example, for requirements such as “huge realistic space”, “3D graphics”, “cross-platform”, Cloud Computing, OpenGL and SDL blocks will be present. If you need a third-party game engine, you need to designate it here.
  2. Map items. In the free form, we open the previous diagram without worrying about the structure. The following blocks are elements: “subsystem”, “concept”, “data”, “library”, “object of the subject area”. The links show the most basic idea of ​​how the system works. You can designate the nature of the relationship, for example: "uses", "implements" and others. Blocks are divided into layers. The chart should take into account most of the conceptual requirements. It's okay if several levels of abstraction are mixed, competing options appear, or something doesn't look the best. The diagram only outlines the field of activity and offers possible solutions, but it is not the final architecture of the program.
  3. Subsystem map. In this diagram, specific architectural decisions are made from those proposed in the second stage. Information is structured and placed in such a way as to separate application layers. Libraries and implementation approaches are indicated. The diagram will not describe a fully high-level design, but it will show important dependencies between the subsystems and help to separate them into layers.


Big Space: CM-01 Necessity Map


Big Space: CM-02 Elements Map


Big Space: CM-03 Subsystems Map


The last diagram shows that the game has three layers: Views, Game Logic, Application ( Mike McShaffry, Game Coding Complete ). The Views and Game Logic layers are separated from the main code by their own command interpreters and eDSL facades. It is assumed that all game logic, with the exception of the game state, will be implemented outside the IO monad. The game state, although it applies to game logic, is separate, since it requires shared access from views and from the application side (for network communication, for periodic data loading, for GPGPU and cloud computing). The diagram shows that the concept of Software Transactional Memory will be used to work with the state of the game.; libraries Netwire, SDL, Cloud Haskell and others are also indicated. Almost all game logic, by design, should be implemented using several internal and external languages. Of course, the proposed architecture is one in a thousand, and the diagram says nothing about the lower level of design; research and prototypes are needed to find bottlenecks or miscalculations. But overall, the architecture looks slim and neat.

After the diagram of subsystems, you can proceed to the next stage of design. Having knowledge of architecture, you can build two models: a data model and a type model, and paint the interfaces of all public modules. Finally, the final step will be to implement these interfaces and write internal libraries. But these final stages are already a low-level design and implementation. Partially we will consider them in the following chapters.

Conclusion

The approach presented in this article is downward, that is, directed from the most general to the most particular. We saw that UML is weakly applied to FP. Therefore, we designed the top-level architecture, inventing our own diagrams and building our own methodology. We also learned about the methods of dealing with complexity, which we will definitely need in the design of subsystems.