graninas May 6, 2014 at 19:56

Design and architecture in FP. Part 3

Tutorial

Properties and laws. Scenarios Inversion of Control in Haskell.

Quite a bit of theory

In the last part, we made sure that it is very easy to get confused in poorly designed code. Fortunately, since ancient times, we have known the principle of “divide and conquer” - it is widely used in the construction of architecture and design of large systems. We know different incarnations of this principle, such as: separation into components, reducing dependencies between modules, interaction interfaces, abstracting from details, highlighting specific languages. This works well for imperative languages, and it must be assumed that it will work in functional languages, with the exception that the means of implementation will be different. Which ones?

Consider the Inversion of Control principle (a detailed description of this principle can be easily found on the network, for example, here and here) It helps reduce connectivity between parts of the program by inverting the flow of execution. Literally, this means that we embed our code in another place so that it will be called there sometime; in this case, the embedded code is considered as a black box with an abstract interface. We show that in any functional code both IoC attributes are combined - “code injection” and “black box”, for this we will consider a simple example:

progression op = iterate (`op` 2)

geometricProgression, arithmeticalProgression :: Integer -> [Integer]
geometricProgression = progression (*)
arithmeticalProgression = progression (+)

geometricals, arithmeticals :: [Integer]
geometricals = take 10 $ geometricProgression 1
arithmeticals = take 10 $ arithmeticalProgression 1

Here, to the input of one function (iterate, progression), other functions ((*), (+), `op` 2) are transferred, that is, some code is being introduced. And inside the receiving functions, this code is considered as a black box, for which only the type is known. In the case of iterate, for example, the second argument must be of type Integer -> Integer, and no matter how complicated its device is. Thus, control inversion is the basis of functional programming; in theory, higher-order functions allow you to build an arbitrarily large application. There is only one problem: this interpretation of IoC is too naive, and this leads, of course, to naive code. Already in the above example, it can be seen that the code is a monolithic pyramid, and in a real application it would grow to gigantic proportions and become completely unsupported.

Let's look at IoC from the other side, that is, from the “hospitable” client code. In it we get some kind of external artifact that serves a specific purpose. Outside, this artifact can be replaced by another, but for the receiving side, the substitution should be invisible. This so-called principle of substitution Liskov . It serves as a guide in the OOP world and prescribes that artifacts have predictable behavior. “Prescribes”, but not “guarantees”, since in OOP languages such a guarantee cannot be given - in any artifact any side effect may suddenly appear that violates the principle. Is this principle applicable in functional languages? Oh sure. Moreover, provided that the code is clean, we will get stronger guarantees, especially if the language is strictly static typed.

At the end of the article, a brief description of various implementations of Inversion of Control in Haskell is provided. Some templates are almost complete analogues of those in the imperative world (for example, a monadic state injection is a Dependency Injection), while some only slightly resemble IoC. But all of them are equally useful for good design.

A lot of practice

It's time to write some good code. In this article, we will continue to study the design of the game “The Amoeba World,” an entire era outlined by this and this commits. The era was intense. In addition to the completely rewritten game logic, such tools as lenses were tried , testing using QuickCheck was introduced , a scripting language was invented, its interpreter was written, the A * algorithm for searching the world graph was integrated, and another specific antipattern was found, which put an end to this era . In this article, our conversation will concern only properties and scenarios; we will leave the rest for the following parts.

Properties and objects

From past experience, it became clear what the objects are in fact, what they consist of. The main idea underlying this design is this: an object is an entity made up of some properties. The objects “Karyon”, “Plasma”, “Border” and others were dismembered, and the following set of properties was obtained:

Unique identificator
Title
Durability (maximum and current HP)
Owner (player)
Layer (dungeon, earth, sky)
Location (on the map)
Age (maximum and current age)
Battery (maximum and current amount of energy)
Prohibition of movement (on a specific layer in this cell)
Direction
Traffic
Factory (the ability to create other objects)
Self destruction
Collisions (with other objects)

A meticulous reader can see imperfection here, for example, for some reason, the “layer” and “location” are divided into two properties, although they seem to be about the same thing. And what kind of property is this “collision"? And what about the Factory? And what about “Age” and “Self-destruction”? And why does each object have a string name that will devour memory? The claims are justified, and already in the next era the list was once again revised, and in the same way: by highlighting the properties of the properties. As a result, there were only six, the most important, "runtime" and "static", and the rest logically turned into external effects and actions ...

For example, we will verbally describe a couple of real objects that could be on the game map:

Core:
    Name = “Karyon”
    Location = (1, 1, 1)
    Layer = Earth
    Owner = Player1
    Durability = 100/100
    Battery =
    300/2000 Factory = Plasma, Player1

Plasma:
    Name = “Plasma”
    Location = (2, 1 , 1)
    Layer = Land
    Owner = Player1
    Strength = 30/40

Since there are a finite number of properties, it was decided to make a wrapper type for each and place them all under one algebraic type ( code ):

- Object.hs:
data Property = PNamed Named
              | PDurability (Resource Durability)
              | PBattery (Resource Energy)
              | POwnership Player
              | PLayer Layer
              ...
  deriving (Show, Read, Eq)

Define the type of abstract object:

- Object.hs:
type PropertyKey = Int
type PropertyMap = M.Map PropertyKey Property

data Object = Object {propertyMap :: PropertyMap}
deriving (Show, Read, Eq)

The first thought that begs at the form of Property is that we returned to where we started, that is, to the God ADT problem (at that time it was an Item type). However, it is not. The significant difference is in the level of abstraction that the Object type provides us with. We have what we can call “combinatorial freedom”: a small number of properties give a combinatorial explosion of possibilities for arranging new objects. No other properties are planned - and if any appear, the changes will not propagate along the code, like a wave through dominoes. We will be convinced of this when we talk about scenarios, but for now we ask ourselves: how to create these very specific objects?

The easiest way is to populate the property list and convert it to Data.Map :

- Objects.hs:
import Object

karyon = Object $ M.fromList [(1, PObjectId 1)
, (4, PNamed “Karyon”)
, (2, PDurability (Resource 100 100))
, (3, PBattery (Resource 300 2000))
, (10, POwnership Player1)
, (5, PDislocation (Point 1 1 1))
, ...]

... but stop! What kind of logic do we use for PObjectId, Dislocation, and Ownership? After all, it makes sense to talk about them only for objects on the map! On the other hand, there are common properties that specify a class of objects and then do not change: PNamed and PLayer, PFabric and PPassRestriction (motion ban). In Karyon, the layer can only be Ground, and the PNamed “Plasma” property can only belong to the plasma, respectively. Here we are faced with the problem that objects should be created when directly placed on the map, and at the same time you need to have templates with the initial data. As templates, the so-called “ smart constructors ” are suitable - functions that will create a ready-made object for us according to ready-made patterns and a small set of input parameters. Here's what the smarter karyon function looks like:

- Objects.hs:
import Object

karyon pId player point = Object $ M.fromList [(1, PObjectId pId)
, (4, PNamed “Karyon”)
, (2, PDurability (Resource 100 100))
, (3, PBattery (Resource 300 2000))
, (10, POwnership player)
, (5, PDislocation point)
, ...]

This syntax is hardly elegant, too much “noise” and body movements. Haskell is a concise language, and we should strive for simplicity and functional minimalism, then the code will be more beautiful, understandable and more convenient. Ah, it would be nice if the verbal description of the template, presented a few paragraphs above, could be transferred to the code ... There is nothing impossible !

- Objects.hs
plasmaFabric :: Player -> Point -> Fabric
plasmaFabric pl p = makeObject $ do
    energyCost. = 1
    scheme. = Plasma pl p
    producing. = True
    placementAlg. = PlaceToNearestEmptyCell

karyon :: Player -> Point -> Object
karyon pl p = makeObject $ do
    namedA | = karyonName
    layerA | = ground
    dislocationA | = p
    batteryA | = (300, Just 2000)
    durabilityA | = (100, Just 100)
    ownershipA | = pl
    fabricA | = plasmaFabric pl p

The comprehensibility of the code depends on how much the reader’s knowledge and thinking coincided with the author’s knowledge and thinking. Is this code clear? It’s clear what he is doing, but how does he work? What, for example, do the operators “. =” And “| =” mean here? How does the makeObject function work? Why do some names have the letter “A” and some do not? And is that a monad or something? ..

The misty answer to these correct questions is this: this code uses an internal language for the layout of objects. Its design is based on the use of lenses in conjunction with the state monad. Functions with “A” postfixes are smart constructors (“accessors”) of the properties themselves, knowing the serial number of a particular property and knowing how to validate values. Functions without “A” are lenses. The “. =” Operator belongs to the library of lenses and allows you to set the value “under magnification” inside the State monad. The plasmaFabric function populates the Fabric ADT, and the karyon function populates the PropertyMap and Object. In the second example, accessors and data are transferred to the custom operator | =, for the sake of correctness we will call it the “filling operator”. The fill statement works inside the State monad. It pulls out the current PropertyMap and places the property validated by the accessor into it:

- Object.hs:
makeObject :: Default a => State a () -> a
makeObject = flip execState def

data PAccessor a = PAccessor {key :: PropertyKey
                             , constr :: a -> Property}

- Property fill operator:
(| =) accessor v = do
    props <- get
    let oldPropMap = _propertyMap props
    let newPropMap = insertProperty (key accessor) (constr accessor v) oldPropMap
    put $ props {_propertyMap = newPropMap}

- Accessor for the Named property:
isNamedValid (Named n ) = not. null $ n
namedValidator n | isNamedValid n = n
                 | otherwise = error $ "Invalid named property:" ++ show n

namedA = PAccessor 0 $ PNamed. namedValidator

This design is not perfect. Validation of properties looks very dangerous, since it can fall with an error in runtime. We also do not monitor whether there is already such a property in the set — we simply write a new one on top of it. Both flaws can be easily fixed by creating a stack of Either and State monads, and handling exceptions in a safe way. In this case, the code in the module with templates (Objects.hs) will change slightly. There are many pluses, but there is one objection: while the language of the object layout is used only to create templates, and while they can be tested, the extra logic will only get in the way. On the other hand, when this code goes into scripting, security will become important.

Our last object related question is: what does the World data type look like now? There were no significant changes, the world is still a type of Map:

type World = M. Map Point Object

The Data.Map structure suffers performance. A more suitable solution here is a two-dimensional array; Haskell has efficient vector implementations such as vector or repa . When it becomes clear that the game’s performance is not high enough, it will be possible to return and review the world’s repository, but for now, the development speed is more important.

Scenarios

Scenarios are the laws of the world. Scenarios describe a particular phenomenon. The phenomena in the world are local; in one phenomenon, only the necessary properties are involved in a certain section of the map. For example, when a bomb detonates, we are interested in the strength of objects in a radius of N — it is we who must reduce it by the amount of damage, and if the strength drops below 0, we need to remove objects from the map. If the factory works for us, we must first provide it with a resource, then get the product and place it somewhere nearby. Strength is not important, but resources, the factory itself and the empty space for the product are important.

Scripts should be run relative to the underlying properties. If there is an object with the “Motion” property on the map, run the motion script. If the factory is working, we will run the scenario for the production of military units. Scripts are not allowed to change the current world; they work one at a time and accumulate results in the overall data structure. It should be borne in mind that sometimes the work of some scenarios affects the work of others, up to the complete cancellation.

We illustrate this with examples. Let us have two factories that produce one tank at a cost of 1 unit. In stock, we have only 1 unit of resource. The first scenario will work successfully, but the second should find out that all resources are used up and stop working. Or another situation: two objects move in opposite directions. When there is only one cell left between them, what should happen? Collision or impossibility of movement of one of the objects? There may be many such nuances; I would like the scripts to be complete, but remain extremely simple to read and write.

We outline the requirements for the scripting subsystem:

reliability;
property orientation;
sequence;
simplicity;
scripts can fail;
performance;
scripts may run other scripts;
...

In the game “The Amoeba World”, the Scenario DSL language was designed and its interpreter (code) was written. Here's what a piece of script for the Fabric property (code) looks like:

- Scenario.hs:
createProduct :: Energy -> Object -> Eval Object
createProduct eCost sch = do
    pl <- read ownership
    d <- read dislocation
    withdrawEnergy pl eCost
    return $ adjust sch [ownership. ~ Pl, dislocation. ~ D]

placeProduct prod plAlg = do
    l <- withDefault ground $ getProperty layer prod
    obj <- getActedObject
    p <- evaluatePlacementAlg plAlg l obj
    save $ objectDislocation. ~ p $ prod

produce f = do
    prodObj <- createProduct (f ^. energyCost) (f ^. energyCost) (f ^ . scheme)
    placeProduct prodObj (f ^. placementAlg)
    return "Successfully produced."

producingScenario :: Eval String
producingScenario = do
    f <- read fabric
    if f ^. producing
        then produce f
        else return "Producing paused."

In the second part of the series of articles, namely in the section 'let-functions', we saw the code is cumbersome and incomprehensible. Now we see the code is light, still incomprehensible, but a certain system is already visible in it. Let's try to figure it out.

Scenario DSL is divided into two parts: the language of the request for game data and the runtime. At the heart of everything is the Eval type - a stack of Either and State monads:

- Evaluation.hs:
type EvalType ctx res = EitherT EvalError (State ctx) res
type Eval res = EvalType EvaluationContext res

The State internal monad allows you to store and change the execution context. The current world, operational data, a random generator - all this lies in the context of:

data DataContext = DataContext {dataObjects :: Eval Objects
                               , dataObjectGraph :: Eval (NeighborsFunc -> ObjectGraph)
                               , dataObjectAt :: Point -> Eval (Maybe Object)}

data EvaluationContext = EvaluationContext {ctxData :: DataContext
                                           , ctxTransactionMapbtt TransactionMapbap ::
                                           TranstapactionMap :: Maybe Object
                                           , ctxNextRndNum :: Eval Int}

The Either outer monad allows for safe handling of execution errors. The most common situation is when collisions occur, and some scenario should break off in the middle of work. In order for the state of the game to remain correct, you need to roll back all its changes, and if the script was called from another script, then you should somehow react to the problem. Therefore, many functions are of the Eval type, which hides the Either monad. In fact, all functions with type Eval are scripts. Even interpreter functions (evalTransact, getTransactionObjects) and query language functions (single, find) work in this type and, in fact, are also scripts. In other words, the Scenario DSL language is unified by the Eval type, which makes the code consistent and monad-composable.

Since any function of type Eval is a script, each of them can be run and tested. The interpretation of the script is just the execution of the monad stack:

- Evaluation.hs:
evaluate scenario = evalState (runEitherT scenario)
execute scenario = execState (runEitherT scenario)
run scenario = runState (runEitherT scenario)

For game scenarios, there is one entry point - the generalized mainScenario function:

- Scenario.hs:
mainScenario :: Eval ()
mainScenario = do
    forProperty fabric producingScenario
    forProperty moving movingScenario
    return ()

- Somewhere in the main code - one tick of the whole game:
stepGame gameContext = runScenario mainScenario gameContext

In the same way, individual scripts are launched, which means that you can introduce unit and functional code testing. Here, for example, debugging code from the ScenarioTest.hs module, - if necessary, it can be transformed into a full-fledged QuickCheck or HUnit test:

main = do
    let ctx = testContext $ initialGame 1
    let result = execute (placeProduct (plasma player1 point1) nearestEmptyCell) ctx
    print result

Now that we’ve taken a look at some of the features of the Scenario DSL runtime, we are preparing the following function:

withdrawEnergy pl cnt = do
    obj <- singleActual $ named `is` karyonName ~ & ~ ownership` is` pl ~ & ~ batteryCharge `suchThat` (> = cnt)
    batRes <- getProperty battery obj
    save $ batteryCharge. ~ modifyResourceStock batRes cnt $ obj

This is also a scenario that serves a specific purpose: for player pl, remove cnt energy from the core. What needs to be done for this? First of all, find on the map an object with the following properties: Named == “Karyon” and Ownership == pl. In the code above, we see a call to singleActual - this function searches for an object based on a predicate for us. Thanks to the query language, the verbal description is almost exactly translated into code:

named `is` karyonName
~ & ~ ownership` is` pl
~ & ~ batteryCharge `suchThat` (> = cnt)

It is easy to guess that the operator (~ & ~) means “AND”, and the `is` operator sets the equality of a certain property to a value. The third predicate condition selects only those objects for which the battery is charged enough to remove more energy from there. Of course, the energy can end, and then the object will not be found - in this case, the Either monad fail-branch will begin, and the whole script will be canceled. But if energy can be withdrawn, then we withdraw and accumulate changes:

save $ batteryCharge. ~ modifyResourceStock batRes cnt $ obj

It is worth mentioning that Scenario DSL actively uses lenses, which greatly reduces the code. For example, instead of concise (batteryCharge. ~ 10), we would have to do archaeological excavations along the chain: Object -> PropertyMap -> PBattery -> Resource -> change stock -> save everything back. Although the idiom of the lenses is doubtful , this tool is very, very useful.

The query language has many useful features. You can search for a lot of objects by predicate (query function), you can search for a single object (single function), and if there are a lot of them, file the script. There are also search strategies: to search only for old data, to search only for new, or all together - and let the client code itself understand. In general, Scenario DSL did a good job of its function, and there were opportunities to expand it. And there was only one serious problem, on which I again had to revise the basis of the basics - design of the Object type. The name of this problem ...

Antipattern Lens + NoMonomorphismRestriction

The cause of all ills lies in the PropertyMap data type and in the lenses for properties:

property k l = propertyMap. at k. traverse. l

named = property (key namedA) _named
durability = property (key durabilityA) _durability
battery = property (key batteryA) _battery
...

The property function in all cases returns different lenses, which cannot be done with the monomorphism check turned on. Therefore, I had to include the extension of the NoMonomorphismRestriction language. Unfortunately, because of this, type inference began to break in the most unexpected places, and had to look for workarounds. Even worse: the NoMonomorphismRestriction mode began to propagate through the code. It appeared wherever the lenses of the Object.hs module were used, and infected the type-checker with insanity. In the end, the design of the Scenario DSL began to cave in under the limitations of the tympher, which led to several not-so-good solutions.

The problem can be eradicated by abandoning the PropertyMap type. Then all properties will appear in the Object type, even those that are not needed by a particular object. There may be other solutions, but the next version of the design did just that:

data Object = Object {
                        - Properties:
                         objectId :: ObjectId - static property
                       , objectType :: ObjectType - predefined property

                       - Runtime properties, resources:,
                       ownership :: Player - runtime property ... or can be effect!

                       , lifebound :: IntResource - runtime property
                       , durability :: IntResource - runtime property
                       , energy :: IntResource - runtime property
                       }

There is no silver lining - as a result of the revision, other properties have turned into external effects and actions. The design became more correct, although I had to throw away most of the developments on the Scenario DSL ...

Instead of a conclusion

The new scripting engine, presumably, will be based on other principles. In particular, it is planned to make not an internal DSL, but an external one - then the scripts can be written in plain text files. Currently, the author is working on the Application and View layers, on the search for the optimal model for using FRP. The following chapters will explain what the idea behind FRP is and how reactive programming can connect the disparate parts of a large application.

Haskell Inversion of Control Implementations

Disclaimer: the author did not have time to complete the research for this section. To be continued in future articles.

Monadic state injection

What is : Dependency Injection.
What is it used for : For abstracted work with external state in client code.
Description : An external state is implemented through the State monad as a context. Client code runs in the State monad with this context. When accessing the context, the client code receives data from an external state.
Structure :

We define the Context data type - it will contain the external state in the form of the State monad:

data Context = Context {ctxNextId :: State Context Int}

Define specific instances of embedded code. The code can produce a constant result:

constantId :: State Context Int
constantId = return 42

Or it may produce different results for each call:

nextId :: Int -> State Context Int
nextId prevId = do let nId = prevId + 1
                   modify (\ ctx -> ctx {ctxNextId = nextId nId})
                   return nId

Create the client code in the state monad:

        client = do
            externalId <- get >> = ctxNextId
            doStuff externalId
            return externalId

Run the client code by implementing a specific instance of the external state:

print $ evalState client (Context constantId)
print $ evalState client (Context (nextId 0))

Full example : gist
Example output :
Sequental ids:
[(1, "GNVOERK"), (2, "RIKTIG YOGLA")]
Random ids:
[(59, "GNVOERK"), (64, "RIKTIG YOGLA")]

Modular abstraction

What is : Black box.
What is it used for : Choose the implementation of the algorithm in runtime.
Description : There is a facade module in which several modules that implement the same function are connected. According to a certain algorithm, a particular implementation is selected in the switch function of the facade module. In the client code, the facade module is connected, and the desired algorithm is used through the switch function.
Full example : gist

Tags:

Design and architecture in FP. Part 3

Quite a bit of theory

A lot of practice

Instead of a conclusion

Haskell Inversion of Control Implementations

Also popular now: