youngmysteriouslight August 1, 2015 at 20:59

Clear code with free monads

Transfer

From the translator:
This is a free translation of the article “Purify code using free monads” by Gabriel Gonzales, devoted to the use of free monads to represent code as a syntax tree with subsequent controlled interpretation.
On the hub there are other articles by Gabriel - “Cooperative flows from scratch in 33 lines on Haskell” and “What good free monads are good for” .
To read this article, you need to know what a free monad is and why it is a functor and monad. You can find out about this in the above two translations or in the article to which the author refers .
All translator comments are in italics.
For all comments related to the translation, please contact in PM.

Experienced Haskell programmers often advise novices to keep programs as clean as possible. A function is called pure if it is deterministic (the return value is uniquely determined by the values of all formal arguments) and has no side effects (that is, it does not change the state of the execution environment). In classical mathematics, λ-calculus, and combinatorial logic, all functions are pure. Cleanliness provides many practical benefits:

you can formally prove some properties of the written code,
in addition, you can easily review the code and say what it does,
finally can be driven through QuickCheck .

For demonstration, I will use such a simple program echo:

import System.Exit
main = do x <- getLine
          putStrLn x
          exitSuccess
          putStrLn "Finished"

The above program, however, has one drawback: it mixes business logic and side effects. In the specific case, there is nothing wrong with this, I always write simple programs that I can completely keep in mind. However, I hope to interest you in cool things that come about when side effects are separated from business logic.

Free monads

For starters, we need to clear our program. We can always separate unclean sections from any code using free monads . Free monads allow us to divide any unclean program into a pure representation of its behavior and a minimal unclean interpreter:

import Control.Monad.Free
import System.Exit hiding (ExitSuccess)
data TeletypeF x
  = PutStrLn String x
  | GetLine (String -> x)
  | ExitSuccess
instance Functor TeletypeF where
    fmap f (PutStrLn str x) = PutStrLn str (f x)
    fmap f (GetLine      k) = GetLine (f . k)
    fmap f  ExitSuccess     = ExitSuccess
type Teletype = Free TeletypeF
putStrLn' :: String -> Teletype ()
putStrLn' str = liftF $ PutStrLn str ()
getLine' :: Teletype String
getLine' = liftF $ GetLine id
exitSuccess' :: Teletype r
exitSuccess' = liftF ExitSuccess
run :: Teletype r -> IO r
run (Pure r) = return r
run (Free (PutStrLn str t)) = putStrLn str >>  run t
run (Free (GetLine  f    )) = getLine      >>= run . f
run (Free  ExitSuccess    ) = exitSuccess
echo :: Teletype ()
echo = do str <- getLine'
          putStrLn' str
          exitSuccess'
          putStrLn' "Finished"
main = run echo

The code rewritten in this way collects all unclean actions in the function run. I prefer to call this process “code cleanup” because we clear the entire program logic to clean code, leaving aside the minimal unclean residue only in our interpreter run.

Evidence

It seems that we didn’t hurt much from such code cleansing, but paid for this pleasure with a lot of lines of code (and also overhead of memory and time) . Nevertheless, now we can prove some properties of our code using equational reasoning (equational reasoning - output through equivalent transformations) .

For example, anyone reading this must have noticed an obvious bug in the original program echo:

import System.Exit
main = do x <- getLine
          putStrLn x
          exitSuccess
          putStrLn "Finished" -- ой-ой!

The last command will never be executed ... or will it still be? How to prove it?

In fact, we cannot prove that the last command will never be executed, because it is not . The author of the module System.Exitcould easily change the definition exitSuccessto

exitSuccess :: IO ()
exitSuccess = return ()

This program still passes type checking, but now it also successfully prints Finishedto the console.

But for our cleaned version, we can prove that any command exitSuccess'will never be executed after :

exitSuccess' >> m = exitSuccess'

Let us prove the equational conclusion, this most significant profit of purification. The equational conclusion suggests that we can take any expression and simply substitute the component functions in the definitions of functions. Each step of the following proof is accompanied by a comment that points to a specific functional definition used to proceed to the next step.

exitSuccess' >> m
-- exitSuccess' = liftF ExitSuccess
= liftF ExitSuccess >> m
-- m >> m' = m >>= \_ -> m'
= liftF ExitSuccess >>= \_ -> m
-- liftF f = Free (fmap Pure f)
= Free (fmap Pure ExitSuccess) >>= \_ -> m
-- fmap f ExitSuccess = ExitSuccess
= Free ExitSuccess >>= \_ -> m
-- Free m >>= f = Free (fmap (>>= f) m)
= Free (fmap (>>= \_ -> m) ExitSuccess)
-- fmap f ExitSuccess = ExitSuccess
= Free ExitSuccess
-- fmap f ExitSuccess = ExitSuccess
= Free (fmap Pure ExitSuccess)
-- liftF f = Free (fmap Pure f)
= liftF ExitSuccess
-- exitSuccess' = liftF ExitSuccess
= exitSuccess'

Notice how in the last steps we turned the equalities over and moved back from the functional definition (on the right) to the defined expression (on the left) . Such a technique is completely acceptable, because, due to its purity, the equal sign in Haskell does not mean assignment, but really means equality! This means that, in discussing the code in terms of equality, we can translate expressions in both directions relative to the equal sign. Impressive!

For those who have heard of the λ-calculus (from the translator)

In fact, equality in Haskell does not mean equality in the traditional sense. In the λ-calculus there is the concept of reduction - a term transformation according to a certain rule. Formally speaking, this is just a binary relation on many terms. A term can be reduced to several different terms, and this ambiguity leads to a multiplicity of reduction strategies (energetic - first the argument is reduced, then the expression; lazy - first the whole expression, then the argument, if necessary; and the like). With a fixed reduction strategy, the term is either reduced in a certain way to another term, or is not reduced at all. The latter case is called the normal form of the term; we think of it as a fully completed calculation. The reductions themselves are generated by some kind of rules. The best known are β and η reductions. In a certain sense, each functional equality (function definitions) expresses the same rules. When the calculation takes place, these rules work in one direction. If the rule closes on itself, infinite recursion happens, likeinf = inf :: a. Nevertheless, in the metatheory we can take this binary relation and close it by transitivity and symmetry, having obtained the binary relation of reduction equality. By the Church – Rosser theorem, the reduction equality M = N is valid if and only if there is a term (not necessarily in normal form) P such that both terms M and N are reduced to a common value P. For some finite number of steps, this is how since such equalities are discussed in this article.

It is equally important that the above proof is true, however much this free monad is interpreted. We can change the function runto any other interpreter, including clean and equality still holds

exitSuccess' >> m = exitSuccess'

This means that we can safely use it as a rule of redefinition in GHC (GHC rewrite rule) for optimization of passage on our program:

{-# RULES
  "exit" forall m. exitSuccess' >> m = exitSuccess'
    #-}

... and we can guarantee that the rule will always be applied safely and never broken.

Reasoning about code

I would like to prove that our program always displays the same line that it receives at the input. Unfortunately, we also cannot prove this, because it is not . An author (the maintainer) of a function putStrLncould change his mind and override at any time

putStrLn str = return () -- худший автор в мире!

In fact, we cannot even prove that this is true for the version cleared by the free monad. The function runuses the same putStrLn, so the program is in any case subject to the same threat of a bug. Nevertheless, we can nevertheless prove something about the freest monad itself, considering it through a pure interpreter:

runPure :: Teletype r -> [String] -> [String]
runPure (Pure r)                  xs  = []
runPure (Free (PutStrLn y  t))    xs  = y:runPure t xs
runPure (Free (GetLine     k))    []  = []
runPure (Free (GetLine     k)) (x:xs) = runPure (k x) xs
runPure (Free  ExitSuccess   )    xs  = []

Now we can prove

runPure echo = take 1

using a function takefrom Preludethe Haskell98 standard package. I leave this as an exercise for the reader, since this post is already quite long.

Let's pay attention to one more important point. Examining echothrough the prism of pure code, we find a small detail that was not originally seen: the user may not enter anything! In this case, our (last) interpreter should return an empty list, like a function take _ [] = []. An equational conclusion forces us to be strict when a simple statement “our program always produces the same line that it receives” is not enough. The more you work with such equalities, the more you improve the skill of reasoning about the properties of your code and find errors in your initial assumptions.

It is equally important that equivalent conversions allow you to see your program in a completely new light. We saw that our program became notorious take 1after passing through runPure, which means that we can borrow our intuition about the function taketo understand our program and detect such insidious minor bugs.

After we allocated the clean part of the code as a monadFreeand proved its viability, it becomes a reliable core and we only have to compose an interpreter from this moment. Thus, while we cannot fully prove the correctness of the entire program, we could prove the correctness of everything except the interpreter. Equally important is the fact that we have reduced the interpreter to an absolutely minimally susceptible form for attacks, which allows us to keep it in our head and maintain it manually.

Testing

We cannot prove anything about code immersed in a monad IO. And how could this be done? One could say somehow: “if you compile the code, run the program and enter some single line, the same line will be returned back”, but this is not really an equation, therefore there is nothing strict in such a phrase. However, we can write it as a test for our program.

Unfortunately, tests of unclean code do not scale to large and complex programs, and in test-dependent programming languages writing such tests takes a significant portion of the entire programming cycle.

Fortunately, we can still easily experience clean code with the QuickCheck library , which iterates over clean statements and checks for incorrectness in fully automatic mode.

For example, suppose you were too lazy to prove runPure echo = take 1. We can instead ask QuickCheck to test this judgment for us:

>>> import Test.QuickCheck
>>> quickCheck (\xs -> runPure echo xs == take 1 xs)
+++ OK, passed 100 tests.

Cool! You can test your code much more actively if it is as clean as possible.

Finally

Equational output works only for clean code, so clean sections of code serve as a reliable kernel, which you can:

check for correctness, integrity and correctness (prove soundness) ,
derive the properties of his behavior (reason about behavior) ,
actively test (test aggressively) .

This is why we always fight for a fraction of the clean code in our code and minimize the unclean parts.
Fortunately, the free monad Freeguarantees us the easiest way to achieve the highest level of cleanliness, as much as possible, and the least amount of unclean code. This is why all experienced Haskellist programmers should be fluent in free monads.

Acknowledgments

Thanks to KolodeznyDiver for the help with the translation.

Tags: