graninas March 1, 2012 at 06:12

Haskell - Design

They say that every programmer in his life should write at least one compiler or come up with some programming language. Designing a new language is not easy, because you need to think through dozens of parameters that, like Lego cubes, should work well together. One unsuccessful decision can cross out the fate of the language when it has not even been published. Hundreds of languages vegetate in oblivion, pushed from the catwalk by older brothers, but the world with tenacity worthy of the best use gives birth to two or three new annually. Whether they fall into at least the “alternative worldview group,” or even become mainstream, time will tell. Fortunately, my language does not need this, because you cannot program it, you can only admire it. For this is the Haskell code visualization language, the design of which will be discussed in the article.

Haskell is a wonderful language. He is very expressive and capacious. The code is much more compact than the code in imperative languages (C ++, Java, C #), despite the fact that it has fewer syntactic constructs. The pure functional nature of Haskell and its design make the code look like a math notation. Take for example an indicative (but not the best) version of the factorial:

fact n | n == 0 = 1
| n> 0 = n * fact (n-1)

In these two lines, the recurrent factorial formula familiar from the school is well recognized:

Of course, a certain recursive factorial will not be much worse in any other language, which can raise doubts as to whether Haskell is so expressive.

int fact (int n)
{
if (n == 0) return 1;
return n * fact (n - 1);
}

The trick is that both functions are written in a functional style, so there is not much difference between them. However, sparing the stack, we could use a loop in an imperative language, which looks completely different:

int fact (int n)
{
    int f = 1;
    for (int i = 2; i <= n; ++ i)
        f * = i;
    return f;
}

Programmers are well aware that the last two examples can be visualized using a flowchart:

The second flowchart, in principle, describes what happens in Haskell recursive code. And although the flowcharts are visualizations, they are still different from what I want: they show what is happening and not how it looks . I want to transfer the code to a three-dimensional scene, so that the scene is beautiful, intuitive and as unusual as possible. The meaning of visualization is this.

When designing a language, you have to think about how to expand it later. If you do it wrong or inconvenient, there will be no room for new ventures without destroying backward compatibility. Stories such examples are known; This bowl has not passed even some very popular languages (for example, Python 3.0, which is not compatible with earlier versions). To a large extent, the design extensibility is affected by very small things, so to speak, the syntax in the small. With successful details, the whole language is built as one large and well-designed constructor. Haskell is one of these, which means that the visualization language should also keep the mark. And here I can’t even make a discount on the fact that the Haskell design is known - I still have to overcome these difficulties from the very beginning. Although in a sense, of course, it is easier for me.

So, you need to think over the design of the graphic language, quite independent, but at the same time visual and aesthetically useful. You need to think about individual syntactic units, at the same time looking at a level higher - at their connection and their significance in the general outline. Based on that factorial, I plan to someday come up with sketches even for such a Haskell code:

- | Collects actions for specified box side drawings.
- | It should be used only in this module.
f :: PreparedTextureObjects
    -> GLfVertex3
    -> (BoxSide, QuadColorSpec)
    -> ([BoxSide], [IO ()])
    -> ([BoxSide], [IO ()])
f texRes boxDim (side, qColorSpec) (sList , ioList) = let
    boxIO = do setQuadColorSpec texRes qColorSpec
                GL.renderPrimitive GL.Quads (boxSide boxDim side)
    in (side: sList, boxIO: ioList)

Or for this:

data QuadColorSpec = QuadTexture TextureName
                   | QuadPlainColor GLfColor4
                   | NoQuadColorSpec
    deriving (Show)

data ObjectTextureSpec = BoxTextureSpec
        {quadSideTexes :: [(BoxSide, QuadColorSpec)]
        , defQuadSideTex :: QuadColorSpec
        } deriving (Show)

data CornerCoord = UR | DR | DL | UL
    deriving (Show)

In the meantime, I have the factorial code, and the visualization language began with it:

fact n | n == 0 = 1
| n / = 0 = fact (n - 1) * n

Here we see several syntactic constructions:

function name		fact
Argument		n
2 security expressions		\| n == 0
		\| n / = 0
2 body functions		= 1
		= fact (n - 1) * n

In turn, each body of the function is divided into a tree-like computational expression. Take the second case. Presenting the elements of this tree in the form of boxes, we get the first crude option:

And immediately flaws appear. You can imagine how code visualization will grow if you add some more actions and expressions to the function body. Yes there! Expressions can themselves consist of expressions, and simply correlating each element with the box will result in us just getting a long and unintuitive chain of boxes. The “forehead” solution does not fit here. Recall a useful fact: the pyramid is the clear embodiment of the tree structure. A computational expression can also be represented in this form, and even in several ways:

The last picture is subjectively prettier due to the characteristic pyramidal protrusions. It would be possible to dwell on this form, but binary functions have the peculiarity that they can be operators. The difference between operators and functions is purely intuitive, but from the point of view of the language it is one and the same thing, it is simply written differently in functional and infix entries:

isListElement x xs = elem x xs - functional form
isListElement 'x xs = x `elem` xs - infix form

The default operators are used in infix notation, but functional is acceptable too:

add a b = a + b
add 'a b = (+) a b

And this fact can be easily displayed during visualization. In the next sketch, in addition to highlighting the infix form, I doubled the final syntactic units (variables and constants):

The height of the pyramid decreased, which is good, and it became clear where the infix entry is and where is the functional one. This approach becomes useful when defining consecutive calculations:

add3 a b c = a + b + c
someF x y z = x >> = y >> z

And for functions with a large number of arguments, it is worth preserving the gap between the boxes.

The sketches show that the length of the boxes depends on the name on the box and on the place occupied by the arguments. There is nothing to be done: the inscription is needed, and it is not worth reducing it in size, otherwise we will lose uniformity. It is possible to customize fonts on a web page or in a document, and the code should look strictly, because it is so easier to read. Therefore, complex, massive computational expressions will also look impressive. Let's look at the function for Fibonacci numbers:

fibs = 0: 1: zipWith (+) fibs (tail fibs)

The expression (and it works too!) Looks very beautiful. Let's try to build a sketch for it, remembering that the colon sign used twice is written in infix form, and zipWith is a function that takes three arguments: (+), fibs and (tail fibs). The problem here appears with a plus in parentheses. The sign is passed to another function as an argument, and for Haskell it is a normal and very useful phenomenon, known as a “higher order function”. Additionally, the + operator bit off both of its own arguments, and this is already called a section. According to our past rules, the + operator should be the final argument with an enlarged box. Basically, this is normal:

But looking to the future, we can reflect on how sections and curried functions will generally look. Take for example a little artificial code:

ops = [(+), (*), (-)]

func x y op = x `op` y

result = map (func 2 3) ops

There are three functions: ops returns a list of operators, func - performs the action passed as op on the arguments x and y, and inside result it is all used and something is calculated. To be honest, when executing result, it will return the following list: [5, 6, -1], and it’s easy to track how it turned out. The map function (func 2 3) was applied to all operators from the ops list. We are interested in this case that (func 2 3) is a curried record with one argument that remains vacant. The operators in brackets are also curried (or rather truncated), everything has already been stolen from us ... that is, two arguments were selected. This is not reflected in the Haskell code, and a strange entry like this:

nullMap = map null

it might not tell us anything unless you know what map and null functions do, and if type signatures are not specified. However, in visualization, we could clearly show that these functions require some other arguments. It is enough to postulate that the arguments fit into the grooves of the functions, and the number of grooves corresponds to arity. Here is the result for the result and nullMap functions:

When designing computational expressions, the question of proportions still remains an essential question, and here a total revelry of fantasy is possible. The height of the elements, the distance between the arguments, the length and depth of the empty groove, the height and length of the finite elements, the size of the pyramidal indentation lend themselves to parameterization. If you arm yourself with knowledge of harmony, then you probably should add a magical golden ratio to the proportions. It does everything better!

An important construction of any language is conditional transitions. Haskell also has an if statement, and it is such that the else statement must accompany it. There can be no expression in which one of the alternatives is missing - in a world where everything has a function, the result should always be returned. The following is a calculation of the factorial with if (we, like the last time, do not think about the fact that n can come negative):

fact n = if n == 0 then 1 else fact (n - 1) * n

You can format the code in any way. then and else are not sensitive to indentation and can be located on a new line (only by counting at least one white space from the fact function):

fact n = if n == 0
then 1
else fact (n - 1) * n

Perhaps these permutations somehow help to come up with the design of this design. The naive transfer of the words if, then, and else as the lowest layers of the pyramids overlaps with the previous version and is confusing. But if you separate them with some isolated element, it will turn out to be more interesting. Compare the following two options:

By giving vent to imagination, you can achieve some intuitiveness of the latter option. After all, what is if? This is when a boolean expression is checked for truth, that is, for a match with True. On the principle of coincidence, constructors, jigsaw puzzles and something else are built: there is a coincidence, - the pin fits the groove; there is no coincidence - well, I'm sorry ... Let's try to add this concept to the sketch:

It looks interesting and makes you fantasize further. You can experiment with position, shape, size, but you must not forget that if-then-else is an expression, and it will be used inside other expressions. That is, since we limit ourselves in space for the sake of expressiveness, then the conditional operator should not go beyond some empirical framework.

Perhaps it’s not difficult for if-then-else, and these syntax elements, like a good pair, can fit in any part of the expression, but the related case-construction can be troublesome. It can expand: the number of alternatives is unlimited. We rewrite the factorial function:

fact n = case n == 0 of
True -> 1
False -> fact (n - 1) * n

There is something to think about. A new semantics is added to the case construct: pattern matching. Here, I confess, I have not yet come up with a decent option. But you can try by analogy with one of the last if sketches:

The meaning of the comparison with the sample is that on each of the cubes in front of the "->" sign there can be a rather large expression. The possibilities are enormous: here is the splitting of lists, and an arbitrarily deep comparison with algebraic data types, and even with the connected extension of the language View Patterns - a kind of syntax that resembles functions. I will show only a simple, fairly typical code where pattern matching is used inside the case:

toHabrFormat (s, (ch: [])) = s ++ [ch]
toHabrFormat a @ (s, b @ (ch: inStr)) = let formatted = (foldr1 (<|>) formatters) a
                        in case formatted of
                            Just (res, []) -> res
                            Just (res, (r: rs)) -> toHabrFormat (res ++ [r], rs)
                            Nothing -> toHabrFormat (s ++ [ch], inStr)

Apparently, while there is no clarity on how to represent algebraic data types, lists, tuples, etc., the case-construction will be unconvincing.

Now, nevertheless, let us put aside computational expressions and try to complete the visualization of the factorial function with its name and security expressions.

fact n | n == 0 = 1
| n / = 0 = fact (n - 1) * n

There is something to be said about security expressions. These are regular expressions, but they always return a boolean value - True or False. They serve as some filters, as if not skipping execution in the body, if the condition is not true. Playing with this concept, you can come up with something like the notorious if. Let's evaluate the following option: The

box with equality (“bridge”) arose from the desire to combine security expressions and the body of the function in the same way as in the code. And that seems to be an interesting idea. As it developed, it grew into something more - a visual separation of parts (function name; security expressions; computational expression). You can go further: place them on the platform, thereby delineating the space of each of the parts.

In the code, the guard expression begins with the character '|'. If it weren’t, the code would look contradictory:

fact n n == 0 = 1
n / = 0 = fact (n - 1) * n

The compiler would not understand what the letters and symbols after the word fact are. A vertical bar is a necessary element of syntax, and therefore we need to add it to the visualization. It is logical to put something to the left of the platform with a Boolean expression; it can be, for example, a “wall”:

But is it so intuitive? A solid box looks like an obstacle, a ban, and we need - a condition, variability. If we present the progress in the form of a cube, then, moving from the name of the function to the guard expression, it will simply collide with the wall, and not one of the bodies of the function will be achieved. In this reflection lies the key to a more intuitive version: what if you make a hole in the wall? You can think of yourself this way: if the logical condition is true, the cube will go through the hole and get where you need it. The direction of movement is also worth visualizing: let bridges with arrows show it, they will also help to connect the right and left parts of the function:

Much better! Now we visualize the name of the function with arguments. Again, here we are dealing with pattern matching, and, in a good way, it should be combined with arity. We have already learned how to make cutouts for arguments, and they will come in handy here. Without comparison with the sample, we would have such an option:

So, the basis of the graphic language is built, and we now roughly imagine what it should look like. For a clear picture, you still need to deal with such important things as algebraic data types, lists, tuples, and pattern matching. When they are finished, most likely ideas will appear about what to do with type declarations, with do-constructs, list generators, as-patterns, class types, class incarnations, and other wonderful elements of the Haskell syntax.

Let's start with algebraic data types, lists, and pattern matching. In the case example above, we already tried to visualize the ADT and pattern matching. Imagine that arity instead of cuts is indicated by protrusions. The picture with case, True and False will then be incorrect, because the arity of these two constructors is zero. However, the arity of Just is 1, and he will have one ledge.

Lists are created through the infix operator ":" (which is actually a type constructor, but that's another story [1]). We already know how to visualize it. Let us take a more complicated case, when there are both “unimportant” elements and an empty list. Here is a hypothetical code:

someFunc someList = case someList of
(x1:x2:_:xs) -> Just (x1, x2)
(x1:[]) -> Just (x1, x1)
[] -> Nothing

The constructs of the Haskell language themselves clearly demonstrate their essence. An unimportant argument is replaced by an underscore, and an empty list is replaced by empty square brackets. We will take this into account when rendering. In the code formalism, only the differences between the arguments x1, x2 and xs from the first alternative are not visible. Meanwhile, x1 and x2 are the elements of the list cleaved in front, and xs is the rest of the possibly empty list. It would be useful to show the difference somehow.

Yes, the listings look very good. I especially like the "unimportant" element. What about tuples? Well, there’s no reason to be smart: just put each element of the tuple in a groove on a common board. The function cannot be confused, since the tuple has no name, and its base is thinner. For reliability, you can change the color.

Not such a difficult task! So, logically reasoning, you can build a language for visualization. It's okay if some designs are changed beyond recognition through several versions. Now, without touching on how to visualize the type declaration, I risk later discovering that they do not fit into the semi-finished scheme. You will have to change something, sacrifice something, but the general principles of visualization will remain the same. Of course, there were huge open spaces for imagination. I have not yet selected colors and shades, I have not thought about fonts, and the shape of the elements can be improved in many ways. There was even such an idea: to make boxes with inscriptions glass, and put the inscription inside, and make it all shimmer with reflected light. Perpetuate, so to speak, the code in the glass. It would probably be something like this:

Beauty, of course. And it already seems that the main cost here is to invent a design, and visualization can be quickly cooked up in some kind of graphic editor, but ... In reality, everything is somewhat more complicated, because visualization requires a program, not a static 3D scene. This is exactly the kind of program I create in the GraphServer project. The task is complicated by the fact that not for all designs I came up with visualization; already at the time of writing this article, some designs were greatly changed or invented. But even what is ready at the moment looks promising.

This is a cross article. Read about creating a visualization server in the article “Haskell - Aesthetics” .

PS I apologize for the not very high-quality pictures. Habrastorage seems to be doing something with them.
[1] - Correction in connection with this comment: it-talk.org/post76900.html#p76900

Tags:

Haskell - Design

Also popular now: