Polymorphy and classes in Haskell

    Hello dear reader!

    Today we’ll talk about the polymorphism of functions and about classes in the Haskell language, how to use them and why they are needed. We all used the polymorphism method, for example, in the Java language. How can this be applied to Haskell? Haskell is not an object-oriented language, but it still has classes. Although classes in Haskell have some properties of classes of object-oriented languages, they are more abstracted from this and much more powerful. Drawing an analogy with the Java language further, classes in Haskell are nothing more than interfaces — only function declarations are written to the class, and implementations of these functions themselves will be made later.
    I would like to thank the user Darkus for reading and correcting all the flaws and inaccuracies in the article, as well as for good advice and help in writing.


    Function Polymorphism


    In Haskell, there are two types of polymorphism - parametric and special (in English they call the term “ad hoc”, which came from the Latin language).

    Parametric


    If you have ever programmed in the Haskell language, you have certainly come across examples of parametric polymorphism in functions length, head, teal, curry, uncurry, map, filter, .... What unites all these functions? Let's look at the implementation of some of them:
    length :: [a] -> Int
    length []	= []
    length (_:xs) 	= 1 + length xs
    head :: [a] -> a
    head (x:_) = x
    tail :: [a] -> [a]
    tail (_:xs) = xs
    curry :: ((a,b) -> c) -> (a -> b -> c)
    curry f x y = f (x,y)
    uncurry :: (a -> b -> c) -> ((a,b) -> c)
    uncurry g (x,y) = g x y


    In the first three functions of the input parameter type a, and have curryand uncurrystill band c. Instead of a specific data type ( Int, Bool, Char,...), typing is used. Java example:
    public class Test {...}


    A is also used here - an unknown data type that will be determined at compile time. Polymorphism is used when it is not known what type of data the parameter will be, but the operations that will be performed on it (ilm with it) are known.

    All identifiers of typical variables in the Haskell language must begin with a capital letter (for example a, abc, aA101), and all specific types (constructors) begin with a capital letter (for example String, Int, Node).

    The parameter acan accept any types, be it Int, String or even functions (for example length [f, g, h], where f, g, h are functions that have the same types). It is worth noting that the type bcan also accept any types, including the type of the parameter a.

    The type of function in the GHCi interpreter (and in Hugs) can always be found using the command :t, for example:
    Main>:t length
    length :: [a] -> Int


    Special (ad hoc)


    A synonym for special polymorphism is the term “overloaded functions”. This is a weaker type of polymorphism than parametric. Take for example the addition operator (function) (+). Expressions of this kind
    (+) 2 3 ->> 5
    (+) 2.5 3.85 ->> 6.35
    differ from expressions
    (+) True False
    (+) [1,2,3] [3,2,1]
    in that in the first case, numerical values ​​were used, and in the second, values ​​of type Bool and [Int]. The addition operator is not defined for non-numeric types. This is because this function has a type not
    (+) :: a -> a -> a
    like this
    (+) :: Num a => a -> a -> a. That is, there is a restriction on the type of input (and output) data.

    The restriction that is imposed on type a: Num a says that typeamust be an element of class Num. Such type classes are very important in Haskell, as they add additional protection against programming errors, and can also reduce the amount of code written at times. We will see this in the following example.

    The following binary tree types are given: A
    binary tree that can store any type of data
    data Tree1 a = Nil | Node1 a (Tree1 a) (Tree1 a)


    A binary tree that can store two elements, and each branch ends with a “leaf” with one element (and not Nil):
    data Tree2 a b = Leaf2 b | Node2 a b (Tree2 a b) (Tree2 a b)


    A binary tree that can store data of type String and also ends with a sheet:
    data Tree3 = Leaf3 String | Node3 String Tree3 Tree3


    And, let's say, there are two more kinds of lists:
    regular
    type Lst a = [a]

    and type “rope”
    data Rope a b = Nil | Twisted b (Rope b a)


    Having all these data structures, I would like to write a function sizethat, regardless of the data structure, would display either its depth (for trees), or length (for lists), or in a word - size. A naive solution would be to write a function for everyone:
    Hidden text
    sizeT1 :: Tree1 a -> Int --считаем кол-во узлов
    sizeT1 Nil		= 0
    sizeT1 (Node1 _ l r) 	= 1 + sizeT1 l + sizeT1 r
    sizeT2 :: (Tree2 a b) -> Int --считаем кол-во элементов
    sizeT2 (Leaf 2 _)	= 1
    sizeT2 (Node2 _ _ l r)	= 2 + sizeT2 l + sizeT2 r
    sizeT3 :: Tree3 -> Int --считаем длины элементов типа String
    sizeT3 (Leaf3 m)= length m
    sizeT3 (Node m l r)	= length m + sizeT3 l +sizeT3 r
    -- и для списков:
    sizeLst :: [a] -> Int
    sizeLst = length
    sizeRope :: (Rope a b) -> Int
    sizeRope Nil			= 0
    sizeRope (Twisted _ ls)	= 1 + sizeRope ls


    Now, if any other data structure appears, you will have to make a new function that will be suitable only for this structure. And we would like for one function to get the size of any data structure. As the function (+) had a restriction by the Num class, so we need to make the restriction by the Size class, and for this we need to create it first:
    class Size a where
    	size :: a -> Int


    It remains only to make instances (instances) of this class for specific values a(= for specific data types):
    instance Size (Tree1 a) where
    	size Nil		= 0
    	size (Node1 _ l r)	= 1 + size l + size r
    instance Size (Tree2 a b) where
    	size (Leaf2 _)		= 1
    	size (Node2 _ _ l r)	= 2 + size l + size r
    instance Size Tree3 where
    	size (Leaf3 m)	= length m
    	size (Node3 m l r)	= length m + size l + size r
    instance Size [a] where
    	size = length
    instance Size (Rope a b) where
    	size Nil	 	= 0
    	size (Twisted _ ls)	= 1 + size ls


    Done! Now if we write in GHCi :t size, we will see size :: Size a => a -> Int. We got what we wanted:
    size Nil ->> 0
    size (Node1 "foo" (Node1 "bar" Nil Nil) Nil) ->> 2
    size (Leaf 2 "foo") ->> 1
    size (Node3 "foo" (Node3 "bar" (Leaf3 "abc") (Leaf "cba")) (Leaf "tst")) ->> 15 --3*5
    size [1..5] ->> 5
    size (Rope 2 (Rope ‘a’ (Rope 5 Nil))) ->> 3


    Each instance of the class Sizeimplemented a function size, which is applicable only to values ​​of a certain type (= data structures). In this case, this function, like other predefined operators (+), (*), (-), is overloaded.

    But there are also disadvantages to such a decision. For example, we would like to know the number of elements in a paired list, i.e.
    size [(1,2), (3,4)] ->> 4
    size [('a','b'), ('c','d'), ('e','f')] ->> 6 

    It would be obvious to do the following:
    instance Size [(a,b)] where
    	size = (*2) . length


    But here we have a problem, because we already have an instance for the usual list ( instance Size [a] where), we can no longer use a different definition, because, as mentioned earlier, a type acan be any type, including (b,c), that is, . [a] == [(b,c)]To solve this problem, you can use Overlapping Instances (English overlapping class instances). This solution has its drawbacks (importing one of the modules can change the value of the program, can cause confusing errors, and so on).
    Now a little more about classes.

    Classes


    In the previous example, we passed the class Size, but here is the incomplete declaration of the class in Haskell: (tv = type variable):
    class Name tv where
    	сигнатура включающая переменную tv


    Incomplete because restrictions can also be imposed on the class declaration tv(for example, it tvmust be an element of the class Ord). The signatures can be as many as you like, but each of them must include (as an input and / or output parameter) the variable tv.
    Type classes are collections of types for which special functions are defined. Here are some (one of the most important) classes in Haskell:
    • Eq - class for the test for equality (inequality) of two data types (operations ==, / =)
    • Ord - a class for determining the type order, that is, which element is larger, which is smaller (operations>,> =, <, <=, min, max ...)
    • Enum - a class for types whose values ​​can be "counted" (for example, [1..10])
    • Bounded - the class is also used for types of the Enum class. Used to name the lower and upper bounds of a type.
    • Show - a class for types whose values ​​can be converted to a string (= can be represented as characters)
    • Read - a class for types whose values ​​can be converted from a string


    Only these classes can be automatically inherited by any type of data (= put in a section deriving). There is also a dependency between classes. For example, if we know how to compare two elements (class Ord), then we must be able to determine in advance whether one element is equal to another (class Eq). Indeed, to implement the operator ( >=), you must have an implemented operator (==). Therefore, we can say that the class Ord depends (inherits) from the class Eq. Here is a more detailed diagram of class dependency:


    Let's look at the equivalence class in Eqmore detail. As previously mentioned, this class must have two functions (==) and (/ =):
    class Eq a where
    	(==), (/=) :: a -> a -> Bool --так как сигнатуры (==) и (/=) одинаковы, их можно записать в 1 строчку
    	x /= y = not (x == y)
    	x == y = not (x /= y)


    The code shows that for any instance of the Eq class, it is enough to implement one of two functions. For example, for the Bool type (two types are equal only when they are both True or False) this is done as follows:
    instance Eq Bool where
    	(==) True True		= True
    	(==) False False 		= True
    	(==) _ _			= False 


    As you can see, the case True False and False True (as well as the last line) is not necessary to write, because inequality is the inverse of equality. If before that we had the case that the type is a cokretization of another type (for example, [Int] is an instance of the type [a]), now the type can be an instance of the class (for example, Bool is an instance of the Eq class). By the same analogy, we can write the equality functions of those binary trees that we used above. For example, for Tree2:
    instance (Eq a) => Eq (Tree2 a) where --накладываем условие на a, что этот тип может сравниваться
    	(==) (Leaf2 s) (Leaf2 t)		= (s==t)
    	(==) (Node2 s t1 t2) (Node2 t u1 u2)= (s==t) && (t1 == u1) && (t2 == u2)
    	(==) _ _				= False

    For (Tree3 a b)we will already have to impose a condition on both a, and on b, that is,
    instance (Eq a, Eq b) => Eq (Tree3 a b)

    Everything to the left of the symbol =>is called a context, everything to the right of this symbol must be either a base type (Int, Bool, ...), or type constructors (Tree a, [...], ...). The operator (==) is an overloaded function (ad-hoc polymorphism).

    Equality (==) is a type-dependent property that requires implementation for each type (such as the implementation of equality of binary trees). I would like to be able to compare two functions in this way:
    (==) fac fib ->> False --где fac - факториал числа, а fib - число Фибоначчи
    (==) (\x -> x+x) (\x -> 2*x) ->> True
    (==) (+2) (2+) ->> True

    This would require writing such code in Haskell:
    instace Eq (Int -> Int) where
    (==) f g = ... --где f и g являются какими-то функциями


    Is it possible to write such a function that would yield the result of the equality of any two functions (True or False)? Unfortunately, from the Church-Turing thesis it follows that the equality of any two functions cannot be determined. There is no algorithm that, for any two given functions, always and through a finite number of steps, decides whether the two functions are equal or not. Such an algorithm cannot be programmed in any programming language.

    Instead of implementing its function for equality, for example, binary trees, you can always put those classes that are inherited by this type automatically after the keyword deriving. That is, we could quite safely write:
    data Tree a = Nil | Node a (Tree a) (Tree a) deriving Eq


    This allows us not to write an instance of the class Eq ( instance Eq a => Eq (Tree a) where ...) with our own hands . But for any function, then you have to impose a condition on the variable athat this variable is an element of the class Eq. Sometimes, you still have to write these functions yourself, since the “automatic” comparison does not always work as we would like. For example for a binary tree
    data Tree3 a b = Leaf3 bl | Node3 a b (Tree3 a b) (Tree3 a b) deriving Eq

    The following function will be implemented (automatically):
    instance (Eq a, Eq b) => Eq (Tree3 a b) where
    (==) (Leaf3 q) (Leaf3 s) 			= (q==s)
    (==) (Node3 _ q t1 t2) (Node3 _ s u1 u2)	= (q==s)&&(t1==u1)&&(t2==u2)
    (==) _ _					= False 

    As we can see, the first argument of Node3 was omitted, which we would not want in some cases.

    Conclusion


    The informal difference between the two types of polymorphy:
    • parametric - the same code regardless of type.
    • ad-hoc - different code, although the same function names.

    Only registered users can participate in the survey. Please come in.

    What topic would you like to see the next article about Haskell?

    • 33.8% Error Handling 43
    • 18.8% Abstract Data Structures 24
    • 19.6% Recursion and recursion types 25
    • 3.9% Higher Order Functions 5
    • 23.6% Lambda calculus 30

    Also popular now: