Haskell Range Control Arithmetic with Type-Level Literals

Functional programming (FP), as you know, contributes to the writing of reliable (error-free) code.

Clearly, this is a maxim. There are no programs without errors. However, FP combined with strict static typing and the development of the type system allows, to a large extent, to identify the inevitable errors of the programmer at the compilation stage. I'm talking about Haskell, although it probably applies to OCaml too.

However, if we set ourselves the goal of writing reliable code, we immediately find that the possibilities of Haskell are not unlimited here. Not everything that exists for this purpose (building secure code) in other languages ​​is easily implemented in Haskell. It would be good if they corrected me here, but, alas.

First of all, of course, it is worth paying attention to the Ada language, specially developed for reliable code writing (to a lesser extent, its related Pascal). And although the ideology of Ada, in my opinion, has long been outdated, the syntax draws fossils from the 80s, and some ideas that supposedly increase the security of the code now cause a grin. For all that, in Ada there is a developed system of static and dynamic data verification for given conditions ( data validation , constraint validation ). Simply put, every time a variable is updated, the compiler can add to the output code the execution of the test we specified. The simplest and most common case is range validation.- the value goes beyond the specified limits. There is such control in Pascal. Without claiming to replace Ada (it is the standard among the military, in avionics, etc.), let's try to approach the security standards of Ada, making, for starters, range validation in Haskell. Obviously, you need to overload the arithmetic functions (this is at least) of the Num class , setting the range control in them and throwing exceptions when going beyond it.
Immediately we run into a problem - in arithmetic functions of the Num class, of the form
(+) :: a -> a -> a

There is no place to set the boundaries of the checked range. What can be done?

Option 1 . Create a record of three numbers - the boundaries of the range and the value being processed, and determine (instantiate) for such a Num record . The disadvantage is obvious. It would be enough for us to store the boundaries of the ranges in one instance for each type, and not for each value (which may be 100,000).

Option 2 . We can define hard-bound checks in a class generated using Template Haskell. This option is quite possible. With TH, we can do everything. But, let's try to find a way to set the range limits at compile time in some other way.

Option 3. Relatively recently, starting with GHC 7.8, if I do not confuse, an opportunity called Type-Level Literals , i.e. setting constants in the description of types, which, moreover, can be used in functions.

Let's try to implement range validation using this mechanism.

For a controlled number, we will not economically start the full data type, we write that requires less overhead during the execution of newtype.

newtype RgVld (lo :: Nat) (hi :: Nat) a = RgVld { unRgVld :: a }
        deriving (Eq, Ord) 

RgVld is short for range validation . And lo and hi , which are of type Nat (defined in GHC.TypeLits ), are the same constants in the definition of a type - range boundaries. Here they are integers (they are converted to Integer, but they cannot be negative, alas). There are also string ones - but to describe the restrictions with a string and convert them into strings in runtime - no, we are not writing on a script.

Actually, this type is the essence of the implementation of range validation . Now you can create for it:
instance (KnownNat lo, KnownNat hi, Num a, Ord a, Show a) => Num (RgVld lo hi a) where
  (RgVld l) + (RgVld r) = chkVld "(+)" $ RgVld $ l+r
  (RgVld l) - (RgVld r) = chkVld "(-)" $ RgVld $ l-r
  (RgVld l) * (RgVld r) = chkVld "(*)" $ RgVld $ l*r
  fromInteger n = chkVld "fromInteger" $ RgVld $ fromInteger n
  abs = id
  signum (RgVld v) = RgVld $ signum v

The KnownNat class is also defined in GHC.TypeLits . Because the checks of the resulting values ​​are the same, for them you can create a helper class
class CheckValidation a where  
    chkVld:: String -> a -> a
(which may be suitable for other types of checks) with the only function chkVld , which will pass values ​​that fall into the range through itself and throw an exception for values ​​outside the range. Its first argument is a substring in the exception message showing the function that caused the exception.

instance (KnownNat lo, KnownNat hi, Num a, Ord a, Show a) => CheckValidation (RgVld lo hi a) where
    chkVld whr r@(RgVld v) = let lo' = natVal (Proxy :: Proxy lo)
                                                   hi' = natVal (Proxy :: Proxy hi) in
                                              if v < fromInteger lo' || v > fromInteger hi' 
                                              then throw $ OutOfRangeException $ "out of range [" ++ show lo' ++ " .. " ++
                                                  show hi' ++ "], value " ++ show v ++ " in " ++ whr
                                              else r

Naturally, you must remember to create the exception class itself:

data OutOfRangeException = OutOfRangeException String
    deriving Typeable
instance Show OutOfRangeException where
   show (OutOfRangeException s) = s
instance Exception OutOfRangeException

For the RgVld type, we also implement the Show , Read , and Bounded classes , which are quite simple but obviously useful in this case .

instance (KnownNat lo, KnownNat hi, Show a) => Show (RgVld lo hi a) where
   show (RgVld v) = show v
instance (KnownNat lo, KnownNat hi,  Num a, Ord a, Show a, Read a) => Read (RgVld lo hi a) where
   readsPrec w = \s -> case readsPrec w s of
                    [] -> []
                    [(v,s')] -> [(chkVld "readsPrec" $ RgVld v, s')]
instance (KnownNat lo, KnownNat hi,  Num a, Ord a, Show a) => Bounded (RgVld lo hi a) where
   minBound = fromInteger $ natVal (Proxy :: Proxy lo)
   maxBound = fromInteger $ natVal (Proxy :: Proxy hi)

Because it was about the Ada language, which is strongly associated with the “military”, we will assume that our program controls ICBMs with multiple warheads. Suppose they are numbered from one, and there are 20 in total, and each, of course, carries an atomic bomb - A-bomb, "H-bomb." Reduce to ab. And here’s an auxiliary function for creating an H-bomb:

ab:: Int -> RgVld 1 20 Int
ab = RgVld

The variable is the bomb number in the ICBM, in the range from 1 to 20. If the missile is upgraded, then the number 20 will need to be changed only in this auxiliary function. Check it out.

*RangeValidation> ab 2 + ab 3
5
*RangeValidation> ab 12 + ab 13
*** Exception: out of range [1 .. 20], value 25 in (+)
*RangeValidation> 

- here is the range control in Haskell.

An attentive reader may object: “Usually we do not add two numbers within the range, but add the offset to the range type.” Actually, this is not important for this implementation - it is not the input values ​​of the operations that are checked, but only the output, so it will not cause an interrupt.

*RangeValidation> ab 20 + ab 0
20
*RangeValidation> 

But, it seems, like, and it doesn’t turn out beautifully. We introduce an additional class
class Num' a b where
  (+.) :: a -> b -> a
  (-.) :: a -> b -> a
  (*.) :: a -> b -> a
which implements arithmetic with operands of different types, and make RgVld an instance of it by defining
instance (KnownNat lo, KnownNat hi, Num a, Ord a, Show a) => Num' (RgVld lo hi a) a where
  (RgVld l) +. r = chkVld "(+.)" $ RgVld $ l+r
  (RgVld l) -. r = chkVld "(-.)" $ RgVld $ l-r
  (RgVld l) *. r = chkVld "(*.)" $ RgVld $ l*r

The functions (+.), (-.), (*.) Are similar to the usual ones, but they perform actions with a range type and a regular number. Example:
*RangeValidation> ab 5 -. (3 :: Int)
2
*RangeValidation> 
- yes, the type of number will have to be specified explicitly if it is a constant.

Naturally, the range type does not have to be integer. Create an auxiliary function to determine the fuel level.

fuel:: Double -> RgVld 0 15 Double
fuel = RgVld

And check the operation of the range type when refueling:

*RangeValidation> fuel 4.6 + fuel 4.5
9.1
*RangeValidation> fuel 9.1 + fuel 6
*** Exception: out of range [0 .. 15], value 15.1 in (+)
*RangeValidation> 
- Oh no no no. Poured!

Unfortunately, "due to" the limitations of the applied "technology" - Type-Level Literals, the range is still specified by integers. Maybe the authors of GHC will improve it (although, in general, it was conceived by them somewhat for another). In the meantime, we will be happy with what we met.

Full example code:

{-# LANGUAGE DataKinds, KindSignatures, ScopedTypeVariables, MultiParamTypeClasses, FlexibleInstances #-}
{-# LANGUAGE DeriveDataTypeable #-}
module RangeValidation where
import Data.Proxy
import GHC.TypeLits
import Data.Typeable
import Control.Exception
data OutOfRangeException = OutOfRangeException String
    deriving Typeable
instance Show OutOfRangeException where
   show (OutOfRangeException s) = s
instance Exception OutOfRangeException
class CheckValidation a where  
    chkVld:: String -> a -> a
instance (KnownNat lo, KnownNat hi, Num a, Ord a, Show a) => CheckValidation (RgVld lo hi a) where
    chkVld whr r@(RgVld v) = let lo' = natVal (Proxy :: Proxy lo)
                                 hi' = natVal (Proxy :: Proxy hi) in
                             if v < fromInteger lo' || v > fromInteger hi' 
                             then throw $ OutOfRangeException $ "out of range [" ++ show lo' ++ " .. " ++
                                          show hi' ++ "], value " ++ show v ++ " in " ++ whr
                             else r
newtype RgVld (lo :: Nat) (hi :: Nat) a = RgVld { unRgVld :: a }
        deriving (Eq, Ord)
instance (KnownNat lo, KnownNat hi, Num a, Ord a, Show a) => Num (RgVld lo hi a) where
  (RgVld l) + (RgVld r) = chkVld "(+)" $ RgVld $ l+r
  (RgVld l) - (RgVld r) = chkVld "(-)" $ RgVld $ l-r
  (RgVld l) * (RgVld r) = chkVld "(*)" $ RgVld $ l*r
  fromInteger n = chkVld "fromInteger" $ RgVld $ fromInteger n
  abs = id
  signum (RgVld v) = RgVld $ signum v
infixl 6 +.,-.
infixl 7 *.
class Num' a b where
  (+.) :: a -> b -> a
  (-.) :: a -> b -> a
  (*.) :: a -> b -> a
--  (/.) :: a -> b -> a
instance (KnownNat lo, KnownNat hi, Num a, Ord a, Show a) => Num' (RgVld lo hi a) a where
  (RgVld l) +. r = chkVld "(+.)" $ RgVld $ l+r
  (RgVld l) -. r = chkVld "(-.)" $ RgVld $ l-r
  (RgVld l) *. r = chkVld "(*.)" $ RgVld $ l*r
instance (KnownNat lo, KnownNat hi, Show a) => Show (RgVld lo hi a) where
   show (RgVld v) = show v
instance (KnownNat lo, KnownNat hi,  Num a, Ord a, Show a, Read a) => Read (RgVld lo hi a) where
   readsPrec w = \s -> case readsPrec w s of
                    [] -> []
                    [(v,s')] -> [(chkVld "readsPrec" $ RgVld v, s')]
instance (KnownNat lo, KnownNat hi,  Num a, Ord a, Show a) => Bounded (RgVld lo hi a) where
   minBound = fromInteger $ natVal (Proxy :: Proxy lo)
   maxBound = fromInteger $ natVal (Proxy :: Proxy hi)
-- examples   
ab:: Int -> RgVld 1 20 Int
ab = RgVld
fuel:: Double -> RgVld 0 15 Double
fuel = RgVld 

Also popular now: