Numeric Type Classes in Rust

Published on October 07, 2014

Numeric Type Classes in Rust

    Rust abstractions are different from the usual in OOP. In particular, instead of classes (classes of objects), classes of types called “trait” are used (not to be confused with trait from Scala, where impurities - mixin are hidden under this term ).
    Type classes are not unique to Rust; they are supported in Haskell, Mercury, Go, and can be implemented in a slightly perverse way in Scala and C ++.

    I want to show how they are implemented in Rust using the example of dual numbers and analyze individual non-trivial (or poorly worked out) moments.

    Interfaces of numeric types are rather cumbersome, and I will only insert fragments of code here. All code is available on github (Update: a working version is available on crates.io)
    Most of the interfaces implemented here have the status of experemental or unstable and are likely to change. I will try to keep the code and text relevant.

    Rust supports operations overloading, but, unlike C ++, operations have a synonym method with a common alphabetic name. So a + b can be written a.add (b) , and to override the operation '+' you just need to implement the add method.


    What is a type class?

    A type class is often compared to an interface. Indeed, it determines what can be done with some data types, but these operations must be implemented separately. Unlike the interface, the implementation of the type class for a certain type does not create a new type, but lives with the old one, although the old type about the implemented interface may not know anything. In order for the code using this interface to work with this data type, it is not necessary to edit the data type, nor the interface, nor the code - it is enough to implement an interface for the type.

    Unlike an OOP-style interface, a type class can refer to a type several times. In Rust, this link is called Self., in Haskell it can be called almost anything. For example, in Haskell, the '+' method requires that both arguments have exactly the same type and expect an object of exactly the same type to be returned (in Rust, in the Add type class these types can be different - in particular, Duration and Timespec can be added). The type of the return value is also important - in the arguments the type from the class may not be used at all, and the compiler decides which implementation of the method to use based on which type to get. For example, Rust has a Zero type class and code
    let float_zero:f32 = Zero::zero();
    let int_zero:i32 = Zero::zero();
    assigns different zeros to variables of different types.

    How it is done in Rust

    Description

    A type class is created with the trait keyword, followed by a name (possibly with parameters, something like in C ++) and a list of methods. A method may have a default implementation, but such an implementation does not have access to type internals and should use other methods (for example, express inequality ( ! = , Ne ) through negating equality).
    pub trait PartialEq {
        /// This method tests for `self` and `other` values to be equal, and is used by `==`.
        fn eq(&self, other: &Self) -> bool;
        /// This method tests for `!=`.
        #[inline]
        fn ne(&self, other: &Self) -> bool { !self.eq(other) }
    }
    Here is a description of the type class from the standard library, which includes types that can be compared for equality.
    The first argument, called self or & self of each method is an analog of this from the classic OOP. The presence of an ampersand indicates a method of transferring ownership of an object and, unlike C ++, does not affect the ability to change it (passing by reference or by value). The right to modify the object gives explicit indication mut.
    The second argument must be of the same type as the first - this is indicated by Self .
    Later we will encounter the fact that this argument is not necessary - it turns out something like static methods, although in reality they still remain “dynamic” - the dispatching is carried out according to other parameters or according to the type of expected result.
    pub trait Add<RHS,Result> {
        /// The method for the `+` operator
        fn add(&self, rhs: &RHS) -> Result;
    }
    The '+' operation in Rust is not required to require the same types of arguments and results. For this, the type class is made template: the template arguments are the types of the second argument and result.
    For comparison, in Haskell, type classes are not parameterized (except by the type itself), but can contain not separate types, but pairs, triples and other types of types (extension MultiParamTypeClasses), which allows you to do similar things. They promise to add support for this feature to the release in Rust.
    It is worth paying attention to the syntactical difference from C ++ - the description of an entity in Rust (in this case, a type class) is itself a template, and in C ++ a template is declared separately using a keyword. The C ++ approach, in some ways, is more logical, but more difficult to read.
    Consider another Zero example :
    pub trait Zero: Add<Self, Self> {
        /// Returns the additive identity element of `Self`, `0`.
        ///
        /// # Laws
        ///
        /// ```{.text}
        /// a + 0 = a       ∀ a ∈ Self
        /// 0 + a = a       ∀ a ∈ Self
        /// ```
        ///
        /// # Purity
        ///
        /// This function should return the same result at all times regardless of
        /// external mutable state, for example values stored in TLS or in
        /// `static mut`s.
        // FIXME (#5527): This should be an associated constant
        fn zero() -> Self;
        /// Returns `true` if `self` is equal to the additive identity.
        #[inline]
        fn is_zero(&self) -> bool;
    }
    You can see the inheritance in the description of this type class - to implement Zero, you must first implement Add (parameterized by the same type). This is the usual inheritance of interfaces without implementation. Multiple inheritance is also allowed, for this the ancestors are listed through '+'.
    Pay attention to the fn zero () -> Self method ; . This can be considered as a static method, although later we will see that it is somewhat more dynamic than static methods in OOP (in particular, they can be used to implement "factories").

    Implementation

    Consider the Add implementation for complex numbers:
    impl<T: Clone + Num> Add<Complex<T>, Complex<T>> for Complex<T> {
        #[inline]
        fn add(&self, other: &Complex<T>) -> Complex<T> {
            Complex::new(self.re + other.re, self.im + other.im)
        }
    }
    Complex numbers are a generalized type parameterized by the representation of a real number. The implementation of addition is also parametrized - it is applicable to complex numbers over various variants of real ones, if an interface is implemented for these real numbers. In this case, the required interface is too rich - it assumes the implementation of Clone (allowing you to create a copy) and Num (containing basic operations on numbers, in particular the inheriting Add ).

    Deriving

    If you are too lazy to write implementations of simple standard interfaces yourself, this routine work can be passed to the compiler using the deriving directive.
    #[deriving(PartialEq, Clone, Hash)]
    pub struct Complex<T> {
        /// Real portion of the complex number
        pub re: T,
        /// Imaginary portion of the complex number
        pub im: T
    }
    Here, the developers of the library are asked to create an implementation of the PartialEq, Clone, and Hash interfaces if type T supports everything you need.
    Currently, implementation auto-generation is supported for classes of types Clone, Hash, Encodable, Decodable, PartialEq, Eq, PartialOrd, Ord, Rand, Show, Zero, Default, FromPrimitive, Send, Sync and Copy.

    Numeric Type Classes

    The std :: num module describes a large number of type classes associated with different properties of numbers.
    They can refer to some other traits - for comparison and memory operations (for example, Copy tells the compiler that this type can be copied byte-by-bit).
    I highlighted the interfaces that I implemented for dual numbers in the diagram.

    The implementation of dual numbers

    The data type is trivially arranged:
    pub struct Dual<T> {
      pub val:T,
      pub der:T
    }

    Unlike complex numbers from the standard library, I tried to implement the interface based on minimal assumptions. So the implementation of Add for me requires only the Add interface of the source type, and Mul - only Mul + Add.
    Sometimes this led to strange code. For example, Signed is not required to support Clone, and to return a copy of it for a positive dual in the abs method, we had to add it to zero
    impl<T:Signed> Signed for Dual<T> {
     fn abs(&self) -> Dual<T> {
       if self.is_positive() || self.is_zero() {
        self+Zero::zero() // XXX: bad implementation for clone
       } else if self.is_negative() {
        -self
       } else {
         fail!("Near to zero")
       }
     }
    }
    Otherwise, the compiler cannot trace ownership of this object.
    Note that the type Zero :: zero () is not explicitly set. The compiler guesses what it should be by trying to add self , which implements Num , and therefore Add <Self, Self> . But the type Self at the time of compilation is not yet known - it is set by the template parameter. So, the zero method is dynamically located in the table of Num implementation methods for Dual <T> !

    I’ll also note an interesting trick, how Float implements integer constants characterizing the whole type. That is, they cannot receive an instance at the input (it may not be in the right context), but must be an analog of static methods. The same problem often arises in Haskell, and to solve it, a fake parameter with the required type is added to such methods. Haskell language is lazy and you can always pass the error "Not used" as an unused argument . In the strict Rust language, this technique does not work, and creating an object for this can be too expensive. Therefore, a workaround is used - None of type Option <Self> is passed
    #[allow(unused_variable)]
    impl<T:Float> Float for Dual<T> {
     fn mantissa_digits(_unused_self: Option<Dual<T>>) -> uint {
      let n: Option<T> = None;
      Float::mantissa_digits(n)
     }
    }
    Since the parameter is not used, by default the compiler generates a warning. There are two ways to suppress it - starting the parameter name with the '_' character or using the # [allow (unused_variable)] directive.