Functional languages ​​in hardware development

    Functional languages, as a rule, are not very suitable for low-level programming, although they are used for code generation.

    Project Examples
    generation of safe C code (used by Kaspersky Lab) Ivory , support for reactive programming in Arduino , and so on Atom , Ion

    But if you go down even lower to the hardware level, then suddenly the FP is very useful. After all, the block of combinatorial logic is nothing more than a function of the values ​​of the incoming signals to the values ​​of the outgoing ones, and for consistent logic it is enough to add an old and new state to the parameters and result.

    When I first studied Haskell, I joined one heated discussion "on what is better to simulate an RS-trigger." I immediately noticed that the language I freshly studied solves all the problems that pop up in this discussion.

    Modeling involves observing the evolution of a model’s state over time, but there is no mutable state in Haskell as such. For that there are lazy lists that turn into “horizontal time”.

    What time?
    This ridiculous term was coined by one of the participants, who was very surprised that it was more convenient for me to put the entire evolution of one signal on the line, and the next evolution the whole evolution of the other signal (though on large models this format will lead to serious memory overruns and it’s better to make efforts to print the result as in ordinary languages ​​- data output is the most difficult in Haskell). In other matters, he even liked this format, since it is more like signals on an oscilloscope, compared to printing the values ​​of all signals in one line for each clock cycle.

    An easy way to simulate signals is to present them with lists of values ​​at any given time. If one signal is equal to another with an offset of one quantum in time, we simply add 0 to the beginning of the list:

    delay s = 0:s
    

    Or so
    delay = 0:
    


    You can create your own type for signals - this is more efficient, safer and more correct, but for simplicity we will restrict ourselves to using simple lists for now.

    data Signal v = S v (Signal v)
    delay v s = S v s
    

    If accurate modeling of the operating time is required, then the signal can be represented by a list of pairs (time interval, signal value) and a representation of transient values ​​can be provided.


    An RS trigger is two NOR nodes connected mutually recursively. This system has two stable states in which the output of one NOR is one and the other is zero. By supplying a unit to the second input of one of the NOR nodes, one can switch states.

    Generally speaking, an RS trigger is an asynchronous circuit. But for simplicity of the example, we will simulate it as synchronous, which is not entirely true (even choosing a short “measure” size is difficult to simulate transients, it is better to use a different representation of the signal).

    nor '_' '_' = '~'
    nor _ _ = '_'
    rs r s = (q, nq)
      where 
        q = '_' : zipWith nor r nq
        nq = '_' : zipWith nor q s
    main = let
       r = "~_______"
       s = "___~~___"
       (q,nq) = rs r s
      in do
          print r
          print s
          print q
          print nq
    

    Haskell Quick Reference
    The names of variables (more precisely constants, since they cannot be changed in the limit of the scope) and functions begin with a small letter and consist of alphanumeric characters or consist of special characters and do not start with ':'.

    Capital letters (or with ':') begin with the names of types, constructors (we can assume that these are the names of constants in enumerations) and the names of modules.

    (:) - list constructor. Creates a new list, adding one element to the old one at the beginning.
    0 : [1,2,3,4,5]
    equivalent to [0,1,2,3,4,5]
    Strings in Haskell are represented as a list of characters. “1234” means the same as ['1', '2', '3', '4']

    zip - turns two lists into a list of pairs.

    zip [1,2,3,4] "1234"
    buden equal
    [(1,'1'),(2,'2'),(3,'3'),(4,'4')]


    zipWith applies a function to items from two lists

    zipWith (+) [1,2,3,4] [1,3,5,7]
    will calculate the element-wise sum of lists [2,5,8,11]
    zip is expressed through zipWith
    zip = zipWith (,)
    


    zip3 and zipWith3 work similarly, but for three lists.

    scanl applies a function to each element of the cumulative list. Its type (signature) is described as follows:
    scanl :: (b -> a -> b) -> b -> [a] -> [b]

    The first argument to scanl is a function of two arguments, the second is the initial value of the accumulator, and the third is the input list.
    scanl (+) 0 [1,2,3,4]
    calculate a list of partial sums: [0,1,3,6,10]

    ($) - postfix record of applying the function to the argument.
    f $ x = f x

    It is often used to write fewer brackets:
    fx $ gy is equivalent to fx (gy)

    The notation \ xy -> fyx means an anonymous function (also called closure) with parameters x and y.

    Next, there are several obscure terms. I hope they do not scare the reader. Even if this description is too complicated, it will be easy to figure out how to use these functions using examples.

    fmap - “raises” the function from a single value to a function above the whole container. A container should be a functor, but almost everything is. In particular, such containers are signals storing values ​​for each moment in time. Lists are also such containers, although for historical reasons there is a special “map" function with the same functionality for them.
    liftA is the same as fmap, but for applicative functors (as evidenced by the letter 'A' in the name). Signals are also applicative functors, and lists are more complicated. Formally, lists are also applicative functors and liftA works with them in the expected way. But liftA2 and liftA3 behave unexpectedly, but this is a topic for a separate article.

    liftA2 and liftA3 “lift” functions from two and three arguments to functions from containers. They will work with signals, and for lists it is better to use zipWith and zipWith3.

    This approach makes it relatively easy to simulate quite complex schemes at the RTL level . The clock signal is clearly not present, but is implied wherever it is needed. Registers can be modeled with a delay or by explicitly providing a state in the parameters and the return value of the node code.

    macD r x y = acc
      where
        prods = zipWith (*) x y
        sums = zipWith (+) acc prods
        acc = 0 : zipWith (\r v -> if r == 1 then 0 else v) r sums
    macS r x y = scanl macA 0 $ zip3 r x y
      where
       macA acc (r,x,y) = if r == 1 then 0 else acc+x*y
    

    Two equivalent models of MAC operation (multiplication with addition) with a battery are described here. macD - using a recursive delayed signal, macS - using an explicitly described state.

    If the Haskell subset simulates synchronous hardware so well, why not synthesize HDL from it? There are several compiler extension projects that allow this: commercial Bluespec , free Lava, and CλaSH .

    Clash


    As an example, I want to consider Clash, since it can compile both in VHDL, in SystemVerilog, and in the good old Verilog (which attracts me because it is used not only in microelectronics :)

    The installation process is described in sufficient detail on the site. It should be taken carefully - firstly, compatibility with ghc-7.x is declared (that is, it may not work with 8.x), secondly, you should not try to run “cabal install clash” - this is an outdated package, you need to install clash-ghc ( "Cabal install clash-ghc --enable-documentation").

    The clash executable file (or clash.exe, depending on the OS) will be installed in the "~ / .cabal / bin" directory, it is better to add it to $ PATH.

    The main node from which clash starts compilation is called topEntity, which is a function from an incoming signal to an outgoing one (naturally, signals can be composite).

    For example, consider a single-bit adder:

    topEntity :: Signal (Bool, Bool) -> Signal (Bool, Bool)
    topEntity s = fmap (\(s1,s2) -> (s1 .&. s2, s1 `xor` s2)) s
    

    Whole file
    module ADD1 where
    import CLaSH.Prelude
    topEntity :: Signal (Bool, Bool) -> Signal (Bool, Bool)
    topEntity = fmap (\(s1,s2) -> (s1 .&. s2, s1 `xor` s2))
    

    fmap turns a function from a pair of logical quantities into a function of a signal. You can compile the file in verilog with the command “clash --verilog ADD1.hs”

    Result
    // Automatically generated Verilog-2001
    module ADD1_topEntity_0(a1
                           ,result);
      input [1:0] a1;
      output [1:0] result;
      wire [0:0] app_arg;
      wire [0:0] case_alt;
      wire [0:0] app_arg_0;
      wire [1:0] case_alt_0;
      wire [0:0] s1;
      wire [0:0] s2;
      assign app_arg = s1 & s2;
      reg [0:0] case_alt_reg;
      always @(*) begin
        if(s2)
          case_alt_reg = 1'b0;
        else
          case_alt_reg = 1'b1;
      end
      assign case_alt = case_alt_reg;
      reg [0:0] app_arg_0_reg;
      always @(*) begin
        if(s1)
          app_arg_0_reg = case_alt;
        else
          app_arg_0_reg = s2;
      end
      assign app_arg_0 = app_arg_0_reg;
      assign case_alt_0 = {app_arg
                          ,app_arg_0};
      assign s1 = a1[1:1];
      assign s2 = a1[0:0];
      assign result = case_alt_0;
    endmodule
    

    To work with the state, you can use the Moore and Miles machines. Consider the frequency divider, first using the Moore automaton.

    data DIV3S = S0 | S1 | S2
    div3st S0 _ = S1
    div3st S1 _ = S2
    div3st S2 _ = S0
    div3out S2 = True
    div3out _ = False
    topEntity :: Signal Bool -> Signal Bool
    topEntity = moore div3st div3out S0
    

    data is a Haskell construct describing a data type. In this program, we describe the type of DIV3S representing the state of our machine. Possible values ​​of this type are listed through '|' - S0, S1 and S3.
    div3st - a state function (the symbol "_" is used to mean an unused parameter, in this case, the value of the input signal).
    div3out - function from state to value of output signal.

    The moore library function creates a node based on these two functions and the initial state.

    Output systemverilog
    // Automatically generated SystemVerilog-2005
    module DIV3Moore_moore(w3
                          ,// clock
                          system1000
                          ,// asynchronous reset: active low
                          system1000_rstn
                          ,result);
      input logic [0:0] w3;
      input logic system1000;
      input logic system1000_rstn;
      output logic [0:0] result;
      logic [1:0] s1_app_arg;
      logic [1:0] s1;
      always_comb begin
        case(s1)
          2'b00 : s1_app_arg = 2'd1;
          2'b01 : s1_app_arg = 2'd2;
          default : s1_app_arg = 2'd0;
        endcase
      end
      // register begin
      logic [1:0] dout;
      always_ff @(posedge system1000 or negedge system1000_rstn) begin : DIV3Moore_moore_register
        if (~ system1000_rstn) begin
          dout <= 2'd0;
        end else begin
          dout <= s1_app_arg;
        end
      end
      assign s1 = dout;
      // register end
      always_comb begin
        case(s1)
          2'b10 : result = 1'b1;
          default : result = 1'b0;
        endcase
      end
    endmodule
    

    Same thing with the Miles assault rifle:

    data DIV3S = S0 | S1 | S2
    div3 S0 _ = (S1, False)
    div3 S1 _ = (S2, False)
    div3 S2 _ = (S0, True)
    topEntity :: Signal Bool -> Signal Bool
    topEntity = mealy div3 S0
    

    VHDL Output
    -- Automatically generated VHDL-93
    library IEEE;
    use IEEE.STD_LOGIC_1164.ALL;
    use IEEE.NUMERIC_STD.ALL;
    use IEEE.MATH_REAL.ALL;
    use std.textio.all;
    use work.all;
    use work.div3mealy_types.all;
    entity div3mealy_mealy is
      port(w2              : in boolean;
           -- clock
           system1000      : in std_logic;
           -- asynchronous reset: active low
           system1000_rstn : in std_logic;
           result          : out boolean);
    end;
    architecture structural of div3mealy_mealy is
      signal y         : boolean;
      signal result_0  : div3mealy_types.tup2;
      signal x         : unsigned(1 downto 0);
      signal x_app_arg : unsigned(1 downto 0);
      signal x_0       : unsigned(1 downto 0);
    begin
      result <= y;
      y <= result_0.tup2_sel1;
      with (x) select
        result_0 <= (tup2_sel0 => to_unsigned(1
                                             ,2)
                    ,tup2_sel1 => false) when "00",
                    (tup2_sel0 => to_unsigned(2,2)
                    ,tup2_sel1 => false) when "01",
                    (tup2_sel0 => to_unsigned(0,2)
                    ,tup2_sel1 => true) when others;
      -- register begin
      div3mealy_mealy_register : process(system1000,system1000_rstn)
      begin
        if system1000_rstn = '0' then
          x <= to_unsigned(0,2);
        elsif rising_edge(system1000) then
          x <= x_app_arg;
        end if;
      end process;
      -- register end
      x_app_arg <= x_0;
      x_0 <= result_0.tup2_sel0;
    end;

    Clash uses fixed-size vectors instead of lists, and most library functions are redefined to work with them. You can get to the standard list functions by adding a line to the file (or by executing in REPL) import qualified Data.List as L. After that, you can use the functions by explicitly specifying the prefix “L.” for instance

    *DIV3Mealy L> L.scanl (+) 0 [1,2,3,4]
    [0,1,3,6,10]

    Most familiar list functions work with vectors.

    *DIV3Mealy L> scanl (+) 0 (1 :> 2 :> 3 :> 4 :> Nil)
    <0,1,3,6,10>
    *DIV3Mealy L> scanl (+) 0 $(v [1,2,3,4])
    <0,1,3,6,10>

    But there are many subtleties, for details it is worth referring to the documentation .

    A guide with examples can be found here .

    The site has examples of projects on Clash, in particular, the implementation of the 6502 processor .

    Prospects


    Haskell is a very powerful language, and it can be used to develop DSL, for example, to develop a device software interface (with generation, in addition to HDL, also through Ivory drivers and emulators for virtualization systems), or a description of architecture and microarchitecture (with generation of LLVM backend, which optimizes for this microarchitecture).

    I take this opportunity to express my gratitude to yuripanchul for organizing the publication of the textbook “Digital Circuitry and Computer Architecture”, which I am reading now, and which encouraged me to write this article.

    Also popular now: