Verilog. Digital filter on RAM

    What if you need to place a large digital filter on an FPGA? And if the board is already divorced? Is the iron old? Is there little space left in the project? In this topic, one of the possible implementations of the digital FIR filter on the Altera Cyclone II EP2C15 FPGA will be considered. In fact, this is a continuation of this theme from the sandbox.
    It will be described how to make a shift register on RAM, while reducing LE costs, and how to get a digital filter from this.


    How does a filter work? The basic operation is accumulation multiplication. The filter coefficients are multiplied with the values ​​in the shift register and summed. All if you do not go into details. The necessary ingredients are voiced, now let's get down to business.

    Accumulation Multiplication

    We believe that we have already decided on the desired type of frequency response of the filter, with the order of the filter, got its coefficients, we know the speed of the input data. Even better if you parameterize these parameters in any way. So try to do it. Here's my implementation of accumulation multiplication:
    module mult
    #(parameter COEF_WIDTH = 24, parameter DATA_WIDTH = 16, parameter ADDR_WIDTH = 9, parameter MULT_WIDTH = COEF_WIDTH + DATA_WIDTH)
        (
        input   wire                                    clk,
        input   wire                                    en,
        input   wire            [ (ADDR_WIDTH-1) :  0 ] ad,
        input   wire    signed  [ (COEF_WIDTH-1) :  0 ] coe,
        input   wire    signed  [ (DATA_WIDTH-1) :  0 ] pip,
        output  wire    signed  [ (DATA_WIDTH-1) :  0 ] dout
        );
    wire signed [(MULT_WIDTH-1) :  0 ] mu = coe * pip;
    reg signed [ (MULT_WIDTH-1) :  0 ] rac = {(MULT_WIDTH){1'b0}};
    reg signed [ (DATA_WIDTH-1) :  0 ] ro = {DATA_WIDTH{1'b0}};
    assign dout = ro;
    always @(posedge clk)
    if(en)
        if(ad == {ADDR_WIDTH{1'b0}})
        begin
            rac <= mu;
            ro <= rac[ (MULT_WIDTH-2) -: (DATA_WIDTH) ];
        end
        else
            rac <= rac + mu;
    endmodule
    


    Why is ADDR_WIDTH = 9? Because the order of the filter is chosen equal to 2 ^ 9 = 512. Firstly, this is done for ease of obtaining frequency from a divider or PLL. Secondly, I had the opportunity to increase the frequency by 512 times, because the sample rate was 16 kHz. But more on that later. Of course not very readable due to parameterization, but you can figure it out.

    Filter coefficients

    Read the topic from the sandbox at the link that was at the top? Was there a RAM pattern? This template does not suit us anymore. I couldn’t get that RAM to read / write in one clock cycle. Maybe everything is from not knowledge, but the filter coefficients are now stored in this module:

    module coef
    #(parameter DATA_WIDTH=24, parameter ADDR_WIDTH=9)
        (
        input wire [(DATA_WIDTH-1):0] data,
        input wire [(ADDR_WIDTH-1):0] addr,
        input wire we,
        input wire clk,
        output wire [(DATA_WIDTH-1):0] coef_rom
        );
    reg [DATA_WIDTH-1:0] rom[2**ADDR_WIDTH-1:0];
    reg [(DATA_WIDTH-1):0] data_out;
    assign coef_rom = data_out;
    initial
    begin
      rom[0  ] = 24'b000000000000000000000000;
      rom[1  ] = 24'b000000000000000000000001;
    //new year tree 
      rom[510] = 24'b000000000000000000000001;
      rom[511] = 24'b000000000000000000000000;
    	end
    always @ (posedge clk)
    begin
        data_out <= rom[addr];
        if (we)
            rom[addr] <= data;
    end
    endmodule


    Approximately 508 coefficients were omitted so as not to catch up with despondency. Why 24 bits, not 16? I like the spectrum better. But this is not important. Changing the odds is not a long task. In addition, you can attach the memory initialization file with the $ readmemb or $ readmemh script after the initial begin.

    Shift register

    This is actually the main reason why I write this. Maybe someone will think to himself that he already knew. Maybe something else will think about the author of the good, something about the wheel there.
    Here it will be written how to make a shift register with RAM using a wrapper. Probably everyone read in a handbook on their FPGA that RAM can work as a shift register. How? I did it, there’s nothing complicated about it. But why? The Cyclone family is positioned as devices with a memory bias "devices feature embedded memory structures to address the on-chip memory needs of FPGA designs." And you need to be able to use this memory. The problem is solved in two of this: RAM and the wrapper. RAM is similar to the case with storing filter coefficients:

    module pip
    #(parameter DATA_WIDTH=16, parameter ADDR_WIDTH=9)
        (
        input wire [(DATA_WIDTH-1):0] data,
        input wire [(ADDR_WIDTH-1):0] read_addr, write_addr,
        input wire we,
        input wire clk,
        output wire [(DATA_WIDTH-1):0] pip_ram
        );
    reg [DATA_WIDTH-1:0] ram[2**ADDR_WIDTH-1:0];
    reg [(DATA_WIDTH-1):0] data_out;
    assign pip_ram = data_out;
    always @ (posedge clk)
    begin
        data_out <= ram[read_addr];
        if (we)
            ram[write_addr] <= data;
    end
    endmodule


    The only thing is that without initializing RAM, it is automatically filled with zeros. By the way, this technique can be used when recording filter coefficients, if they are less than 2 ^ N.
    Now the wrapper itself:

    module upr
    #(parameter COEF_WIDTH = 24, parameter DATA_WIDTH = 16, parameter ADDR_WIDTH = 9) 
        (
        input wire                          clk,
        input wire                          en,
        input wire  [ (DATA_WIDTH-1) :  0 ] ram_upr,
        input wire  [ (DATA_WIDTH-1) :  0 ] data_in,
        output wire [ (DATA_WIDTH-1) :  0 ] upr_ram,
        output wire                         we_ram,
        output wire [ (ADDR_WIDTH-1) :  0 ] adr_out
        );
    assign upr_ram = (r_adr == {ADDR_WIDTH{1'b0}}) ? data_in : ram_upr;
    assign we_ram = (r_state == state1) ? 1'b1 : 1'b0;
    assign adr_out = r_adr;
    reg [  2 :  0 ] r_state = state0;
    localparam      state0 = 3'b001,
                    state1 = 3'b010,
                    state2 = 3'b100;
    reg [ (ADDR_WIDTH-1) :  0 ] r_adr = {ADDR_WIDTH{1'b0}};
    always @(posedge clk)
    if(en)
    begin
        case(r_state)
            state0:
                r_state <= state1;
            state1:
                r_state <= state1;
            state2:
                begin
                end
        endcase
    end
    always @(posedge clk)
    case(r_state)
        state0:
            r_adr <= {ADDR_WIDTH{1'b0}};
        state1:
            r_adr <= r_adr + 1'b1;
        state2:
            begin
            end
    endcase
    endmodule

    The same address is supplied to RAM with coefficients and a shift register. By feedback through RAM from the shift register, the previous value is transmitted to the module, which is recorded at the current address. Thus, the shift is carried out not in one cycle, but for each one value. An input word is written to each zero address.
    Why am I persistently using a state machine, even though some states are not involved? We recall what was written by reference at the very beginning. Now this module works twice as fast, which means, all other things being equal, it is also idle half the time. Theoretically, this half can be occupied with something. This can be a conversion of filter coefficients for adaptive filtering, or the operation of a second filter (something like a time slot). There is nothing here and FSM is not needed here, but I still left this atavism. It’s always easier to remove FSM than to enter it.

    Total

    Here is the top-end file that came out of shimantik:

    module filtr_ram(
    	CLK,
    	D_IN,
    	MULT
    );
    input	CLK;
    input	[15:0] D_IN;
    output	[15:0] MULT;
    wire	SYNTHESIZED_WIRE_13;
    wire	[15:0] SYNTHESIZED_WIRE_1;
    wire	[8:0] SYNTHESIZED_WIRE_14;
    wire	SYNTHESIZED_WIRE_4;
    wire	[15:0] SYNTHESIZED_WIRE_15;
    wire	SYNTHESIZED_WIRE_6;
    wire	[0:23] SYNTHESIZED_WIRE_8;
    wire	[23:0] SYNTHESIZED_WIRE_11;
    assign	SYNTHESIZED_WIRE_4 = 1;
    assign	SYNTHESIZED_WIRE_6 = 0;
    assign	SYNTHESIZED_WIRE_8 = 0;
    pip	b2v_inst(
    	.we(SYNTHESIZED_WIRE_13),
    	.clk(CLK),
    	.data(SYNTHESIZED_WIRE_1),
    	.read_addr(SYNTHESIZED_WIRE_14),
    	.write_addr(SYNTHESIZED_WIRE_14),
    	.pip_ram(SYNTHESIZED_WIRE_15));
    	defparam	b2v_inst.ADDR_WIDTH = 9;
    	defparam	b2v_inst.DATA_WIDTH = 16;
    upr	b2v_inst1(
    	.clk(CLK),
    	.en(SYNTHESIZED_WIRE_4),
    	.data_in(D_IN),
    	.ram_upr(SYNTHESIZED_WIRE_15),
    	.we_ram(SYNTHESIZED_WIRE_13),
    	.adr_out(SYNTHESIZED_WIRE_14),
    	.upr_ram(SYNTHESIZED_WIRE_1));
    	defparam	b2v_inst1.ADDR_WIDTH = 9;
    	defparam	b2v_inst1.COEF_WIDTH = 24;
    	defparam	b2v_inst1.DATA_WIDTH = 16;
    coef	b2v_inst3(
    	.we(SYNTHESIZED_WIRE_6),
    	.clk(CLK),
    	.addr(SYNTHESIZED_WIRE_14),
    	.data(SYNTHESIZED_WIRE_8),
    	.coef_rom(SYNTHESIZED_WIRE_11));
    	defparam	b2v_inst3.ADDR_WIDTH = 9;
    	defparam	b2v_inst3.DATA_WIDTH = 24;
    mult	b2v_inst5(
    	.clk(CLK),
    	.en(SYNTHESIZED_WIRE_13),
    	.ad(SYNTHESIZED_WIRE_14),
    	.coe(SYNTHESIZED_WIRE_11),
    	.pip(SYNTHESIZED_WIRE_15),
    	.dout(MULT));
    	defparam	b2v_inst5.ADDR_WIDTH = 9;
    	defparam	b2v_inst5.COEF_WIDTH = 24;
    	defparam	b2v_inst5.DATA_WIDTH = 16;
    endmodule


    You can immediately see what can be corrected to become more beautiful.
    Now again about what happened. The main minus is this full serial filter. That is, the frequency of the filter must be raised 2 ^ (ADDR_WIDTH) times relative to the speed of the input data. This problem can be solved if the impulse response of the filter is symmetric, but the shift register RAM will have to be divided into two modules, into which 2 addresses will be sent, the values ​​from RAM will be added and multiplied in the mult module, which will have to add another input. Then the frequency will need to be raised 2 ^ (ADDR_WIDTH-1) times.

    Sources and project in Quartus 9.0
    ifolder.ru/27556340

    Also popular now: