FPGA - my first steps

    Recently, I nevertheless made my first step towards FPGA and called you for myself . My fanatic passion for FPGAs and the idea that FPGAs are the best platform for creating any devices has become religious. My sect FPGAs preaches a complete rejection of microcontrollers, and a particularly extremist branch preaches a rejection of not only software processors , but generally sequential computing!

    As always, comprehension of truths was helped by solving real problems. In today's sermon, I would like to talk about the trials that fall to the lot of the young FPGA. Overcoming trials, we comprehend the truth. But there are questions that I did not find answers to. Therefore, I would very much like the Khabrovsk brothers - FPGAs with experience, to participate in the discussion, and extend a helping hand to their younger brothers.

    This article is for beginners. In it I will describe typical problems, questions, misconceptions, errors that may appear at the very beginning of training (because they appeared in me). However, the context of the article is limited by the fact that development is carried out on FPGAs from Altera in Quartus in Verilog .

    It’s hard to live without doing anything, but we are not afraid of difficulties!

    One of the reasons that many are not starting to learn Verilog right now is the lack of a real FPGA. Someone cannot order, because it’s expensive, and someone because they don’t know what to take (the choice was discussed in the previous article ). Someone FPGA is still traveling by mail.

    But in my developments, I came to the conclusion that I need the presence of a real FPGA already at the final stage of development, when I need to test the project “in hardware”. The thing is that most of the time I spend in debugging my code using simulators.

    Therefore, my advice: the absence of FPGAs is not a reason to stay inactive. Write and debug FPGA modules in simulators!

    Simulator for Verilog

    So, what amuse yourself with boring long working days (if they are)? Of course, we master FPGAs! But how can you get a whole development environment from Altera to work if it weighs 3 monthly Internet working limits? You can bring it to a flash drive! But if the subject of study is Verilog, then you can restrict yourself to a notepad, the IcarusVerilog compiler, and look at the result in GTK Wave.

    Try it now
    To get started in Windows, just download the iverilog-20130827_setup.exe installation file (development snapshot) from the link http://bleyer.org/icarus/ [11.2MB]

    Installation is straightforward . Now let's get ahead a bit: create a folder for the project and in it a couple of files with contents that are not yet clear:

    Module file with module testing code - bench.v
    `timescale 1ns / 100 ps
    module testbench();
    reg clk;
      clk <= 0;
      repeat (100) begin
    	clk <= 1;
    	clk <= 0; 

    The testbench test module is described in the bench.v file. A test signal source clk (meander) is created in it. Other modules will be created in separate files, or the logic can be tested first in this module, and then moved to a separate module. Then, instances of these modules will be added to the testbench module, where we will send test signals to their inputs and get results from them. From modules we can build a hierarchy, I think it is clear to everyone.

    BAT A file that compiles and simulates the main module by adding other modules from the current folder - makev.bat
    iverilog -o test -I./ -y./ bench.v
    vvp test

    After starting this file, we will see the text set in $ display on the screen (this is the debug output), the value of the signals and circuit registers will be in the test.vcd file. We click on the file and select the program for viewing - GTKWave (in my case D: \ iverilog \ gtkwave \ bin \ gtkwave.exe). Just a couple of clicks and we will see our clk.

    In practice, I create each new module in notepad and debug IcarusVerilog. The next step after such debugging is checking the modules in Quartus. Although Quartus also has its own simulator, but I use it less often. The reason is the simplicity of updating the code and viewing the result in IcarusVerilog: saved the changes in the file, launched BAT, clicked the “update” button in GTKWave - that's it! In ModelSim, this requires a little more movement, but it is also not bad, especially on complex structures.

    After the simulation, it's time to launch Quartus. But it’s too early to upload firmware to the FPGA. You need to make sure that the divine computer correctly understood which circuit we want to get, setting out our thoughts in the form of Verilog.

    The difference between simulation and work in real hardware

    At first I, like a blind kitten, banged my head against the doorposts. It would seem that the correct code does not work at all, or it does not work as you expect. Or it just worked, and now it suddenly stopped!

    An inquiring kitten begins to look for a relationship between his actions and the result ( “pigeon superstition” ).

    Biggest drama

    Below is a list of oddities, but first the biggest drama I've come across: not all Verilog constructs can be synthesized in iron. This is due to the fact that Verilog describes not only hardware logic, which integrates into modules and works in hardware. Verilog also describes test modules that combine the tested modules, send test signals to their inputs, and generally exist only for testing on a computer. Changing signal values ​​over time is specified by constructs containing the “#” sign in Verilog text. Such a sign means a time delay. In the example above, this is how the CLK signal is generated. And I thought it was a sinful thing that in the same way inside a real FPGA, you can generate, for example, a sequence of bits for sending a message via RS232. After all, the signal from the 50 MHz generator is fed to the FPGA input! Maybe she somehow focuses on him. As it turned out, I'm not the only one who hoped for a miracle: 1 ,2 , 3 , 4 , 5 . The reality, as always, turns out to be harsher: FPGA is a set of logic and a time delay can appear in it when using a counter whose value increases in clock cycles from the generator to a given value, or in some other way (but always in hardware).

    List of oddities found

    Surprising things, however, reading books [1,2] sheds light on this devilry. Moreover, grace is gained.

    If you designate reg, then it is not a fact that it will be created

    How did I get the problem? Suppose there is one module, the input of which I must submit a value (by type of parameter). In the future, this parameter will have to change in time depending on some external events. Therefore, the value must be stored in the register (reg). But the implementation of receiving external events has not yet been implemented, so I do not change the register, but simply set it to its original value, which does not change in the future.

    //задаю 8 битный регистр
    reg [7:0] val;
    //инициирую его значением
    initial val <= 8'd0240;
    //wire к которому подключим выход из модуля
    wire [7:0] out_data;
    //неведомый модуль, называется bbox
    //экземпляр этого модуля называется bb_01
    //будем считать, что в модуле есть входной порт in_data и выходной out_data
    //во входной порт подаем значение с регистра val, а выход подключаем к wire - out_data
    bbox bb_01(.in_data(val), .out_data(out_data));

    What would be the catch? In imperative PLs, we often set variables as constants and then we never change them and everything works. What do we see in iron?

    Firstly, we do not see the register. Secondly, 8'hFF is fed to the input of the module instead of our 8'd0240! And this is already enough for the scheme to work not in the way we planned. The fact that there is no register is normal. Verilog can describe logic in a variety of ways, while at the same time, the synthesizer always optimizes the hardware implementation. Even if you write the always block and work with registers in it, but the output value will always be determined by the input ones, then using the register will be redundant here and the synthesizer will not put it. And vice versa, if for some values ​​of the input data the output value does not change, then there is no way to do without a register-latch and the synthesizer will create it. (Book 1 pp. 88-89). What follows from this? If we start changing the value of the register, for example, depending on the button presses, then the gerister will already be created and everything will work as it should. If it turns out that the buttons do not change anything, then the synthesizer will again throw it away and again everything will break. What to do with a constant? You need to submit it directly to the input of the module:

    bbox bb_01(.in_data(8'd0240), .out_data(out_data));

    Now at the input of the module we have the correct value:

    It remains a mystery why, when reducing the register, its value in initial is not substituted for the input of the module.

    The wire dimension is best set by yourself.

    When developing in Quartus environment, it is allowed not to set wire lines in advance. In this case, they will be created automatically, but a warning will be issued. The problem is that the wire capacity will be 1-bit, and if the ports have a capacity of more than 1 bit, the value will not be transmitted.

    bbox       bb_01(.in_data(8'd0240), .out_data(int_data));
    other_bbox bb_02(.in_data(int_data), .out_data(out_data));

    Warning (10236): Verilog HDL Implicit Net warning at test.v(15): created implicit net for "int_data"


    As you can see, one bit is connected, and the remaining 7 bits are obtained not connected (NC). To avoid such a problem, you need to create wire yourself. It is not for nothing that the IcarusVerilog compiler does not give a warning, but an error if wire is not specified in advance.

    wire [7:0] int_data;
    bbox       bb_01(.in_data(8'd0240), .out_data(int_data));
    other_bbox bb_02(.in_data(int_data), .out_data(out_data));

    The computer will not climb the modules, watch what bit depths the ports have. In addition, the bit depth may be different, and not all bits are taken to the input of the module or from the output, but some specific bits.

    You cannot use the output of a logical function as a clock signal

    Sometimes in a project you need to lower the clock frequency, or enter a time delay of N clock cycles. A novice can use the counter and an additional scheme for determining if the counter reaches a certain value (comparison scheme). However, if you directly use the output from the comparison circuit as a clock, then problems may arise. This is because the logic circuit takes some time to set a stable output value. This delay shifts the front of the signal passing through different parts of the logic circuit relative to the clock, as a result of the race, metastability, asynchronism. I even once had a chance to hear a remark about this as a criticism of FPGAs: “with FPGAs there are constant problems - signal racing”.

    If you read at least a couple of articles:
    Trigger metastability and inter-cycle synchronization
    A couple of words about pipelines in FPGAs

    it becomes clear how FPGA devices are developed: the whole task is divided into hardware blocks, and the data between them moves along the conveyors, synchronously latched in registers by the clock signal. Thus, knowing the total clock frequency, the synthesizer calculates the maximum frequency of all combinatorial circuits, determines whether their speed fits the cycle period and concludes whether the circuit will or will not work in the FPGA. All this happens at the synthesis stage. If the schemes fit into the parameters, then you can flash the FPGA.

    For a complete understanding, it is worth reading the Altera handbook on the subject of “clock domains”, as well as understanding how to set TimeQuest calculation parameters for the project.

    Thus, for the developers of devices based on FPGAs, all the necessary methodologies have been created, and if you adhere to them, then there will be no problems.

    But what if I want to go against the system?

    The order of development and the behavior of the circuit synthesizer leads us to the conclusion about what FPGAs are at the hardware level. These are synchronous circuits. Therefore, among the goals of the synthesizer is to keep within time intervals. For this, he, for example, simplifies logical expressions, throws out from the synthesis parts of circuits that are not used by other circuits and are not tied to the physical conclusions of FPGAs. Asynchronous solutions and analog tricks are not welcome, because their work can be unpredictable and depend on anything (voltage, temperature, manufacturing process, batch, FPGA generation), and therefore does not give a guaranteed, repeatable, portable result. But everyone needs a stable result and common approaches to design!

    But what to do if you do not agree with the opinion of the synthesizer about the need to throw out immutable registers and shorten logic circuits? What if you want to make circuits with asynchronous logic? Need fine tuning? Or maybe you yourself want to assemble a circuit on the low-level components of the FPGA? Easy! Thanks to the developers of Altera for such an opportunity and detailed documentation!

    How to do it? You can try the graphical diagram editor. You may have heard that Quartus allows you to draw diagrams? You can choose the building blocks yourself and connect them. But this is not a solution! Even the drawn circuit will be optimized by the synthesizer, if possible.

    As a result, we come to the old truth: if nothing helps, read the instructions . Namely"Altera Handbook" part called "Quartus II Synthesis Options" .

    To begin with, describing the architecture on Verilog in a certain way, you can get a certain result. Here are some sample code for getting a synchronous and asynchronous RS trigger:

    //модуль синхронного RS триггера
    module rs(clk, r, s, q);
    input wire clk, r,s;
    output reg q;
    always @(posedge clk) begin
      if (r) begin
        q <= 0;
      end else if (s) begin
        q <= 1;

    In this case, you get a synchronous trigger.

    If you do not take into account the clock signal and switch depending on any changes in r and s, the result is an element with an asynchronous set value - a latch.

    //пример модуль асинхронного RS триггера
    module ModuleTester(clk, r, s, q);
    input wire clk, r,s;
    output reg q;
    always @(r or s) begin
      if (r) begin
        q <= 0;
      end else if (s) begin
        q <= 1;

    But you can go even further and create a latch yourself from the primitive (primitives are available just like any other Verilog module):

    module ModuleTester(clk, r, s, q);
    input wire clk, r,s;
    output reg q;
    DLATCH lt(.q(q), .clrn(~r), .prn(~s));

    As a result, the entire “body kit” at the input of the latch, which the synthesizer considered necessary, will disappear and we will get exactly what we wanted:

    A list of existing primitives can be viewed on the Altera website.

    And now a small example about asynchronism and reduction. I thought, for example, to make a generator according to the same principle as it was customary to do before, but only on the FPGA:

    But to increase the period I will take 4 elements, but only one of them will be inverted:

    module ModuleTester(q);
    output wire q;
    wire a,b,c,d;
    assign a = b;
    assign b = c;
    assign c = d;
    assign d = ~a;
    assign q = a;

    But we get a reduction (1 element, instead of four). Which is logical. But then we conceived a delay line.

    But if we put the condition to the synthesizer that the lines a, b, c, d are not reduced, then we get what we intended. Directives are used to help the synthesizer . One way to indicate this is the text in a comment:

    module ModuleTester(q);
    output wire q;
    wire a,b,c,d  /* synthesis keep */; 
    //                       ^^^--- это директива для синтезатора
    assign a = b;
    assign b = c;
    assign c = d;
    assign d = ~a;
    assign q = a;

    And here is the result - a chain of four elements:

    And this is far from all! I’ll leave it to the joy of self-study: working with case and a directive for implementing it as RAM / ROM or a logic circuit; work with built-in memory units (RAM / ROM); choice of implementation of multiplication - hardware multiplier or logic circuit.


    Quoting the article , I want to say that “FPGAs / FPGAs are not processors,“ programming ”FPGAs (filling the configuration memory of FPGAs) you create an electronic circuit (hardware), while when programming a processor (fixed hardware) you slip him a chain of sequential program instructions (software) written in memory .

    Moreover, as if I initially did not want to become strongly attached to a particular piece of iron, but sometimes, in order to use resources more efficiently and economically, I have to work at a low level. Often this can be avoided by developing synchronous circuits correctly. However, completely forgetting that this is iron does not work.

    I also want to say that fanaticism and maximalism have diminished over time. At first I tried to perform all actions and FPGA calculations in one clock cycle, because FPGA allows it. However, this is not always required. I have not yet managed to use the computational kernels of soft processors, however, the use of state machines to work according to a specific algorithm has become the norm. Calculations are not per 1 cycle, time delays of several cycles due to the use of pipelines is the norm.

    Books that really helped me

    1. V.V. Soloviev - The basics of the Verilog digital instrument design language. 2014
    2. Altera: Quartus II Handbook
    3. Altera: Advanced Synthesis Cookbook
    4. Altera: Designing with Low-Level Primitives

    Related Articles FPGAs, Altera and Verilog

    News FPGA industry
    Microsoft switches to proprietary
    processors Intel intends to release Xeon server processors with integrated FPGA
    Intel plans to buy Altera
    RBC: Intel bought Altera chip maker for $ 16.7 billion
    Search Bing optimized using FPGA neural network
    Search Bing optimized using FPGA neural network
    Intel Xeon processors are equipped with Altera FPGA

    Theory of
    Development of digital devices based on programmable logic VLSI

    Hardware features
    metastability trigger and synchronization mezhtaktovaya
    Time FPGA analysis or as I osvai ala Timequest
    few words about the conveyors in the FPGA
    the Verilog. RAM wrappers and why is it needed
    Designing synchronous circuits. Quick start with Verilog HDL

    Making a timer or first FPGA project
    Watch for FPGA using Quartus II and a bit of Verilog
    How I made a USB device
    Color music based on FPGA FPGA
    programming. Studying the phenomenon of “bounce of contacts” and the method of getting rid of it (VHDL!)
    Implementation of Verilog digital IIR filter on
    Verilog. Digital filter on RAM
    FPGA is simple or do-it-yourself ALU
    VGA adapter on FPGA Altera Cyclone III
    Processor research and its functional simulation
    NES, implementation on FPGA
    Video generation by mathematical function on FPGA
    Hardware number sorter on verilog
    Simple SDR receiver on FPGA
    FPGA standalone SDR receiver
    A look at 10G Ethernet from the FPGA side of the developer
    Simple FPGA based FM radio transmitter
    Making tetris for FPGA
    Minesweeper on FPGA
    Making IBM PC on FPGA


    Thanks to everyone who read to this place. I hope that with this article the principle of operation and use of FPGAs will become at least a little closer and more understandable. And as an example of application in a real project, I am preparing another article for release this week. Project Functional DDS FPGA Generator

    Also popular now: