Implementation of a stable UART, with a speed of 921600 baud and more, in Verilog language under FPGA
Hello. A couple of weeks ago, I began to slowly study programming for FPGAs. For these purposes, I ordered from the Chinese the cheapest board based on the Altera Max II EPM240T100C5N chip. After installing Quartus v15, I began to study the 2001 Verilog standard. Having blinked with LEDs, I decided to try to implement some kind of data transfer protocol. Naturally, it became UART :) After looking at other people's examples on the network, I did not really like the excessive piling up of logic, a lot of additional counters, and most importantly, problems with synchronization in the receiver and, as a result, not stability at high speeds. Of course, you can find quality implementations that are fully configurable, and indeed, with an “ideal code,” but there will be no sporting interest.
And so, the task was to realize the most compact, stable and simple 8-bit asynchronous transceiver with 1 start and 1 stop bit. In a word - a classic. But as it turned out, the task is not so trivial as it was at first glance. Having realized the receiver and transmitter in just one evening, I had to spend two more to make the logic of the microcircuit not swallow, correctly receive and send a stream of bytes, without errors.
Based on criticism and wishes in the comments, I worked on the bugs, and the article presents the second implementation of this module. Reception and sending of data were transferred to shift registers, a majority scheme for input RX of three elements was added, got rid of blocking assignments in synchronous blocks, and UART clock counters were considered from the maximum value to zero.
Project Files:
- Main
- UART
- UART_TX
- UART_RX
- RXMajority3Filter
Let's start with the UART_TX module:
UART_TX.v
module UART_TX #
(
parameter CLOCK_FREQUENCY = 50_000_000,
parameter BAUD_RATE = 9600
)
(
input clockIN,
input nTxResetIN,
input [7:0] txDataIN,
input txLoadIN,
output wire txIdleOUT,
output wire txReadyOUT,
output wire txOUT
);
localparam HALF_BAUD_CLK_REG_VALUE = (CLOCK_FREQUENCY / BAUD_RATE / 2 - 1);
localparam HALF_BAUD_CLK_REG_SIZE = $clog2(HALF_BAUD_CLK_REG_VALUE);
reg [HALF_BAUD_CLK_REG_SIZE-1:0] txClkCounter = 0;
reg txBaudClk = 1'b0;
reg [9:0] txReg = 10'h001;
reg [3:0] txCounter = 4'h0;
assign txReadyOUT = !txCounter[3:1];
assign txIdleOUT = txReadyOUT & (~txCounter[0]);
assign txOUT = txReg[0];
always @(posedge clockIN) begin : tx_clock_generate
if(txIdleOUT & (~txLoadIN)) begin
txClkCounter <= 0;
txBaudClk <= 1'b0;
end
else if(txClkCounter == 0) begin
txClkCounter <= HALF_BAUD_CLK_REG_VALUE;
txBaudClk <= ~txBaudClk;
end
else begin
txClkCounter <= txClkCounter - 1'b1;
end
end
always @(posedge txBaudClk or negedge nTxResetIN) begin : tx_transmit
if(~nTxResetIN) begin
txCounter <= 4'h0;
txReg[0] <= 1'b1;
end
else if(~txReadyOUT) begin
txReg <= {1'b0, txReg[9:1]};
txCounter <= txCounter - 1'b1;
end
else if(txLoadIN) begin
txReg <= {1'b1, txDataIN[7:0], 1'b0};
txCounter <= 4'hA;
end
else begin
txCounter <= 4'h0;
end
end
endmodule
Let's take a look at everything in order:
module UART_TX #
(
parameter CLOCK_FREQUENCY = 50_000_000,
parameter BAUD_RATE = 9600
)
(
input clockIN,
input nTxResetIN,
input [7:0] txDataIN,
input txLoadIN,
output wire txIdleOUT,
output wire txReadyOUT,
output wire txOUT
);
Parameters CLOCK_FREQUENCY and BAUD_RATE are the frequency of the quartz resonator and the frequency of the UART transmitter, respectively.
Incoming ports:
clockIN - port of the clock signal from the quartz resonator.
nTxResetIN - reset port on a negative edge.
txDataIN is an eight-bit data bus.
txLoadIN - port for starting data transfer.
Outgoing ports:
txIdleOUT - the idle port of the transmitter, set to the log. 1 upon complete completion of the data byte transfer cycle, if no log is present on the txLoadIN port . 1.
txReadyOUT- port, log. 1 on which will mean that a stop bit has been sent, and new data can be loaded.
txOUT - serial port for outgoing data to be assigned to the FPGA leg.
localparam HALF_BAUD_CLK_REG_VALUE = (CLOCK_FREQUENCY / BAUD_RATE / 2 - 1);
localparam HALF_BAUD_CLK_REG_SIZE = $clog2(HALF_BAUD_CLK_REG_VALUE);
reg [HALF_BAUD_CLK_REG_SIZE-1:0] txClkCounter = 0;
reg txBaudClk = 1'b0;
reg [9:0] txReg = 10'h001;
reg [3:0] txCounter = 4'h0;
assign txReadyOUT = !txCounter[3:1];
assign txIdleOUT = txReadyOUT & (~txCounter[0]);
assign txOUT = txReg[0];
The local parameter HALF_BAUD_CLK_REG_VALUE is the value of the counter-divider of the frequency of the half-cycle of the UART clock signal. It is calculated by the formula CLOCK_FREQUENCY / BAUD_RATE / 2 - 1 .
The local parameter HALF_BAUD_CLK_REG_SIZE is the bit depth of this counter itself. It is calculated by the wonderful function $ clog2 - the base 2 logarithm of the value of the HALF_BAUD_CLK_REG_VALUE parameter .
Registers reg:
txClkCounter - counter-divider of the clock frequency.
txBaudClk - clock signal for the transmitter.
txReg - shift register in which data bytes, start and stop bits will be written.
txCounter- counter of the number of sent bits.
Wires wire:
txReadyOUT is assigned by a continuous connection through a logical negation on the 4th, 3rd and 2nd bit of the register txCounter Accepts the state log. 1 when the txCounter counter reaches a value of 1 or 0.
txIdleOUT is assigned by a continuous connection to txReadyOUT and through the logical primitive AND to the inverted 1st bit of the txCounter register . Accepts a state log. 1 when the txCounter counter reaches 0.
txOUT is assigned by a continuous connection to the 1st bit of the txReg register Data
transfer:
always @(posedge txBaudClk or negedge nTxResetIN) begin : tx_transmit
if(~nTxResetIN) begin
txCounter <= 4'h0;
txReg[0] <= 1'b1;
end
else if(~txReadyOUT) begin
txReg <= {1'b0, txReg[9:1]};
txCounter <= txCounter - 1'b1;
end
else if(txLoadIN) begin
txReg <= {1'b1, txDataIN[7:0], 1'b0};
txCounter <= 4'hA;
end
else begin
txCounter <= 4'h0;
end
end
On a negative edge, on the nTxResetIN port , which is checked in the first condition, a log is set on the first bit of the txReg register . 1, and the txCounter register takes the value 0, which gives the outputs txIdleOUT and txReadyOUT and txOUT log. 1.
Otherwise, on a positive edge on the txBaudClk port , the signal value of the txReadyOUT port is checked , and, with the log. 0, the contents of the txReg register are shifted towards the lower bits, and the txCounter counter is decremented by one, and when the value 1 is reached , the log will be set at the txReadyOUT output . 1.
Otherwise, on a positive edge on the txBaudClk port , the signal of the txLoadIN port is checked , and, with the log. 1, txReg register receives the value from the input txDataIN start and stop bits, the txCounter counter takes the value 10 (4'hA), which will give a negative edge at the outputs txIdleOUT and txReadyOUT and txOUT - which will signal the beginning of data transfer (start bit).
Otherwise, the txCounter register takes the value 0, and a log appears at the output of txIdleOUT . 1.
It is worth noting that according to this logic with a log. 1 on txLoadINdata will climb continually from the input txDataIN into register txReg and sequentially transmitted to the output txOUT . Those. to stop the transmission of the data packet, you need to reset txLoadIN to the log. 0 before the stop bit is fully transmitted. The best way is to reset txLoadIN on a negative edge on the txReadyOUT port . It is impossible to interrupt the process of transmitting a data byte with a logical zero on txLoadIN . You can use nTxResetIN for this .
Transmitter Clock Generation:
always @(posedge clockIN) begin : tx_clock_generate
if(txIdleOUT & (~txLoadIN)) begin
txClkCounter <= 0;
txBaudClk <= 1'b0;
end
else if(txClkCounter == 0) begin
txClkCounter <= HALF_BAUD_CLK_REG_VALUE;
txBaudClk <= ~txBaudClk;
end
else begin
txClkCounter <= txClkCounter - 1'b1;
end
end
On the positive edge of the clock signal on the clockIN port in the first condition, the log is checked. 1 on txIdleOUT and log. 0 on txLoadIN , and when the condition is met, the txClkCounter register is reset to 0, and a log is set on the txBaudClk clock . 0. i.e. thereby we guarantee that the log. 1 on txDataIN, the transmitter will start transmitting data already on the next positive edge of clockIN .
Otherwise, txClkCounter is checked for the value 0, and when the condition is met , the value of the HALF_BAUD_CLK_REG_VALUE parameter will be written in txClkCounter atxBaudClk inverts its state.
Otherwise, txClkCounter will decrease its value by 1.
Timing diagram of the signals of the UART_TX module:
UART_RX module:
UART_RX.v
module UART_RX #
(
parameter CLOCK_FREQUENCY = 50_000_000,
parameter BAUD_RATE = 9600
)
(
input clockIN,
input nRxResetIN,
input rxIN,
output wire rxIdleOUT,
output wire rxReadyOUT,
output wire [7:0] rxDataOUT
);
localparam HALF_BAUD_CLK_REG_VALUE = (CLOCK_FREQUENCY / BAUD_RATE / 2 - 1);
localparam HALF_BAUD_CLK_REG_SIZE = $clog2(HALF_BAUD_CLK_REG_VALUE);
reg [HALF_BAUD_CLK_REG_SIZE-1:0] rxClkCounter = 0;
reg rxBaudClk = 1'b0;
reg [9:0] rxReg = 10'h000;
wire rx;
assign rxIdleOUT = ~rxReg[0];
assign rxReadyOUT = rxReg[9] & rxIdleOUT;
assign rxDataOUT[7:0] = rxReg[8:1];
RXMajority3Filter rxFilter
(
.clockIN(clockIN),
.rxIN(rxIN),
.rxOUT(rx)
);
always @(posedge clockIN) begin : rx_clock_generate
if(rx & rxIdleOUT) begin
rxClkCounter <= HALF_BAUD_CLK_REG_VALUE;
rxBaudClk <= 0;
end
else if(rxClkCounter == 0) begin
rxClkCounter <= HALF_BAUD_CLK_REG_VALUE;
rxBaudClk <= ~rxBaudClk;
end
else begin
rxClkCounter <= rxClkCounter - 1'b1;
end
end
always @(posedge rxBaudClk or negedge nRxResetIN) begin : rx_receive
if(~nRxResetIN) begin
rxReg <= 10'h000;
end
else if(~rxIdleOUT) begin
rxReg <= {rx, rxReg[9:1]};
end
else if(~rx) begin
rxReg <= 10'h1FF;
end
end
endmodule
module UART_RX #
(
parameter CLOCK_FREQUENCY = 50_000_000,
parameter BAUD_RATE = 9600
)
(
input clockIN,
input nRxResetIN,
input rxIN,
output wire rxIdleOUT,
output wire rxReadyOUT,
output wire [7:0] rxDataOUT
);
In many ways, similar to the UART_TX module.
Incoming ports:
clockIN and nRxResetIN have the same values as in the UART_RX
rxIN module - an incoming serial data port that must be assigned to the FPGA leg.
Outgoing ports:
rxIdleOUT - receiver idle port, set to the log. 1 upon complete completion of the data byte receive cycle.
rxReadyOUT - receiver ready port. When going to the log. 1 shows that a data byte was received, which ended with a stop bit (log. 1). Goes to the log state. 0 at the log. 0 on the nRxResetIN port or when you start receiving the next byte of data.
rxDataOUT - eight-bit bus received data.
localparam HALF_BAUD_CLK_REG_VALUE = (CLOCK_FREQUENCY / BAUD_RATE / 2 - 1);
localparam HALF_BAUD_CLK_REG_SIZE = $clog2(HALF_BAUD_CLK_REG_VALUE);
reg [HALF_BAUD_CLK_REG_SIZE-1:0] rxClkCounter = 0;
reg rxBaudClk = 1'b0;
reg [9:0] rxReg = 10'h000;
wire rx;
assign rxIdleOUT = ~rxReg[0];
assign rxReadyOUT = rxReg[9] & rxIdleOUT;
assign rxDataOUT[7:0] = rxReg[8:1];
Registers reg:
rxClkCounter - counter-divider of the clock frequency.
rxBaudClk - clock signal for the receiver.
rxReg is a shift register that stores 8 bits of received data, start and stop bits.
Wires wire:
rx - incoming serial data signal passed through the majority logic of the RXMajority3Filter module .
rxIdleOUT is continuously assigned to the inverted 1st bit of the rxReg register . Accepts a log. 1 at the end of data reception, when the stop bit will be written to the register rxReg [0] .
rxReadyOUT is continuously assigned to the 10th bit of the rxReg register andrxIdleOUT through the logical primitive AND. Accepts a log. 1 if the data reception was completed and in the rxReg register the 10th bit took the value log. 1 (stop bit).
rxDataOUT is assigned from 9 to 1 bits of the rxReg register .
Reception of data:
always @(posedge rxBaudClk or negedge nRxResetIN) begin : rx_receive
if(~nRxResetIN) begin
rxReg <= 10'h000;
end
else if(~rxIdleOUT) begin
rxReg <= {rx, rxReg[9:1]};
end
else if(~rx) begin
rxReg <= 10'h1FF;
end
end
On a negative edge on the nRxResetIN port , the first condition will be satisfied, and rxReg will reset to 0, which will set the log. 0 on the rxReadyOUT port and the log. 1 on rxIdleOUT .
Otherwise, with the log. 0 on the rxIdleOUT port , the contents of the rxReg register are shifted towards the lower bits, and the current state of the rx signal will be written to the high bit .
Otherwise, the log. 0 signal rx will mean the beginning of data transfer (start bit), and units (10'h1FF) will be written to the rxReg register in all bits except the high bit (start bit).
Receiver Clock Generation:
always @(posedge clockIN) begin : rx_clock_generate
if(rx & rxIdleOUT) begin
rxClkCounter <= HALF_BAUD_CLK_REG_VALUE;
rxBaudClk <= 0;
end
else if(rxClkCounter == 0) begin
rxClkCounter <= HALF_BAUD_CLK_REG_VALUE;
rxBaudClk <= ~rxBaudClk;
end
else begin
rxClkCounter <= rxClkCounter - 1'b1;
end
end
The purpose of the second and third conditions is identical to the condition from the UART_TX module - generating a clock signal for the receiver.
In the first condition, the log is checked. 1 signal rx and log. 1 signal rxIdleOUT , and when the condition is met , the value of the HALF_BAUD_CLK_REG_VALUE parameter will be written to txClkCounter , and the log will be set to rxBaudClk . 0.
i.e. when a log appears. 0 (start bit) on port rx , the counter will count half the period of the clock signal of the receiver, and only after that data reception will begin.
Timing diagram of the signals of the UART_RX module:
RXMajority3Filter module:
RXMajority3Filter.v
module RXMajority3Filter
(
input clockIN,
input rxIN,
output wire rxOUT
);
reg [2:0] rxLock = 3'b111;
assign rxOUT = (rxLock[0] & rxLock[1]) | (rxLock[0] & rxLock[2]) | (rxLock[1] & rxLock[2]);
always @(posedge clockIN) begin
rxLock <= {rxIN, rxLock[2:1]};
end
endmodule
Represents the implementation of a majority element in a shift register of three elements.
UART module:
UART.v
module UART #
(
parameter CLOCK_FREQUENCY = 50_000_000,
parameter BAUD_RATE = 9600
)
(
input clockIN,
input nTxResetIN,
input [7:0] txDataIN,
input txLoadIN,
output wire txIdleOUT,
output wire txReadyOUT,
output wire txOUT,
input nRxResetIN,
input rxIN,
output wire rxIdleOUT,
output wire rxReadyOUT,
output wire [7:0] rxDataOUT
);
defparam uart_tx.CLOCK_FREQUENCY = CLOCK_FREQUENCY;
defparam uart_tx.BAUD_RATE = BAUD_RATE;
UART_TX uart_tx
(
.clockIN(clockIN),
.nTxResetIN(nTxResetIN),
.txDataIN(txDataIN),
.txLoadIN(txLoadIN),
.txIdleOUT(txIdleOUT),
.txReadyOUT(txReadyOUT),
.txOUT(txOUT)
);
defparam uart_rx.CLOCK_FREQUENCY = CLOCK_FREQUENCY;
defparam uart_rx.BAUD_RATE = BAUD_RATE;
UART_RX uart_rx
(
.clockIN(clockIN),
.nRxResetIN(nRxResetIN),
.rxIN(rxIN),
.rxIdleOUT(rxIdleOUT),
.rxReadyOUT(rxReadyOUT),
.rxDataOUT(rxDataOUT)
);
endmodule
It simply combines the two modules UART_RX and UART_TX into a single whole, forwarding incoming and outgoing signals, and the values of the frequency parameters of the quartz resonator and the frequency of the UART transmitter.
And actually the top-level module Main:
Main.v
module Main
(
input wire clockIN,
input wire uartRxIN,
output wire uartTxOUT
);
defparam uart.CLOCK_FREQUENCY = 50_000_000;
defparam uart.BAUD_RATE = 921600;
reg [7:0] txData;
reg txLoad = 1'b0;
wire txReset = 1'b1;
wire rxReset = 1'b1;
wire [7:0] rxData;
wire txIdle;
wire txReady;
wire rxIdle;
wire rxReady;
UART uart
(
.clockIN(clockIN),
.nTxResetIN(txReset),
.txDataIN(txData),
.txLoadIN(txLoad),
.txIdleOUT(txIdle),
.txReadyOUT(txReady),
.txOUT(uartTxOUT),
.nRxResetIN(rxReset),
.rxIN(uartRxIN),
.rxIdleOUT(rxIdle),
.rxReadyOUT(rxReady),
.rxDataOUT(rxData)
);
always @(posedge rxReady or negedge txReady) begin
if(~txReady)
txLoad <= 1'b0;
else if(rxReady) begin
txLoad <= 1'b1;
txData <= rxData;
end
end
endmodule
It is essentially a simple “echo” test.
On a positive edge on the rxReady port , incoming data will be written to the txData register , which is assigned to the txDataIN input of the transmitter, and the txLoad register , which is assigned to the txLoadIN transmitter input, will be logged. 1, to start the transfer.
Negative edge on port txReady , register txLoad takes the value log. 0.
This module was tested on a board with an Altera Max II EPM240T100C5N chip and a quartz resonator with a frequency of 50 megahertz, with a UART speed of 921600 baud (the maximum speed that my USB-UART adapter supports).
According to the standard, for the receiver, the sampling frequency of the start bit should be at least 16 times the UART frequency. So for stable operation of the module at 921600 baud rate, the frequency of the quartz resonator should not be lower than 921600 * 16 = 14'745'600 hertz. For example, a crystal at 16 megahertz will go.
It is also advisable to put a pull-up resistor at the input of the receiver.
As usual, any optimization and improvement tips are welcome.
Download updated files here .