About metastability in electronics

    Many novice developers often underestimate the effect of asynchrony on digital circuits. In projects with one clock, there are no difficulties: the circuit is completely synchronous, and the developer only needs to comply with the requirements of Setup and Hold. But as soon as a second clock appears in the system, the CDC - Clock Domains Crossing problem arises, related to the asynchronous operation of parts of the circuit operating from independent (asynchronous) generators. In practice, this problem translates into a complication of the design route associated with the features of static time analysis in CAD, and in iron it manifests itself in the form of such an effect as metastability and abnormal behavior of triggers. Actually, they have already written about metastability here , but I suggest a little deeper understanding of the problem.

    At the beginning are a few introductory words. Such trigger-critical trigger parameters as setup and hold are described in detail here . In a nutshell, near the moment the clock signal arrives at the trigger input, there is a certain minimum time interval within which the signal at the information input must remain stable. If a signal change occurs outside this interval, the trigger fires correctly. These are the minimum requirements that a developer must comply with when designing a circuit with a single clock. And even if the project uses several multiple frequencies obtained from one reference, the circuit is also considered synchronous, and the maximum that should be taken care of during the design is compliance with Setup and Hold.

    Now imagine that the project uses two independent clock sources. All triggers in the circuit are divided into two domains, according to the principle of control from one or the other reference frequency. Between these domains asynchronous to each other, a boundary passes in the form of signals generated at one reference frequency and going to the information inputs of triggers operating at a different frequency. In fact, the signal at the trigger inputs turns out to be asynchronous to their clock pulse, which means that the required Setup and Hold times cannot be sustained. As a result, anomalies (actually malfunctions) occasionally occur in the operation of triggers at the border of two domains, which are usually designated as an indeterminate state of trigger X in temporary modeling and are painted in red on the waveform.


    The figure shows: tc - the absolute time of arrival of the front of the clock pulse, which is a zero reference for the Hold axis (directed from zero to the right), and the Setup axis. The Setup axis is directed to the left, since the setup time is counted in the negative direction from the moment the front of the shred arrives. Next, ts and th are the Setup and Hold parameters of the trigger: between the marks ts and th, the signal at the trigger information input must be stable (red area of ​​the figure). Outside the ts + th window, the signal can change arbitrarily (blue area of ​​the figure). When the signal inside the ts + th window changes, the trigger switching can be very delayed. The last element of the picture - the area of ​​metastability, is highlighted in orange. This is a certain time interval when it falls into which the behavior of the trigger outputs becomes unpredictable, which will be described in detail below. On practice,

    Consider the device of the classical D-flip-flop circuit and its constituent RS-latch:



    Imagine that a short pulse with an active zero arrives at the input R (Reset) of the latch, and a passive signal (log. 1) comes at the input S (Set). If the pulse is very short, then the latch may not switch. And if you increase the pulse duration? We conduct a series of experiments by applying pulses of different durations to the input of the latch. The following figure is borrowed from the article by L.R. Marino General Theory of Metastable Operation:


    The figure shows two axes - the voltage at the X and Y outputs of the RS-latch. Marks V0 and V1 - voltage log. 1 and the log. 0 outputs of the trigger, and Vm is the voltage equal to ½ U supply. The figure also shows that the initial state (Initial state) is at the point of the plane {X = V1, Y = V0} - the outputs of the trigger {X, Y} took logical values ​​{1,0}. A high potential (passive value) is applied to the input of the latch S (Set), and a pulse with an active zero of different durations is applied to the input R (Reset) (the 6th shortest, the 1st longest - shown at the bottom of the figure). In accordance with the pulse number at input R, the figure shows 6 potential switching paths of the pair of outputs {Y, X}: for pulses 1-3, the trigger switches completely, for pulses 5-6 the trigger does not switch, and trajectory 4 brings the trigger to the center metastability zones (point {Vm, Vm}), located in the middle between the thresholds log. 1 and the log. 0.

    So, we showed that by applying an impulse of insufficient duration to one latch arm, its energy may not be enough to fully switch. With a long pulse duration, the latch must switch. And finally, you can pick up an input pulse of such a duration that the outputs of the latch will be in the middle between two stable states. Moreover, the closer to the region of metastability the pulse energy pushes the latch outputs, the longer then these outputs restore a stable state.

    Now let's turn to the modern D-flip-flop diagram from the 65nm library used in ASIC design:


    In the diagram for the information input D, there is a key GD, the first latch is depicted as two inverters in the on switch I1 and G1, the pass switch SW separates the latches, and the second latch is also built on two inverters in the on switch I2 and G2. The output for increasing speed is taken from the left shoulder of the second latch. The trigger works as follows: at CK = 0, the input key passes the signal to the first latch with the G1 feedback disabled, the key between the latches is closed, and the second latch is stored, because G2 feedback is active. With CK = 1, the first latch is cut off from input D, its feedback G1 is activated, the key between the latches is unlocked, and the second latch switches off the feedback - the data from the first latch is overwritten into the second. If signal D is removed before the arrival of the leading edge CK, we get a pulse with an active zero at the input of the first latch, whose duration depends only on the ratio of the fronts CK and D. Thus, the situation is similar to that considered with the RS latch: the pulse energy on one of the arms can switch the latch, not switch, or insert the latch into a state of metastability. In the article by L.R. Marino proved mathematically that the outputs of absolutely any trigger, regardless of its design, can take a metastable state.

    Let's try to consider all the possible states of the outputs of the latch; for this, we construct another graph with two axis-potentials of the outputs. The schedule is borrowed from lectures on logical design, Doctor of Technical Sciences, Professor VB Marakhovsky:


    The schedule is almost the same as in the article of L.R. Marino, but the arrows show the trajectories of the possible behavior of the latch outputs. Moving along the trajectories, the outputs of the latch will eventually switch to a stable logical state {1,0} or {0,1} corresponding to the state at the inputs (the input state will snap into place). But when the potential of the outputs falls into a certain area in the center of the graph, it is impossible to predict the final state of the outputs. This is an area of ​​unstable equilibrium (metastability), the output of which can be affected only by random factors, such as thermal noise. The final state of the latch outputs at the completion of metastability is not known in advance - it can be any ({1,0} or {0,1}). The size of the metastability region is measured experimentally, or calculated using spice modeling. It should be noted, that in practice metastability is manifested not only in the form of a static potential equal to ½ U of food, but also in the form of weak pulsations near this point. Another important property of metastability is the unpredictability of the time it takes for the latch to become stable. As is known, the time a latch leaves a metastable state is described by a Poisson distribution, and in theory it can be infinitely long. Thus, it is impossible to predict in advance not only the final state of the trigger after exiting metastability, but also the duration of the metastable state in time. However, if the signal at the trigger input does not change in the next clock cycle, then the maximum duration of the metastable state of the trigger outputs at the boundary of two domains does not exceed the duration of the clock pulse.

    Formulas and methods for calculating the sizes of the metastability region and the frequency of failures can be found in the article Variability in Multistage Synchronizers. I will give only the results of calculating the frequency of failures from this article. As a synchronizer, we used the serial connection schemes of 2x, 3x, and 4x triggers operating on a pulse with a period of 800 ps and a frequency range of the input signal from 600 ps to 2ns.


    On the graph along the vertical axis MTBF is shown in years - the period between failures (falling into a metastable state - failure) when using different synchronizers at the entrance to a domain operating from a clock cycle with a period of 800 ps. The horizontal period is the period of the domain’s clock pulse - the signal source. As you can see from the figure, the more triggers in the synchronizer chain, the less often failures occur.

    conclusions


    1. The trigger switching time entirely depends on the duration of the control pulse at the input of the first latch, which is obtained by the phase shift of the signal to the clock pulse at the inputs of the trigger. When switching a signal beyond the limits of the interval ts + th, the trigger switches correctly, and in a predetermined time. But, if the input signal has changed inside this interval, then the trigger reaches a stable state the longer, the closer it is to the metastability zone. If the trigger outputs are inside the metastability zone, their final logical state and response time cannot be predicted.

    2.The metastability of the trigger outputs on the oscilloscope looks like the level of the output potential equal to ½ U of the supply, or as weak ripples near this level. In the inverters of the trigger circuit, a through current flows between the ground and power buses. But since the resistance of the open channel in n- and p-transistors is measured in kilo Ohms, the through current does not have a noticeable effect on such phenomena as the total circuit consumption, power drop (IR drop), and electromigration.

    3.At frequencies near the gigahertz, a failure (due to metastability) in a trigger at the boundary of two domains occurs once every few seconds. Using metastability helps to use two or more triggers in series as a synchronizer. When using two triggers, a failure at the synchronizer output happens once a year; three triggers - once every thousand years; four triggers - once every 10 billion years. Reducing the operating frequency affects the frequency of failures exponentially: when using a synchronizer of two triggers and a frequency of about 500 MHz, the interval of failures will increase by several orders of magnitude - up to one failure per million years. Therefore, if you are designing a circuit with frequencies for gigahertz, try to use as few asynchronous domains as possible,

    4. The frequency of failures is strongly influenced by the stray capacitances of the outputs of the first and second latches (constituting the trigger), which depend on the parameters of the transistors used and on the trigger circuit. In other words, changing the parameters of transistors and the trigger circuit, you can indirectly affect the frequency of occurrence of metastability at the outputs of this trigger. A recipe is only suitable if you design your own library of elements.

    5. From the point of view of static time analysis, all paths between asynchronous domains should be described by sdc constants set_false_path.

    Nobody has studied the metastability of triggers in Russia (correct, if not right), but in practice, it is enough for the developer to follow the simple recommendations that are given above; I hope this is useful to someone.

    Also popular now: