[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: synchronizer_flop in [oc] misc directory



Hi Lin:

Here is my experience and thoughts about the synchronizer_flop in the misc
directory.  I hope that I can convince everyone to use it ALWAYS!


In a system with several clocks, there are often critical signals which must
cross from one clock domain to another.

The standard way to handle this is to have a bunch of wires which carry
data and an extra set of wires which signal between clock domains that
the data is available, and that it has been consumed.  These might be called
guard signals.

The sender of one of these guard signals sends it based on the sender clock.
The receiver first captures the guard signal in a single flop, then uses the
synchronized guard signal to control state machines.


In an RTL simulation, it is easy to write these in verilog as simple lines
like:

always @(posedge receive_clock) synchronized_guard_signal <= guard_signal ;
You can let synthesis make the flop for you.

That doesn't work well when you go all the way to silicon.  The problem is
that the flop misbehaves.  You don't know the name of the flop, so you can't
fix it.  But you find out it's name soon enough, when your gate-level
simulation fails!


Here is what happens.  You have a sender clock tree and a receiver clock
tree.
Even if you drive these trees from outside your chip with the same signal,
they typically
will have different clock tree delays, so the sender and receiver clocks to
flops will
happen at different times.

It's worse than that.  As you do fast, typical, and slow simulations, the
delays through
the clock trees will vary differently for the different clock trees.

The result is that you will often see that the signal starts out in the
sending clock
domain as valid data, but it becomes X's as it is received by the
synchronization
flop.  This happens if the setup and hold requirements of the flop are not
met.


If you can find the special synchronization flops, you can substitite a
special gate-
level simulation flop (for ONLY the critical synchronization flops) which
never
outputs X unless it has X as an input.  If it has a valid 1'b1 or 1'b0 as
input, it
always makes a valid 1'b1 or 1'b0 at it's output.

A similar problem happens with the reset input to the flop.  What is the
reset
signal changes at the synchronization flop near when the clock changes?
1'bX!
You can have a simulation model which makes a valid data instead.

This substitution of low-level flop models is easier if the source
instantiates
a flop with a name which is easy to find.  If you manually instantiate your
special synchronization flops, you are at least documenting at source level
what your design intent is.

(I expect that no-touches will be needed to keep the low-level flop from
being
synthesized into ordinary logic.)


Here is what I suggest people do when they have logic which must cross clock
domains:

1) Manually instantiate the special synchronization flops needed to cross
clock domains.

2) Use a receive_sync_clock signal as the clock to the special
synchronization flops.

3) Bring the receive_sync_clock signal to the top level, running in parallel
with
    the receive_clock signal used for all logic in the receiving clock
domain.

4) In the synthesis constraints, make sure that the receive_sync_clock to
receive_clock
    timing constraint is less nano's than the clock period of the receiving
clock.  This
    makes sure that the synchronization flop has time to settle out of
metastability
    and deliver valid data to the rest of the receive state machine.

5) In the synthesis constraints, make sure that the sender_clock to
receive_sync_clock
    timing constraint is less nanos than the clock period of the FASTER of
the sender_clock
    or the receiver_clock.  This constraint makes it easier to have
cycle-accurate simulations
    across the slow-typical-fast paramater set.

6) It is usually a good idea for all resettable flops in the design to come
out of reset
    at the same time,  So the external reset signal should be synchronized
at the
    chip edge (using a manually instantiated synchronizer flop) and the
reset signal
    should be sent to all receivers.  This signal may go to MANY receivers,
so it might
    need to be pipelined through a few normal flops so that synthesis can
meet timing.

7) The test environment will probably want the sender_clock, the
receiver_clock, and
    the receiver_sync_clock (same as receiver_clock) to have known
relationships
    so that scan testing can be done.  It seems that the test environment
needs this
    so much that designing the clock trees to operate during test mode might
be the
    FIRST concern (or maybe second, after considering timing to external
signals)
    Carefully specify the clock tree constraints for the several clocks, and
their
    related _sync_ versions, so that testing is possible.  I am almost
certain that
    the designer should NEVER say that a signal is a false-path signal, even
if it
    crosses from one clock domain to the other.  Sure, the circuit will
work.  But if
    synthesis and layout tools go nuts, you won't be able to test the chip.


I am going to submit this idea to opencores for their "best practices HDL
guide"
I would appreciate info from anyone who knows a better idea!

Blue Beaver



> Hi, Blue Beaver,
>   I don't understand how the synchronizer_flop works. It seems just like a
> common D type flip-flop with asynchronous reset. After synthesis, it must
> also be implemented by the D type flip-flop from the target library. How
> could it avoid the X in the gate-level simulation? Can you explain for me?
>
> Best Regards!
> Lin Sheng



--
To unsubscribe from cores mailing list please visit http://www.opencores.org/mailinglists.shtml