6

Introduction

Having found multiple, sometimes conflicting or incomplete information on the internet and in some training classes about how to create timing constraints in SDC format correctly, I'd like to ask the EE community for help with some general clock generating structures I have encountered.

I know that there are differences on how one would implement a certain functionality on an ASIC or FPGA (I have worked with both), but I think there should be a general, correct way to constrain the timing of a given structure, independent of the underlying technology - please let me know if I'm wrong on that.

There are also some differences between different tools for implementation and timing analysis of different vendors (despite Synopsys offering a SDC parser source code), but I hope that they are mainly a syntax issue which can be looked up in the documentation.

Please also see the related question ASIC timing constraints via SDC: How to correctly specify a multiplexed clock?

Question

This is about the following ripple clock divider, which is part of the clkgen module which is a part of a larger design using the generating clocks:

Ripple clock divider

The generation of clk0 seems to be relatively straight-forward:

create_clock [get_pins clkgen/clk0] -name baseclk -period 500

Though I am not so sure about the SDC commands for the generated, divided clocks clk2, clk4 and clk8: How should the source and target options be specified? My initial thought was that target is the output pin on the clock generating cell, source is as close to the target as possible:

create_generated_clock -name div2clk -source [get_pins clkgen/divA/clk] -divide_by 2 [get_pins clkgen/divA/q]

The source could also be the module's clock input pin:

create_generated_clock -name div2clk -source [get_pins clkgen/clk0] -divide_by 2 [get_pins clkgen/divA/q]

Or the previously defined source clock, as suggested here:

create_generated_clock -name div2clk -source [get_clocks baseclk] -divide_by 2 [get_pins clkgen/divA/q]

...which also raises the question if the source or the target options need to be something other than get_pins, such as get_nets, get_registers or get_ports.

To keep the example as general as possible, let's assume that the generated clocks clk2, clk4 and clk8 could be driving other, potentially interacting (clock domain crossing) registers (not shown in the schematic).

I think the constraints for clk4 and clk8 should be obvious once we know how the clk2 constraint is written.

The X1 instance (a simple buffer) in the schematic is just a place-holder to highlight the issue of where in the clock propagation network the source option of the create_generated_clock should be set, as automatic place&route tools are usually free to place buffers anywhere (such as between the divA1/q and divB1/clk pins).

FriendFX
  • 366
  • 1
  • 6
  • 16

1 Answers1

3

I'd say that the rule of thumb is: set either input port of the top module, or Q pin of an internal flip-flop as the source of generated clock.

Example Verilog code:

module top (

input clk,
input rst,

...

);

...

always @(posedge clk or negedge rst)
begin
if (rst == 1'b0) 
    div_2_clk = 1'b0;
else
    div_2_clk = ~div_2_clk;           
end

...

endmodule

Example SDC code:

create_clock -name clk -period 5 [get_port clk]
...
create_generated_clock -name slow_clk -source [get_port clk] -divide_by 2 [get_pins div_2_clk_reg/Q]

I did no test the above syntax's. Also note the extension _reg added to the RTL name of the signal - this is the extension added by synthesis tool when it detects that the signal must be represented by a flip-flop. This extension may vary between tools (I don't know for sure).

If you use any RTL wrapper around flip-flops - set the source of the generated clocks to be the internal Q pin of the flip-flop, not the output pin of the wrapper.

If you follow these simple rules you need not worry about any buffers added by the synthesis or P&R tools.

Vasiliy
  • 7,323
  • 2
  • 21
  • 38
  • Thanks for your answer! Does that mean that the timing analyser will add the path (and potential inserted buffer) delays to the latency of the (generated) clocks? Should my first SDC line therefore be `create_clock [get_ports clk0]` instead (assuming `clk0` is the top-level port)? If you could incorporate that into your answer and also include at least one complete example of the `create_generated_clock` statement, I'd be happy to accept it! – FriendFX Sep 27 '13 at 00:07
  • @FriendFX, I added some code snippets, but treat them with care - they are not verified. – Vasiliy Sep 27 '13 at 09:43
  • thanks for the SDC snippet (didn't need the Verilog). For the `create_*clock` commands, does that mean that the timing analyser will add the path (and potential inserted buffer) delays to the latency of the (generated) clocks? – FriendFX Sep 27 '13 at 13:53
  • @FriendFX, what do you mean by "the latency of the (generated) clock"? Latency with respect to what? Its source clock? If so, then you're probably interested in something related to clock domain crossing between these clocks? – Vasiliy Sep 27 '13 at 22:36
  • What I meant was if the path delay from the `source` to the `target` in a generated clock will include any buffers the P&R tool might insert between these nodes? – FriendFX Sep 28 '13 at 12:29
  • @FriendFX, I'm not sure that I can answer this question without comprehending why are you interested in delay between these clocks. In general, once you perform timing analysis on post P&R schematic, all delays will be accounted for in the clock tree (buffers, muxes, high fanout nets, etc...) – Vasiliy Sep 28 '13 at 12:50
  • My only worry was that I read in the [Best Practices for the Quartus II TimeQuest Timing Analyzer](http://www.altera.co.jp/literature/hb/qts/qts_qii53024.pdf) that "the `-source` option should refer to __the nearest clock pin of the specified target.__" This might only be relevant when using multiple clocks on the same port and for that case, the `-master_clock` option can be used anyway. I just want to make sure I am not preventing the timing analyser from taking all delays properly into account. – FriendFX Sep 30 '13 at 00:10
  • As for your question "why are you interested in delay between these clocks": If there are any register-to-register data paths between `clk0` and `clk4` in my example, any delays in the clock network between `clk0` and `clk4` need to be taken into account by the timing analyser for these paths, am I right? – FriendFX Sep 30 '13 at 00:17
  • @FriendFX, I can't answer Quartus specific question because I've never used Altera's FPGAs. If you have a path between `clk0` and `clk4` registers, you'll either define it as false path (if the clocks are mutually exclusive, if you use synchronizers, ...) or as a multi-cycle path. In the former case the tool should not perform any timing analysis and it is up to you to ensure that no logical failure can occur. In the latter case, all relevant delays were taken into account by other tools that I saw. – Vasiliy Sep 30 '13 at 05:49
  • I also don't want to be Quartus-specific. Unfortunately, I haven't found any other (ASIC-related) SDC resources which even go into these details - any pointers would be appreciated! I hope SDCs to be universal enough to apply across tools and flows, as mentioned in my original question. – FriendFX Sep 30 '13 at 08:07
  • @FriendFX, I have no good references to give you. Try to find Synopsis Design Compiler User's Guide - I suspect this is the source of the most complete documentation. – Vasiliy Sep 30 '13 at 09:50
  • I found it [here](http://web.mit.edu/fredchen/www/share/dcug.pdf), but unfortunately it doesn't give any specifics about the `create_generated_clock` SDC command. I accepted your answer. Maybe I'll start a separate question about the details of the `-source` option at some point. Thanks for your efforts! – FriendFX Oct 01 '13 at 00:44