6

Why are latches and 2 phase clocking schemes frowned upon in modern high speed ASIC design? I understand that single edge flip-flop based designs are easier on STA tools but are there any other good reasons for this bias in the industry?

Revanth Kamaraj
  • 430
  • 1
  • 3
  • 9
  • I share your curiosity, since would seem like two-phase clocking would make some things much easier. For example, if one a peripheral running off a 32768Hz crystal using two-phase clocking, one could switch smoothly between having its clocks synchronized to the main CPU clock and having them be asynchronous. That wouldn't be reliable when using single-phase clocking since switching on the synchronizer might result in the peripheral receiving a runt (non-synchronized) clock pulse followed by a synchronized one, but when using two-phase clocking that wouldn't matter. – supercat Jul 01 '16 at 19:59

1 Answers1

6

It is not so much a 'bias' in the industry as it is a matter of design strategy. Two high-speed clocks 180 deg out of phase (clock A and clock B) sounds like a good solution for input register/output register(or count and store)clocks or to prevent race conditions.

At low speeds less than 500MHZ a 2-phase clock is not such a big issue, but at GHZ clock frequencies it is near impossible to prevent skewing of clock A compared to clock B, especially after traversing through several turns and layer shifting. This quickly renders the 2-phase scheme useless, as the skewing would have to be constantly corrected.

Better to have a single master clock and use 'point of use' delays where needed for data setup and hold time, register read time, etc. Single edge flip-flops only latch data on the rising or falling edge of the clock. For longer delays a nand latch can 'trap' the clock pulse until it is used, which resets the latch. Often this 'point of use' delay is just a zig-zag pattern in the trace right at the IC that needs the delay.

With attention to details, the rising edge can be used to latch an address into a dram/ram/register and the falling edge can be use to read or write data. As long as the address/data is stable before the clock edge arrives (even by a few hundred picoseconds), this helps the single-phase clock scheme work at its best.

CPU's and MPU's still use a 4-phase clock but only in the state-machine core, to perform the 'fetch/decode/execute/store' procedures in an orderly fashion. Note that some modern CPU's may use a 6 phase clock so that pre-fetch and write-back are added to the sequence.

  • 2
    we gave up on 2-phase clocks due to threshold mismatch. Using cross-coupled NANDs instead of a t-gate based latch is significantly better when your thresholds are +-10% – b degnan Jul 01 '16 at 20:07
  • Does that mean all those 32-bit MIPS processors students design are unrealistic in the sense that they are too simplistic to be practical even though architecturally accurate? – Revanth Kamaraj Jul 01 '16 at 20:14
  • 1
    @bdegnan. I referred the threshold errors as skewing errors. Your correct about the nand gates. They create a 'semaphore' effect in that data or clock is held until needed, resetting the nand latch. –  Jul 01 '16 at 20:18
  • @rvt. If they are 'architecturally accurate' then the limit is clock speed. With a very slow clock even a sloppy design might work. For GHZ speeds every aspect of every trace and part must be fine tuned to within picoseconds to work dependably. –  Jul 01 '16 at 20:22
  • @rvt There's a lot of design constraints that you won't get into until you are in the field. The MIPS is a good, simple core, and it's used all over the place. You are actually constrained by bus IO, so processor speed doesn't matter as much as it used to because it's all about cache in real architectures. – b degnan Jul 01 '16 at 20:43
  • @bdegnan. It is the same old story;trying to keep the core busy even though the outside data bus cannot run at core speeds. Fatter multi-port cache solves some of that problem. When Fram/Mram is more mature I think we will see another leap in CPU/MPU/FPGA performance. –  Jul 01 '16 at 20:49
  • 1
    @Sparky256 I'm pretty sure that we won't see it, but we could get lucky. Here's the public results from my last survey: http://degnan68k.blogspot.com/2015/04/assessing-trends-in-performance-per.html I've seen dies that are the actual reticle size due to cache, it's a sad state of affairs. – b degnan Jul 01 '16 at 21:17
  • 2
    @bdegnan. Great blog, and thanks for the help with the nand gates. With Intel now cranking out chips with 10nm topology, and stacking layers for more 3D space usage, the mobo bottleneck will persist until we begin to use fiber-optic inlays for signals. Even so, copper traces for legacy hardware will be around for sometime. –  Jul 01 '16 at 22:17