How do commercial microprocessors meet timing with a gigahertz clock?

Question

I am having troubles making a relatively simple FPGA design (for an Altera Cyclone IV) meet timing for logic driven by a 250 MHz clock. This makes me wonder how commercial microprocessors (such as the Intel Core i7) manage to meet timing at clock frequencies more than an order of magnitude higher.

How can commercial microprocessors meet timing at 3.8 GHz when I'm struggling at 250 MHz for an FPGA?

an fpga and a processor are apples and oranges. The fpga is built out of relatively large modules/cells that are interconnected. The compiler for the fpga is no better or worse than a software compiler, meaning there is a lot of room for improvement for performance, your signals are routed all over, through large/slow cells taking up time. A processor for example is exactly the gates needed, no extra routing (well jtag scan and bist), etc. Same issue with the compiler but there are some better (expensive) chip compilers out there. — old_timer, Sep 11 '12 at 21:35

score 7 · Accepted Answer · answered Sep 12 '12 at 18:06

FPGAs don't actually have "gates" per se. They typically have Look-Up Tables (LUTs). LUTs are typically implemented using SRAMs. For instance, Spartan 3 FPGAs use 16-bit SRAMs; that is, four address inputs produce one output signal. "Programming" is done by loading the SRAM with a bit pattern representing the truth table, such that for e.g. 2-input XOR, you have address 00 = output 0, address 01 = output 1, address 10 = output 1, address 11 = output 0.

This all means that FPGAs actually have many, many extra and unnecessary gates to perform the same logic function. If you need FPGAs for reprogrammability and rapid prototyping, then this is great! In fact, some people implement the design first in the FPGA, debug it, and then move to an ASIC, which will be smaller, faster, and consume less power, all while doing the same thing the FPGA does.

Modern microprocessors are also pipelined. For instance, in a simple FPGA program, a very large calculation involving several adds and maybe a few multiplies and a comparison may be carried out in the same clock cycle. Doing all this work in one clock cycle means the clock cycle must be long. In a pipelined implementation (which is possible to implement in FPGAs and is often used to achieve timing closure), the big calculation is broken down into pieces, and each piece is executed in one much shorter clock cycle. It still takes about the same amount of time to do the calculation, but the advantage is that after the first piece is calculated and the first partial datum has moved to the second piece, the first piece can immediately begin processing the second datum. The first calculation will still take many cycles to complete, but once it is done, a new calculation will be completed during every clock cycle.

So, in a nutshell, FPGAs have generic logic while CPU has specific logic. FPGA has generic routing while CPU has specific routing. FPGA may be pipelined, but CPU is definitely pipelined.

score 6 · Answer 2 · answered Sep 11 '12 at 21:53

6

Expanding on dwlech's comment. The processors have direct copper connections. The FPGAs are interconnected through programmable connections. Also the processors put critical stuff next to each other. The FPGAs also need room for the SRAM that holds the programming.

answered Sep 11 '12 at 21:53

Brian Carlton

13,252
5
43
64

3

Keep in mind that processors from suppliers such as Intel are being done on the bleeding edge oftechnology where speed and power tradeoffs are state of the art. It is also no simple feat to "meet timing" on an multi-gigahertz processor core even with the specific advantages noted by Brian Carlton. – Michael Karas Sep 11 '12 at 22:00
3

Despite what @Michael Karas points out, the latest FPGAs are often on the bleeding edge of technology for the fabs too. – Brian Carlton Sep 11 '12 at 22:51

How do commercial microprocessors meet timing with a gigahertz clock?

2 Answers2