1

I understand the x86 operation to perform integer multiplication of two numbers (e.g. on 64 bits) is MUL.

My question is, how is this operation generally implemented at the hardware level? (for instance, on a modern Intel processor). Also, is it executed in a single CPU cycle?

Weier
  • 111
  • 2
  • 2
    Did you try to google something like "hardware multiplier" or "binary multiplier"? – Eugene Sh. Mar 01 '23 at 16:37
  • 1
    *Also, is it executed in a single CPU cycle?* That depends on your CPU. But in general, no. Modern processors are heavily pipelined, pretty nothing is executed in one cycle, though it might have a througput of 1 / cycle. – Marcus Müller Mar 01 '23 at 16:48
  • Weier, You mentioned an instr. set that has a long history and for which there isn't a single answer (it was done different ways at different times) and then you ask for how it is done, generally. Folks that do these things apply a lot of knowledge to varying goals. So I think the better answer for you is to start by learning about the Booth algorithm. I've not read this, [An effective educational module for Booth's multiplication algorithm](https://www.researchgate.net/publication/262177138_An_effective_educational_module_for_Booth%27s_multiplication_algorithm), but the title sounds good. – periblepsis Mar 01 '23 at 16:49
  • Feel free to look the Intel CPU documentation how many clock cycles it takes in each case. There is no way to know how a specific Intel CPU does it. There is also no generic method to do it. Likely Wikipedia lists multiple different methods of computing multiplication in computers. Have you done any research before asking, and is this some kind of homework question? – Justme Mar 01 '23 at 18:01

1 Answers1

1

It depends on the processor, an 8086 takes several clock cycles for MUL depending on the processor, pentiums take 10 (if I remember right). It also depends on the size of the operation as modern processors have larger registers. (FMUL's can be done in one clock cycle). You can look it up in the processor manual (the only one I have is for an 8086)

I believe MUL in modern processors uses the dadda tree multiplier hardware algorithm

Voltage Spike
  • 75,799
  • 36
  • 80
  • 208
  • Dadda is kind of old. A modified version of Booth is as fast as it gets (see: *Ravindra P Rajput & M. N Shanmukha Swamy, 2012*.) I'm not current on the most up to date stuff on IC design. But in playing around with FPGAs, no question Dadda takes twice the LUTs as modified Booth for the bit widths I've played with. I'm pretty sure **no** modern processor uses Dadda. – periblepsis Mar 03 '23 at 04:17