Dividing numbers on an FPGA

Question

I wrote a program for a Cyclone II FPGA that divides 2 64 bit numbers and returns if the remainder is 0 using the modulus (%) operation.

When I compiled the program with 64 bit numbers for the divisor and dividend it used almost all of the logic cells in the device.

Is there a way to do division, or modulus, or [if modulus==0] in a way that uses less logic cells?

Backup and talk about the actual problem you are trying to solve - chances are this is not the most efficient way to do it! — Chris Stratton, Mar 09 '16 at 02:22
It is more just a general question. I was thinking about using an FPGA as a math accelerator to do complex operations. — Eric Johnson, Mar 09 '16 at 02:42
Division is not a nice operation to try to perform in a single clock cycle. I'm a bit surprised that you actually got that to synthesize, not all synthesizers even support division for synthesis. — alex.forencich, Mar 09 '16 at 02:47
Is there a way I can split it into multiple clock cycles so that I can get a net increase in the number of divisions I can do? What I mean is that if I split it into 4 clock cycles, It could do more than 4 divisions simultaneously. — Eric Johnson, Mar 09 '16 at 02:49
ChrisStratton the project I am currently doing is determining if a 64 bit number is prime but want general answers for division. — Eric Johnson, Mar 09 '16 at 02:53
In practice for this kind of thing you'll find it surprisingly hard to beat a CPU's integer divide unit unless you apply some kind of clever non-general technique. Or find a suitable IP block. The device has a lot of embedded *multipliers*, so you may be able to use some sort of Newton-Raphson technique to find the answer. — pjc50, Mar 09 '16 at 10:55

score 5 · Answer 1 · answered Mar 09 '16 at 10:42

Our Open-Source PoC-Library has a multi-cycle division IP core, which can be synthesized as a pipeline. The bit count of the dividend and divisor as well as the radix can be configured to the users needs. This module returns the quotient as well as the remainder.

entity arith_div is
  generic (
    A_BITS             : positive;          -- Dividend Width
    D_BITS             : positive;          -- Divisor Width
    RAPOW              : positive := 1;     -- Power of Compute Radix (2**RAPOW)
    PIPELINED          : boolean  := false  -- Computation Pipeline
  );
  port (
    -- Global Reset/Clock
    clk : in std_logic;
    rst : in std_logic;

    -- Ready / Start
    start : in  std_logic;
    ready : out std_logic;

    -- Arguments / Result (2's complement)
    A : in  std_logic_vector(A_BITS-1 downto 0);  -- Dividend
    D : in  std_logic_vector(D_BITS-1 downto 0);  -- Divisor
    Q : out std_logic_vector(A_BITS-1 downto 0);  -- Quotient
    R : out std_logic_vector(D_BITS-1 downto 0);  -- Remainder
    Z : out std_logic  -- Division by Zero
  );
end arith_div;

See the source of PoC.arith.div for the full implementation (it's too long to post it here).

score 3 · Answer 2 · answered Mar 09 '16 at 02:57

Do you know anything about the divisor? If you do, then there are several things that can be done.

If the divisor \$m=2^k\$ is a power of 2, then $$\mathbb{mod}(x, m)=0\iff x \& (2^k-1)=0 $$.
If the divisor is a known constant, then there's several tricks, see: http://www.hackersdelight.org/divcMore.pdf and https://hal.inria.fr/ensl-00642145/PDF/LUTConstDiv.pdf

The clue to solving division, in general, is Euclidian Division. Read https://en.wikipedia.org/wiki/Euclidean_division to get an understanding of why pipelining this algorithm is necessary. It basically boils down to the same algorithm you would use to solve division by hand.

Dividing numbers on an FPGA

2 Answers2