3

I am storing a 16k constant sine table of 14 bit signed vectors in a package.

I use this package in my module to read out the array in a clocked process

But I get this warning during synthesis and my synthesis is taking a long time -

The RAM will be implemented on LUTs either because you have described an asynchronous read or because of currently unsupported block RAM features. If you have described an asynchronous read, making it synchronous would allow you to take advantage of available block RAM resources, for optimized device usage and improved timings. Please refer to your documentation for coding guidelines."

code in package -

TYPE signed_array IS ARRAY (integer RANGE <>) OF signed (DATAWIDTH-1 DOWNTO 0); 

CONSTANT SINE_TABLE_SIZE : integer := QUARTER_LENGTH+1; -- 16384+1
----sine pi/2 = 1 <=> "0111111....1" MSB is 0 because of the signed representation

CONSTANT SINE_TABLE : signed_array(0 TO SINE_TABLE_SIZE-1):= (
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.0), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*9.5873799096e-05), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.000191747597311), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.000287621393763), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.000383495187571), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.000479368977855), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.000575242763732), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.000671116544322), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.000766990318743), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.000862864086114), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.000958737845553), DATAWIDTH) ,
                    to_signed(integer((2.0**(DATAWIDTH-1)-1.0)*0.00105461159618), DATAWIDTH) ,


process(clk)
    if rising_edge(clk) then
        ctd <= ctr + 1;
    end if;
end process;

cos <= SINE_TABLE(to_integer(unsigned(ctr)));

Any suggestions on how to write a vhdl code in infer a block RAM instead of LUTs?

The SINE TABLE is in a package, and the process is in the main module

Martin Thompson
  • 8,439
  • 1
  • 23
  • 44
Sai Gautam
  • 109
  • 10

2 Answers2

3

Move the take lookup right next to the increment so the output is registered.

That is a gigantic lookup table, though. You may want to consider using a compressed lookup table to save on the block RAM. The trade-off is you may need a couple of multipliers.

Here is an example of a pipelined, compressed sine lookup table: https://github.com/alexforencich/verilog-dsp/blob/master/rtl/sine_dds_lut.v . By default, this one has an 18 bit phase input (2^18 = 262k equivalent entries) with a 16 bit output width. It consumes 3 block RAMs (two are 512x16 and one is 256x8) and two DSP slices.

alex.forencich
  • 40,694
  • 1
  • 68
  • 109
2

The memory read needs to be registered to be recognised as a block RAM:

process(clk)
    if rising_edge(clk) then
        ctr <= ctr + 1;
        cos <= SINE_TABLE(to_integer(unsigned(ctr)));
    end if;
end process;

I would also make ctr of an integer type - then you don't need to faff around with the conversions.

Martin Thompson
  • 8,439
  • 1
  • 23
  • 44
  • this works...but what i actually want is not a direct assignment of "cos" but an assignment based on the 2 MSBs of the ctr to decide which quadrant of the sine signal is used. So I gave a case structure inside this clocked process, but this causes somehow the synthesis tool again not to infer the block ram – Sai Gautam May 04 '15 at 06:46
  • Are you trying to read both the sin and cos in the same clock cycle? If so you need to be careful to use a VHDL construct which the synthesis tool recognises as a dual-read BRAM. Also, keep the memory-reading logic separate from the address calculating logic (in separate processes) if you are using XST - it sometimes struggles to identify RAM constructs when things are mixed together in the same process. – Martin Thompson May 04 '15 at 19:07
  • Thank you...that makes sense why the tool was not inferring the block RAM – Sai Gautam May 06 '15 at 07:01
  • I had mixed a mux inside the same clocked process as the output registration of the block RAM – Sai Gautam May 07 '15 at 07:48