4

I'm making my design with Vivado HLs and Vivado and I'm doing some somewhat big transfers between DDR and my custom IP block and vice-versa.

Each transfer from DDR to custom IP is of 256x256x4=262144 bytes and it happens 4 times.

My MM2S (Memory Mapped to Stream) velocity is at 350Mbytes/s and by S2MM is at 200 Mbytes/s.

I know I can get better velocities and I guess these slow ones are related to the parameters of the Axi DMA block.

That's what I came here to ask you, to help me understand which should be the correct parameters since I still can't understand it from reading the logicore product guide.

enter image description here

Width of buffer length n From what I understand this is the maximum length of the transfer in bytes like so 2^n. So in my case as 2^18=262144 shall I put 18 in here?

Memory Map Data Width Data width in bits of the AXI MM2S Memory Map Read data bus. I have no idea here. My words have 32 bits and I defined the entrance stream of my block to have a length of 32 bits but what is this?

Stream Data Width I guess here I should put 32 correct?

Max Burst Size

Burst partition granularity setting. This setting specifies the maximum size of the burst cycles on the AXI4-Memory Map side of MM2S. Valid values are 2, 4, 8,16, 32, 64, 128, and 256.

Again, I have no idea what to put here.

I could do a trial-and-error approach and change parameters until I find the best ones but the problem is that each re-synthesyze and re-implementation in Vivado takes a lot of time...

João Pereira
  • 357
  • 5
  • 17

1 Answers1

5

Width of buffer length n: This is exactly what you think, the largest transfer in byte the IP can perform with a single command. 18 bits may be enough, but it's likely you need 19 bits to represent 2^18, check the datasheet to make sure.

Memory Map Data Width This is on the AXI side. You can put what you want (AXI will upsize/convert as needed), but in my experience it's better to avoid size conversion and clock conversion as much as possible. That means that if your AXI memory is 128 bits 100MHz, you should use the same 100MHz clock here with 128 bits wide port. On the Zynq, it expects 32 or 64 bits, and I guess the upsize/convert are "free" since it's done on the fixed hardware.

Max Burst Size This also affects the AXI side. It's the maximum transfer of Memory Map Data Width bits it will perform in a single transfer request. Higher is usually better, because of the way memories work with bursts. However, it will affect your system's performances (arbitrating) and possibly inflate the core's size if you use store-and-forward (which I'm pretty sure the IP core forces you to use, it used to be optional). The impact of that option depends mostly on the AXI infrastructure and load. On a load-light infrastructure with large write/read acceptance, you won't see any impact.

Stream Data Width This is the AXI-stream side. This is what your own IP needs, in your case it seems to be 32 bits.

Don't forget that the AXI-Stream and AXI port doesn't have to use the same size and clocks. However, for maximum throughput, the AXI port must have higher throughput than the AXI-Stream side.

For instance, if you AXI-Stream (and thus, your core) use 32 bits with a 150MHz clock, it effectively have a throughput of 4.8GBits/s. If your AXI port runs at 100MHz, it can't be 32 bits since it won't have enough throughput (3.2GBits/s < 4.8GBits/s). At 64 bits (6.4GBits/s), you would have enough to feed continuously to your IP core.

Jonathan Drolet
  • 1,169
  • 5
  • 7
  • Ty for that answer. I have just one doubt about the Memory Map Data Width. What is my Axi memory? My memory-mapped is DDR so which width shall I put?Right now I have 64 and with the same clock as axi-stream (100Mhz) so a throughput of 6.4Gb/s while the Axi-stream is 32 bit so 3.2Gbit/s. Can I do like 512 bits (51.2Gbits/s) and obtain better results? – João Pereira Jul 02 '15 at 11:14
  • It's pointless to have larger width than the system bus. On Zynq, it's 32 or 64 bits. On a custom DDR, it's given by it's parameters. A 32 bits wide 400MHz DDR memory accessed with a 1:4 clock will be 256 bits wide 100MHz. If it's 1:2 it would be 128 bits 200MHz, it all depends on the parameter you choose when you generate the memory interface. On the Zynq, that's fixed and you don't have much choice over it. – Jonathan Drolet Jul 02 '15 at 12:42
  • But I didn't choose any memory interface, I just interact with standard Zynq DDR. Those `32 bits wide 400MHz DDR memory accessed with a 1:4 clock will be 256 bits wide 100MHz` I have no idea where I can see mine. Help me in this please and i'll be on my way :) – João Pereira Jul 02 '15 at 13:30
  • 1
    On the Zynq you don't have control over that, it's fixed hardware. It's either 32/64 bits, depending on the port your using. – Jonathan Drolet Jul 02 '15 at 13:49
  • Ah ok. So since I'm using the ACP port connected to the `M_AXI_MM2S` and `M_AXI_S2MM` it's 64 bit. Ty for the help ;) – João Pereira Jul 02 '15 at 14:19