1

for a project (FPGA image processing accelerator) I need to create a high bandwidth read-only memory. I settled on using Quad SPI NOR flash modules (will use them in XIP mode) but I have some concerns. enter image description here First I'm not sure if I need to use buffer ICs for the clock and CS lines (for prototyping I used 4 and it was fine, but the end goal is something like 64 daisy-chained). If I do, can you recommend some high-speed chips? The modules can run at 133 MHz which is much faster than anything I've worked with, in the past.

Another question is do I need to impedance match the data lines? if yes, does the propagation delay in the clock buffers cause problems?

How about termination resistors?

It'd be great to have some tips on high-speed routing considerations before designing a PCB. If you can point me to any sources that I can learn from, that'd be great!

Also befoe you suggest "the right tool for the job" (for example an FPGA with built-in HBM2 or similar), this is a student project and those solutions are out of the question. These SPI modules are fairly cheap and I/O pins on the FPGA are free. A 256 bit bus at 133MHz is around 4.25GB/s which should be fast enough for real-time image processing. If not, FPGA boards with around 500 IO pins are still affordable (around 100$).

OM222O
  • 282
  • 2
  • 11
  • So it looks like the plan is to implement a read-only controller on the FPGA for each QSPI flash? Just making sure... If you are sharing one clock with 64 ICs you will probably want to use clock fanout ICs. If you have a 1toN fanout IC then you will need LogN(64) levels of fanout ICs (1:8 fanout buffers will require 2 levels, for example). Don't think this is as big of a problem on the chip select line, but don't take my word for it. You will probably want series termination resistors and controlled impedance on the data lines. Keeping lengths short as possible will help. – mooshoomatt Apr 13 '22 at 01:26
  • there will only be one read-only controller (a basic state machine to output the SPI signals and read the result) so at each clock cycle it'll read 256 bits and split it into several FIFOs to do the processing in parallel but that's not really the most important thing. connecting the clock and CS lines effectively gives you a single module with more data lines which is exactly what I need. I'm just not sure about high-speed designs because I only validated the code on a breadboard at 1MHz which is way too slow. – OM222O Apr 13 '22 at 01:38
  • 1
    Ok, so you have a way to send read commands to all 64 flash ICs then. My comment regarding the fanout buffers still stands. You will want to make sure that the overall routed length for each clock path is relatively closely matched as well to avoid skew issues, in addition to controlled single ended impedance and series termination resistors for the data lines and the clock lines, depending on the specs of the clock buffers. If you want to be safe you could use the same fanout approach for the chip select signals. I don't see any reason why it would not work assuming your controller works. – mooshoomatt Apr 13 '22 at 17:09

0 Answers0