2

I have an issue with Altera FIFOs. It seems that the full signal wrfull gets asserted even when the FIFO is not full. My FIFOs are of size 8. The SignalTap traces below show the read levels of my FIFOs (rdlevel) as well as the wrfull signals:

enter image description here

What could explain that the wrfull signals are not behaving properly?

Randomblue
  • 10,953
  • 29
  • 105
  • 178

2 Answers2

3

I haven't used Altera, or the FIFO you are using, and the signals you have in your picture makes no sense to me. But I have designed many FIFO's and maybe I can still give some insight into what you are seeing...

Many FIFO's have a strange notion of "full". Specifically, "full" is when there is one less words than what you think there should be. A 16-word deep FIFO can only hold 15 words. Or a 64 word FIFO can really only hold 63 words. In these FIFO's, the RAM used is 16 or 64 words but the logic used to calculate the full-ness will signal FULL at one less. Attempting to put 16 or 64 words into that FIFO will result in bad stuff. Not every FIFO is like this, but most of them are.

Some FIFO's, particularly ones with a separate read and write clock, will take time for data written to it to appear on the read port. Sometimes a surprising amount of time. To put it a different way... Let's say that the FIFO is completely empty, and you start writing to the FIFO as fast as possible. You might be able to write several words to the FIFO before the "FIFO Not Empty" signal goes active. This means that you can't start emptying the FIFO until later than you might expect. Which means that you might have more words in the FIFO than you anticipated. Which could cause it to fill up.

The length of time for a written word to appear on the read port varies depending on the internal architecture of the FIFO. Sometimes it is within a clock cycle. Other FIFO's could require several write-clocks plus several read-clocks.

Requiring several write-clocks and/or several read-clocks brings up an interesting issue. There are many FIFO's which don't behave well with discontinuous clocks. That is, a clock that just isn't continuously running. If it takes 3 read-clocks to bring the write data to the read port, and you only give it 2 clocks then the data is not going to get there. The data will accumulate in the FIFO and never get read. You can actually get into a state where the FIFO claims to be both empty and full at the same time! Empty, because the data has not gotten to the read port yet. And full, because there is no more room in the RAM to write new data.

So, those are some reasons why the full flag might be going active when you don't expect it to. But you must also consider that you are just doing something else wrong. There's a bug in your code, or you have a noisy clock, or you have incorrect timing constraints. Those are real possibilities that I can't diagnose remotely.

  • One thing I would think would be helpful feature in a FIFO would be an asynchronous "maybe not empty" flag, with guarantees that: (1) if it's not set, there's definitely no data in the device; (2) if it is set, feeding some number of dequeueing-side clocks will definitely either make data appear or cause "maybe not empty" to become clear. Such a signal (and an asynchronous "maybe not full" on the enqueueing side) would be useful in system that have sleep modes. – supercat Jun 07 '13 at 19:21
3

Unless queueing and dequeueing are both controlled by a single clock (perhaps with separate enable signals, so that not every clock cycle will enqueue and dequeue data), it will be necessary for the FIFO to carry information across clock domains. This will enable tradeoffs between queue latency and resistance to metastability. In many ways, the cleanest way to have a queue cross clock domains will be to have the queue accept clocks from both domains, along with enable signals. A "bytes enqueued" graycode counter is kept in the input domain, and a "bytes dequeued" counter is kept in the output domain; both counters should have an "extra" bit [so for a 16-level queue, counters should be 5 bits]. Each domain has a double-synchronized copy of the other domain's counter.

Bytes may be enqueued except when the "bytes enqueued" counter is the ones-complement of the synchronized copy of the "bytes dequeued" counter [meaning that the queue is exactly full]. Bytes may be dequeued when the "bytes dequeued" counter does not equal the synchronized copy of the "bytes enqueued" counter. It may be helpful in some cases for the queue to also asynchronous not-empty or not-full indicators; though they should not be considered "reliable", they may be useful as "wake-up" signals.

If a queue uses this approach, data may be enqueued or dequeued at any speed, regardless of the relative speeds of the clocks, subject only to the constraints that data will not become available until two receive-clocks after it is added, and a queue slot will not become available for recycling until two transmit-clocks after it is read. Unlike some other approaches, however, this approach will allow data to be input or output on every clock provided the queue isn't too full or too empty.

Note that while it's possible to have a FIFO chip which operates fully asynchronously, and it's possible for such chips to be used in a way which guarantees that there won't be a risk of metastability, synchronous designs as described above will allow for higher throughput. The problem with async chips is that a device must only try to submit data only when the chip is known to not be full, and retrieve data only when it's known to not be empty. This means that whatever device is going to submit data must, before each byte, determine whether there's space or not; thus, the submission of each byte becomes a two-step process. In a synchronous system, the sending device may attempt to submit a byte on each cycle without regard for whether the queue is full, and then check whether the request succeeded or failed.

supercat
  • 45,939
  • 2
  • 84
  • 143
  • 1
    What you describe is patented by Xilinx (who has several FIFO patents involving Grey codes). –  Jun 07 '13 at 16:03
  • @DavidKessner: All right then--synchronize using some other alternative representation (e.g. use a synchronous counter, but have the count signal latch a copy of the upper bits; put both the latched and unlatched counter values through a double synchronizer, and have the downstream side use the LSB to select which copy is valid). – supercat Jun 07 '13 at 17:15
  • Yeah, Xilinx has done the world a dis-service with this. There are lots of ways around this, but it is too complex to cover them in a comment. –  Jun 07 '13 at 17:40
  • @DavidKessner: I'm curious what the scope and age of such patents would be. Graycode counters have been around for many decades, as has the idea of a circular queue in which the "stuff" pointer is owned by the enqueueing process and the "fetch" pointer owned by the dequeueing process. – supercat Jun 07 '13 at 17:48
  • The Patents are from around 1999, and involve using Grey Code counters to avoid glitching when calculating the full/empty status while also crossing clock domains. At the time I was running a website similar to OpenCores and ended up having a couple nasty email exchanges with their lawyers. While they never outright threatened me with a lawsuit, they made it abundantly clear that they did not like what I was doing. –  Jun 07 '13 at 17:58