I am trying to understand the correct way to calculate throughput of a digital hardware design block that forms part of a bigger system.
Here are the few scenarios:
DUT takes 10 clock cycles to generate 20 bit output, then another 10 clock cycles to generate the next 20 bit output. -> The maximum throughput is 20 bits per 10 clock cycles = 2 bits/cycle
DUT takes 10 clock cycles to generate the first 20 bit output, then (being pipelined) it generates a new 20 bit output ever cycle -> The maximum throughput is 20 bits per 1 clock cycles = 20 bits/cycle
Is this correct or do I have to involve clock frequency to calculate the throughput as well?
EDIT:
DUT = Device-Under-Test is the sub-block for which I am trying to calculate the throughput. This can the design as a whole i.e system level, or a single block inside it that generates data.