Let's take a single 8-bit 200MHz DDR chip. All datasheets state that bandwidth is "up to" 400MB/s.
Because it's dual, data is read/written on both edges so yes, theoretical, it's 400MB/s.
But you need 4 or 5 cycles to do a 4-byte burst read operation which 2 cycles are used to "really" read data (4 edges).
400MB/s would imply 2 cycles for each read.
If you optimize like crazy with CAS-latency=2, I don't see how you can do fewer than 4 cycles.
The real sustained read speed of a single DDR chip is 400MB/s * 2/4 = 200MB/s. And I even don't mention the refresh operations...
Am I misunderstanding the specifications?