How computers put signals in order (handling memory (system bus) access in multi-core system)?

Question

My journey started from the question: "If I have 2 cores that want to write values in one memory address at the same (literally) time, how does a computer manage such a situation?" After quite a long research I thought that I had found the answer: multicore processor has a shared clock between its cores (for somehow synchronizing them), though there's information that each core usually has its own. The only confident fact is that the process of ordering such signals happens in a memory controller or at the moment of accessing a system bus.

I'm interested in how it works physically, how this kind of arbiter works (some implementations), because I've created a solution that should work, but (in my opinion) is extremely ineffective (and probably not implementable at all):

Suppose we can read and write information from some cells. Then, we can create a unit with independent input ways from each core and we let the unit has its own clock. Now, whenever a core wants to acquire resources, it must put some charge in its cell, which will be simultaneously read by the unit in one of its ticks (if both of these events are happening in overlapping periods of time and the unit hasn't got enough of charge, then it must return it back in the cell). If the unit has found enough charge in a cell, it must send a notification to the core and stop reading the cells until the core returns some charge again.

`write values in one memory address at the same (literally) time` ... is that something that is actually desirable? — jsotola, Feb 26 '23 at 00:17

score 3 · Accepted Answer · answered Feb 26 '23 at 00:33

It just picks one. Each processor connects to the arbiter (via "independent input ways" as you put it); the arbiter connects to the memory chips. Each clock cycle, the arbiter somehow chooses which processor gets a turn to access memory, then does that memory access by copying the address onto the address bus etc., and if there's more than one processor that wants to access memory, it tells the other ones to wait.

Actually, a more modern memory system has a lot of latency. The processor sends out a memory request but it doesn't expect to get the answer straight away. It could be 20 cycles or more, even in a single-core system with no arbiter. So what really happens is that the arbiter has a FIFO buffer (a.k.a. a queue) from each processor that holds the requests from that processor. The processor doesn't need to wait unless the queue is full.

FIFO buffers can also be used to send data between different clock domains, so the arbiter has everything sync'ed to its own clock, even if the processors have different clocks. This requires the requests to spend a minimum few cycles in the buffer because it always costs a few clock cycles whenever you make data line up with a different clock.

Each clock tick, the arbiter chooses a FIFO buffer that has a request in it, removes the request from the buffer, and copies it out to the memory chips on the memory bus. If there aren't any requests it does nothing. It also has to get responses back from the memory chips and copy them to the right processor.

How does it choose which queue? Some obvious ways would be a fixed priority system (e.g. CPU 0 goes first, CPU 1 only goes if CPU 0 doesn't want to go) or a round-robin system (CPU 0, CPU 1, CPU 0, CPU 1, etc...). More complicated ways might be a token bucket system or a priority system depending on which thread is running on each processor.

I think part of your confusion is asking how data can get send because of a circuit with one clock and read by a circuit with a different clock, reliably without causing problems. A very common method is a double flip-flop synchronizer - you simply delay the signal by 2 (or even 3) clock cycles on the receiving side, and the delay is long enough to let any problems work themselves out. We don't have any other "beautiful" way.

How computers put signals in order (handling memory (system bus) access in multi-core system)?

1 Answers1