Although most answers approach from the side of software and/or hardware model, the cleanest way is to consider how the physical RAM chips work. (The cache is located between the processor and the memory, and simply uses the same address bus, and its operation is completely transparent for the processor.)
RAM chips have one single address decoder, which receives the address of the memory cell, arriving on the address bus (and similarly a data bus, either in or out). The present memories are built in the "single processor approach", i.e. one processor is connected through one bus to one memory chip. In other words, this is the "von Neumann bottleneck", since every single instruction must reference the memory at least once.
Because of this, on one wire (or wires, aka bus) only one signal may exist at a time, so the RAM chip may receive one cell address at a time. Until you can assure the two cores put the same address to the address bus, the simultaneous bus access by two different bus drivers (like cores) is physically not possible. (And, if it is the same, it is redundant).
The rest is the so called hardware acceleration. The coherence bus, the cache, SIMD access, etc. are just some nice facades in front of the physical RAM, your question was about. The mentioned accelerators may cover the fight for using the address bus exclusively, and the programming models have not much to do with your question. Also note that simultaneous access would also be against the abstraction "private address space".
So, to your questions: the simultaneous direct RAM access not possible, neither with the same nor with different addresses. Using cache might cover this fact and might allow apparently simultaneous access in some cases. It depends on cache level and construction, as well as the spatial and temporal locality of your data.
And yes, you are right: multi(core) processing without enhanced RAM access, will not help much for RAM-intensive applications.
For better understanding: just recall how Direct Memory Access works. Both the CPU and the DMA device can put address to the bus, so the have to exclude each other from the simultaneous using of the bus.