how do these purely theoretical models of computation relate to real-world computer architecture?
They don't.
Both λ-calculus and Turing Machines were designed to model the way a human computes. They weren't designed to model computing machines.
This is most obvious in the Turing Machine, which was heavily modeled after the way a "computer" (which during Alan Turing's time was a job description for a human!) worked: by keeping a limited amount of information in his head and writing down the rest on a sheet of paper. Turing made the amount of information arbitrarily large (but finite), and allowed for infinite amounts of paper (and he cut the sheets in stripes and glued them together to get a simplified one-dimensional tape), however, even allowing for inhuman amounts of information and physically impossible infinite tapes, he could prove that there are certain things that cannot be computed.
λ-calculus similarly developed out of a desire to formalize what computation means, in this case instead of starting from the very physical act of how a human writes down computations on paper, it is more based on an intuition of how one would evaluate a function in one's head.
Turing proved that a Universal Turing Machine can exist, i.e. a Turing Machine which can read the description of an arbitrary Turing Machine from tape and perform that Turing Machine's operations. This can be considered to be the very first interpreter, and the very first stored-program computer.
However, people who were actually involved in developing the first digital computers claim that the work of the logicians on models of computation had little influence on the work of the electrical engineers and applied mathematicians building the first digital computers, despite the striking resemblance.
There are models of computation that are much closer to what current mainstream computers look like. The Random Access Machine, for example, is a cost model for algorithm analysis that more closely models how memory is accessed in a typical computer, namely with constant-time random access, as opposed to a Turing Machine, where access time is linear in the distance from your current memory location to the one you want to access (you have to move the head across the tape one cell at a time).
Note that you use the term "real-world computer architecture" as if there was only one. There are in fact many different architectures. For example, the Reduceron is a graph-reduction CPU specifically designed for Haskell-like languages, it works very differently from, say, a typical x86 CPU. There are architectures which explicitly model the communication costs on the chip, such as the FLEET architecture. The Harvard Architecture strictly separates code and data as opposed to the von Neumann Architecture, which doesn't distinguish between them. And so on.