64

Can a CPU (such as the Intel i3/i5/i7/Xeon) with on-chip cache RAM use that as its only functional RAM, without any external memory banks attached?

Or must there be external RAM, and the cache cannot be accessed or used alone?

Modern desktop/server CPUs often have more internal cache RAM than many 1990's computers had in entire system memory, so there should be plenty enough there to run simple code.

CPUs from before cache existed such as the 6502 would be unable to do anything, as the internal CPU RAM only amounted to a few bytes for the address counter and accumulators.

This is not a question of running any sort of modern operating systems, but running simple code programmed into a custom ROM, or hand-entered with a hex input keypad.

JDługosz
  • 675
  • 1
  • 4
  • 12
Dale Mahalko
  • 1,570
  • 2
  • 13
  • 15
  • 1
    Entirely depends on the CPU and what exactly you mean by "cache" as some CPUs have their ram built in and need no external chips. – PlasmaHH Jun 19 '17 at 14:28
  • Well if you can repurpuse cache as ram then yes. And there are CPUs that can do that . Or the CPU has ram built in, and a lot of them does, e.g. the ARM family uses it to run the first a few stages of boot loaders. – user3528438 Jun 19 '17 at 14:28
  • 42
    basically, if you include addressable memory in the processor itself, you build what we call a microcontroller. These exist. – Marcus Müller Jun 19 '17 at 14:41
  • @MarcusMüller: Your comment is definitely important enough for someone to write a more detailed answer. Maybe you would like to? – mathreadler Jun 19 '17 at 19:48
  • 1
    Who says you need any RAM at all? – Chris Stratton Jun 20 '17 at 06:18
  • 1
    Depends on how you define "function". I bet an i7 is perfectly capable of producing heat with only a battery connected to it. – Dmitry Grigoryev Jun 20 '17 at 15:26
  • 2
    There are a number of things you can do with just a few registers and no additional RAM. For example, a function generator. – David Schwartz Jun 21 '17 at 07:29
  • 1
    Related: [What use is the INVD instruction?](https://stackoverflow.com/questions/41775371/what-use-is-the-invd-instruction) – Iwillnotexist Idonotexist Jun 21 '17 at 19:20
  • 1
    :-) I'm reminded of *The Hunt for Red October.* "Can you launch an ICBM horizontally?" ... "Sure! Why would you want to?" Thumbs up to @MarcusMüller for pointing out that using a full uProcessor in this way is overkill. – JBH Jul 19 '17 at 08:40
  • it takes a fair amount of code to get dram up and running, and although it is possible to do that only with flash and internal processor registers, ideally one would design a processor with a little ram to get booted, a sane thing to do. But sure you dont need ram to run a processor as mentioned above. – old_timer Jan 05 '18 at 20:21
  • you will find that these parts need more than a power supply they sometimes need a sequence of things to happen the different rails coming up at a certain time, power good feedback before you can source another, particularly intel foundry stuff likes lots of rails. they may also have variable voltage rails that that logic in the chip uses to dynamically tune a core supply. so your "power supply" handwaving is not just a couple of alligator clips on a bench supply. not to mention the socket needed or getting the balls soldered to a board to try this. – old_timer Jan 05 '18 at 20:23

4 Answers4

68

See this extremely detailed account of the PC boot sequence: http://www.drdobbs.com/parallel/booting-an-intel-architecture-system-par/232300699?pgno=2

Since no DRAM is available at this point, code initially operates in a stackless environment. Most modern processors have an internal cache that can be configured as RAM to provide a software stack. Developers must write extremely tight code when using this cache-as-RAM feature because an eviction would be unacceptable to the system at this point in the boot sequence; there is no memory to maintain coherency. That's why processors operate in "No Evict Mode" (NEM) at this point in the boot process, when they are operating on a cache-as-RAM basis. In NEM, a cache-line miss in the processor will not cause an eviction. Developing code with an available software stack is much easier, and initialization code often performs the minimal setup to use a stack even prior to DRAM initialization.

You can observe this by running a PC without RAM: it will play a series of beeps. The program that plays those is run from the BIOS Flash ROM.

I've also seen this behaviour on some ARM processors. There will be configuration registers inside the SoC that allow you to use the cache as RAM early on in the boot sequence, in order to run a program that finds, enumerates and configures the DRAM.

pjc50
  • 46,540
  • 4
  • 64
  • 126
  • 1
    Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/60781/discussion-between-vaxquis-and-tuskiomi). –  Jun 20 '17 at 17:05
  • 3
    [More details on how x86's "cache-as-ram" (CAR) mode works](https://stackoverflow.com/questions/41775371/what-use-is-the-invd-instruction), in an answer which brings it up as a use-case for `INVD` (when exiting CAR mode, invalidate cache instead of having useless data written to memory, potentially over something valuable). – Peter Cordes Jun 23 '17 at 06:15
  • What's wrong with evicting when DRAM isn't available? Isn't the problem *reading* from rather than *writing* to RAM? – user541686 Jul 27 '19 at 19:20
  • @user541686: The eviction itself isn't an immediate problem, but if that cache line wasn't just a tmp buffer then presumably you'll want to read it again later. If it was evicted, the data is gone. If the first access after eviction was a write (e.g. to use it as a tmp buffer again), that might be ok (in a hypothetical case where CAR *without* no-fill mode was possible, so allocations could happen.) But instead of hardware having per-cache-line controls on whether we can or can't evict (which would cost bits just to support this special case), there's one global control. – Peter Cordes May 22 '20 at 02:24
14

Generally, the cache memory is not addressable. A program cannot store or retrieve data intentionally from it.

Lior Bilia
  • 7,282
  • 1
  • 20
  • 30
  • 2
    +1 This is a very astute answer. Cachelines are loaded and retired at will from addressable accesses from main memory, in a nearly random way. – Ale..chenski Jun 19 '17 at 16:33
  • 18
    Yup, this is why it requires a special mode (as in pjc's answer), or some kind of special trickery. Another option (besides the accepted answer) might be to make the system *think* it has some DRAM, but actual memory writes just throw away the data, and reads produce all-zeros (or actual data from a ROM for some region of addresses). As long as all loads hit in cache, the system will behave correctly. IDK enough about how x86 boots to know if it's possible to get into 64-bit long mode without needing any real cache-flushes. – Peter Cordes Jun 19 '17 at 19:31
  • 4
    However, [Skylake CPUs with in-package eDRAM cache use it as a memory-side cache](http://www.anandtech.com/show/9582/intel-skylake-mobile-desktop-launch-architecture-analysis/5) (between the memory controller and *everything* else, unlike in Broadwell), so it can even cache DMA accesses from non-CPU system devices, or CPU loads/stores to "uncacheable" memory regions (which bypass L1/L2/L3). – Peter Cordes Jun 19 '17 at 19:38
  • 3
    @PeterCordes However the answer isn't "no" just because a special mode is needed - the question is then about whether the special mode exists. – user253751 Jun 20 '17 at 00:05
  • 1
    The question is - what happens if you use the cache nevertheless and make sure it never misses or gets flushed - it becomes addressable the moment something is written to an address in nonexistent RAM, no? The actual write is dropped on the floor, but the read is served from cache anyway? – rackandboneman Jun 20 '17 at 14:50
  • Ooops, that is similar to what @Peter Cordes suggested. I'd suspect the read would give you static, or on a multiplexed bus, the capacitively "cached" state of the address lines (that is why missing RAM on old systems seemed to be full of ASCII tables :) ) – rackandboneman Jun 20 '17 at 14:52
  • @rackandboneman: I was thinking you might wire all the DRAM data lines to ground or something so they'd read as zero. But I suspect that it's not that simple, and some signal line might have to transition from low to high or vice versa before a DDR4 memory controller will think the DRAM has provided data. And does the bus encoding use a scrambler? Since you'd probably wire the ROM to the chipset and program a region of physical addresses to map there instead of DRAM, it shouldn't matter. Although maybe you could wire a ROM into the DDR4 bus, and have all addresses alias into the ROM... – Peter Cordes Jun 20 '17 at 18:07
  • You'd probably have to "ground" them via some kind of pullup, given you will still attempt to write data into a dead short (and probably create a lot of energy waste and thermal stress that way). – rackandboneman Jun 21 '17 at 09:06
  • @PeterCordes "IDK enough about how x86 boots to know if it's possible..." - well what hope do us mortals have then? – Martin Bonner supports Monica Jun 21 '17 at 09:23
  • @MartinBonner: Look at http://wiki.osdev.org/, and maybe the source code for an open-source BIOS like https://www.coreboot.org/ if you're curious :P I know a *lot* about how x86 works once it's up and running in long mode or 32-bit protected mode, with the system programmed to map the DRAM to some ranges of physical address space, and so on. I have little interest in the details of getting there, especially stuff the BIOS does before loading the boot sector or EFI. I know something about how a kernel enables paging and switches from 16-bit real mode to protected or long mode... – Peter Cordes Jun 23 '17 at 05:03
10

While this does not directly address the processor families specified in the question, the scheme below would work on the earlier x86 processors so, yes it is possible to operate without either RAM or cache, although this approach requires some creative programming skills.

Back in the 1980's I came across a design for a radio receiver that decoded the MSF time signals broadcast in the UK. This design used a Z80 processor and only had a ROM for the program storage. All of the processing and data storage was performed using the internal registers within the processor. This obviously meant that there could be no subroutine calls as there was memory available to hold the stack.

Back then the cost of RAM was high and as this was a hobby project, keeping costs down was important, quite apart from it being an interesting academic exercise. This was also before the days of widely available microcontrollers (an 8751 with eprom cost over £100 IIRC).

uɐɪ
  • 3,538
  • 21
  • 27
  • Restricting yourself to the data that can be kept in registers really limits the useful functions you can perform. I think the question was more about the cache memory in modern processors. – Mark Ransom Jun 20 '17 at 23:25
  • Of course it goes without saying that not using RAM limits the functionality that is possible. The application cited shows how much is possible with very limited data storage resources, a radio front end and an LCD display. – uɐɪ Jun 21 '17 at 10:18
-3

Typically a CPU will require an external clock. But with that, yes it can.