1

Is it possible in theory to recover after a process is mistakenly pointed-out to read from a wrong memory address, rather than terminating it?

Let say an error while working with registers lead the processor to read a random place on memory and therefore throwing an illegal instruction exception. At this point, are there any ways to recover to a stable state rather terminating the process?

Are there any processor architecture (specially for Embedded systems), with some extra features to deal with these issue?

Also are there any research papers trying to figure out what are valid return addresses for a function? For example if my function is trying to return to an address that is (virtually) invalid for my program to do its job, have been there any efforts to detect such a violation? Either on the programming language level or on the operating system and memory management?

Note: Illegal Instruction is an exception thrown by the processor.

Update

Would saving last known valid Instruction Pointer somewhere manually help? Maybe in an unused or reserved register, but it should be somehow guaranteed to stay untouched by the rest of the program.

53777A
  • 1,706
  • 13
  • 18
  • possible duplicate of [Recommend a design pattern/approach to exposing/tolerating/recovering from system errors, Exception handling (e.g.s in Java, C++, Perl, PHP)](http://programmers.stackexchange.com/questions/109297/recommend-a-design-pattern-approach-to-exposing-tolerating-recovering-from-syste) – gnat Jan 06 '15 at 11:13
  • see also: [Is it possible/good idea to reduce chance of crashing by catching Error?](http://programmers.stackexchange.com/questions/258201/java-is-it-possible-good-idea-to-reduce-chance-of-crashing-by-catching-error) – gnat Jan 06 '15 at 11:16
  • 2
    What you would need to do is to recover the value of the instruction pointer register. Since it's already been overwritten, this is non-trivial. There *might* be an interesting research idea in this. – Kilian Foth Jan 06 '15 at 11:23
  • 1
    @gnat In my question, the process is already out of control. I can't see how suggested duplicates can address this issue. I mean I can't see how can error or exception handling address this issue? – 53777A Jan 06 '15 at 11:25
  • 2
    @KilianFoth: You need to get back to a known-good state. That can include more than just the IP. It also means you often don't want to go back to one of the recent states as those are suspect. If you step back just one instruction by restoring just the previous IP, you will just crash again when you retry the illegal instruction. – MSalters Jan 06 '15 at 15:29

2 Answers2

6

When the processor throws up an illegal instruction error, there are usually so many unknowns about the program state that the easiest way to get into a known-good state is to let the process crash and to let the fail-safe mechanisms restart the it or to let a fall-back system take over. This might go as far as restarting the embedded device.

All processors that I know of signal errors like illegal instruction through interrupts. This is how an OS like Windows can inform the user that an application did something terribly wrong. In the firmware for embedded devices, you have enough control over the OS to hook up a different handler to that interrupt. This alternative handler might try to recover the situation, but in my experience, the only recovery used in practice is "terminate and restart".

Bart van Ingen Schenau
  • 71,712
  • 20
  • 110
  • 179
  • Isn't the distinction between *process* and *application* important here? – Wolf Jan 06 '15 at 11:51
  • @Wolf: I don't see a significant difference, but I have edited anyway. – Bart van Ingen Schenau Jan 06 '15 at 11:55
  • @BartvanIngenSchenau Does that mean if the OS has an snapshot of the program from the last interrupt, then when this interrupt happens it can recover to the previous state? I mean virtually is it possible? – 53777A Jan 06 '15 at 11:58
  • @BartvanIngenSchenau Thanks +1 - I mean, address spaces can be well-isolated for processes (on some platforms). If an application consists of multiple processes, on of them can be restarted while the application keeps (sort of) running. – Wolf Jan 06 '15 at 12:00
  • 1
    @53777A: No, the OS does not keep snapshots. But when an interrupt is being serviced, all the state that the processor needs to continue with the interrupted process is available and can be stored in a safe place. This information can also be altered to let the processor resume somewhere else. – Bart van Ingen Schenau Jan 06 '15 at 12:27
1

On embedded systems, where you can't always just exit the process and let the OS clean up (what if there's no OS?), an alternative method is to jump to a reinit() handler which reinitializes RAM and jumps to the program's entry point.

This is a reasonable technique if the CPU doesn't need complex reinitialization after such an error. But you probably should have a fallback technique to reset the actual CPU if the simple reinit() fails. Sometimes a full clean restart may be needed.

MSalters
  • 8,692
  • 1
  • 20
  • 32