The "higher level" steps are as you've guessed: Get characters from keyboard, convert to integers, add the integers, convert the result back to characters, then display those characters.
The CPU does none of this itself.
The CPU executes tiny little instructions that each do a very simple (and very specific) thing. These instructions are the fundamental (lowest level) building blocks of software. Each of those high level steps represents many instructions.
For a detailed example; to get 1 character from the keyboard (using multiple assumptions about the computer's hardware and the OS being used and over-simplifying):
- Some sort of controller (that the keyboard is connected to) will send an IRQ (Interrupt ReQuest) to the CPU, and the CPU will (sooner or later) respond by starting an interrupt handler.
- The OS's interrupt handler will figure out what the IRQ was and invoke a device driver's interrupt handler
- The device driver's interrupt handler will do whatever it has to to get the byte from the controller (this can be several layers of "complex" for some cases - e.g. USB). Then it'll send that byte to a keyboard driver
- The keyboard driver will figure out a "scan code", which typically involves a state machine to figure out if multiple bytes are part of the same scan code (or part of a new scan code). Then it will typically convert the "potentially multi-byte scan-code" into a "fixed size integer key-code".
- Then the keyboard driver will use the key-code and various lookup tables and other meta-data (that depend on which keyboard layout is being used) to determine if there is/isn't a "character" (Unicode codepoint?) associated with that key. Note that a lot of keys simply don't have any character.
- The keyboard driver will combine this with other information to form some sort of "key press event"; and send that event somewhere (e.g. to a GUI).
- The "key press event" will make its way through various processes (e.g. from X to GUI to terminal emulator to shell to foreground console app) until it finds its way to an application. This can involve stripping a lot of useful information at some point (terminal emulator) to make it work for legacy
stdin
.
- Once the key/character arrives at the application; there's typically some sort of input buffering that allows the user to edit (and supports things like backspace, delete, cursor movement, cut/copy/paste, etc). Also; the "current buffer" is typically being displayed while the user edits it (so that they can see what they're doing). Usually, when the user presses "enter" the entered text is considered complete. This may all be done by a library (e.g. C standard library).
- Then application determines if the input is valid. E.g. if it's expecting a string representing a number but the user typed "FOO" then it may (should) display an appropriate error message and reject the input.
- While doing input validation, or after doing input validation, (or instead of doing input validation for extremely bad software), the application converts the input text (a string representing a number) into an integer
Note that all of the above can easily add up to thousands of tiny little instructions that are executed by the CPU; even though it's only a fraction (barely more than one of the "higher level steps" we started with) and even though I didn't provide any details of how the input buffer is displayed while it's being edited (font engine, text layout engine, 2D graphics renderer).
For all higher level steps (get 2 numbers from user, add them, then display the result) the total number of instructions that a CPU executes can be (literally) millions.