4

I am working on embedded systems as beginner and have come across files like start.s or cstart files that run before main() function begins. What is the purpose of these or similar files? What information we are telling the system? I've heard of initialization but don't know exactly what that is.

doubleE
  • 709
  • 1
  • 11
  • 19
  • 3
    If you have found files like start.s or cstart then you can probably open these files and try to understand them, do not be so lazy. If you are so lazy then you can use google before askig questions: http://stackoverflow.com/questions/3393611/flow-of-startup-code-in-an-embedded-system-concept-of-boot-loader – Al Bundy Sep 20 '16 at 18:18
  • I tried to read them but couldn't decrypt what they are doing. – doubleE Sep 20 '16 at 18:21
  • 2
    Your question should be more specific. There is much information people has to guess the way it is asked. For example: what kind of processor? which compiler? Also, you should let us know what have you done to solve your question before asking. Why is it cryptic? is it written in assembler? Finally, have you ever read the processor datasheet? have you gone through the compiler manual? – Krauss Sep 20 '16 at 18:30
  • Related: [What _should_ happen before main()](https://stackoverflow.com/a/47940277/584518). – Lundin Oct 31 '18 at 15:00

4 Answers4

6

It is completely dependent on the compiler and architecture, but generally that code initializes the most basic hardware required for the rest of the code to run. The code for example:

  • Defines the reset vectors

  • Defines the layout of data in memory (many systems use a linker script instead)

  • Defines the addresses of interrupt service routines in a big table (the interrupt vector table)

  • Initializes CPU registers, e.g. the stack pointer

  • Configures the core clock

In addition, that section also serves the runtime needs of the programming language used. It:

  • Initializes whatever function parameter passing system used

  • Initializes global variables by e.g. copying flash contents to RAM and zero-initializing memory

  • If dynamic memory allocation is used, initializes the heap

  • If floating point math is enabled, initializes the FPU (if available) or initializes the floating point library

  • If exceptions are used, initializes exception handling.

jms
  • 8,504
  • 3
  • 22
  • 45
3

Ubuntu 20.04 glibc 2.31 RTFS + GDB

glibc does some setup before main so that some of its functionalities will work. Let's try to track down the source code for that.

hello.c

#include <stdio.h>

int main() {
    puts("hello");
    return 0;
}

Compile and debug:

gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o hello.out hello.c
gdb hello.out

Now in GDB:

b main
r
bt -past-main

gives:

#0  main () at hello.c:3
#1  0x00007ffff7dc60b3 in __libc_start_main (main=0x555555555149 <main()>, argc=1, argv=0x7fffffffbfb8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffbfa8) at ../csu/libc-start.c:308
#2  0x000055555555508e in _start ()

This already contains the line of the caller of main: https://github.com/cirosantilli/glibc/blob/glibc-2.31/csu/libc-start.c#L308.

The function has a billion ifdefs as can be expected from the level of legacy/generality of glibc, but some key parts which seem to take effect for us should simplify to:

# define LIBC_START_MAIN __libc_start_main

STATIC int
LIBC_START_MAIN (int (*main) (int, char **, char **),
         int argc, char **argv,
{

      /* Initialize some stuff. */

      result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
  exit (result);
}

Before __libc_start_main are are already at _start, which by adding gcc -Wl,--verbose we know is the entry point because the linker script contains:

ENTRY(_start)

and is therefore is the actual very first instruction executed after the dynamic loader finishes.

To confirm that in GDB, we an get rid of the dynamic loader by compiling with -static:

gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o hello.out hello.c
gdb hello.out

and then make GDB stop at the very first instruction executed with starti and print the first instructions:

starti
display/12i $pc

which gives:

=> 0x401c10 <_start>:   endbr64 
   0x401c14 <_start+4>: xor    %ebp,%ebp
   0x401c16 <_start+6>: mov    %rdx,%r9
   0x401c19 <_start+9>: pop    %rsi
   0x401c1a <_start+10>:        mov    %rsp,%rdx
   0x401c1d <_start+13>:        and    $0xfffffffffffffff0,%rsp
   0x401c21 <_start+17>:        push   %rax
   0x401c22 <_start+18>:        push   %rsp
   0x401c23 <_start+19>:        mov    $0x402dd0,%r8
   0x401c2a <_start+26>:        mov    $0x402d30,%rcx
   0x401c31 <_start+33>:        mov    $0x401d35,%rdi
   0x401c38 <_start+40>:        addr32 callq 0x4020d0 <__libc_start_main>

By grepping the source for _start and focusing on x86_64 hits we see that this seems to correspond to sysdeps/x86_64/start.S:58:


ENTRY (_start)
    /* Clearing frame pointer is insufficient, use CFI.  */
    cfi_undefined (rip)
    /* Clear the frame pointer.  The ABI suggests this be done, to mark
       the outermost frame obviously.  */
    xorl %ebp, %ebp

    /* Extract the arguments as encoded on the stack and set up
       the arguments for __libc_start_main (int (*main) (int, char **, char **),
           int argc, char *argv,
           void (*init) (void), void (*fini) (void),
           void (*rtld_fini) (void), void *stack_end).
       The arguments are passed via registers and on the stack:
    main:       %rdi
    argc:       %rsi
    argv:       %rdx
    init:       %rcx
    fini:       %r8
    rtld_fini:  %r9
    stack_end:  stack.  */

    mov %RDX_LP, %R9_LP /* Address of the shared library termination
                   function.  */
#ifdef __ILP32__
    mov (%rsp), %esi    /* Simulate popping 4-byte argument count.  */
    add $4, %esp
#else
    popq %rsi       /* Pop the argument count.  */
#endif
    /* argv starts just at the current stack top.  */
    mov %RSP_LP, %RDX_LP
    /* Align the stack to a 16 byte boundary to follow the ABI.  */
    and  $~15, %RSP_LP

    /* Push garbage because we push 8 more bytes.  */
    pushq %rax

    /* Provide the highest stack address to the user code (for stacks
       which grow downwards).  */
    pushq %rsp

#ifdef PIC
    /* Pass address of our own entry points to .fini and .init.  */
    mov __libc_csu_fini@GOTPCREL(%rip), %R8_LP
    mov __libc_csu_init@GOTPCREL(%rip), %RCX_LP

    mov main@GOTPCREL(%rip), %RDI_LP
#else
    /* Pass address of our own entry points to .fini and .init.  */
    mov $__libc_csu_fini, %R8_LP
    mov $__libc_csu_init, %RCX_LP

    mov $main, %RDI_LP
#endif

    /* Call the user's main function, and exit with its value.
       But let the libc call main.  Since __libc_start_main in
       libc.so is called very early, lazy binding isn't relevant
       here.  Use indirect branch via GOT to avoid extra branch
       to PLT slot.  In case of static executable, ld in binutils
       2.26 or above can convert indirect branch into direct
       branch.  */
    call *__libc_start_main@GOTPCREL(%rip)

which ends up calling __libc_start_main as expected.

Unfortunately -static makes the bt from main not show as much info:

#0  main () at hello.c:3
#1  0x0000000000402560 in __libc_start_main ()
#2  0x0000000000401c3e in _start ()

If we remove -static and start from starti, we get instead:

=> 0x7ffff7fd0100 <_start>:     mov    %rsp,%rdi
   0x7ffff7fd0103 <_start+3>:   callq  0x7ffff7fd0df0 <_dl_start>
   0x7ffff7fd0108 <_dl_start_user>:     mov    %rax,%r12
   0x7ffff7fd010b <_dl_start_user+3>:   mov    0x2c4e7(%rip),%eax        # 0x7ffff7ffc5f8 <_dl_skip_args>
   0x7ffff7fd0111 <_dl_start_user+9>:   pop    %rdx

By grepping the source for _dl_start_user this seems to come from sysdeps/x86_64/dl-machine.h:L147

/* Initial entry point code for the dynamic linker.
   The C function `_dl_start' is the real entry point;
   its return value is the user program's entry point.  */
#define RTLD_START asm ("\n\
.text\n\
    .align 16\n\
.globl _start\n\
.globl _dl_start_user\n\
_start:\n\
    movq %rsp, %rdi\n\
    call _dl_start\n\
_dl_start_user:\n\
    # Save the user entry point address in %r12.\n\
    movq %rax, %r12\n\
    # See if we were run as a command with the executable file\n\
    # name as an extra leading argument.\n\
    movl _dl_skip_args(%rip), %eax\n\
    # Pop the original argument count.\n\
    popq %rdx\n\

and this is presumably the dynamic loader entry point.

If we break at _start and continue, this seems to end up in the same location as when we used -static, which then calls __libc_start_main.

TODO:

2

Somewhat related question: Who receives the value returned by main()?

main() is an ordinary C function, so it requires certain things to be initialized before it is called. These are related to:

  • Setting up a valid stack
  • Creating a valid argument list (usually on the stack)
  • Initializing the interrupt-handling hardware
  • Initializing global and static variables (including library code)

The last item includes such things as setting up a memory pool that malloc() and free() can use, if your environment supports dynamic memory allocation. Similarly, any form of "standard I/O" that your system might have access to will also be initialized.

Pretty much anything else is going to be application-dependent, and will have to be initialized from within main(), before you enter your "main loop".

Dave Tweed
  • 168,369
  • 17
  • 228
  • 393
  • 1
    Except for maybe setting up an MMU (memory management unit), I have never seen the C startup code (crt0.s for example) actually do any I/O initialization, this is usually done right at the top of main using a series of C function calls. – tcrosley Sep 20 '16 at 18:35
  • @tcrosley: Like I said, "*might* have access to". It's rare, but I've seen it done. In most systems with MMUs, you're running under an OS, not on the bare metal, so the environment for `main()` is more like running on a desktop system. – Dave Tweed Sep 20 '16 at 18:38
  • 1
    `main` is most certainly not an ordinary C function. There are many very specific rules that apply to `main`, such as no prototype, implicit `return 0` etc. The allowed forms of `main` are also special and dictated by the given compiler - so unlike when declaring/defining regular functions, the programmer cannot decide the function format. – Lundin Sep 21 '16 at 11:06
  • Regarding MMU, it is quite common that even simple microcontrollers have some form limited MMU. Some devices allow custom memory mapping of registers, RAM, flash etc. As for more complex devices, they don't necessarily have an OS. I've done some bare metal Power PC projects - it has a rather complex MMU which obviously needs to be initialized very early on, even before the stack is initialized. – Lundin Sep 21 '16 at 11:09
  • As for I/O initialization from start-up code, I've never seen that either. I have however written such code myself, for example when the MCU has all pins set as input by default. On some sensitive applications, you might then want to set them as outputs as soon as possible for EMC reasons. Also, professional programs initialize fundamental safety features like watchdog and brown-out detect as early as possible, often before main. Amateur/hobbyist/wannabe programmers often do the massive copy-down of data to all RAM cells in .data and .bss with no wdog or LVD yet activated. – Lundin Sep 21 '16 at 11:19
2

On a typical embedded system, startup code will at minimum will have to load all initialized variables with their defined values and zero out all uninitialized variables. Depending upon the hardware platform, it may also have to configure the CPU stack pointer [on some hardware platforms, a reset will automatically set the stack pointer to the top of memory, but on other platforms it must be set manually] or configure various other features in the CPU or memory controller.

The startup code is usually pretty short and simple, and some platforms may document how it works and allow a user to substitute something else (e.g. if an embedded system will need to have a user-supplied startup routine copy some code from a serial flash chip into RAM and then execute it, it may make sense to have initialized variables be part of the code image, rather than having their initial values be part of the code image which is copied to another area of RAM on startup but then ignored thereafter).

supercat
  • 45,939
  • 2
  • 84
  • 143