5

The startup files for STM32 Cortex-M MCU's, for most GCC toolchains, often bundle the Atollic TrueStudio startup assembly files with HAL libraries, like for example in my case, STM32CubeF4.

I'm looking at startup_stm32f407xx.s, and it starts with a section looking as follows

.syntax unified
.cpu cortex-m4
.fpu softvfp
.thumb

I want to rewrite the assembler startup script to C, as part of learning the Cortex-M startup process.

When compiling with the GCC ARM Toolchain, or perhaps any other GCC based toolchain, does that mean I have to transfer these arguments, found in the assembly startup file, to command line arguments for arm-none-eabi-gcc:

arm-none-eabi-gcc -mcpu=cortex-m4 --mfpu=softvfp --mthumbz ...

Do these assembler lines correlate to the respective GCC arguments, or are they used for something entirely different?

josef.van.niekerk
  • 3,600
  • 7
  • 44
  • 63
  • 1
    I can give you the startup file written in C if you want.. – Eugene Sh. Nov 02 '15 at 15:21
  • That would be awesome, Eugene! Still would also like to understand the .s file and startup process a bit better. – josef.van.niekerk Nov 02 '15 at 15:24
  • Even better, go to [here](http://gnuarmeclipse.github.io/plugins/install/), install the plugins and they will generate the startup code for you (it's king of light alternative to Cube. And using C). – Eugene Sh. Nov 02 '15 at 15:27
  • It loads the PC and the SP, not much to learn there (on a pure hardware view) the interrupt vector table is usually also initialized in the startup file, probably because it's right next to the PC and SP value. So you'll generate a structure corresponding to the layout and tell the linker to put it at the very beginning. – Arsenal Nov 02 '15 at 15:28
  • @EugeneSh. Hehe, funny thing is, I'm already using GNU ARM Eclipse. I'm looking at a generated STM32F4 project to write a my own startup step by step. It's all about doing an in depth examination of the startup, and I'm trying to come up with the bare minimum to get a STM32F4 up and running. Don't even want newlib in there for now. – josef.van.niekerk Nov 02 '15 at 15:31
  • I know it doesn't make sense, why I'm doing this, but it's my own attempt, as a hobbyist to get an in depth understanding of the startup cycle. – josef.van.niekerk Nov 02 '15 at 15:32
  • @josef.van.niekerk Just a friendly advice. I am an experienced embedded developer, but I find really difficult to follow the STM32fxx docs and code. The have done a terrible job making stuff transparent. If you want some easier path with the same class micros, start with TI. Of course it's my opinion. – Eugene Sh. Nov 02 '15 at 15:34
  • @EugeneSh. Point taken. I was sniffing around the other day to see if other vendors like Freescale, NXP and even TI might be a bit friendlier to code with and setup. Maybe I must get an eval kit from TI and see how it goes. – josef.van.niekerk Nov 02 '15 at 15:39
  • Disclaimer: I am not familiar with NXP and Freescale, and these might be better than TI. But you can get a Tiva C launchpad board for like 10 bucks :) – Eugene Sh. Nov 02 '15 at 15:40
  • 1
    Just for example, tm4C123 startup code: http://pastebin.com/Mg3Aprza – Eugene Sh. Nov 02 '15 at 15:44
  • 2
    While I agree that ST does a very good job at making it hard for developers, the start-up process (until you get to main) will look identical for all Cortex-M4 cores on the same compiler. If you want to learn more on peripherals and write your own hardware drivers, go to TI, their peripherals have given me much less trouble than the ST ones. – Arsenal Nov 02 '15 at 15:47
  • I'll take a look at something like the Tiva C TM4C123G, which I can get from RS Components here in South Africa. Would still like to use GNU ARM Eclipse on Mac. – josef.van.niekerk Nov 02 '15 at 15:55
  • It works just fine with opensource tools (gcc, eclipse, openocd). Just to save your time, get the GCC precompiled from [here](https://launchpad.net/gcc-arm-embedded), and the other stuff from [here](http://gnuarmeclipse.github.io/debug/openocd/). – Eugene Sh. Nov 02 '15 at 15:57

1 Answers1

7

This page gives a nice overview over the "special" assembler directives of the GNU ARM assembler.

As you suspected, these directives are basically used in the way of the compiler switches and should have their representation when compiling the sources.

The ones used:

  • .syntax [unified | divided]: This directive sets the Instruction Set Syntax as described in the ARM-Instruction-Set section. (unified: ARM and THUMB use the same syntax)
  • .cpu cortex-m4: Select the target processor. Valid values for name are the same as for the -mcpu commandline option. Specifying .cpu clears any previously selected architecture extensions.
  • .fpu softvfp: Select the floating-point unit to assemble for. Valid values for name are the same as for the -mfpu commandline option
  • .thumb: This performs the same action as .code 16.
  • .code 16: This directive selects the instruction set being generated. The value 16 selects Thumb, with the value 32 selecting ARM.

Some, if not all, of them can also be configured via the command-line interface. I'd guess that they included it in the assembly file to make sure, that it gets assembled exactly the way it was thought to be. The inline directives take a higher priority as the command line switches.


As for the description of the start-up process, don't know if you actually asked this, but I felt like writing it:

From a hardware point of view it is a thing of the core and is described in the ARMv7-M Architecture Reference Manual (available upon registration). In section B1.5.5 the reset behaviour is explained.

Asserting reset causes the processor to abandon the current execution state without saving it. On the deassertion of reset, all registers that have a defined reset value contain that value, and the processor performs the actions described by the TakeReset() pseudocode.

// TakeReset()
// ============
TakeReset()
CurrentMode = Mode_Thread;
PRIMASK<0> = '0'; /* priority mask cleared at reset */
FAULTMASK<0> = '0'; /* fault mask cleared at reset */
BASEPRI<7:0> = Zeros(8); /* base priority disabled at reset */
if HaveFPExt() then /* initialize the Floating Point Extn */
CONTROL<2:0> = '000'; /* FP inactive, stack is Main, thread is privileged */
CPACR.cp10 = '00';
CPACR.cp11 = '00';
FPDSCR.AHP = '0';
FPDSCR.DN = '0';
FPDSCR.FZ = '0';
FPDSCR.RMode = '00';
FPCCR.ASPEN = '1';
FPCCR.LSPEN = '1';
FPCCR.LSPACT = '0';
FPCAR = bits(32) UNKNOWN;
FPFSR = bits(32) UNKNOWN;
for i = 0 to 31
S[i] = bits(32) UNKNOWN;
else
CONTROL<1:0> = '00'; /* current stack is Main, thread is privileged */
for i = 0 to 511 /* all exceptions Inactive */
ExceptionActive[i] = '0';
ResetSCSRegs(); /* catch-all function for System Control Space reset */
ClearExclusiveLocal(ProcessorID()); /* Synchronization (LDREX* / STREX*) monitor support */
ClearEventRegister(); /* see WFE instruction for more details */
for i = 0 to 12
R[i] = bits(32) UNKNOWN;
bits(32) vectortable = VTOR<31:7>:'0000000';
SP_main = MemA_with_priv[vectortable, 4, AccType_VECTABLE] AND 0xFFFFFFFC<31:0>;
SP_process = ((bits(30) UNKNOWN):'00');
LR = 0xFFFFFFFF<31:0>; /* preset to an illegal exception return value */
tmp = MemA_with_priv[vectortable+4, 4, AccType_VECTABLE];
tbit = tmp<0>;
APSR = bits(32) UNKNOWN; /* flags UNPREDICTABLE from reset */
IPSR<8:0> = Zeros(9); /* Exception Number cleared */
EPSR.T = tbit; /* T bit set from vector */
EPSR.IT<7:0> = Zeros(8); /* IT/ICI bits cleared */
BranchTo(tmp AND 0xFFFFFFFE<31:0>); /* address of reset service routine */

ExceptionActive[*] is a conceptual array of active flag bits for all exceptions, meaning it has active flags for the fixed-priority system exceptions, the configurable-priority system exceptions, and the external interrupts. The active flags for the fixed-priority exceptions are conceptual only, and are not required to exist in a system register.

The steps you have to take to implement the startup behaviour in software is dependent on the compiler and linker you use, so a general solution will probably not exist.

It usually consists of creating a reset and vector table structure and filling it with the right values. The stackpointer is the first value, it has to be initialized with the RAM address where your stack will reside. This value is usually defined in the linker control file as exported symbol. The next value is the address of your startup code, so you just put the function pointer to c_startup() or whatever there.

What follows is a long list of function pointers pointing to the individual interrupt service handlers. You have to take care, that you don't skip positions in the table just because the ISR is not implemented. Initialize those values either with 0 (things can go wrong) or with a "not-implemented handler" consisting of a while(true) (safe way).

After that you have to fill the c_startup() with life, which depends on the compiler. Typical things to do are: initializing RAM with global values, initialize the FPU is, calling constructors of static objects and finally jumping to main().

As a last step you have to tell the linker that it places your newly created super-structure to the very beginning of the vector table (usually the first address of the flash, but this might vary depending on how the vector is fetched in the device).

Arsenal
  • 17,464
  • 1
  • 32
  • 59