I am working with the STM32F303VC discovery kit and I am slightly puzzled by its performance. To get acquainted with the system, I've written a very simple program simply to test out the bit-banging speed of this MCU. The code can be broken down as follows:
- HSI clock (8 MHz) is turned on;
- PLL is initiated with the with the prescaler of 16 to achieve HSI / 2 * 16 = 64 MHz;
- PLL is designated as the SYSCLK;
- SYSCLK is monitored on the MCO pin (PA8), and one of the pins (PE10) is constantly toggled in the infinite loop.
The source code for this program is presented below:
#include "stm32f3xx.h"
int main(void)
{
// Initialize the HSI:
RCC->CR |= RCC_CR_HSION;
while(!(RCC->CR&RCC_CR_HSIRDY));
// Initialize the LSI:
// RCC->CSR |= RCC_CSR_LSION;
// while(!(RCC->CSR & RCC_CSR_LSIRDY));
// PLL configuration:
RCC->CFGR &= ~RCC_CFGR_PLLSRC; // HSI / 2 selected as the PLL input clock.
RCC->CFGR |= RCC_CFGR_PLLMUL16; // HSI / 2 * 16 = 64 MHz
RCC->CR |= RCC_CR_PLLON; // Enable PLL
while(!(RCC->CR&RCC_CR_PLLRDY)); // Wait until PLL is ready
// Flash configuration:
FLASH->ACR |= FLASH_ACR_PRFTBE;
FLASH->ACR |= FLASH_ACR_LATENCY_1;
// Main clock output (MCO):
RCC->AHBENR |= RCC_AHBENR_GPIOAEN;
GPIOA->MODER |= GPIO_MODER_MODER8_1;
GPIOA->OTYPER &= ~GPIO_OTYPER_OT_8;
GPIOA->PUPDR &= ~GPIO_PUPDR_PUPDR8;
GPIOA->OSPEEDR |= GPIO_OSPEEDER_OSPEEDR8;
GPIOA->AFR[0] &= ~GPIO_AFRL_AFRL0;
// Output on the MCO pin:
//RCC->CFGR |= RCC_CFGR_MCO_HSI;
//RCC->CFGR |= RCC_CFGR_MCO_LSI;
//RCC->CFGR |= RCC_CFGR_MCO_PLL;
RCC->CFGR |= RCC_CFGR_MCO_SYSCLK;
// PLL as the system clock
RCC->CFGR &= ~RCC_CFGR_SW; // Clear the SW bits
RCC->CFGR |= RCC_CFGR_SW_PLL; //Select PLL as the system clock
while ((RCC->CFGR & RCC_CFGR_SWS_PLL) != RCC_CFGR_SWS_PLL); //Wait until PLL is used
// Bit-bang monitoring:
RCC->AHBENR |= RCC_AHBENR_GPIOEEN;
GPIOE->MODER |= GPIO_MODER_MODER10_0;
GPIOE->OTYPER &= ~GPIO_OTYPER_OT_10;
GPIOE->PUPDR &= ~GPIO_PUPDR_PUPDR10;
GPIOE->OSPEEDR |= GPIO_OSPEEDER_OSPEEDR10;
while(1)
{
GPIOE->BSRRL |= GPIO_BSRR_BS_10;
GPIOE->BRR |= GPIO_BRR_BR_10;
}
}
The code was compiled with CoIDE V2 with the GNU ARM Embedded Toolchain using -O1 optimization. The signals on pins PA8 (MCO) and PE10, examined with an oscilloscope, look like this:
The SYSCLK appears to be configured correctly, as the MCO (orange curve) exhibits an oscillation of nearly 64 MHz (considering the error margin of the internal clock). The weird part for me is the behavior on PE10 (blue curve). In the infinite while(1) loop it takes 4 + 4 + 5 = 13 clock cycles to perform an elementary 3-step operation (i.e. bit-set/bit-reset/return). It gets even worse on other optimization levels (e.g. -O2, -O3, ar -Os): several additional clock cycles are added to the LOW part of the signal, i.e. between the falling and rising edges of PE10 (enabling the LSI somehow seems to remedy this situation).
Is this behavior expected from this MCU? I would imagine a task as simple as setting and resetting a bit ought to be 2-4 times faster. Is there a way to speed things up?