2

I am using an STM32F103C8T6, STM32CubeIDE with HAL.

During accessing an LCD (1602), with I2C, I get a hard fault.

I try to debug it, but somehow I don't see the stack. See the following screenshot:

enter image description here

The breakpoint happens inside the stm32f1xx_hal_i2c, in this code (==> is break point):

static HAL_StatusTypeDef I2C_WaitOnFlagUntilTimeout(I2C_HandleTypeDef *hi2c, uint32_t Flag, FlagStatus Status, uint32_t Timeout, uint32_t Tickstart)
{
  /* Wait until flag is set */
  ==> while (__HAL_I2C_GET_FLAG(hi2c, Flag) == Status)
  {
    /* Check for the Timeout */
    if (Timeout != HAL_MAX_DELAY)
    {
      if (((HAL_GetTick() - Tickstart) > Timeout) || (Timeout == 0U))
      {
        hi2c->PreviousState     = I2C_STATE_NONE;
        hi2c->State             = HAL_I2C_STATE_READY;
        hi2c->Mode              = HAL_I2C_MODE_NONE;
        hi2c->ErrorCode         |= HAL_I2C_ERROR_TIMEOUT;

        /* Process Unlocked */
        __HAL_UNLOCK(hi2c);

        ==> return HAL_ERROR;
      }
    }
  }
  return HAL_OK;
}

This function is called from within many functions inside the same file.

The hard fault occurs after return HAL_ERROR in the same function, than the next function in the 'while' loop, and when I jump INTO the while (so calling __HAL_I2C_GET_FLAG). During the call to that last function, the following stack is shown:

enter image description here

How can I debug what's happening exactly in between the call and the hard fault? Or even better, does anybody have a reason how such hard fault can happen in an STM library inside HAL?

Michel Keijzers
  • 13,867
  • 18
  • 69
  • 139
  • 1
    Do you have a watchdog enabled? Are you sure that your I2C device is responding? – Ron Beyer May 07 '20 at 19:49
  • I don't have a watch dog ... I don't know if it responds ... I could try a (simple) logic analyzer; but I wouldn't understand a hard fault (that seems more like a software problem). – Michel Keijzers May 07 '20 at 19:51
  • 1
    Set a breakpoint until you find the line that causes the hard fault, then start single-stepping into the code to see where the fault is. I'd be surprised if it was in the I2C library code, this is more likely something in your code like a bad I2C handle pointer. – Ron Beyer May 07 '20 at 20:09
  • 1
    Also @MichelKeijzers make sure your compiler optimizations are either configured to be 'Optimized for Debug' or off, otherwise you might not be able to step to the actual line causing the fault – Ocanath May 07 '20 at 20:13
  • 2
    __HAL_I2C_GET_FLAG is a macro, so it can't be called. What most likely happens is you run out of memory and trash the stack so it returns to some random memory address that is not executable. The problem most likely is not in this function at all. Show the code that uses the I2C HAL. – Justme May 07 '20 at 20:20
  • @Ron Beyer, I exactly did that, and you are right, it was fully unrelated to the I2C code, I will put it in my question as 'answer'. – Michel Keijzers May 07 '20 at 21:59
  • @JustMe Yes you are right, the memory is completely garbaged, and I will put in my question why. – Michel Keijzers May 07 '20 at 22:00
  • @MichelKeijzers Don't put the answer in your question, there's nothing wrong with providing an answer to your own question, at least this way you can mark it as the answer and this question won't show up in the future when it gets bumped. – Ron Beyer May 07 '20 at 22:04
  • @RonBeyer ok I will move it. – Michel Keijzers May 07 '20 at 22:05

3 Answers3

5

Note that the solution was in a completely different part than the question mentions.

The problem was, that I use EEPROM simulation, which means on an STM32 that part of the flash is overwritten.

By default page 15 is used (out of 31) which means like about less than half way of the 64 KB Flash it has.

Since my program is growing gradually, instead of formatting/writing to the 'EEPROM simulation' Flash space, it was overwriting my program.

I changed the page number to 30 and now the problem is solved (unless my program gets too big again).

I thought the simulation flash 'library' would use the next 'free' page but it does not.

Michel Keijzers
  • 13,867
  • 18
  • 69
  • 139
2

Your debugging strategy should be to set a breakpoint as close to the fault as possible, and step one line at a time until you find the exact location of the fault. Make sure your compiler optimization settings are 'optimized for debug' or off or it might skip lines/execute them in a different order.

I think you're looking at the wrong piece of code. Hard faults are usually due to bad memory access. You're clearly using one of the blocking HAL I2C functions, so if I had to guess I'd say you're probably overrunning an array. Check your array size and make sure you're calling your I2C function with the correct number of bytes.

Ocanath
  • 2,171
  • 14
  • 24
  • 1
    Or maybe add the call to the parent HAL I2C function to your question? – Ocanath May 07 '20 at 20:33
  • Thanks for answering. I did set the breakpoint as close as possible, and stepped into just a few times. To set the compiler optimization is a good tip. And yes, you are right it was a memory problem, fully unrelated to what I described, but I didn't realize. I will put it in my question (which I might delete). – Michel Keijzers May 07 '20 at 21:58
  • I couldn't see it, because the memory was a garbage, I will put it in my question (as sort of answer). – Michel Keijzers May 07 '20 at 21:59
  • 1
    don't delete it! that's an unusual problem, it could be useful to someone else down the line. Glad to hear you figured it out – Ocanath May 07 '20 at 22:06
  • Also your mention of the optimized for debug flag is useful. – Michel Keijzers May 07 '20 at 22:07
1

For future reference, you need to define DEBUG_DEFAULT_INTERRUPT_HANDLERS in your preprocessor macros to capture the real cause of the hardfault interrupt.

Unfortunately, I don't know how to do this with the STM32CubeIDE. Here is a tutorial on how to do it with VisualGDB and hopefully this points you in the right direction: https://visualgdb.com/tutorials/arm/tracing/traceback/

Alan Samet
  • 138
  • 1
  • 7
  • 1
    Although the answer I added was the main problem in my case, however, although I couldn't try your solution as I don't want to intentionally break my program, I think it is a way to do it in future and is generic to find memory related problems. – Michel Keijzers May 11 '20 at 08:00