ana_lazareva April 5, 2018 at 13:46

Debugging multithreaded programs based on FreeRTOS

Tutorial

Debugging multitasking programs is not an easy task, especially if this is your first time encountering it. After the joy of starting the first task or the first demo program has passed, from the endlessly exciting observation of the LEDs, each of which blinks in its own task, there comes a time when you realize that you understand quite a bit (you ~~don’t~~ understand ~~at all~~) about what really happens. Classics of the genre: “I allocated as much as 3KB to the operating system and launched only 3 tasks with a 128B stack, and for the fourth one for some reason there is already not enough memory” or “How much stack should I allocate to the task? How much is enough? And so much? ” Many people solve these problems through trial and error, so in this article I decided to combine most of the points that, currently, greatly simplify my life and allow me to more consciously debug multithreaded programs based on FreeRTOS.

This article is intended primarily for those who have just recently begun to learn FreeRTOS, but it is likely that readers who are well acquainted with this operating system will find something interesting for themselves here. In addition, despite the fact that the article is aimed at developers of embedded software, it will also be interesting for application programmers, as a lot of words will be said about FreeRTOS as such, regardless of microcontroller romance.

In this article I will talk about the following points:

Configuring OpenOCD to work with FreeRTOS.
Do not forget to include hooks.
Static or dynamic memory allocation?
Tale, about the configMINIMAL_STACK_SIZE parameter.
Monitoring resource usage.

Configuring OpenOCD to work with FreeRTOS

The first thing you may encounter when using FreeRTOS is the lack of any useful information in the Debug window:

It looks as sad as possible. Fortunately, OpenOCD supports FreeRTOS debugging, you just need to configure it correctly:

Add the file FreeRTOS-openocd.c to the project
Add flags to linker (Properties> C / C ++ Build> Settings> Cross ARM C ++ Linker> Miscellaneous> Other linker flags):
```
-Wl,--undefined=uxTopUsedPriority
```
Add flags to the debugger (Run> Debugs configurations> Debugger> Config options):
```
-c "$_TARGETNAME configure -rtos auto"
```
Uncheck Run> Debugs configurations> Startup> Set breakpoint at main.

After these settings, the Debug window will display all existing flows with details, i.e. we will always have access to information about the state of a particular process and what it is currently occupied with:

Do not forget to include hooks

If our program crashed into some kind of hard_fault_handler (), then with the settings from the previous paragraph, we can understand from what task we got there. However, we do not know anything about the causes of this fall.

For example, in the picture above, we see that an error occurred during the execution of the YellowLedTask task. The first thing we do is in the debug, we begin to step line by line along the endless cycle of the task in order to clarify the place of the fall. Suppose we learned that a program crashes during the execution of the dummy () function (by the way, there is a way to immediately understand which function we broke in, you can read about this in this article) We begin to check the body of the function for errors or typos. An hour passes, the eye begins to twitch, and we are sure that the function is correctly written as firmly as we are sure that the chair on which we are sitting exists. So what's the deal? But the fact is that the error that may arise may not have anything to do with your function, and the problem is precisely in the operation of the OS. And here hooks come to our aid.

The following hooks exist on FreeRTOS:

/* Hook function related definitions. */
#define configUSE_IDLE_HOOK                     0
#define configUSE_TICK_HOOK                     0
#define configCHECK_FOR_STACK_OVERFLOW          2
#define configUSE_MALLOC_FAILED_HOOK            1
#define configUSE_DAEMON_TASK_STARTUP_HOOK      0

The most important, as part of the program debugging, are configCHECK_FOR_STACK_OVERFLOW and configUSE_MALLOC_FAILED_HOOK.

The configCHECK_FOR_STACK_OVERFLOW parameter can be turned on with a value of 1 or 2, depending on which stack overflow detection method you want to use. Read more about this here . If you enabled this hook, then you will need to define the function
void vApplicationStackOverflowHook (TaskHandle_t xTask, signed char * pcTaskName), which will be executed every time when the stack allocated for the task is not enough for it to work, and most importantly you will see it on the stack calls a specific task. Thus, to solve the problem, it will only be necessary to increase the size of the stack allocated for the task.

vApplicationStackOverflowHook

void vApplicationStackOverflowHook(TaskHandle_t xTask, char* pcTaskName)
{
    rtos::CriticalSection::Enter();
    {
        while (true)
        {
            portNOP();
        }
    }
    rtos::CriticalSection::Exit();
}

The configUSE_MALLOC_FAILED_HOOK parameter is turned on 1, as are most of the FreeRTOS configurable parameters. If you enabled this hook, then you will need to define the void vApplicationMallocFailedHook () function. This function will be called when there is not enough free space on the heap allocated for FreeRTOS to host the next entity. And, again, the main thing is that we will see all this in the call stack. Therefore, all we need to do to solve this problem is to increase the size of the heap allocated for FreeRTOS.

vApplicationallocFailedHook

void vApplicationMallocFailedHook()
{
    rtos::CriticalSection::Enter();
    {
        while (true)
        {
            portNOP();
        }
    }
    rtos::CriticalSection::Exit();
}

Now, if we run our program again, then when it crashes in hard_fault_handler () we will see the reason for this crash in the Debug window:

By the way, if you ever found an interesting use of configUSE_IDLE_HOOK, configUSE_TICK_HOOK or configUSE_DAEMON_TASK_STARTUP_HOOK, then it would be very interesting to read about this in the comments)

Static or dynamic memory allocation?

So, we figured out how to monitor the stack and heap overflow in FreeRTOS, and now it's time to talk about the eternal - about memory.

In this section, we will consider the following FreeRTOS options:

/* Memory allocation related definitions. */
#define configSUPPORT_STATIC_ALLOCATION         0
#define configSUPPORT_DYNAMIC_ALLOCATION        1
#define configTOTAL_HEAP_SIZE                   100000
#define configAPPLICATION_ALLOCATED_HEAP        0

In FreeRTOS, memory for creating tasks, semaphores, timers, and other RTOS objects can be allocated either statically (configSUPPORT_STATIC_ALLOCATION) or dynamically (configSUPPORT_DYNAMIC_ALLOCATION). If you enable dynamic memory allocation, you must also specify the heap size that RTOS can use (configTOTAL_HEAP_SIZE). In addition, if you want the heap to be located in a certain place, and not automatically located in the memory by the linker, you need to enable the configAPPLICATION_ALLOCATED_HEAP parameter and define the uint8_t ucHeap [configTOTAL_HEAP_SIZE] array. And do not forget that to dynamically allocate memory, you need to add the heap_1.c, heap_2.c, heap_3.c, heap_4.c or heap_5.c file to the folder with FreeRTOS files, depending on which version of the memory manager suits you best.

In order to estimate how much memory you can give to the FreeRTOS heap, after building the project, you need to look at the size of the .bss section. It displays the size of RAM needed to store all static variables. For example, I have a controller with 128K RAM, I gave FreeRTOS 50K and after building the project, the .bss section occupies 62304B. This means that in my project, static variables of 12304 bytes + 50,000 bytes are statically allocated for the OS heap. We must remember that a couple of kilobytes must be reserved for the main () stack, and as a result, we get that the FreeRTOS heap can be increased by (128000 - 62304 - 2000) bytes.

The advantages of each approach to memory allocation can be found here , and a detailed comparative description of various memory managers is presented here..

As for my opinion, at this stage of development, I see no reason to use static memory allocation, so in the above config, static memory allocation is turned off. And that's why:

Why independently allocate a buffer for the stack and the StaticTask_t structure if the operating system supports as many as 5 different memory managers for every taste and color, who themselves will figure out where, what and how to create, and even let them know if they didn’t succeed? In particular, heap_1.c is more than fully suitable for most programs for microcontrollers
You may need some third-party library written very optimally and comprehensively, but using malloc (), calloc () or new [] () inside. And what to do? Refuse it in favor of the less optimal (is this even if there is a choice)? Or you can simply use dynamic memory allocation with heap_2.c or heap_4.c . The only thing you will need to do is redefine the appropriate functions so that memory allocation occurs using the FreeRTOS tools in the heap provided to it:
code snippet
```
void* malloc(size_t size) {
    return pvPortMalloc(size);
}
void* calloc(size_t num, size_t size) {
    return pvPortMalloc(num * size);
}
void free(void* ptr) {
    return vPortFree(ptr);
}
void* operator new(size_t sz) {
    return pvPortMalloc(sz);
}
void* operator new[](size_t sz) {
    return pvPortMalloc(sz);
}
void operator delete(void* p) {
    vPortFree(p);
}
void operator delete[](void* p) {
    vPortFree(p);
}
```

In my projects, I use only dynamic memory allocation with heap_4.c, giving as much memory as possible to the OS heap, and always redefine the functions malloc (), calloc (), new (), etc., regardless of whether they are used currently or not.

The Ratu for dynamic memory allocation, I, of course, do not deny that there are tasks for which static memory allocation is the ideal solution (this, by the way, can also be discussed in the comments).

Tale, about the configMINIMAL_STACK_SIZE parameter

The value of the configMINIMAL_STACK_SIZE parameter is calculated NOT in bytes, but in words! Moreover, the word size varies from one OS port to another and it is defined in the portmacro.h file with the portSTACK_TYPE. For example, in my case, the word size is 4 bytes. Thus, the fact that the configMINIMAL_STACK_SIZE parameter in my configuration is 128 means that the minimum stack size for the task is 512 bytes.

That's all.

Resource Usage Monitoring

It would be great to have answers to questions such as:

Have I adequately selected the stack size for the task? Are there too many? Or maybe too little?
And how much processor time is required to complete my task?
And how many real heaps allocated for the OS are used? Is the program already at the limit or is there still where to deploy?

In this section I will give an example of how you can implement a simple monitoring of resources, which will help to get unambiguous answers to all the above questions.

FreeRTOS has a toolkit that allows you to collect on-the-fly statistics on resource usage, including the following parameters:

#define configGENERATE_RUN_TIME_STATS           0
#define configUSE_TRACE_FACILITY                0
#define configUSE_STATS_FORMATTING_FUNCTIONS    0

I’ll talk about the value of each parameter a bit further, but first we need to create a task, let's call it MonitorTask, an infinite loop of which will collect statistics with the interval config :: MonitorTask :: SLEEP_TIME_MS and send it to the terminal.

After the task is created, we need to set the configUSE_TRACE_FACILITY parameter to 1, after which the function will become available to us:

UBaseType_t uxTaskGetSystemState(TaskStatus_t* const pxTaskStatusArray, const UBaseType_t uxArraySize, uint32_t* const pulTotalRunTime)

The pxTaskStatusArray parameter must have sizeof (TaskStatus_t) * uxTaskGetNumberOfTasks (), i.e. it should be large enough to contain information about all existing tasks.

By the way, about the structure of TaskStatus_t. What information can we get regarding each task? But this one:

TaskStatus_t

typedef struct xTASK_STATUS
{
/ * The handle of the task to which the rest of the information in the
structure related. * /
TaskHandle_t xHandle ;

/ * A pointer to the task's name. This value will be invalid if the task was
deleted since the structure was populated! * /
const signed char * pcTaskName ;

/ * A number unique to the task. * /
UBaseType_t xTaskNumber ;

/ * The state in which the task existed when the structure was populated. * /
eTaskState eCurrentState ;

/ * The priority at which the task was running (may be inherited) when the
structure was populated. * /
UBaseType_t uxCurrentPriority ;

/ * The priority to which the task will return if the task's current priority
has been inherited to avoid unbounded priority inversion when obtaining a
mutex. Only valid if configUSE_MUTEXES is defined as 1 in
FreeRTOSConfig.h. * /
UBaseType_t uxBasePriority ;

/ * The total run time allocated to the task so far, as defined by the run
time stats clock. Only valid when configGENERATE_RUN_TIME_STATS is
defined as 1 in FreeRTOSConfig.h. * /
unsigned long ulRunTimeCounter ;

/ * Points to the lowest address of the task's stack area. * /
StackType_t * pxStackBase ;

/ * The minimum amount of stack space that has remained for the task since
the task was created. The closer this value is to zero the closer the task
has come to overflowing its stack. * /
unsigned short usStackHighWaterMark ;
} TaskStatus_t;

So, now we are ready to describe the endless cycle of the MonitorTask task. It may look, for example, like this:

MonitorTask function

TickType_t delay = rtos::Ticks::MsToTicks(config::MonitorTask::SLEEP_TIME_MS);
while(true)
{
  UBaseType_t task_count = uxTaskGetNumberOfTasks();
  if (task_count <= config::MonitorTask::MAX_TASKS_MONITOR)
  {
    unsigned long _total_runtime;
    TaskStatus_t _buffer[config::MonitorTask::MAX_TASKS_MONITOR];
    task_count = uxTaskGetSystemState(_buffer, task_count, &_total_runtime);
    for (int task = 0; task < task_count; task++)
    {
      _logger.add_str(DEBG, "[DEBG] %20s: %c, %u, %6u, %u", 
                            _buffer[task].pcTaskName,
                            _task_state_to_char(_buffer[task].eCurrentState),
                            _buffer[task].uxCurrentPriority,
                            _buffer[task].usStackHighWaterMark,
                            _buffer[task].ulRunTimeCounter);
    }
    _logger.add_str(DEBG, "[DEBG] Current Heap Free Size: %u",
                          xPortGetFreeHeapSize());
    _logger.add_str(DEBG, "[DEBG] Minimal Heap Free Size: %u",
                          xPortGetMinimumEverFreeHeapSize());
    _logger.add_str(DEBG, "[DEBG] Total RunTime:  %u ms", _total_runtime);
    _logger.add_str(DEBG, "[DEBG] System Uptime:  %u ms\r\n",
				      xTaskGetTickCount() * portTICK_PERIOD_MS);
  }
  rtos::Thread::Delay(delay);
}

Suppose that in my program, besides MonitorTask, there are several more tasks with such parameters, where configMINIMAL_STACK_SIZE = 128:

TasksConfig.h

static constexpr uint32_t MIN_TASK_STACK_SIZE 	= configMINIMAL_STACK_SIZE;
static constexpr uint32_t MIN_TASK_PRIORITY   	= 1;
static constexpr uint32_t MAX_TASK_PRIORITY   	= configMAX_PRIORITIES;
struct LoggerTask {
    static constexpr uint32_t STACK_SIZE        = MIN_TASK_STACK_SIZE * 2;
    static constexpr const char NAME[]          = "Logger Task";
    static constexpr uint32_t PRIORITY          = MIN_TASK_PRIORITY;
    static constexpr uint32_t SLEEP_TIME_MS     = 100;
};
struct MonitorTask {
	static constexpr uint32_t STACK_SIZE 		= MIN_TASK_STACK_SIZE * 3;
	static constexpr const char NAME[]   		= "Monitor Task";
	static constexpr uint32_t PRIORITY   		= MIN_TASK_PRIORITY;
	static constexpr uint32_t SLEEP_TIME_MS 	= 1000;
	static constexpr uint32_t MAX_TASKS_MONITOR     = 10;
};
struct GreenLedTask {
	static constexpr uint32_t STACK_SIZE 		= MIN_TASK_STACK_SIZE * 2;
	static constexpr const char NAME[]   		= "Green Led Task";
	static constexpr uint32_t PRIORITY   		= MIN_TASK_PRIORITY;
	static constexpr uint32_t SLEEP_TIME_MS         = 1000;
};
struct RedLedTask {
	static constexpr uint32_t STACK_SIZE 		= MIN_TASK_STACK_SIZE * 2;
	static constexpr const char NAME[]   		= "Red Led Task";
	static constexpr uint32_t PRIORITY   		= MIN_TASK_PRIORITY;
	static constexpr uint32_t SLEEP_TIME_MS 	= 1000;
};
struct YellowLedTask {
	static constexpr uint32_t STACK_SIZE 		= MIN_TASK_STACK_SIZE * 2;
	static constexpr const char NAME[]   		= "Yellow Led Task";
	static constexpr uint32_t PRIORITY   		= MIN_TASK_PRIORITY;
	static constexpr uint32_t SLEEP_TIME_MS 	= 1000;
};

Then, after starting the program, I will see the following information in the terminal:

Wow, this is already good! So let's figure out what we see in this log.

We see the names of all existing tasks. In addition to the tasks described in the TaskConfig.h file, we also see the IDLE task, which is created automatically when the RTOS scheduler starts (its purpose is written here ).
We see the status of each task, where B = Blocked, R = Ready, S = Suspended, D = Deleted.
We see the priority of each task.
We see the minimum size of free space on the stack since the task was created. And here it becomes obvious to us that for the work of most tasks, we allocated too many stacks. For example, for a LoggerTask task, a stack of 256 words was allocated, but in reality it uses only 40 words. Thus, a stack of 64 words is enough for the task to function. Here you have the start of optimization.
We see the current and minimum (since the start of the scheduler) free space on the heap. In our simple example, these values are equal, but in more complex programs these two variables are, of course, different. Thus, we understand that out of 100KB allocated to FreeRTOS, it uses less than 10KB, therefore in our hands more than 90KB of free memory.
And finally, we see the amount of time elapsed since the start of the scheduler in milliseconds.

Applying the acquired knowledge to the TasksConfig.h file and lowering the value of the configMINIMAL_STACK_SIZE parameter from 128 to 64, we get the following picture:

Super! Now every task has an optimal supply of free space on the stack: not too large, and not too small. In addition, we freed up almost 3KB of memory.

And now it's time to talk about what we do not see in the received log yet. We do not see how much processor time each task uses, i.e. how much time, the task was in the Running state. To find out, we need to set the configGENERATE_RUN_TIME_STATS parameter to 1 and add the following definitions to the FreeRTOSConfig.h file:

#if configGENERATE_RUN_TIME_STATS == 1
void vConfigureTimerForRunTimeStats(void);
unsigned long vGetTimerForRunTimeStats(void);
#define portCONFIGURE_TIMER_FOR_RUN_TIME_STATS()    vConfigureTimerForRunTimeStats()
#define portGET_RUN_TIME_COUNTER_VALUE()            vGetTimerForRunTimeStats()
#endif

Now we need to start an external timer that counts the time (preferably in microseconds, because some tasks may take less than a millisecond, but we still want to know about everything). We complement the MonitorTask.h file with the declaration of two static functions:

static void config_timer();
static unsigned long get_counter_value();

In the MonitorTask.cpp file, write their implementation:

void MonitorTask::config_timer()
{
  _timer->disable_counter();
  _timer->set_counter_direction(cm3cpp::tim::Timer::CounterDirection::UP);
  _timer->set_alignment(cm3cpp::tim::Timer::Alignment::EDGE);
  _timer->set_clock_division(cm3cpp::tim::Timer::ClockDivision::TIMER_CLOCK_MUL_1);
  _timer->set_prescaler_value(hw::config::MONITOR_TIMER_PRESQ);
  _timer->set_autoreload_value(hw::config::MONITOR_AUTORELOAD);
  _timer->enable_counter();
  _timer->set_counter_value(0);
}
unsigned long MonitorTask::get_counter_value()
{
  static unsigned long _counter = 0;
  _counter += _timer->get_counter_value();
  _timer->set_counter_value(0);
  return (_counter);
}

And in the main.cpp file, we write the implementation of the functions vConfigureTimerForRunTimeStats () and vGetTimerForRunTimeStats (), which we announced in FreeRTOSConfig.h:

#if configGENERATE_RUN_TIME_STATS == 1
void vConfigureTimerForRunTimeStats(void)
{
    tasks::MonitorTask::config_timer();
}
unsigned long vGetTimerForRunTimeStats(void)
{
    return (tasks::MonitorTask::get_counter_value());
}
#endif

Now, after starting the program, our log will

look like this: Comparing the values of Total RunTime and System Uptime, we can conclude that only a third of the time our program is busy with tasks, with 98% of the time spent on IDLE, and 2% on all other tasks. What does our program do the remaining two-thirds of the time? This time is spent on the work of the scheduler and switching between all tasks. Sad but true. Of course, there are ways to optimize this time, but this is the topic for the next article.

As for the configUSE_STATS_FORMATTING_FUNCTIONS parameter , it is very secondary, most often it is used in various demos provided by FreeRTOS developers. Its essence lies in the fact that it includes two functions:

void vTaskList(char* pcWriteBuffer);
void vTaskGetRunTimeStats(char* pcWriteBuffer);

Both of these features are NOT part of FreeRTOS. Inside themselves, they call the same function uxTaskGetSystemState, which we used above, and put already formatted data into pcWriteBuffer. The developers themselves do not recommend the use of these functions (but, of course, do not prohibit them), indicating that their task is rather a demonstration, and it is conspicuous to use the uxTaskGetSystemState function directly, as we did.

That's all. As always, I hope this article was useful and informative)

To build and debug the demo project described in the article, we used the Eclipse + GNU MCU Eclipse (formerly GNU ARM Eclipse) + OpenOCD bundle.

Third Pin Company Blog

Tags: