
STM32 and FreeRTOS. 1. Entertainment with streams
- Tutorial
This series of 5 articles is designed for those who have fewer opportunities for the usual “tinek” and arduinok, but all attempts to switch to more powerful controllers failed or did not bring as much pleasure as they could. All of the following has been spoken out by me many times at the “educational program” of our studio programmers (who often admitted that switching from “tinek” to “stmki” opens up so many possibilities that you get into a stupor, not knowing what to grab), so I dare to hope that it will be useful will be everything. When reading, it is understood that the reader is a curious person and was able to find and put Keil, STM32Cube and press the OK buttons. For practice, I use the STM32F3DISCOVERY evaluation board, because it is cheap, it has a powerful processor and a bunch of LEDs.
Each article is designed to “repeat” and “comprehend” somewhere around one evening, for home, family or relaxation ...

Very often (yes, what’s there often, almost always) microcontrollers are used in conditions where it is necessary to monitor several parameters at once. Or vice versa, manage multiple devices simultaneously.
Here is a problem for an example: we have 4 outputs on which it is necessary to output pulses of different durations with different pauses. All we have is a system timer that counts in milliseconds.
We complicate the task in the spirit of "torturing myself on arduino." Timers are busy with another, PWM is not suitable, because it does not work on all legs, and you can’t usually drive it to the desired modes. After a little thought, we sit down and write something like this code
And so or something like that for all 4 ports. It turns out a decent footcloth on several screens, but this footcloth works and works pretty quickly, which is important for the microcontroller.
Then, suddenly, the programmer notices that the port twitches during each cycle, even if its state does not change. The whole footcloth rules. Then the number of ports with the same needs doubles. The programmer spits and rewrites everything into a single function of type PortBlink (int port num).
Happiness almost came, but suddenly it was required that at some port, together with the “exit” control, something was previously read and the port was controlled based on this read. The programmer again swears and does another function, especially for the port.
Happiness? But nevermind. The customer hooked something up and this can easily slow down the process for seconds ... The moaning begins, programmers once again correct the code, finally turning it into an unreadable trash, managers roll out wild prices to the customer for adding functionality, the customer swears and decides to never mess again with integrated solutions.
(such as advertising and praise) And why? Because initially the wrong decision was made about the platform. If possible, we offer a sophisticated platform even for primitive tasks. From experience, the cost of development and support is then much lower. And now, for controlling the 8th outputs, I will take the STM32F3, which can operate at 72MHz. (whispering) Actually, I just have at hand a demo payment with him (smail). Was still with L1, but we inadvertently used it in one of the projects.
Open the STM32Cube, select the board, turn on the checkbox next to FreeRTOS and assemble the project as usual. We do not need anything like that, so we leave everything by default.
What is FreeRTOS? This is an almost real-time operating system for microcontrollers. That is, everything you've heard about operating systems such as multitasking, semaphores, and other mutexes. Why FreeRTOS? It just supports STM32Cube ;-). There are a bunch of other similar systems - the same ChibiOS. In essence, they are all the same, they only differ in teams and their format. Here I’m not going to rewrite the mountain of books and instructions on working with operating systems, I’ll just run wide strokes on the most interesting things that help programmers very much in their hard work.
Okay, I’ll take it that they read it on the Internet and got into it. We look what has changed
Somewhere at the beginning of main.c
and after all initializations
And an empty StartThread with one infinite loop and osDelay (1);
Surprised? Meanwhile, before you are almost 90% of the functionality that you will use. The first two lines create a thread with normal priority, and the last line launches the task scheduler. And all this magnificence fits into a 6 kilobyte flash.
But we need to check the work. Change osDelay to the following code
Compile and fill. If everything is done correctly, then the blue LED should blink (a bunch of LEDs are soldered on the STM32F3Discovery on PE8-PE15, so if you have a different board, change the code)
Now, let's take and replicate the resulting function for each LED.
Add a stream for each LED
And move the code to light the LED
In general, everything is the same.
Compile, fill ... and we get a fig. Complete. No LED flashes.
Through terrible debugging by the comment method, we find out that 3 threads are working, and 4 are no longer there. What is the problem? The problem is in the allocated memory for the sheduler and the stack.
We look in FreeRTOSConfig.h
3000 bytes for everything and each task has 128 bytes. Plus, somewhere else you need to store information about the task and other useful things. That's why, if you do nothing, the scheduler does not even start if there is not enough memory.
Judging by the facts, if you enable full optimization, then FreeRTOS itself will take 250 bytes. Plus, for each task, 128 bytes for the stack, 64 for the internal list and 16 for the task name. We consider: 250 + 3 * (128 + 64 + 16) = 874. Even to a kilobyte does not reach. And we have 3 ...
What is the problem? The version of FreeRTOS supplied with STM32Cube is too old (7.6.0) to get vTaskInfo, so I go to the side:
Before and after creating the stream, I put the following (fre is the usual size_t)
We stick in the breakpoints and get the following numbers: before creating the task, there were 2376 free bytes, and after 1768. That is, 608 bytes are spent on one task. Check again. We get the numbers 2992-2376-1768-1160. The number is the same. By simple logical conclusions, we understand that those numbers from the fax are taken for some dead processor, with optimizations turned on and all sorts of modules turned off. We look further and understand that the start of the sheduler eats up about 580 more bytes.
In general, we take for calculations 610 bytes per task with a minimum stack and another 580 bytes for the OS itself. Total in TOTAL_HEAP_SIZE you need to write 610 * 9 + 580 = 6070. Round and give 6100 bytes - let it eat.
We compile, fill and observe how all the LEDs flash at once. We try to reduce the stack to 6050 - again, nothing works. So, we calculated correctly :)
Now you can indulge and set your own “pulse” and “pause” intervals for each LED. In principle, if you update FreeRTOS or conjure in the code, then it is easy to give an accuracy of 0.01ms (by default, 1 tick is 1ms).
Agree, working with 8 tasks alone is much more pleasant than in one with 8 at the same time? In reality, in projects we usually spin on 30-40 threads. How many deaths would there be for programmers, if I push all their processing into one function, I’m even afraid to calculate :)
The next step we need to deal with priorities. As in real life, some tasks are “more equal” than the others and they need more resources. To begin with, we replace one flasher with a flasher, but done incorrectly, when a pause is made not by the OS means, but by a simple cycle.
That is, instead of osDelay (), such a horror is inserted here.
The number of cycles is usually selected experimentally (because if there are several such delays, then a bunch of headaches in the calculations are provided). Aesthetes can calculate the execution time of commands.
Replace, compile, run. The LEDs blink as before, but somehow sluggishly. Viewing with an oscilloscope makes it clear that instead of smooth boundaries (like 50ms we burn and 50ms we don’t burn), the boundaries began to float for 1-2ms (the eye, oddly enough, notices it). Why? Because FreeRTOS is not a real-time system and can afford such liberties.
Now let's raise the priority of this task one step, to osPriorityAboveNormal. Run and see a lonely blinking LED. Why?
Because the scheduler prioritizes tasks. What does he see? That high priority task constantly requires a processor. What is the result? Other tasks do not have time for work.
And now lower the priority by one step from normal to osPriorityBelowNormal. As a result, the scheduler, having given work to normal tasks, gives the remaining resources “bad”.
From here you can easily infer the first rule of the programmer: if the functions have nothing to do, then give control to the scheduler.
FreeRTOS has two “wait” options
The first option is “just wait N ticks”. The usual pause, without any frills: how much they said to wait, so much we wait. This is vTaskDelay (osDelay is just a synonym). If you look at the time at runtime, it will be something like this (assume that a useful task is performed for 24ms):
... [0ms] - transfer of control - operation [24ms] pause in 100ms [124ms] - transfer of control - operation [148ms] pause in 100ms [248ms] ...
It is easy to see that due to the time required to work, control transfer does not occur every 100ms, as one might initially assume. For such cases, there is vTaskDelayUntil. With it, the time line will look like this
... [0ms] - control transfer - work [24ms] pause at 76ms [100ms] - control transfer - work [124ms] pause at 76ms [200ms] ...
As you can see, the task receives control in clearly defined time intervals, which we needed. To check the accuracy of the scheduler in one of the threads, I asked to pause for 1ms. In the picture you can evaluate the accuracy of working with 9 threads (we don’t forget about StartThread)

I usually end here, because people are so immersed in the game with priorities and figuring out “when it breaks”, which is easier to shut up and let you have fun.
A fully assembled project with all source codes can be taken at kaloshin.ru/stm32/freertos/stage1.rar
Continued Part 2. About semaphores
Each article is designed to “repeat” and “comprehend” somewhere around one evening, for home, family or relaxation ...

Very often (yes, what’s there often, almost always) microcontrollers are used in conditions where it is necessary to monitor several parameters at once. Or vice versa, manage multiple devices simultaneously.
Here is a problem for an example: we have 4 outputs on which it is necessary to output pulses of different durations with different pauses. All we have is a system timer that counts in milliseconds.
We complicate the task in the spirit of "torturing myself on arduino." Timers are busy with another, PWM is not suitable, because it does not work on all legs, and you can’t usually drive it to the desired modes. After a little thought, we sit down and write something like this code
// инициализация
int time1on=500; // Время, пока выход 1 должен быть включен
int time1off=250; // Время, пока выход 1 должен быть выключен
unsigned int now=millis();
....
// где-то в цикле
if(millis()now+time1on+time1off)
{
now=millis();
}
}
And so or something like that for all 4 ports. It turns out a decent footcloth on several screens, but this footcloth works and works pretty quickly, which is important for the microcontroller.
Then, suddenly, the programmer notices that the port twitches during each cycle, even if its state does not change. The whole footcloth rules. Then the number of ports with the same needs doubles. The programmer spits and rewrites everything into a single function of type PortBlink (int port num).
Happiness almost came, but suddenly it was required that at some port, together with the “exit” control, something was previously read and the port was controlled based on this read. The programmer again swears and does another function, especially for the port.
Happiness? But nevermind. The customer hooked something up and this can easily slow down the process for seconds ... The moaning begins, programmers once again correct the code, finally turning it into an unreadable trash, managers roll out wild prices to the customer for adding functionality, the customer swears and decides to never mess again with integrated solutions.
(such as advertising and praise) And why? Because initially the wrong decision was made about the platform. If possible, we offer a sophisticated platform even for primitive tasks. From experience, the cost of development and support is then much lower. And now, for controlling the 8th outputs, I will take the STM32F3, which can operate at 72MHz. (whispering) Actually, I just have at hand a demo payment with him (smail). Was still with L1, but we inadvertently used it in one of the projects.
Open the STM32Cube, select the board, turn on the checkbox next to FreeRTOS and assemble the project as usual. We do not need anything like that, so we leave everything by default.
What is FreeRTOS? This is an almost real-time operating system for microcontrollers. That is, everything you've heard about operating systems such as multitasking, semaphores, and other mutexes. Why FreeRTOS? It just supports STM32Cube ;-). There are a bunch of other similar systems - the same ChibiOS. In essence, they are all the same, they only differ in teams and their format. Here I’m not going to rewrite the mountain of books and instructions on working with operating systems, I’ll just run wide strokes on the most interesting things that help programmers very much in their hard work.
Okay, I’ll take it that they read it on the Internet and got into it. We look what has changed
Somewhere at the beginning of main.c
static void StartThread(void const * argument);
and after all initializations
/* Create Start thread */
osThreadDef(USER_Thread, StartThread, osPriorityNormal, 0, configMINIMAL_STACK_SIZE);
osThreadCreate (osThread(USER_Thread), NULL);
/* Start scheduler */
osKernelStart(NULL, NULL);
And an empty StartThread with one infinite loop and osDelay (1);
Surprised? Meanwhile, before you are almost 90% of the functionality that you will use. The first two lines create a thread with normal priority, and the last line launches the task scheduler. And all this magnificence fits into a 6 kilobyte flash.
But we need to check the work. Change osDelay to the following code
HAL_GPIO_WritePin(GPIOE,GPIO_PIN_8,GPIO_PIN_RESET);
osDelay(500);
HAL_GPIO_WritePin(GPIOE,GPIO_PIN_8,GPIO_PIN_SET);
osDelay(500);
Compile and fill. If everything is done correctly, then the blue LED should blink (a bunch of LEDs are soldered on the STM32F3Discovery on PE8-PE15, so if you have a different board, change the code)
Now, let's take and replicate the resulting function for each LED.
static void PE8Thread(void const * argument);
static void PE9Thread(void const * argument);
static void PE10Thread(void const * argument);
static void PE11Thread(void const * argument);
static void PE12Thread(void const * argument);
static void PE13Thread(void const * argument);
static void PE14Thread(void const * argument);
static void PE15Thread(void const * argument);
Add a stream for each LED
osThreadDef(PE8_Thread, PE8Thread, osPriorityNormal, 0, configMINIMAL_STACK_SIZE);
osThreadCreate (osThread(PE8_Thread), NULL);
And move the code to light the LED
static void PE8Thread(void const * argument)
{
for(;;)
{
HAL_GPIO_WritePin(GPIOE,GPIO_PIN_8,GPIO_PIN_RESET);
osDelay(500);
HAL_GPIO_WritePin(GPIOE,GPIO_PIN_8,GPIO_PIN_SET);
osDelay(500);
}
}
In general, everything is the same.
Compile, fill ... and we get a fig. Complete. No LED flashes.
Through terrible debugging by the comment method, we find out that 3 threads are working, and 4 are no longer there. What is the problem? The problem is in the allocated memory for the sheduler and the stack.
We look in FreeRTOSConfig.h
#define configMINIMAL_STACK_SIZE ((unsigned short)128)
#define configTOTAL_HEAP_SIZE ((size_t)3000)
3000 bytes for everything and each task has 128 bytes. Plus, somewhere else you need to store information about the task and other useful things. That's why, if you do nothing, the scheduler does not even start if there is not enough memory.
Judging by the facts, if you enable full optimization, then FreeRTOS itself will take 250 bytes. Plus, for each task, 128 bytes for the stack, 64 for the internal list and 16 for the task name. We consider: 250 + 3 * (128 + 64 + 16) = 874. Even to a kilobyte does not reach. And we have 3 ...
What is the problem? The version of FreeRTOS supplied with STM32Cube is too old (7.6.0) to get vTaskInfo, so I go to the side:
Before and after creating the stream, I put the following (fre is the usual size_t)
fre=xPortGetFreeHeapSize();
We stick in the breakpoints and get the following numbers: before creating the task, there were 2376 free bytes, and after 1768. That is, 608 bytes are spent on one task. Check again. We get the numbers 2992-2376-1768-1160. The number is the same. By simple logical conclusions, we understand that those numbers from the fax are taken for some dead processor, with optimizations turned on and all sorts of modules turned off. We look further and understand that the start of the sheduler eats up about 580 more bytes.
In general, we take for calculations 610 bytes per task with a minimum stack and another 580 bytes for the OS itself. Total in TOTAL_HEAP_SIZE you need to write 610 * 9 + 580 = 6070. Round and give 6100 bytes - let it eat.
We compile, fill and observe how all the LEDs flash at once. We try to reduce the stack to 6050 - again, nothing works. So, we calculated correctly :)
Now you can indulge and set your own “pulse” and “pause” intervals for each LED. In principle, if you update FreeRTOS or conjure in the code, then it is easy to give an accuracy of 0.01ms (by default, 1 tick is 1ms).
Agree, working with 8 tasks alone is much more pleasant than in one with 8 at the same time? In reality, in projects we usually spin on 30-40 threads. How many deaths would there be for programmers, if I push all their processing into one function, I’m even afraid to calculate :)
The next step we need to deal with priorities. As in real life, some tasks are “more equal” than the others and they need more resources. To begin with, we replace one flasher with a flasher, but done incorrectly, when a pause is made not by the OS means, but by a simple cycle.
That is, instead of osDelay (), such a horror is inserted here.
unsigned long c;
for(int i=0;i<1000000;i++)
{
c++;
}
The number of cycles is usually selected experimentally (because if there are several such delays, then a bunch of headaches in the calculations are provided). Aesthetes can calculate the execution time of commands.
Replace, compile, run. The LEDs blink as before, but somehow sluggishly. Viewing with an oscilloscope makes it clear that instead of smooth boundaries (like 50ms we burn and 50ms we don’t burn), the boundaries began to float for 1-2ms (the eye, oddly enough, notices it). Why? Because FreeRTOS is not a real-time system and can afford such liberties.
Now let's raise the priority of this task one step, to osPriorityAboveNormal. Run and see a lonely blinking LED. Why?
Because the scheduler prioritizes tasks. What does he see? That high priority task constantly requires a processor. What is the result? Other tasks do not have time for work.
And now lower the priority by one step from normal to osPriorityBelowNormal. As a result, the scheduler, having given work to normal tasks, gives the remaining resources “bad”.
From here you can easily infer the first rule of the programmer: if the functions have nothing to do, then give control to the scheduler.
FreeRTOS has two “wait” options
The first option is “just wait N ticks”. The usual pause, without any frills: how much they said to wait, so much we wait. This is vTaskDelay (osDelay is just a synonym). If you look at the time at runtime, it will be something like this (assume that a useful task is performed for 24ms):
... [0ms] - transfer of control - operation [24ms] pause in 100ms [124ms] - transfer of control - operation [148ms] pause in 100ms [248ms] ...
It is easy to see that due to the time required to work, control transfer does not occur every 100ms, as one might initially assume. For such cases, there is vTaskDelayUntil. With it, the time line will look like this
... [0ms] - control transfer - work [24ms] pause at 76ms [100ms] - control transfer - work [124ms] pause at 76ms [200ms] ...
As you can see, the task receives control in clearly defined time intervals, which we needed. To check the accuracy of the scheduler in one of the threads, I asked to pause for 1ms. In the picture you can evaluate the accuracy of working with 9 threads (we don’t forget about StartThread)

I usually end here, because people are so immersed in the game with priorities and figuring out “when it breaks”, which is easier to shut up and let you have fun.
A fully assembled project with all source codes can be taken at kaloshin.ru/stm32/freertos/stage1.rar
Continued Part 2. About semaphores