Two approaches to software design for embedded
I want to talk a little about two approaches of software design in embedded. These two approaches - using a supercycle or using RTOS (Real-Time Operation System, real-time operating system).
I think that in the course of the story it will also be clear in which particular cases it is worth using the first, and in which not to do without the second.
I hope it will be interesting to all those who want to look into the world of development for embedded systems. For those who have already eaten a dog in the embedded, most likely there will be nothing new.
We have a microcontroller, which is actually a processor, a bit of memory and various peripherals, for example: analog-to-digital converters, timers, Ethernet, USB, SPI - all this greatly depends on the controller and the tasks being solved.
For example, you can connect some kind of sensor to the ADC input, for example, a temperature sensor, which, when power is supplied to it, converts the temperature into the voltage measured by this ADC.
And to the controller's output, called GPIO (General Purpose Input-Output), you can, for example, connect an LED (or something more powerful like a motor, but already through an amplifier).
Via SPI, RS232, USB, etc. the controller can communicate with the outside world in a more complex way - receiving and sending messages using a predetermined protocol.
In 90% of cases, the software is written in C, sometimes C ++ or assembler can be used. Although more and more often there are opportunities to write on something higher level, if this does not apply to direct work with peripherals and the maximum possible speed is not required.
To better imagine what you have to deal with, here are a couple of examples of environments that you have to work with: the size of the FLASH controller (analogous to a hard disk) is 16-256 kilobytes, the size of RAM is 64-256 kilobytes! And in such an environment, it’s really possible to launch not only an application, but also a real-time operating system with full support for multitasking!
The examples below are in pseudo-code, sometimes very similar to C. Without implementation details, where it is not essential for understanding.
The program in this approach looks simple:
An infinite loop in which the controller sequentially does everything it has to do.
The most interesting, of course, in embedded systems is the work with peripherals (the very ADCs, SPI, GPIO, etc.). The controller can work with external peripherals in two ways: interrogating or using interrupts. In the first case, if we want, for example, to read a character from the RS232 console, then we will periodically check to see if there is a character there until we get it. In the second case, we configure the RS232 controller so that it generates an interrupt the moment a new symbol appears.
Demonstration of the first approach. For example, we want to monitor the temperature, and if it exceeds the set limit, light the LED. It will look something like this:
So far, everything should be simple and clear. (I will not give the functions of reading temperature and manipulating the LED - this is not the purpose of this article).
But what if we need to do something at a given frequency? In the example above, the temperature will be checked as often as possible. And if, for example, we need to blink an LED once a second? Or interrogate the sensor strictly with an interval of 10 milliseconds?
Then timers come to the rescue (which almost any microcontroller has). They work so that they generate an interrupt at a given frequency. The blinking LED will then look something like this:
The peculiarity of working with interrupts is that the interrupt handler (the code that will be called immediately when the interrupt occurs) should be as short as possible. Therefore, the most common solution is to set a global flag variable in the handler (yes, no global variables, alas), and check it in the main loop, and when it changes, do the basic work required to process the event.
This most global variable must be declared with the volatile identifier - otherwise the optimizer can simply throw away the code that is not used from his point of view.
And if you need to blink two LEDs, so that one blinks once a second, and the second - three times? You can, of course, use two timers, but with this approach, timers will not be enough for us for a long time. Instead, we will make the timer work at a much higher frequency, and we will use a divider in the program.
Note that we do not need to monitor the overflow of the millisecond counter, since the unsigned type is used.
Imagine now that we have a debugging console implemented on top of the RS232 interface (the most common solution in the world embedded!). And we want to output debugging messages there (which will be visible if we connect our controller to the computer via the COM port). And at the same time, we need to interrogate a sensor connected to the controller with a strictly specified (at the same time high) frequency.
And here the question arises - how to implement such a banal thing, like outputting a string to the console? An obvious solution like
will be unacceptable in this case. It will output a string, but it will irreversibly violate the requirement to interrogate the sensor with a strictly specified frequency. We do all this in one big cycle, where all actions are performed sequentially, remember? And the console is a slow device, and line output can take much longer than the required interval between consecutive sensor polls. The example below is how you don’t have to!
Another example is if you want to implement software overload protection. Add a current meter, connect it to the ADC of the controller, and control the safety relay to one of the input / output pins. And of course, you want the protection to work as soon as possible after the overload event (otherwise everything just burns out). And you have the same general cycle in which all actions are performed strictly in order. And the guaranteed reaction time to an event can in no way be less than the execution time of one iteration of the cycle. And if there are operations in this cycle that require a long time to complete, then that's all, it is they who will set the reaction time of the system to everything else.
And if suddenly an error creeps in somewhere in this cycle, then the whole system will “fall”. Including the reaction to overload (which I really would not want to allow, would you?).
Although theoretically, something else can be done with the first problem. For example, replace the simplest but longest line printing function with something like:
And a simple call to this function on something like:
As a result, we reduced the transit time of one cycle from the time needed to print a whole line to the time required to print a single character. But for this we had to instead of the primitive and at first glance clear line output function to the console add two state machines to the code - one for printing (to memorize the position), and the second - for printing itself, to remember that we are now printing the line on over the next few cycles. Long live global variables, dirty functions that store states, and the like, wonderful things that can quickly and easily turn code into unaccompanied spaghetti.
Now imagine that the system must simultaneously poll from a dozen sensors, respond to several critical events that require an immediate reaction, process commands that arrive from the user or computer, display debugging messages, control a dozen indicators or manipulators. And for each of the actions its own restrictions on the reaction time and the frequency of the survey or control are set. And try to stuff all this into one consecutive general cycle.
Of course, this is all real. But to those who have to accompany all this at least a year after writing, I will not envy.
Another problem of the “general cycle” design is the complexity of measuring the system load. Suppose you have the code:
The system somehow responds to an interrupt coming from outside. And the question is - how many such interruptions per second can the system handle? How busy will the processor be when processing 100 events per second?
It will be very difficult for you to measure how much time was spent processing events, and how much to poll the variable “Is there an interruption?” After all, everything is done in one cycle!
And here the second approach comes to the rescue.
The easiest way to illustrate its application is with the same example: simultaneous polling of a sensor with a given frequency and outputting a long debug line to the console.
As you can see, the main function no longer has one main infinite loop. Instead, there is a separate infinite loop in each task. (Yes, the os_start_sheduler () function; will never return control!). And most importantly, these tasks have priorities. The operating system itself will provide what we need - so that a task with a high priority is performed first of all, and with a low priority - only when it has time.
And if the reaction time to, for example, an interruption in a design with a supercycle is equal in the worst case to the runtime of the entire cycle (the interruption will happen, of course, right away, but not always necessary actions can be done directly in the handler), then the reaction time in the case of a real-time OS, it will be equal to the switching time between tasks (which is small enough to assume that this happens immediately!). Those. interruption will occur in one task, and immediately upon its completion we will switch to another task, waiting for the event, "triggered" from the interruption.
As for the measurement of processor load, then this task with the application of the OS becomes trivial. By default, each OS has the most gluttonous (but also the lowest priority) Idle task, which executes an empty endless loop and receives control only when all other tasks are inactive. And the calculation of the time spent in Idle, usually also already implemented. It remains only to display it in the console.
Also, if suddenly you “don’t notice” any error, then only the task in which the error will be present will “fall” (it is also possible that all tasks with a lower priority also), but tasks with a higher priority will continue to be executed, providing at least the minimum vital functions of the device, for example, protection against overload.
And to summarize: if the system is very simple and undemanding by the time of the reaction, it is easier to make it like a “supercycle”. If the system is going to become large, combining many different actions and reactions, which are also critical in time, then there is no alternative to using the real-time OS.
In addition, the plus of using the OS is simpler and more understandable code (since we can group code by tasks, avoiding global variables, state machines, and other garbage necessary when using the design with a supercycle).
The downside is the use of the OS - it requires more space, memory, experience and knowledge (although there is nothing complicated there, yet multitasking is a priori more complicated and unpredictable than sequentially executed code). Be sure to have a good understanding of the principles of work in a multitasking environment, the principles of thread-safe code, data synchronization, and much more.
For "playing around" you can take FreeRTOS - a free open source project, which is quite stable and easy to learn. Although not uncommon and commercial projects using this particular OS.
I think that in the course of the story it will also be clear in which particular cases it is worth using the first, and in which not to do without the second.
I hope it will be interesting to all those who want to look into the world of development for embedded systems. For those who have already eaten a dog in the embedded, most likely there will be nothing new.
Just a little theory (for those taking the very first steps).
We have a microcontroller, which is actually a processor, a bit of memory and various peripherals, for example: analog-to-digital converters, timers, Ethernet, USB, SPI - all this greatly depends on the controller and the tasks being solved.
For example, you can connect some kind of sensor to the ADC input, for example, a temperature sensor, which, when power is supplied to it, converts the temperature into the voltage measured by this ADC.
And to the controller's output, called GPIO (General Purpose Input-Output), you can, for example, connect an LED (or something more powerful like a motor, but already through an amplifier).
Via SPI, RS232, USB, etc. the controller can communicate with the outside world in a more complex way - receiving and sending messages using a predetermined protocol.
In 90% of cases, the software is written in C, sometimes C ++ or assembler can be used. Although more and more often there are opportunities to write on something higher level, if this does not apply to direct work with peripherals and the maximum possible speed is not required.
To better imagine what you have to deal with, here are a couple of examples of environments that you have to work with: the size of the FLASH controller (analogous to a hard disk) is 16-256 kilobytes, the size of RAM is 64-256 kilobytes! And in such an environment, it’s really possible to launch not only an application, but also a real-time operating system with full support for multitasking!
The examples below are in pseudo-code, sometimes very similar to C. Without implementation details, where it is not essential for understanding.
So, the “supercycle” approach.
The program in this approach looks simple:
int main()
{
while(1)
{
doSomething();
doSomethingElse();
doSomethingMore();
}
}
An infinite loop in which the controller sequentially does everything it has to do.
The most interesting, of course, in embedded systems is the work with peripherals (the very ADCs, SPI, GPIO, etc.). The controller can work with external peripherals in two ways: interrogating or using interrupts. In the first case, if we want, for example, to read a character from the RS232 console, then we will periodically check to see if there is a character there until we get it. In the second case, we configure the RS232 controller so that it generates an interrupt the moment a new symbol appears.
Demonstration of the first approach. For example, we want to monitor the temperature, and if it exceeds the set limit, light the LED. It will look something like this:
int main()
{
init_adc();
init_gpio_as_out();
while (1)
{
int temperature = readTemperature();
if (temperature > TEMPERATURE_LIMIT)
{
turnLedOn();
}
else
{
turnLedOff();
}
}
So far, everything should be simple and clear. (I will not give the functions of reading temperature and manipulating the LED - this is not the purpose of this article).
But what if we need to do something at a given frequency? In the example above, the temperature will be checked as often as possible. And if, for example, we need to blink an LED once a second? Or interrogate the sensor strictly with an interval of 10 milliseconds?
Then timers come to the rescue (which almost any microcontroller has). They work so that they generate an interrupt at a given frequency. The blinking LED will then look something like this:
volatile int interrupt_happened = 0;
interrupt void timer_int_handler()
{
interrupt_happened = 1;
clear_interrupt_condition();
}
int main()
{
init_timer(1_SECOND_INTERVAL, timer_int_handler);
while (1)
{
if (interrupt_happened)
{
ledToggle();
interrupt_happened = 0;
}
}
}
The peculiarity of working with interrupts is that the interrupt handler (the code that will be called immediately when the interrupt occurs) should be as short as possible. Therefore, the most common solution is to set a global flag variable in the handler (yes, no global variables, alas), and check it in the main loop, and when it changes, do the basic work required to process the event.
This most global variable must be declared with the volatile identifier - otherwise the optimizer can simply throw away the code that is not used from his point of view.
And if you need to blink two LEDs, so that one blinks once a second, and the second - three times? You can, of course, use two timers, but with this approach, timers will not be enough for us for a long time. Instead, we will make the timer work at a much higher frequency, and we will use a divider in the program.
volatile uint millisecond_counter = 0;
interrupt void timer_int_handler()
{
++millisecond_counter;
clear_interrupt_condition();
}
int main()
{
init_timer(1_MILLISECOND_INTERVAL, timer_int_handler);
while (1)
{
uint timestart1 = millisecond_counter;
uint timestart2 = millisecond_counter;
if (millisecond_counter – timestart1 > 1000) // 1 second interval
{
led1Toggle();
timestart1 = millisecond_counter;
}
if (millisecond_counter – timestart2 > 333) // 1/3 second interval
{
led2Toggle();
timestart2 = millisecond_counter;
}
}
}
Note that we do not need to monitor the overflow of the millisecond counter, since the unsigned type is used.
Imagine now that we have a debugging console implemented on top of the RS232 interface (the most common solution in the world embedded!). And we want to output debugging messages there (which will be visible if we connect our controller to the computer via the COM port). And at the same time, we need to interrogate a sensor connected to the controller with a strictly specified (at the same time high) frequency.
And here the question arises - how to implement such a banal thing, like outputting a string to the console? An obvious solution like
void sendString(char * str)
{
foreach (ch in str)
{
put_ch(ch);
}
}
will be unacceptable in this case. It will output a string, but it will irreversibly violate the requirement to interrogate the sensor with a strictly specified frequency. We do all this in one big cycle, where all actions are performed sequentially, remember? And the console is a slow device, and line output can take much longer than the required interval between consecutive sensor polls. The example below is how you don’t have to!
int main
{
while (1)
{
…
if (something)
{
send_string("something_happened");
}
…
if (10_millisecond_timeout())
{
value = readADC();
}
}
}
Another example is if you want to implement software overload protection. Add a current meter, connect it to the ADC of the controller, and control the safety relay to one of the input / output pins. And of course, you want the protection to work as soon as possible after the overload event (otherwise everything just burns out). And you have the same general cycle in which all actions are performed strictly in order. And the guaranteed reaction time to an event can in no way be less than the execution time of one iteration of the cycle. And if there are operations in this cycle that require a long time to complete, then that's all, it is they who will set the reaction time of the system to everything else.
And if suddenly an error creeps in somewhere in this cycle, then the whole system will “fall”. Including the reaction to overload (which I really would not want to allow, would you?).
Although theoretically, something else can be done with the first problem. For example, replace the simplest but longest line printing function with something like:
int position = 0;
int send_string(char * str)
{
if (position < strlen(str)
{
put_ch(str[position];
++position;
return 1;
}
else
{
return 0;
}
}
And a simple call to this function on something like:
int main
{
while (1)
{
…
if (something)
{
do_print = 1;
position = 0;
}
if (do_print)
{
do_print = send_string("something_happened");
}
…
if (10_millisecond_timeout())
{
value = readADC();
}
}
}
As a result, we reduced the transit time of one cycle from the time needed to print a whole line to the time required to print a single character. But for this we had to instead of the primitive and at first glance clear line output function to the console add two state machines to the code - one for printing (to memorize the position), and the second - for printing itself, to remember that we are now printing the line on over the next few cycles. Long live global variables, dirty functions that store states, and the like, wonderful things that can quickly and easily turn code into unaccompanied spaghetti.
Now imagine that the system must simultaneously poll from a dozen sensors, respond to several critical events that require an immediate reaction, process commands that arrive from the user or computer, display debugging messages, control a dozen indicators or manipulators. And for each of the actions its own restrictions on the reaction time and the frequency of the survey or control are set. And try to stuff all this into one consecutive general cycle.
Of course, this is all real. But to those who have to accompany all this at least a year after writing, I will not envy.
Another problem of the “general cycle” design is the complexity of measuring the system load. Suppose you have the code:
interrupt void external_interrupt_handler()
{
interrupt_happened = 1;
clear_interrupt_condition();
}
int main()
{
while (1)
{
if (interrupt_happened)
{
doSomething();
interrupt_happened = 0;
}
}
}
The system somehow responds to an interrupt coming from outside. And the question is - how many such interruptions per second can the system handle? How busy will the processor be when processing 100 events per second?
It will be very difficult for you to measure how much time was spent processing events, and how much to poll the variable “Is there an interruption?” After all, everything is done in one cycle!
And here the second approach comes to the rescue.
Application of a real-time operating system.
The easiest way to illustrate its application is with the same example: simultaneous polling of a sensor with a given frequency and outputting a long debug line to the console.
void SensorPollingTask()
{
while (1)
{
value = SensorRead();
if (value > LIMIT)
{
doSomething();
}
taskDelay(10_MILLISECOND_DELAY);
}
}
void DebugTask()
{
dbg_task_queue = os_queue_create();
while (1)
{
char * str = os_queue_read(dbg_task_queue);
foreach (ch in str)
{
put_ch(ch);
}
}
}
void OtherTask()
{
other_task_init();
…
while(1)
{
…
// we want to do a dbg_printout here
os_queue_put("Long Debug Output String");
…
}
}
int main()
{
os_task_create(SensorPollingTask, HIGH_PRIORITY);
os_task_create(DebugTask, LOW_PRIORITY);
os_task_create(OtherTask, OTHER_PRIORITY);
os_start_sheduler();
}
As you can see, the main function no longer has one main infinite loop. Instead, there is a separate infinite loop in each task. (Yes, the os_start_sheduler () function; will never return control!). And most importantly, these tasks have priorities. The operating system itself will provide what we need - so that a task with a high priority is performed first of all, and with a low priority - only when it has time.
And if the reaction time to, for example, an interruption in a design with a supercycle is equal in the worst case to the runtime of the entire cycle (the interruption will happen, of course, right away, but not always necessary actions can be done directly in the handler), then the reaction time in the case of a real-time OS, it will be equal to the switching time between tasks (which is small enough to assume that this happens immediately!). Those. interruption will occur in one task, and immediately upon its completion we will switch to another task, waiting for the event, "triggered" from the interruption.
interrupt void overcurrent_handler()
{
os_semaphore_give(overcurrent_semaphore);
clear_interrupt_condition();
}
void OvercurrentTask()
{
os_sem_create(overcurrent_semaphore);
while (1)
{
os_semaphore_take(overcurrent_semaphore);
DoOvercurrentActions();
}
}
As for the measurement of processor load, then this task with the application of the OS becomes trivial. By default, each OS has the most gluttonous (but also the lowest priority) Idle task, which executes an empty endless loop and receives control only when all other tasks are inactive. And the calculation of the time spent in Idle, usually also already implemented. It remains only to display it in the console.
Also, if suddenly you “don’t notice” any error, then only the task in which the error will be present will “fall” (it is also possible that all tasks with a lower priority also), but tasks with a higher priority will continue to be executed, providing at least the minimum vital functions of the device, for example, protection against overload.
And to summarize: if the system is very simple and undemanding by the time of the reaction, it is easier to make it like a “supercycle”. If the system is going to become large, combining many different actions and reactions, which are also critical in time, then there is no alternative to using the real-time OS.
In addition, the plus of using the OS is simpler and more understandable code (since we can group code by tasks, avoiding global variables, state machines, and other garbage necessary when using the design with a supercycle).
The downside is the use of the OS - it requires more space, memory, experience and knowledge (although there is nothing complicated there, yet multitasking is a priori more complicated and unpredictable than sequentially executed code). Be sure to have a good understanding of the principles of work in a multitasking environment, the principles of thread-safe code, data synchronization, and much more.
For "playing around" you can take FreeRTOS - a free open source project, which is quite stable and easy to learn. Although not uncommon and commercial projects using this particular OS.