LifeV April 24, 2015 at 16:37

Lightweight threads in Embox

Today, as promised, I will continue the topic of light entity planning, which I have already begun in my series of articles. In it, I talked about the internal structure of tasklet , workqueue and protothread . Of course, the topic is not limited to just these examples: there is FreeRTOS with its coroutine , or GNU Portable threads ; or you can move away from the structures and libraries used in the OS and recall the various green threads, which are becoming more and more.

This time I want to share how we implemented lightweight streams in the Embox project . On the one hand, we tried to take into account the experience of previous developments, on the other, to bring something new.

How it all started

The problem approached us from different angles.

Streams require their stack, consuming relatively much RAM. Our cooperative multitasking was carried out only by disabling the timer responsible for calling the scheduler after a time slice. What did this lead to? On boards with limited resources, sometimes you have to completely abandon multitasking.

In addition, tasks more and more often appeared for which it would be nice to have some kind of event processing mechanism. It is desirable with priorities and primitives of synchronization, and so that all this is processed quickly and does not require extra memory.

This prompted us to think of lightweight, almost cooperative multitasking for some non-stacked entities. But at the same time, no one was going to refuse full multitasking.

Therefore, we decided not only to introduce lightweight planning units into the system, but also to entrust the dispatching of these light flows to the main scheduler of the system on a common basis with full flows. Potentially, we saw the following benefits:

You can get rid of the softirq layer and handle pending interrupts in high priority scheduler queues;
In general, with proper prioritization, some hardware interrupts can be handled in the same way;
Flexible tune of priorities, including at runtime. For example, recently we had a task in which it was necessary to process timer interruptions earlier than from the network stack. And some real-time tasks may require processing of high-priority user threads ahead of some pending interrupts.

So what we dreamed about:

A scheduler that can manage both regular flows and light;
Efficient lightweight streams, some planned features like event handlers with synchronization support.

To reduce the connectivity of the system, the planner will now manage abstract planning units.

Schedee, or Abstract Planning Unit

The idea, in fact, is quite simple, especially when considered in terms of OOP. Take a look at the chart:

Schedee is the parent abstract class of various types of threads. Each type of stream must provide some kind of interface for managing it. The scheduler does not know anything about specific implementations, and in its function it just determines who will be executed next, does some preparation and delegates the processing to a specific schedee, for example, by calling the special function doSomethng ().

Next, I will talk about how we applied this idea in our planner.

In C, unfortunately, you have to implement inheritance by hand, namely, on the basis of a structure as a member of another, more specific structure. We will have an abstract schedee structure from which specific thread and lthread structure types are inherited and provide the implementation of the necessary “abstract” functions.

We have already implemented the usual POSIX-compatible threads and the central scheduler that controls them. Therefore, the expansion of the functionality of the scheduler occurred with an eye on an existing implementation.

The first thing we needed to do was to isolate from the structure of ordinary flows the part that the scheduler manages. Based on this, we will construct the abstract structure sсhedee.

Selecting common fields from the structure turned out to be quite simple. But is something else needed? To answer this question, it is necessary, firstly, to understand how the types of flows can fundamentally differ, and, secondly, to understand the work of the scheduler.

Streams can be:

Crowded out or cooperative. The former are supplanted by flows with a higher priority or after a time slice, and the latter transfer control to the scheduler themselves. From the point of view of the function of the scheduler itself, there is no difference, the only difference is when and by whom it will be called next time;
With or without a stack. The presence of a stack in threads leads to additional memory consumption and the cost of switching contexts. The lack of a stack makes it impossible to wait with a change of context and, as a result, crowding out (at least, full). This difference is already important for the scheduler, such flows should be handled differently;
With one or more entry points. With one entry point, everything is simple: the function runs from and to. The presence of several entry points is typical for coroutines and iterators. For such flows, it is necessary to store information about the current entry point; you can consider such information as an analog of the context of a stream with a stack waiting for an event. In any case, the planner should not be aware of this.

Now let's see how the planner was arranged. The code is simplified for better understanding.

void schedule(void) {
    struct thread *prev, *next;
    ipl_t ipl;
    prev = thread_self();
    /* Защита от прерываний */
    ipl = ipl_save();
    if (!prev->waiting)
        runq_enqueue(&rq.queue, prev);
    next = runq_extract(&rq.queue);
    if (prev != next)                 // Это код, специфичный
        sched_switch(prev, next);     // для стековых потоков.
    ipl_restore(ipl);
}

The scheduler is arranged quite simply: it determines which thread will be executed next, and if the thread has changed, it switches the context. After this, a new thread naturally leaves the function of the scheduler and continues its work. That is, the scheduler is geared to work only with stack streams.

In order to abstract from concrete implementations of schedee, specific processing needs to be carried out in a special function, we will call it process (). A pointer to this function will be in schedee.

Now you need to understand how to handle streams without a stack. What do we know about them?
Such flows are not crowded out. That is, there should be no implicit rescheduling calls;
For stackless streams, there is no need to switch context. In fact, there are different approaches here, in Linux, for example, a separate stack is allocated for sequential processing of softirq. So far, we have settled on the option of processing such flows on the stack of the last executable full thread.

Thus, if for regular flows it is necessary to leave the scheduler function, then stackless flows should be processed in a loop directly in the scheduler, which, however, is a rather common approach. The cycle stops as soon as a stream with a stack has arrived for processing.

void schedule(int preempt) {
    struct schedee *prev, *next, *last;
    prev = schedee_get_current();
    /* Запрещаем прерывания */
    ipl_save();
    if (!prev->waiting)
        runq_enqueue(&rq.queue, prev);
    while (1) {
        next = runq_extract(&rq.queue);
        /* Прерывания разрешаются прямо внутри process() */
        last = next->process(prev, next);
        /* process() возвращает поток, на контекст которого мы переключились.
         * Если переключения не было, то был выполнен бесстековый поток,
         * крутимся дальше. */
        if (last)
            break;
        /* Запрещаем прерывания */
        ipl_save();
    }
}

Encapsulation is a good principle, even if we are dealing with procedural programming. So all the scheduler code should know only about schedee, but not about specific threads. Here is the complete schedee structure code:

struct schedee {
    runq_item_t       runq_link; /* Ссылка в очереди runq */
    spinlock_t        lock; /* Для блокировки в случае SMP */
    /* Главная функция-обработчик, должна быть задана при инициализации
     * конкретной реализацией schedee. Как мы уже выяснили, в случае обычного
     * потока это переключение контекста, обработка сигналов и т.д.
     * Функция возвращает schedee, на стеке которого нужно продолжить 
     * исполнение. Если schedee кооперативный, то функция возвращает NULL. */
    struct schedee    *(*process)(struct schedee *prev, struct schedee *next);
    /* Эти три поля отвечают за состояния в машине состояний планировщика. */
    unsigned int active;
    unsigned int ready;
    unsigned int waiting;
    /* Специфичные поля */
    struct affinity         affinity;
    struct sched_timing     sched_timing;
    /* Приоритет в очереди runq */
    struct schedee_priority priority;
    /* Ссылка в очереди waitq. О том, почему она здесь, речь пойдет ниже. */
    struct waitq_link waitq_link;
};

And finally, what the process () function looks like for regular threads:

struct schedee *thread_process(struct schedee *prev, struct schedee *next) {
	struct thread *next_t, *prev_t;
	next_t = mcast_out(next, struct thread, schedee);
	prev_t = mcast_out(prev, struct thread, schedee);
	if (prev != next) {
		thread_context_switch(prev_t, next_t);
	}
	ipl_enable();
	if (!prev_t->siglock) {
		thread_signal_handle();
	}
	return &thread_self()->schedee;
}

Light streams

Now let's look at the light streams of Embox, for which everything was started. Our lightweight streams are stackless, cooperative and support multiple entry points.

In one of the tasks, it was necessary to process information constantly coming from an external source, moreover, with synchronization. Therefore, the initial question was about adequate support for synchronization, which in fact rests on the ability:

First of all, for example, to capture a mutex; in the general case, wait for any condition (analogue of conditional variables)
in case of failure, return control to the scheduler in the middle of the function
start from the right place when the thread wakes up, for example, when releasing the necessary mutex.

In the case of regular threads, everything is simple: when you fall asleep, you call the scheduler and reschedule. The flow simply calls, for example, the sleep () function. Paragraphs 1 and 3 are implemented due to the safety of the stack and context. With stackless streams, things get more complicated:

It is necessary to exit the stream function explicitly so that the stack is in the same state as before the start of the light stream.
The easy flow function will start from the very beginning. When called again, local variables cease to be relevant.

To combat the last point, we decided to follow the coroutine path, like Adam Dunkels in our protothread library . But, unlike Adam, we decided not to hide the logic behind cunning macros, but to make the interface as explicit as possible. Our solution is based on the extension of GCC Labels as Values . This extension allows you to get links to tags, as well as to calculate the indent from a specific tag, using the usual manipulations with pointers.

Light flows and their structure

To understand everything in more detail, consider the implementation details. Firstly, the structure of lthread itself:

struct lthread {
	struct schedee schedee;
	int (*run)(struct lthread *);
	int label_offset;
	struct schedee *joining;
	struct sched_wait_info info;
};

The first thing to clarify is the signature of the run function. Since light streams do not have a stack, it is assumed that the structure of light streams will often be a member of another structure containing information that will need to be stored from call to call. Therefore, it is reasonable to use the link to the light stream itself as the main argument, and refer to the rest of the information using type casting.

The return value of the run function is input point information. If the input point for the function is assumed to be one, or when restarting, the initial input point should be used, then the return value will be 0. Otherwise, the return value is set by the lthread_yield () macro. This and other macros I will describe below.

The label_offset field is the indent from the start label of the function. This is a service field for saving the input point of the stream. Working with various input points is done using the lthread_resume () and lthread_yield () macros.

The joining field is a reference to the schedee of a thread awaiting the completion of a light thread, similar to regular threads. About it in more detail below.

And finally, info is a service field with the necessary information to wait by timeout.

Light Flow API

The basic easy-stream API is pretty simple. The initialization and start functions are unlikely to require explanation:

void lthread_init(struct lthread *lt, int (*run)(struct lthread *));
void lthread_launch(struct lthread *lt);

When the application has worked, you need to make sure that the light threads are completed and no one else wakes them up. If this is not so, then it is unsafe to terminate the application: the awakening of a light stream can occur when its structure is irrelevant. Therefore, you must first wait for the flow to be in the desired state. This is similar to the behavior of the thread_join () function. We need its analogue for light flows:

extern void lthread_join(struct lthread *lt);

Now this function has one drawback: it can only be called from a stream with a stack. In fact, it can be expanded to the general case, if necessary.

There is another important point regarding this feature. In order to understand it, let's digress a bit and pay attention to the life cycle of the light flow. A light stream can only be in two states: readiness and expectation. Ready-made light threads are in the runq scheduler queue, and while it is in this state, you cannot add it to the queue a second time. However, as soon as the finished thread gets out of the queue, it goes into a waiting state, and now it can be scheduled again, including this he can do. For example, tasklets in Linux behave in the same way. If the light thread plans itself (actually never ends), then the thread that called lthread_join () just hangs. The user is responsible for ensuring that the easy flow does not plan itself in this case.

Consider the part of the API that provides label management, i.e. multiple entry points. An important note: work with labels is carried out only in the run () function itself, but not in the functions that it calls. Unfortunately, these restrictions can not be avoided without a stack.

A macro that returns a previously saved label:

/* Шаблон использования: goto lthread_resume(lt, start); */
#define lthread_resume(lt, initial_label) \
            *((lt)->label_offset + (initial_label))

A macro that returns indentation relative to the start label:

/* Шаблон использования: return lthread_yield(lt, start, resume_point); */
#define lthread_yield(initial_label, target_label) \
            (target_label) - (initial_label);

The lthread_resume () macro should always be paired with the goto statement, and lthread_yield () should be paired with the return statement.

All standby and synchronization functions for light flows operate on the same principle. Consider an example of a simple timeout wait macro:

SCHED_WAIT_TIMEOUT_LTHREAD(self, cond_expr, timeout);

This function returns one of three possible codes:

0 if the condition has become true;
ETIMEDOUT, if after the timeout the condition has not been met;
EAGAIN, if the condition is not yet true, the timer has been started, it is necessary to exit the light flow function in order to return control to the scheduler.

Now consider the function of processing light flows, which is called from the scheduler:

/* Блокировки: прерывания и планировщик */
static struct schedee *lthread_process(struct schedee *prev, struct schedee *next) {
    struct lthread *lt = mcast_out(next, struct lthread, schedee);
    schedee_set_current(next);
    /* Как уже говорилось, у легкого потока всего два состояния и они
     * взаимоисключающие. */
    next->ready = false;
    next->waiting = true;
    /* В функии process() необходимо разрешить прерывания. */
    ipl_enable();
    /* Вот здесь исполняется функция и сохраняется метка. Помним про макрос
     * lthread_yield(). */
    lt->label_offset =lt->run(lt);	
    /* Если поток в конечном или начальном состоянии, будим schedee, ожидающего
     * его завершения. */
    if (lt->joining && __lthread_is_disabled(lt))
		sched_wakeup(lt->joining);
    return NULL;
}

Example

To demonstrate, consider the Race game, which works in two easy streams: one is responsible for moving the road with obstacles, the other checks if the control key is pressed.

The whole code of the game lies here , here I will give only the main function of the light flow, which is responsible for moving the road.

static int move_road(struct lthread *self) {
    int to_wait;
    /* Функция lthread_resume() вернет указатель на сохраненную метку.
     * Если поток запущен впервые, то переход будет выполнен к метке,
     * указанной вторым аргументом - это начальная метка. */
    goto lthread_resume(self, &&update);
update:
    is_game_over |= is_obstacle(car_line_nr*RACE_ROAD_LEN + 1);
    if (is_game_over)
        return 0;
    race_update_road();
    race_print_road(road);
wait:
    to_wait = RACE_ROAD_UPD_MS - (step / RACE_LVL_STEP) * RACE_LVLUP_MS;
    if (SCHED_WAIT_TIMEOUT_LTHREAD(self, 0, to_wait) == -EAGAIN) {
        /* При возвращении из функции необходимо либо указать 0, что будет
         * соответствовать начальной метке, либо с помощью функции
         * lthread_yield() указать метку, к которой нужно будет перейти в
         * следующий раз. */
        return lthread_yield(&&update, &&wait);
    }
    goto update;
}

This application was written to demonstrate the operation of the scheduler and light flows on the STM32VLDISCOVERY board with a wh1602b LCD screen. In fact, the characteristics of STM32VL allow you to run Embox with this game and with extruded streams: 8kb RAM and 128kb flash. Therefore, I will give a description of the Embox images for a board with a game on light streams and on ordinary streams:

	cooperative	combined
text Os / O2	28724/31504	30152/33116
data	436	436
bss	3168	3102
Main stack + thread stack	1536 + 0	1078 + 1782 * 2

Finally, a video with the game process.

Conclusion

All the code that I cited in the article can be viewed in the github repository of our project:

We still have something to work on. Light flows are still young and, most likely, they will change and improve more than once. Already there are ideas and tasks:

Perhaps it makes sense to divide light streams into two different types: using coroutines and simple atomic functions like tasklet. In addition, it is possible to achieve coroutine functionality without labels, simply by breaking the stream function into several and replacing the function pointer with each call. It would be a sort of state machine.
Potentially, it is possible to realize the displacement of light flows by higher priority light flows. This is important for real time, but it is difficult from the point of view that light flows will no longer be atomic, which means that various locks may be required.
Of the synchronization mechanisms, so far, only mutexes are implemented for light streams, since we most often use them. Others are also waiting in the wings.

Of course, there are a number of subsystems for which you want to implement light flows. For example, the tty driver, the pnet subsystem.

Over time, all these problems will be resolved. You can do it too :)

PS In St. Petersburg, Linux runs every last Wednesday of the month. Once again I will talk about light flows, about how the Linux kernel workqueue and taklet are arranged and, maybe, something else :)
There you can also listen to other interesting presentations. See the SPbLUG newsletter for information . Meetings are like at 19:00 at the address: 10th line V.O. 33-35, Faculty of Geography, St. Petersburg State University.

Tags: