GarryC June 15, 2016 at 18:26

To the question of standard libraries

We will begin this story with a riddle,
Even Alice is unlikely to answer,
What remains of the fairy tale afterwards,
After she was told?

This essay will be devoted to various topics, among which there is a place and answer to the question posed in the subtitle, and the narrative will develop mainly around and around the problems associated with the initialization of the periphery of modern MK.
So, we outline the main problems associated with setting up the hardware of MK: the need to set a significant number of parameters, most of which are not specified in each specific case, but, nevertheless, cannot be left arbitrary, but must take some predefined values . If you believe that such a simple statement of the problem can cause a stream of consciousness and lead to some not quite obvious solutions, then

Let's look at an example related to setting up a sore-nosed interface, namely IRPS (Radial Serial Interface - it was under this name that she now appeared as a UART character). I’ll immediately answer the question why the Russian abbreviation is the fact that I write notes partially on the road, and I personally would not call switching keyboard layouts a convenient part of the Android keyboard (by the way, if anyone knows a convenient keyboard with arrow keys and is not brake, drop it In a personal, otherwise it’s so inconvenient to poke on the screen).
Of course, the range of issues raised is not limited to IRPS, it is necessary to configure other MK equipment as well, but I just took it as an example. So, to configure the operation of the IRPS, we must at least set the sending format, which is determined by the following parameters - transmission speed, number of transmitted data bits, the presence and type of parity bits, the number of stop bits. And this is only the beginning, in fact there is also an extended configuration and it is much longer, but now it's not about that. It should also be taken into account that in 70 percent of cases, when using IRPS, the standard configuration 9600-8-0-1 will be set, in another 25 percent only the transmission speed will change, and all other configurations will share the remaining 5 percent, but it should exist for them the ability to customize.

Before we begin to consider various options for solving this problem, we should determine the criteria for evaluating the success of a particular option, otherwise the choice of the most suitable one will turn into a discussion of taste preferences. In my post, I proceed primarily from the reliability criterion, since I consider the following circumstances determining: 1. It is human nature to make mistakes, 2. The compiler's task is (including) to point out these errors as early as possible, and ideally prevent their appearance. It is from these positions that I will consider the acceptability of one or another solution, if you do not agree with at least one of the above two postulates, then most likely you will not really like this post and you should stop reading it,

To begin with, we will understand why such a task arose, because in MK of previous generations (PIC, AVR, 51) we did a great job of configuring the equipment by direct recordings in the corresponding registers? Well, first of all, for this you need to know the registers, or, as it is customary to put it today, smoke manuals, and there are a lot of letters and, although to me personally this occupation does not seem particularly annoying, nevertheless, especially considering the quality of the current manuals, for a significant part community this approach can really be a problem.

A small (hahaha I really thought so) note about the quality of modern documentation - maybe the grass was greener before, but I fully agree with Jack Hansley’s phrase in one of his recent blogs: “Recently, I have come across more and more instruments for documentation for which the word mild definition would be insufficient, although earlier it would have been defined as absent. ” Of course, the Internet is a great thing (and here there is no sarcasm, this is really a new miracle of the world) and you can ask questions in obscure places both directly to the developers and in the forums and often get a lot of answers, among which, even with some luck, there will be true ones , but why not write the documentation in such a way that it leaves no room for discrepancies? They usually answer me,

I can only recommend trying to outsource the writing of documentation, and it is advisable for the device developers who use your devices, since the developers of the device itself believe too much is obvious, and what is obvious to the developer is not always (more precisely, always not) obvious to the user. Maybe create an office to provide this type of service, although, given the attitude of our manufacturers to documentation, they are unlikely to use such an expensive and unnecessary service.

A slight reminiscence on the topic of documentation. I recently studied a description of a single chip (by Intel, by the way) and there is an input with the speaking name PE_RST_N, which is described as a + 3.3Vdc input, which, being asserted, signals the presence of power and clock frequency. How exactly the input is stated - high or low, is not indicated, there are no time diagrams for this signal (although there is a PERST # signal on the time diagram and it is mentioned with the same name in the device state diagram), hereinafter referred to as that the value 0b indicates the activity of the reset, in the table of operating modes in the presence of a reset there is a checkmark in the column value, which is as if hinting. In general, on the basis of a combination of indirect signs, we can conclude that the active level on this leg, leading to the reset of the device, yes, is low,
Why am I forced to enter into the shaky ground of conjecture and speculation? Probably, for me personally, this is a punishment for bad karma in past lives (well, I was not an angel in this one as well), but what did the other developers do? One of my colleagues expressed an interesting hypothesis that such documentation is done specifically to make it difficult for us to get up from our knees, but why then is it written in English? Or is it made specifically for us, but in the West they use real, correct documentation? Although in this case, the phrase “You should not explain with malicious intent what can be explained by simple laziness” fits perfectly (I will not offend the creators of such documentation by verbatim quoting).

But the real masterpiece in my personal collection is a description of one, in general, not bad MK manufactured by a domestic company, in the documentation for which the control bit for connecting pull-up resistors was described as follows: "0- pull-up resistors drop out", 1- ... guess how it is written there? - "pull-up resistors do not fall out" - a good attempt, but did not guess ... drum roll in the studio ... "the value is the opposite of 0". Everyone applauds, curtain.

But all of the above (I'm talking about the unimportant quality of the documentation, if anyone has already forgotten) is only part of the problem, its second component
lies in the fact that modern MK is really more complicated than its predecessors. I have my own point on why hardware is becoming more complicated and “more and more fully satisfying the ever-growing needs of developers” does not come first in the list of reasons for this phenomenon, but my thoughts on the whole do not change the situation in general - modern MKs are really more complicated of their predecessors, they have much more hardware blocks and the blocks themselves have become much more complex, they implement additional functions that developers can (and should) use for business. I’ve conceived a post on the topic of some features (really useful) in the START UART family, when I finish this one, I’ll definitely write, just when implementing the use of these features, I thought about the problems that caused this post to come to life.

The next part of the problem is related to the increase in the nomenclature of both MK families from various manufacturers and MK subspecies within the same series from one manufacturer, ending with a significant variety of devices within one subspecies. Simultaneously with the expansion of the nomenclature, the life cycle of a specific representative of the family is shortened, which raises the question of the permanent transition to new devices. Again, this is far from always necessary, but it’s very difficult to deal with it, so it’s advisable to develop software in such a way that switching to another MK is as painless as possible and ideally comes down to replacing one line

#define device xxxxx

which, of course, is facilitated by the use of standard libraries, which is why I am writing this post about the correct (from my point of view, and you already know about another) writing them.

We will begin to consider possible options for implementing the IRPS settings and the first of them will simply be an initialization function with all possible parameters (here, for some time I struggled with the temptation to write examples of texts on the ANP, an algorithmic programming language, but then decided that it would already be on the other side of the border, separating good from evil and easy trolling from bullying). And we get something like

UARTInit(9600,8,0,1);

for the aforementioned standard case (hereinafter, we will leave out of brackets the question of choosing one tunable channel of equipment from existing in MK). It seems that everything is normal here and we don’t have to write anything supernatural, but we will try to find the shortcomings in this option and (of course, why else look) to eliminate them.

First of all, we need to decide what we generally want from the procedure for setting up the hardware and what requirements we impose on it. In my opinion, the program should be primarily reliable, safe, and understandable. Efficiency requirements are not so significant, since initialization, as a rule, is carried out once and may not be too fast, but if it is compact in memory and undemanding in speed, this is an additional plus in the evaluation. By the way, at least one technology, namely Charlipleking, requires an operational change in the operating mode of the MK legs, so you should not neglect the speed at all.

What shortcomings we see in this technology (and we see them, otherwise what to write about further) - first of all, it is the need to list a large number of parameters, and in a strictly defined order, and if we make a mistake somewhere, the result will not be the one we were counting on. The issue with the order of the parameters can be somewhat relaxed, if you define custom data types for each parameter, then we get:

typedef enum {…UARTSpeed9600…} UARTSpeedT;
void UARTInit(UARTSpeesT UartSpeed,…};

, and when we try to make a mistake, we will get a warning from the compiler. This approach does not cost anything at the execution stage and quite a bit at the compilation stage, so I can highly recommend it and at the same time express painful bewilderment that the authors (including proprietary) libraries are neglected in such a simple and at the same time effective way.

Their only justification is the need to use expressions that need to be memorized, in contrast to the magic number 9600, which is intuitive (it was sarcasm). An alternative is a large number of assertions at the entrance to the function, which will check the correctness of the parameters for the compiler. In principle, the approach is not so bad (even a small fish is better than a large cockroach), but it requires us to release a debug and transfer error messages to the execution stage, which is worse than getting them when compiling.

Having a little smoothed over the requirements for the order of the parameters (now we are scolded when trying to violate it), however, we should still list them all, which is annoying. If we work with an advanced language, then we have default parameter values, but if we stay within the framework of classical C (and we stay in them, unless I warned before), then this way is not for us. In addition, this method has a very significant limitation, namely, all default parameters must follow the defined ones, therefore, to explicitly set the last parameter in the list, we must indicate all the previous ones.

If we again work with an advanced language such as C ++, then we have one more method - overloading the assignment operator, although the prospect of writing (4 + 4 * 3 + 4 + 1 = 21) assignment functions with hard order of arguments and an even more unimaginable number of assignment functions with arbitrary order. Nevertheless, such a possibility exists, and the rules of decency require mentioning it, although they do not oblige it to be used.

If we had a good preprocessor that really provided us with macro-language capabilities, we could write a macro function call with a variable number of typed arguments and get a function call generated by the preprocessor, but we don’t have it (well, we don’t have a good preprocessor in C , we don’t even have an average). If one of the readers of this post considers the standard preprocessor for the C language to be truly a macro language, then I highly recommend that you familiarize yourself with the assembly language macroprocessor description for DEC machines, and then it will be possible to debate on this subject in more detail. But only in this order, and I do not exclude acquaintance with other developed means of code generation. However, this approach is not available to us due to the limited means of expression, and we will not write our own preprocessor, although sometimes we feel like it. On this pessimistic note, we conclude our discussion of options using the initialization function with the direct transfer of parameters.

Another option is to use a certain control structure and separate the process of setting the values of the fields of this structure (including those set by default) from the initialization process itself. What gives us such a separation? First of all, the long-awaited opportunity to modify only those parameters that should differ from the default values.
Of course, at the same time, we must very well imagine and remember these very values, but the two-command compiler remains an unattainable ideal. But you have to pay for everything in this world, and for such an opportunity we will have to pay by necessity to guarantee these very default values, which, when using a direct call to the function, were guaranteed by the compiler in one way or another.

Of course, if we write a function for setting default values and call it for the control structure planned for use, then there will be no problems. And if we forget to do this? In the best case, we will get a little bit not what we expected, in the case of moderate severity we will get well not at all what we expected, in a very bad case we will not get so much that we damage the equipment, but in the worst case we will get everything the above suddenly, that is, in some rare and poorly repeated situations. This is all to ensure that the task is by no means contrived.

Again, if we use C ++, then the constructor is a natural answer to our aspirations, and if we also screw smart pointers, the solution is close to ideal, but we still remain in the framework of pure C, since this is the way of the samurai ( Continuous Path to Segmentation Fault).

I will determine my attitude to one issue related to the use of the control structure, namely, the format of the structure itself and the way in which its components are changed.
First of all, I’ll immediately declare that I personally am against providing the user with information about the internal structure of the library in general and the control structures of this library in particular. The reasons for this deviation in behavior lie far from the principles of OOP, due to the cursed past and were laid in the distant days when computers were large, and the RAM of the small and large size of the name table could really slow down compilation. Therefore, I quite calmly, without internal resistance, accepted the concept of encapsulation and interface in my time, although it was created for completely different reasons than saving memory at the compilation stage.
By the way, I recommend the book “Programming for Mathematicians”, based on a course taught (once, I don’t know how now) at the VMK, which perfectly describes the principles of OOP without using this term from the standpoint of operations.

And that my behavior is a deviation, and it is a deviation from the mainstream, is confirmed by the study of the sources of well-known software packages, including from STM and TI. Due to incomprehensible considerations, the authors of these packages believe that everyone should know everything about everything that is achieved by using nested inclusions of header files and protection against re-inclusion. That is, if suddenly the IRPS module cannot find out the distribution of bits in the USB host control register, then it will “suffer, wither, and even ... uh ... die”.

A small digression on the topic discussed - I really consider the use of the include directive in h files to be evil, and I consider the recommendation to place a conditional macro at the beginning of the header file to protect against re-inclusion as tips on how to reduce the harm from smoking. The correct and simple solution - not to smoke - is not considered by the authors of the recommendations, that is, a priori it is implied that to think over the architecture of the software package, determine the relationship of the modules and build their hierarchy average (well or above average, all the same we are talking about the products of well-known companies) programmer embedded systems is not capable by definition, so we can only talk about minimizing the harm from its crooked-handedness.
Well, I don’t know how true this is, but I personally have such embedded header files cast doubt on their creator’s ability to write good (reliable and convenient) code. But this is a personal opinion, expressed in a frequent conversation, and, as they say, "but I did not know that it was possible that way."

So they (STM and TI) as they want, and I will continue to adhere to the principle "The less you know, the better you sleep," or, in another wording, "What you do not know can not cause you concern." Therefore, I believe that the user of the library did not give up information on her internal device, although within C we are obliged to provide it (and how cool it was in Turbo Pascal with its Unit concept, but what to dream of unrealizable, we were already told that they will not be in C ++ 17). So somewhere we will have an expression like

Typedef struct {
	UARTSpeedT UARTSpeed;
…
} UARTConfigT;

and its appearance is as inevitable as the victory of communist labor. But no one forces us to take another step along the road to the quagmire and set the values of control information in the style

UARTConfigT UARTConfig;
UARTConfig.UARTSpeed=UARTSpeed9600;

because the user is not at all interested in our specific fields, and he just needs confidence that what is needed will happen. Therefore, the use of one or another SET-tera seems to be preferable, and its implementation as a separate function or as a member function remains a matter of taste, as shown in the following code fragment

void UARTSetSpeed(UARTConfigT *UARTConfig, UARTSpeedT UARTSpeed);
UARTSetSpeed(&UartConfig, UARTSpeed9600);

I apologize for how heavy the naming styles of functions are, but if the extra 20 characters in the name allowed you to save half an hour of debugging, then you won.
This approach, in addition to following the principles of OOP, has a utilitarian value - if we use C ++ (but we don’t use it, don’t forget?), Then we can write one overloaded function and use it to set various parameters in the style

void UARTSetParam(UARTConfigT *UARTConfig, UARTSpeedT UartSpeed);
void UARTSetParam(UARTConfigT *UARTConfig, UARTParityT UartParity);

and so on, which allows us to reduce the load on the user's brain by reducing the number of functions necessary for remembering names.
Of course, we must understand that DarZaNeBy (a special hello to those who value Heinlein) and accessing the setter will require more time in execution and more memory to store the code compared to directly assigning a value to the field (although for inline functions this statement and doubtful), but, from my point of view, the advantages outweigh.

Another interesting aspect is that the user does not need our control structure and should not know about it (whatever he thinks about it), so it would be nice to make it invisible to the user, and here would be a global static variable if we take into account that this imposes certain restrictions on the methodology for its use, or an anonymous instance, but more on that later.

So, with any method of setting the values of individual fields, the question arises about the values of fields that are not specified explicitly, and, accordingly, set by default. Two aspects must be distinguished here: the guarantee of the absence of garbage in the fields and the values remaining from the previous use of the structure. And if the first one can still be fought by using initialized variables, the second without an explicit and direct call to the initialization function cannot be solved in any way, even by designers (which we don’t have).

As for the initial initialization, this is either direct initialization at the point of declaration of the structure, which in principle is correct and permissible, since it is controlled, or placing the structure in the region of global variables, which ensures its zeroing, and I consider this method unacceptable in principle, since it is uncontrollable and imposes significant restrictions on the default values, they must be equal to 0. Well, in any case, such a technique does not solve the issue of repeated use, so even the initial socialization can only be seen as a demonstration of good style, but not as a solution.

What solution do I think is acceptable after I mix everything else with the food for sparrows? This solution is complex, that is, it allows you to provide default values, and gives you the ability to selectively change parameters, and eliminates user errors and lowers blood cholesterol and does a bunch of useful and necessary things, as is typical for this comprehensive solution. And another important property of it is it is written in pure C, that is, the bushido path did not suffer damage.

After all, you can talk about the poor properties of Japanese steel and the impossibility of classic fencing of the blade into the blade as much as you like, but it is very beautiful to solve the fight with one blow, starting with pulling the blade out of the scabbard and ending with returning it in one single motion with shaking off drops of blood Unlucky opponent on the road. As for the critics, I recently read a wonderful phrase: “criticism by impotents of Don Juan can be objectively fair, but it still has an unpleasant connotation.”
Honestly, I am very tempted at this moment to stop the post and leave the reader in utter bewilderment and disappointment, but still subject it to further tests and let us get another kind of disappointment from the fact that the so beautifully described solution turned out to be awkward and inconvenient.

So there it is.

UARTConfigT UARTConfigInit(void) {
	UARTConfigT UARTConfig;
	UARTConfig.UARTSpeed=UARTSpeed9600;
	return UARTConfig; 
};

So it’s really possible, in C we can return any type, with the exception of an array, and we can return a structure containing an array, which slightly surprises me, but apparently Kernigan and Richie had reasons for such a solution, it’s a pity that they are not clear to me. However, no bad things can happen, such a solution is absolutely reliable and complies with the language standard. But this is only an initialization procedure, but how will we carry out the task of significant parameters? The option using the intermediate variable is dismissed with indignation, because it does not guarantee us the exception of user errors and create cascading use in the following style:

UARTConfigT UARTConfigSpeed(UARTSpeedT UARTSpeed,UARTConfigT UARTConfig) {
	UARTConfig.UARTSpeed=UARTSpeed;
	return UARTConfig; 
};

Let's pay attention to the order of setting the parameters, we see the advantages of such a solution in the line

UARTConfigSpeed(UARTSpeed4800,UARTConfigParity(UARTParityEven,UARTConfigInit()));

where the parameter value follows immediately after the function name, which is more visible compared to the following expression

UARTConfigSpeed(UARTConfigParity(UARTConfigInit(),UARTParityEven),UARTSpeed4800));

Actually, those who programmed in TurboVision recognized this unforgettable style with lots of closing brackets at the end, but it’s not necessary to try to build a single-line expression and the alternative already looks less nightmare

UARTConfigSpeed(UARTSpeed4800,
 UARTConfigParity(UARTParityEven,
  UARTConfigInit()
 )
);

but this is already a matter of taste and is not subject to discussion by definition - they do not argue about tastes.

What are the advantages of this option - it eliminates the very possibility of skipping the initialization of the control structure, it excludes the user’s familiarity with this structure, since it is anonymous, it checks the correspondence of the function parameters (due to enumerated types), and it can check another possible error of ours if we all they did it right, but they forgot to use the structure (we remember that it is human nature to make mistakes). We lost sight of the fact that all our manipulations have not yet led to the actual setting of the IRPS and we still need a function

int UARTConfigUse(UARTConfigT UARTConfig) {
 return DO_something(); // собственно настройка с возможной ошибкой 
};

and our example in the final form will look like

UARTConfigUse(
 UARTConfigSpeed(UARTSpeed4800,
  UARTConfigParity(UARTParityEven,
   UARTConfigInit()
  )
 )
);

As a cherry on the cake, we show that you can also control the possibility of the last error - skipping the use of the formed structure, unfortunately, this possibility does not apply to standard language tools and is present only in the GCC family - a warning about ignoring the return value, for which the initialization and setting functions should be described as __attribute __ ((warn_unused_result)). Whether this method works specifically for you depends on the compiler developers, for example, it worked in KEIL, but not in IAR (by no means belittling IAR, I relate to it and use it well, but there’s no getting away from the fact). There are other solutions to this problem, based on macros and some intermediate compile-time variable, but they, unfortunately, do not guarantee the result.

Why personally, I can’t consider the given tuning method to be ideal, although it solves all the problems of expressiveness and reliability? Exclusively due to its low efficiency in terms of execution costs, to put it simply, due to its exceptional gluttony over time.
The time has come to answer the question posed in the epigraph to this post in relation to the function returning a result of a non-primitive type. After all, we cannot return a pointer to our internal variable, because we get a compiler warning (and if the warnings are disabled, then they could, yeah).
How does the compiler solve this problem? I don’t know how it should be, I didn’t read the C language standard (yes, it’s so, and I’m not ashamed to admit it), I’m not sure that the implementation details are described there, but regarding IAR I just looked at the generated assembler code and saw there is the following mechanism. When you enter a function (including main) that calls a function that returns a non-primitive type, enough space is allocated on the stack to store a variable of this type. Next, the initialization function called carries out a bitwise copying of its internal variable onto the stack under its return address, and after completion of its work, the value of the returned result lies at the top of the stack. Before calling the function for changing the parameter value, this value is copied onto the stack and passed to it as an argument,
That is, a certain piece of data (and it can be of a very significant size) is constantly dragged between the allocated memory area (on the stack, by the way) and the top of the stack, and, of course, this does not contribute to fast work.

How can we increase the speed of functions - change the type of the return value and return a pointer that is copied efficiently. But then, as a result of the initialization function, the pointer must point to something, and we run the risk of falling into bad recursion. Creating a temporary object with its subsequent removal in the use function would be nice, but we cannot use dynamic objects for religious reasons (only static, only hardcore).

Oddly enough, returning a pointer to an internal variable works if all the configuration functions work with parameters of the same size (while the information above the top of the stack does not spoil), but still leaves the impression of a saber dance (namely, sabers, not sabers , it’s a very risky trick). Especially thrills await you with such a decision, if interruptions are allowed.

It is possible to use a local structure, but then its name appears and care for it is transferred to the fragile shoulders of the user. Therefore, no matter how we resist fate, the global governing structure and a pointer to it seems to be the only acceptable and reliable option. You ask, why then pass it from function to function if the pointer does not change. But this is the only way to guarantee the invocation of the initialization function, we could pass not a pointer, but something else that has nothing to do with our structure, for example, an arbitrary integer, but this way we force the user to call the initialization function, and for the type, with which we plan to work, so that the type of result is left unchanged.

Taking into account the above considerations, we obtain the following solution modification:

typedef UARTConfigT *UARTConfigPT; 
UARTConfigPT UARTConfigInit(void) {
	static UARTConfigT UARTConfig;
	UARTConfig.UARTSpeed=UARTSpeed9600;
	return &UARTConfig; 
};
UARTConfigPT UARTConfigSpeed(UARTSpeedT UARTSpeed,UARTConfigPT UARTConfigP) {
	UARTConfigP->UARTSpeed=UARTSpeed;
	return UARTConfigP; 
};
int UARTConfigUse(UARTConfigPT UARTConfigP) {
 return DO_something(); // собственно настройка с возможной ошибкой 
};

but the application remains unchanged.
Performance increases significantly, the fee for this is a small amount of memory, which is forever reserved for the control structure, which is not needed after the hardware is initialized. At the same time, if we need to really quickly change the operating modes of the equipment, the possibility of applying such a solution is a matter of discussion, but the possibility of using standard (universal and, therefore, inefficient) libraries in general is a big question, most likely, you will have to work hands.

By the way, an ideal solution, high-speed and memory-free, would be to get a pointer to the structure returned by the initialization function and then pass it chainwise, but the compiler brutally broke off the wings of my fancy, saying that the operator & is applicable only to the L-expression. In general, I understand what this means, I still do not understand why they are with me like that. cruel.

Something a bit too much turned out for a post on such a simple topic, but deviations took a lot of place, which I hope will give pleasure to the thoughtful reader and will serve as an impetus for comments and discussions, which will entertain the most respected public.

Tags:

programming

To the question of standard libraries

We will begin this story with a riddle, Even Alice is unlikely to answer, What remains of the fairy tale afterwards, After she was told?

Also popular now:

We will begin this story with a riddle,
Even Alice is unlikely to answer,
What remains of the fairy tale afterwards,
After she was told?