# On the issue of division and TI

### “Do not show off, Maria Ivanovna, and listen to your favorite song“ Valenki ”

Despite the title, the order of presentation will be reversed - first about Texas Instruments (of course not about the company itself, I am an engineer, not a business analyst

The first part of Marlezonsky ballet.

The subject of consideration will be a relatively new family CC13xx (CC1310 / CC1350 / CC1352), but it is only the starting point for discussing the situation in the field of programming embedded systems. The considered MK is intended for use in the design of devices with different types of wireless interfaces (I do not really like the newfangled IoT, especially since it does not exhaust the possibilities of using this family).

The MC is built on the basis of the M3 core with quite acceptable, though not recordable parameters of the frequency and the amount of program memory and data, has a sufficient set of interfaces, but these are not at all interesting. The feature is that the microcircuit contains within itself three MK cores, one central and two peripheral - to work with external devices and to perform interaction through the ether. What caused such a decision?

First of all, the desire to ensure minimum energy consumption. The fact is that the traditional way to reduce consumption by performing resource-intensive calculations in a relatively short time and then going into standby mode with a decrease in clock frequency has natural limitations and a fast core will consume relatively much at a lower frequency, while not providing the required response time to external events. To eliminate this contradiction, the second core of the sensor controller was introduced, which provides interaction with the outside world in the low-consumption mode, and when the need for resource-intensive calculations arises, it activates the main core.

The third controller solves (albeit in a very complicated way) the same task of reducing consumption. The fact is that maintaining radio protocols often requires maintaining very tight time constraints and trying to implement them on the central core (simultaneously with the execution of the target program) will require an increase in the core clock frequency and, accordingly, the power consumed from the power source. The division of functions allows us to remove this contradiction (a typical solution in the style of TRIZ).

I am not at all sure that such a division of functions was absolutely necessary and that the required parameters could not be achieved with simpler architectural solutions, but if the price is kept within reasonable limits and the necessary possibilities are realized, then why not. Moreover, the post is still not about the hardware component of the MC, it is just to cover the situation. We will consider the process of creating software for this class MK.

For a start - the main core, everything is standard here - and the M3 core itself is well known, and the tulchan - gcc, the company itself recommends two IDE - ccs and iar. I worked with the last of them a lot, so I decided to try what kind of product the

The only thing about which I would immediately like to express my bewilderment - the company kindly offers additional utilities for working with this MK (for generating a boot firmware image over the air and for transmitting it to MK), but they are written, for some reason, not in Java, which is the basis for the programming environment and the executing system for which is included in the installation package, and on the Phyton. Not that I strongly disliked the latter (although it is, I do not accept the task of the structure of the program with indents, but in the end “the taste and color of all the colors are different”), but it is just not clear why to attract clearly unnecessary entities. And the utilities themselves do not represent anything complicated,

The second part of the puzzle is that Eclipse itself is popular, also thanks to the possibility of easy plug-ins embedding. Given this fact, the decision of the developers of the programming environment looks extremely mysterious to force the user to call these utilities with handles from the command line, first disconnecting the terminal with the handles in the development environment and launching the program to receive data on the MC and then restoring the terminal again with the handles. Perhaps for Hindu programmers this solution seems the only possible and completely justified, but a large firm could afford to attract more qualified staff, I probably don’t know something.

Further, the programming of the sensor controller (the name is not very good, but this is direct tracing) is carried out using the product Sensor Composer Studio. Immediately another question - why do we need to have a separate product, not that it was very difficult to switch the window, but still ..., especially since the generated code eventually becomes part of the code for the main MK (of course, it’s there is not executed, but is included in the general address space of the program memory).

But then another feature is that they tell us nothing about this core (about its architecture), except that it is 16-bit, low-consuming and its own. In general, that's fine, it works and works well, but the sediment remains. The following is a description of the commands of this kernel, it can be assumed from it that it is a modification of 430, albeit with features such as looping commands. The location of the peripheral registers in the general and local address space is given, but then the oddities begin again - many registers, including the periphery registers, are accompanied by the phrase “used only by the TI library”, and for some of the registers the assignment of bit fields is still given some are not. It’s not that these register descriptions hurt me much, but why give them,

The periphery of this kernel can be accessed from the main kernel using special libraries, at the same time you can write code in the mentioned programming environment, again using certain plug-ins. Everything is normal here, the documentation is quite sufficient, the settings are convenient, there is a graphical representation of the settings, built-in debugging in a special mode (in general mode, the debugger is occupied by the main core), in general, quite decent.

The next part is the core for working with radio, built on the base of M0, executes the program from its own ROM (most likely it is part of the non-volatile memory, it is unlikely that a modern memory has a masked memory) that cannot be modified (at least about this in Documentation is not a word) by the user. Information on the internal structure of the radio path is extremely scarce; in fact, it can be extracted only from the description of mode setting commands, but the usual “embedded” developer does not need it.

The interaction between the main core and the radio part is based on the message flow, which is documented in sufficient detail. And this documentation is quite enough to understand the simple thing - you will not (well, I definitely will not) build your interaction mechanisms between the cores, and even more so the communication protocols, and you will use the library implemented by the company, since the interaction is rather difficult even with the design should take into account a large number of various factors, so that the benefits of writing your own package does not compensate for the cost of its creation. Therefore, we again use the library sublayer to organize the interaction of the main core with the radio core, and, most likely, we use ready-made solutions for organizing a standardized radio channel using ready-made modules, using only the channel parametrization.

I think it has become clear to the reader that writing a full-fledged program for this MC is not an easy task, it is quite accessible for an advanced professional, but what to do to an “ordinary” developer (as Oleg Artamonov recently wrote, ”You already realized that you were very wrong profession? “). The company took care of such a case and, together with the development environment, provides a large package (set of examples) of programs for all occasions called SimpleLink. Moreover, the solutions are given in two versions - both with the use of the real-time operating system TI-RTOS (in my opinion, this is a more convenient way) and in the super cycle (in case you hate the built-in OS that you can’t eat “). I calmly treat RTOS at MK, so I apply the first option and advise you to do the same,

My personal attitude to this package is ambivalent - on the one hand, it’s a wonderful undertaking, really drastically simplifying the use of MK (therefore I use it), but on the other hand, “thrash, waste and sodomy”, a great illustration of the phrase “If the cost of good architecture seems You’re high, think about the cost of bad architecture ”(so I’m blaming him). It’s not just the distribution of functions across modules and the relationships between them (although it’s not all smooth), but also the distribution of modules by file, file names, distribution files on directories and their names and structure, etc. etc. But violation of the principles of KISS and DRY is almost a rule for developers of a package, but if you don’t get into the source of the package (I can’t get rid of this stupid habit) and use it "as is",

But now you can smoothly go to the main postulate of this post (better late than never). I have always believed that it is extremely difficult to write a framework that combines true versatility and high efficiency. The design examples proposed by the company represent just such a framework with an emphasis on versatility, the application configuration tools are almost completely absent, all at the level of the correction of curds. Take one of our many examples, change a small piece associated with measurements and analysis (well, with the transfer, of course), and everything is ready. Of course, the resulting code will generally be quite bloated, but we will offer you various examples for different application conditions, and you will only need to choose the most suitable for your task. In a pinch, A sufficiently large amount of program memory is built into the MC so that you do not have to think about saving this resource. Moreover, a significant part of the executing libraries is already hidden inside the ROM and you just have to call them carefully.

Not that I was against such an approach, and categorically insisted on inventing bicycles, code reuse is a categorical imperative and a guarantee of high efficiency of a programmer’s work, but if the following condition is met

, the subroutine library used should be:

- carefully designed
- neatly programmed
- exhaustively documented
- universal,
- effective.

And if the last two points are just desirable (highly desirable, but nonetheless ...), then the first three are necessary.

And what about the requirements of the package offered by the company SimpleLink? Further, estimates are given on a five-point scale, obtained in the order of superficial familiarity with the package.

1a) The links between the modules should be thought out, the limits of competence of each module should be clearly outlined, eliminating duplication of functions, interfaces developed - a solid four, the work as a whole will be carried out.

1b) Distribution of modules by files with a well-thought-out directory structure - rather, three with a plus, which is worth only repeated repetitions of the text of program modules in different files.

2. A package should not have hard-to-detect rarely-manifest errors (the fact that it should not have constantly-manifest errors is obvious). It is difficult to give assessments here, I will note one unpleasant circumstance - most of the functions can be caused both from the usual program memory and from the permanent memory, and if the source code of the first variant is available and validated, then the second one is much worse - we are not given any instructions its source, not on the identity with the first option, so that verification is out of the question.

3. Documentation for a solid four - even the fact that the authors did not use Doxygen's “power and expressiveness” in terms of documentation, gives at least 1 point, there is a system of contextual links, there are descriptions of the principles of operation - I don’t put the five simply because in terms of documentation I never put it, even to myself.

4. No more than four due to the lack of advanced configuration, but I already mentioned this.

5. It is difficult to say, I usually look at the implementation of SPI - it is, on the one hand, simple enough to appreciate the lack of a standard rake, on the other hand, difficult enough to have where to cram them. So, in this package there are reference inefficient procedures for writing / reading bytes, but if you plunge into the depths, you can find the options that are actually used using the PDP, so far I can’t say anything about them.

A note about the depths - they are really deep (after 4-5 nested functions) and here I can not fail to mention one feature of the package - it is written in C. It was not me who forgot to add two pluses after the letter, it really is not there. For those who are in the subject, much becomes clear, I recommend everyone else to undergo a fascinating quest to determine the set of functions performed at the hardware level when implementing a non-trivial object. Of course, when using classes, this task becomes trivial, but this is not the way the Ted Jedi. I understand that this is forced care for those users who neglect the advantages, but why stop there, but what about the unfortunate users of the assembler, for which they were offended.

And in conclusion - I want to emphasize, “so that they understand me correctly at the top,” I don’t blame the MK family, the development environment, or the software package at all, but just want them to become even better, more convenient and more attractive to the user. . I have my own scores with TI and I will never forgive them for taking over National or the earlier acquisition of Luminary with the subsequent killing of an interesting MK line (although in the latter case they punished themselves) and also what they returned to me in 2014 money for the ordered crystals (although I certainly did not touch the Crimea), but this deep feeling does not prevent me from being objective - they did a good job. The concept proposed by the company in the epigraph is not too close to me, but, probably, they are right, and there is no other way for complex crystals. It is a trend and fight with it is pointless.

And the fact that this is a trend confirms the situation with the new crystals of power management from Vicor. The crystal itself is good (the opposite would be surprising for a reputable company), the parameters are very good, and I mentioned it only in connection with the section on the choice of external components, specifically inductance. In the documentation, this section is only one paragraph in which a specific model of inductance from a particular manufacturer is indicated and it is explicitly stated that other options are not considered, see the epigraph. Fully understanding the crystal developer's reasons (the switching frequency is high, the currents are significant, designing inductances for such conditions is not trivial), nevertheless I should note that for now this design approach is unusual for me, “but the key here is“ for now. ” Can,

The second part of Marlezonsky ballet.

And now about division, the use of which was found in one of the libraries of TI, but not directly related to this company, but is a feature of the gcc compiler. So, we formulate the problem from the field of address arithmetic - we need to calculate the difference in indices (namely, indices, and not bytes) between two array elements (or simply sequentially located data of the same type) specified by pointers to them. As a rule, one of the elements is the first element of the array, but this is not important.

The problem itself is simple and the solution in C is obvious and looks like

$$( P o i n t e r 1 - p o i n t e r 0 ) / s i z e o f ( t y p e )

. The devil is hiding in the implementation - the division team has not become standardly implemented in common architecture MK, so it is not fast. If the divisor is a variable, then there is no good solution from the word at all, but for a constant value there are options based on multiplication. Thanks to the wonderful multiplication property$$( a + b ) ∗ ( c + d ) = a ∗ c + a ∗ d + b ∗ c + b ∗ d

(additive addition, thanks, captain) The hardware implementation of multiplication is much more common in implementations and fast enough (the hardware multipliers topic itself is interesting, but this is not about that now). Unfortunately, there is no similar property for division, which is why it is a rare guest in terms of hardware implementation.So instead of dividing (by a constant, this is important) we want to apply multiplication, watch your hands

$$a / c = a ∗ ( 1 / c ) = a ∗ ( N / ( N ∗ c ) ) = a ∗ ( N / c ) / N

where N is some additional constant. There is a logical question - what kind of garbage, because now we have two division operations instead of one, and even multiplication, due to which the gain. The answer is in the correctly chosen N, if it is a power of two, but the division turns into a shift of the number to the right, which is much cheaper, and if the exponent is a multiple of 8, the division turns into a re-numbering of the number of bytes, which costs nothing. Since the factor N / c is constant, we calculate it in advance and everything seems to be good, if it were not for one detail - the accuracy of the representation of this factor.Consider the specific case of dividing by 10, which occurs when converting numbers from binary to decimal, then, taking H = 256, we expect a constant to multiply 256/10 = 25.6 and a rounding error occurs to a whole 25 (-2.3%) or 26 (+1.6 %). Then, for example, 100 * 26 = 2600 = 256 * 10 + 40 and the highest part of the result is 10, which corresponds to the expected 100/10 = 10. It is possible to calculate at what values of the dividend the result deviates from the correct one by more than one, but for this we will have to solve equations of the form

$$[ n / 10 ] = [ n ∗ k / N ] + 1

(where the brackets mean the integer part with rounding down), and this is not a very clear procedure, it is easier to fix the numerical simulation and make sure that the result is correct up to a certain n. You can enter a corrective additive and compensate for the loss of accuracy by the formula$$( n * 26 - n / 2 + 1 ) / 256

and significantly expand the allowable range of the values of the dividend (up to 256, that is, on the 8-bit unsigned whole, we never make a mistake), but the fundamental drawback of the method cannot be eliminated - the presence of an error, in addition, we get only a partial calculation of the remainder (if necessary, but this is not our case) it is possible to do only a separate operation, which adversely affects the speed of work. In general, the method is quite working, but rather limited in scope.Therefore, I was slightly surprised when in the compiled code (as always, thanks to the godbolt website for the opportunity presented) I saw in address arithmetic multiplication by some (large enough) constant instead of division by a constant (very small). Another question was that the constant was completely different from that calculated for the above method. And finally, the result is taken not the upper half of the work, but the younger part. In general, a little strange method, some garbage, but calculations show that it works. A brief reflection reveals the secret and the method turns out to be absolutely correct, but ... (of course, the reader expected "but ...", since it is impossible to replace division with multiplication in general) he has limitations.

Consider the magic number 143 and explore its funny properties 77 * 143 = 11011, its youngest part 011 = 1 = 77/7. It is easy to see that 784 * 143 = 112112, its younger part 112 = 784/7 and so on for all numbers not exceeding 999 * 7 = 6993. Here it is, the natural limitation of the method, but taking another magic number 7143, we will extend the range of possibilities to 9999 * 7 = 69993. It is easy to find other magic numbers with similar magic properties.

How to get these numbers - we want to find a number, by multiplying by which dividend, we can get a result that contains in its younger part the result of division, in this case by 7. It sounds abstruse, but in fact everything is simple, for the input number 7 we want get xxx001, suppose xxx = 001 and here it is simple street magic 1001/7 = 143. Obviously, 143 * 7 = 1001, then for any number = n * 7 is true (n * 7) * 143 = n * (7 * 143) = n * 1001, and the younger part of the result is n, cht.

Now we see a fundamental flaw in this method - it works only for numbers that are divisible by a divisor and gives a completely unpredictable (in the sense of not at all matching the division) result of multiplication in all other cases. But for this particular application, when we subtract the addresses of the elements of the array, the result will be exactly a multiple of the size of the elements of the array and we have every right to use this method. If we took the wrong numbers, then the result of the division should not interest us, “the machine is the mill, if you throw stones at it, the flour will not work.”

Finding magic numbers for dividers other than 7, as well as proof of their existence, we leave to the share of the inquisitive reader. It is also interesting to build a graph of factors depending on the divisor and see the dips and peaks on it, probably, their presence is somehow related to the theory of numbers. For example, for composite 21 = 3 * 7, this number = 381 (of course, the minimum number, the rest are not interested) is clearly less than the product 667 * 143 = 95381, and even to him is not a multiple, as I naively thought, although 381 = 381.

Another interesting fact is that the magic numbers will be different in different number systems. For example, in the decimal system there is no constant to divide by 5, since none of the numbers of the form x ... 1 can have a five as its divisor and our method does not work. But at the same time, in the binary (hexadecimal) system, this problem is solved, since 1024 + 1 = 1025 = 0x401 by 5 is wonderfully divided with the result of 205 = 0xCD and our algorithm works again, for example 130 * 205 = 0x82 * 0xCD = 0xFAFA => FA = 26 = 130/5. Moreover, it is possible to prove (well, I think so) that now the task of finding a multiplier is solved for any odd divider, and any even divider turns into an odd finite number of shifts (I can definitely prove it). As far as a convenient and useful thing is a binary representation, we are very lucky with it.

Ps. My constant readers (flatter myself with the hope that there are such) are in perplexity - the post has come to an end, and where the "weeping is Yaroslavna". Do not worry, my dear, here he is.

The debug boards for the CC1350 and CC1352 use one crystal of the antenna switch (used in different ways, but it does not matter), namely XMSSJE3G0PA (do not try to read it in Russian transcription, I am sure that muRata did not mean anything like that). So, the switch with modest characteristics is the frequency band up to 2.7 GHz, the transmission attenuation is 0.28 dB, the insulation attenuation is 33 dB, the switching power is 20 dB. But all this is compensated by two parameters - dimensions 1.5 * 1.5 mm and cost - $ 0.9, despite the applied technologies "Silicon On Insulator" and "gallium arsenide".

HOW they do it - first of all about the price, well, secondly, why we do not do that - a rhetorical question

Only registered users can participate in the survey. Sign in , please.