# To the question of speed and measuring it in Arduino

This problem arose in the study of the speed of the work of Arduino when executing various commands (more on this in a separate post). In the course of the study, doubts arose about the constancy of the operation time of individual teams when the value of the operands changed (as it turned out, not unreasonable) and it was decided to try to estimate the execution time of a particular command. For this, a small program was written (who said the sketch was to leave the class), which, at first glance, confirmed the hypothesis. In the output, you can observe the values ​​of 16 and 20, but sometimes there are 28 and even 32 microseconds. If you multiply the data by 16 (clock frequency MK), we get the execution time in clock ticks MK (from 256 to 512). Unfortunately, the repeated run of the main program cycle (with the same initial data), while maintaining the overall picture, gives a different distribution of the execution time, so that the time variations that actually take place are not related to the original data. The original hypothesis has been refuted, but it becomes interesting, and with what exactly is such a significant scatter associated.

The necessary note - I understand perfectly well that to measure the execution time of commands you should use more complex programs, but for a rough estimate it is quite enough and such that will be demonstrated later.

So, time is changing, and very significantly, looking for the causes of this phenomenon. First of all, we pay attention to the multiplicity of the obtained values, look at the description of the work with the time library and see that 4 microsec is a measurement quantum, therefore it’s better to go over to the quanta and understand that we get 4 or 5 (quite often) and 6 or 7 or 8 (very rarely) units of measurement. With the first half, everything is easy - if the measured value lies between 4 and 5 units, then the spread becomes inevitable. Moreover, considering the counts to be independent, we can increase the measurement accuracy by statistical methods, which we do, obtaining acceptable results.

But with the second half (6,7,8) things are worse. We found that the spread does not correlate with the source data, which means that this is a manifestation of other processes that affect the execution time of commands. Note that emissions are quite rare and are not significant for the calculated average value. It would be possible to ignore them altogether, but this is not our style. In general, over the years in engineering, I realized that it was impossible to leave neponyatki, no matter how insignificant they seemed, because they have the disgusting property of beating in the back (well, or even reaching out) at the most inopportune moment.

Begin to put forward a hypothesis 1- the most convenient (in terms of convenience and versatility, it is second only to the direct intervention of the Creator) - software glitches, of course, not mine, my programs never buggy, but plug-in libraries (compiler, operating system, browser, etc. - substitute ). Moreover, since I run the program in the emulator on www.tinkercad.com , then we can still refer to the emulator bugs and close the topic, because the source code is not available to us. Cons of this hypothesis:

1. From cycle to cycle, the location of deviations varies, which hints.
2. This site still supports AutoDesk, although the argument is weak.
3. "We accepted the postulate that what is happening is not a hallucination, otherwise it would be just not interesting."

The following assumption is the effect of some background processes on the measurement result. It seems to be doing nothing, except as we believe, although ... we display the results in Serial. There is a hypothesis 2 - the time of withdrawal is sometimes (strangely like that ... but everything happens) is added to the command execution time. Although it is doubtful how much of that output there is, but all the same - we add Flush and it did not help, we add a delay to the conclusion of the output and it didn’t help, we generally bring the output out of the cycle - still time jumps - this is definitely not Serial.

Okay, what remains is the organization of the cycle itself (with what a fright it is to change its duration, it is not clear) and that’s all ... although micros () remained. I meant that the execution time of the first call of this function and the second one is the same when subtracting these two values, I get zero, but if this assumption is wrong?

Hypothesis 3 - sometimes the second time reference is performed longer than the first one, or the actions related to time measurement sometimes affect the result. We look at the source code of the function of working with time (arduino-1.8.4 \ hardware \ arduino \ avr \ cores \ arduino \ wiring.c - I have repeatedly expressed my attitude to such things, I will not repeat) and see that 1 time out of 256 hardware increment cycles of the lower part of the counter are interrupted to increment the higher part of the counter.

Our cycle execution time is from 4 to 5, so you can expect 170 * (4..5) / 256 = from three to four anomalous values ​​on a segment of 170 measurements. We look - it is very similar, there are really 4 of them. To separate the first and second reasons, we do the calculations with a critical section with forbidden interrupts. The result does not change much, outliers still take place, which means that extra time introduces a call to micros (). Here we cannot do anything, although the source code is available, but we cannot change it - libraries are included in binary. Of course, we can write our own functions of working with time and watch their behavior, but there is an easier way.

If the possible reason for the increase in duration is the "long" interrupt processing, we exclude the possibility of its occurrence in the measurement process. To do this, wait for its manifestation, and only then do a measurement cycle. Since an interrupt occurs much less frequently than our measurement cycle lasts, its absence can be guaranteed. We write the corresponding program fragment (using dirty hacks information extracted from the source code) and “this is such street magic”, everything becomes normal - we measure the execution time of 4 and 5 quanta with an average value of the execution time of the addition operation with PT in 166 cycles, corresponds to the previously measured value. The hypothesis can be considered confirmed.

One more question remains - what is being done in interruptions so long that it takes
(7.8) - (5) ~ 2 quantum = * 4 = 8 μs * 16 = 128 processor cycles? We turn to the source code (that is, the assembler code generated by the compiler on godbolt.com) and see that the interrupt itself executes approximately 70 cycles, of which 60 are permanent, and when reading there are additional costs of 10 cycles, totaling 70 when hit on interruption - less than received, but close enough. The difference is attributed to the difference between compilers or modes of their use.

Well, now we can measure the actual execution time of the PT addition command with various arguments and make sure that it really changes greatly when the arguments change: from 136 cycles for 0.0 to 190 for 0.63 (magic number), and it is only 162 for 10.63. With a probability of 99.9%, this is due to the need for alignment and the peculiarities of its implementation in this particular library, but this study clearly goes beyond the limits of the problem under consideration.

Application - the text of the program:
``````voidsetup(){
Serial.begin(9600);
}
volatilefloat t; 	// так надоvoidloop(){
int d[170];
unsignedlong time,time1;
float dt=1/170.;
for (int i=0; i<170; ++i) {
{
// ждем переполнения счетчика и обработки прерывания
time1=micros();
long time2;
do { time2=micros(); }
while ((time2 & ~0xFF) == (time1 & ~0xFF));
};
/**/
time1=micros();	// засекаем время/*
cli();	// тут был вход в критическую секцию - не помогло
*/
t=10.63;	// начальное значение для операции
t=t+dt;		// измеряемая операция/*
sei();	// завершение критической секции
*/
time = micros();	// время окончания
time1=time-time1;
d[i]=time1/4;
/*
Serial.print(time1);   // вот тут результаты и прыгали
Serial.flush(); 	     // не помогло убрать выбросы
Delay(20);		     // тоже не помогло
*/
};
// выводим запомненные результаты, считаем и выводим среднееfloat sum=0;
for (int i=0; i<170; ++i) {
sum+=d[i];
Serial.println(d[i]);
};
Serial.println((sum/170-2.11)*4*16); 	//2.11 – получается при пустой операции
Serial.flush();	// здесь ставим точку останова, чтобы посмотреть графики вывода
}
``````