Measuring Harmony - Sound Spectrum Analyzer at STM32L4 Discovery

    In a previous post, we connected a cheap Chinese LCD screen to the STM32L4 Discovery board . Now we’ll try to implement on this combination something that goes beyond the traditional blinking of an LED, namely, a sound spectrum analyzer that uses the microphone on the board. At the same time, I will tell you how to use the FreeRTOS operating system, and why it is needed, and also why there are 12 notes in a musical octave, and than 53 notes are better than 12.





    Sound digitization


    We want to receive the signal from the microphone, calculate its spectrum using the fast Fourier transform (FPU to help us) and show the result on the LCD in the form of a 'color waterfall'. The sound strength will be encoded in color. We will draw a line of pixels from the edge of the display where the leftmost pixel will correspond to the minimum frequency and the rightmost one will correspond to the maximum, while the previous picture will be shifted by one line, freeing up space for a new line. Our microcontroller is too complicated to start from scratch, so let's start with an example from the STM32Cube kit called DFSDM_AudioRecord. What is DFSDM? This is Digital Filter for Sigma-Delta Modulation. The fact is that, unlike the good old analog microphones, the one on the Discovery board does not give a signal in the form of a voltage proportional to the sound pressure, and as a sequence of zeros and ones with a clock frequency of several megahertz. If you pass this sequence through a low-pass filter, you get the same analog signal. In previous models of microcontrollers, it was necessary to make a digital filter in order to receive an audio signal in digital form. Now the microcontroller has a special module for this, and all that is required is to configure it at the start of the program. To do this, you can either delve into reading the documentation, or use a ready-made example. I went the second way. The following picture illustrates the internal structure of the DFSDM_AudioRecord program. In previous models of microcontrollers, it was necessary to make a digital filter in order to receive an audio signal in digital form. Now the microcontroller has a special module for this, and all that is required is to configure it at the start of the program. To do this, you can either delve into reading the documentation, or use a ready-made example. I went the second way. The following picture illustrates the internal structure of the DFSDM_AudioRecord program. In previous models of microcontrollers, it was necessary to make a digital filter in order to receive an audio signal in digital form. Now the microcontroller has a special module for this, and all that is required is to configure it at the start of the program. To do this, you can either delve into reading the documentation, or use a ready-made example. I went the second way. The following picture illustrates the internal structure of the DFSDM_AudioRecord program.



    The digitized sound using DMA falls into the ring buffer. DMA causes an interrupt twice: once - when the buffer is half full, the second time - when it is full. The interrupt routine simply sets the appropriate flag. The main () function after initialization executes an infinite loop where these flags are checked and, if the flag is set, the corresponding half of the buffer is copied. An example copies data to another buffer, from where it, again using DMA, is sent to the headphone amplifier. I left this functionality, adding the calculation of the spectrum of the audio signal.

    When there are many tasks


    The straightforward way to add new functionality to our code is to add more flags and write functions that will be called if these flags are set. The result is usually a mess of flags, handler functions, and the global context, which is forced to be global, since the solution of one problem is divided into many small steps implemented by individual functions - event handlers. An alternative way is to entrust task management to an operating system, such as FreeRTOS. This allows you to significantly simplify the logic due to the fact that each task is solved within its own cycle of processing events that interact with each other through the functions of the operating system. For example, we can add a data processing task as a separate cycle, which will wait for data to be ready on the synchronization primitive - semaphore. The semaphore is very simple: you can pass it if the flag is checked, and the flag is automatically omitted. In our case, the data source will raise a flag when it prepares data for another task. In a similar way, you can create arbitrary chains from data source tasks and data consumer tasks, similar to how this happens, for example, in the Linux operating system.



    Of course, the simultaneous execution of tasks is an illusion, especially when the computing core is only one. In this case, we can say that we have a single thread of program execution by the processor. Semaphores, like other synchronization primitives, play the role of a magical rabbit hole, into which the flow of execution fails to emerge in another task.

    Connecting FreeRTOS to your project is quite simple. It is only necessary to replace the endless loop, which usually ends the main () function in the microcontroller, with a call to osKernelStart (). After that, the compiler will explain to you exactly what it lacks for compilation. All the actions that you previously performed in the loop need to be transferred to a separate task and registered with the xTaskCreate call. After that, you can add as many more tasks as you want. It should be borne in mind that between the calls to xTaskCreate and osKernelStart it is better not to place any code that works with hardware, since here the system timer may not work correctly. The call to the osSystickHandler () operating system timer handler must be added to SysTick_Handler (), and the two functions SVC_Handler and PendSV_Handler should be removed from their code, since they are implemented in OS code. When registering tasks, it is important not to make a mistake with the size of the stack. If it turns out to be too small, you will get crashes in the most unexpected places. The first when the stack overflows is the structure itself that describes the task. IAR has the ability to view a list of tasks. If you see a task with a changed name in it, then you need to increase the size of the stack.

    Calculate the spectrum


    To calculate the spectrum, we use the fast Fourier transform. The corresponding function is already in the library. She receives a buffer filled with complex data, and forms the result there. Accordingly, at the input, she needs a buffer, where the digitized sound alternates with zeros (complex part 0). At the output, we get complex numbers for which we immediately calculate the square of the module by adding the squares of the real and imaginary parts. We do this only for half the buffer, because the spectrum is symmetrical. We would need the second half if we wanted to do the inverse transformation, but for a simple display of the spectrum it is not needed. Some additional efforts are necessary in order to be able to calculate the spectrum in different spectral ranges. To get the spectrum for low frequencies, I accumulate data for several cycles of reading the buffer, effectively reducing the sampling frequency of the sound, which is initially 44.1kHz. The result is 6 ranges - 20kHz, 10kHz, 5kHz, 2600Hz, 1300Hz, 650Hz. To switch ranges, use the joystick and a separate task. The joystick also serves as a start / stop function for the waterfall, as well as adjusting the sensitivity. It is more convenient to show the spectrum in logarithmic units (decibels), since its dynamic range is usually very large, and on a linear scale we can distinguish only the strongest components of the spectrum. The logarithm is considered quite a long time even on FPU, so I replaced the real logarithm with a piecewise linear approximation, which is easy to obtain, knowing The result is 6 ranges - 20kHz, 10kHz, 5kHz, 2600Hz, 1300Hz, 650Hz. To switch ranges, use the joystick and a separate task. The joystick also serves as a start / stop function for the waterfall, as well as adjusting the sensitivity. It is more convenient to show the spectrum in logarithmic units (decibels), since its dynamic range is usually very large, and on a linear scale we can distinguish only the strongest components of the spectrum. The logarithm is considered quite a long time even on FPU, so I replaced the real logarithm with a piecewise linear approximation, which is easy to obtain, knowing The result is 6 ranges - 20kHz, 10kHz, 5kHz, 2600Hz, 1300Hz, 650Hz. To switch ranges, use the joystick and a separate task. The joystick also serves as a start / stop function for the waterfall, as well as adjusting the sensitivity. It is more convenient to show the spectrum in logarithmic units (decibels), since its dynamic range is usually very large, and on a linear scale we can distinguish only the strongest components of the spectrum. The logarithm is considered quite a long time even on FPU, so I replaced the real logarithm with a piecewise linear approximation, which is easy to obtain, knowing It is more convenient to show the spectrum in logarithmic units (decibels), since its dynamic range is usually very large, and on a linear scale we can distinguish only the strongest components of the spectrum. The logarithm is considered quite a long time even on FPU, so I replaced the real logarithm with a piecewise linear approximation, which is easy to obtain, knowing It is more convenient to show the spectrum in logarithmic units (decibels), since its dynamic range is usually very large, and on a linear scale we can distinguish only the strongest components of the spectrum. The logarithm is considered quite a long time even on FPU, so I replaced the real logarithm with a piecewise linear approximation, which is easy to obtain, knowingformat for representing a number in float32 . The most significant bit is a sign. The next 8 bits are the binary exponent plus 127. The remaining bits are the fractional part of the mantissa, despite the fact that the integer part is 1 (we omit the nuances of denormalized numbers for simplicity). So, having selected the exponent from float32 and grabbed the several most significant bits of the mantissa, you can get a good approximation of the logarithm. Using the pre-prepared table, we convert the resulting number into an RGB code for display on the LCD. It turns out a color scale of 90 or 60 decibels. The volume level corresponding to zero of this scale can be adjusted by pushing the joystick up and down.

    We display a picture - about the benefits of reading datasheets


    Now we just have to display the picture and revive our 'waterfall'. The straightforward way to do this is to store the image from the entire screen in a buffer, update it there and redraw every time new data appears. Not only is this solution extremely inefficient, we also do not have enough memory to store the entire picture. It would seem that the LCD itself has enough memory for this, and it should be able to do something interesting with it. Indeed, the study of datasheetallowed to detect hitherto unused scrolling command, which allows you to dynamically change the way the LCD controller memory is displayed on the screen. Imagine that memory is a tape enclosed in a ring that you see under the glass of the screen. The Vertical Scrolling Start Address (0x37) command allows you to set the position on the ribbon corresponding to the top edge of the screen. So, all we need to revive the 'waterfall' is to record a new spectrum in this position and scroll through the memory tape. The corresponding code was added to the LCD driver, borrowed from the reputable Peter Drescher , and adapted as described here. The only drawback of this approach: scrolling works only along the long side of the screen. Accordingly, only the short side is available for spectrum output.

    Why is there 12 notes in an octave?


    Let's move on to the practical applications of our device. The first thing that is easy to see on the spectrum is harmonics, that is, frequencies that are multiples of the fundamental frequency. Especially a lot of them in the voice. There are also in the sounds that make musical instruments. It is easy to understand why the notes of neighboring octaves differ in frequency by 2 times: then the notes of a higher octave coincide in frequency with the second harmonic of the notes of a low octave. They say that at the same time they sound "in unison." It’s a little more difficult to understand why there are 12 notes in an octave - seven main (white keys on the piano keyboard) plus 5 additional (black keys). Additional notes are indicated by the main notes with sharp and flat characters, although in fact there is no difference between them and the main notes - all 12 notes form a geometric progression so that the ratio of frequencies between adjacent notes is equal to the root of the 12th degree of 2. The meaning of this division of the octave into notes is that for any note there are other notes that differ in frequency by one and a half times - this combination is called the fifth. The notes that make up the fifth note sound in unison because the second harmonic of one note coincides in frequency with the third harmonic of the other note. The photo below shows the spectra of the notes Do and Sol, forming a fifth, matching harmonics are circled in yellow.



    How come notes 12? Since the notes form a geometric progression, we move on to the logarithms. ln (1.5) / ln (2) = 0.58496 ... A close value is obtained for the fraction 7/12 = 0.583 ... That is, seven half-tones (intervals between adjacent notes) turn out to be very close to a quint - 1.498. Interestingly, the fraction 31/53 = 0.58491 .. gives much greater accuracy, so that the fifth is different from 1.5 only in the fifth decimal place. This fact did not go unnoticed, but musical instruments with 53 notes in an octave did not receive distribution. They are difficult to tune, they are difficult to play, and the percentage of people who can feel the difference with conventional tools is vanishingly small.

    Source


    Lies here . For compilation IAR Embedded Workbench for ARM 7.50.2 was used. No other libraries are required for compilation.

    Also popular now: