Overcoming the threshold of 32 KB for data in the ROM of AVR microcontrollers

What could be worse crutches? Only incompletely documented crutches.


Here is a screenshot from the latest official integrated development environment for 8-bit AVR microcontrollers, Atmel Studio 7, C programming language. As can be seen from the Value column, the variable my_array contains the number 0x8089. In other words, the array my_array is located in memory, starting at address 0x8089.

At the same time, the Type column gives us some other information: my_array is an array of 4 int16_t elements located in the ROM (this is denoted by the word prog, unlike data for RAM), starting at address 0x18089. Stop, but after all 0x8089! = 0x18089. What is actually the address of the array?

C language and harvard architecture

AVR's 8-bit AVR microcontrollers, previously produced by Atmel, now Microchip, are popular, in part, because they are the basis of the Arduino, built on Harvard architecture, that is, the code and data are located in different address spaces. The official documentation contains code examples in two languages: assembly language and C. Previously, the manufacturer offered a free integrated development environment that supports only assembler. But what about those who would like to program in C, and even C ++? There were paid solutions, for example, IAR AVR and CodeVisionAVR. I personally never used it, because when I started programming AVR in 2008, there was already a free WinAVR with the ability to integrate with AVR Studio 4, and in the current Atmel Studio 7 it is just included.

The WinAVR project is based on the GNU GCC compiler, which was developed for the von Neumann architecture, which implies a single address space for code and data. When adapting GCC to AVR, the following crutch was applied: addresses (0 to 0x007fffff) were assigned to the code (ROM, flash), and 0x00800100 to 0x0080ffff - to the data (RAM, SRAM). There were all sorts of other tricks, for example, addresses from 0x00800000 to 0x008000ff represented registers, which can be accessed by the same opcodes as the RAM. In principle, if you are a simple programmer, like a novice Arduinschik, and not a hacker, mixing in one firmware assembler and C / C ++, you do not need to know all this.

In addition to the compiler itself, WinAVR includes various libraries (part of the C standard library and AVR-specific modules) in the form of the AVR Libc project. The latest version, 2.0.0, was released almost three years ago, and the documentation is available not only on the project site itself, but also on the site of the microcontroller manufacturer. There are also unofficial Russian translations.

Data in the address code space

Sometimes you need to put into the microcontroller not just a lot, but a lot of data: so much that they simply do not fit in the RAM. And these data are unchangeable, known at the time of the firmware. For example, a raster image, a melody, or some kind of table. At the same time, the code often takes up only a small fraction of the available ROM. So why not use the remaining space for data? Easy! See the avr-libc 2.0.0 documentation for an entire chapter on Data in Program Space. If you omit the part about the lines, then everything is extremely simple. Consider an example. For RAM, we write this:

unsignedchar array2d[2][3] = {...};
unsignedchar element = array2d[i][j];

And for the ROM as follows:

#include<avr/pgmspace.h>constunsignedchar array2d[2][3] PROGMEM = {...};
unsignedchar element = pgm_read_byte(&(array2d[i][j]));

So simple that this technology has been repeatedly covered, even in runet.

So what's the problem?

Remember the statement that 640 KB is enough for everyone? Remember how you moved from a 16-bit architecture to a 32-bit one, and from a 32-bit one to a 64-bit one? How did Windows 98 unstably work on more than 512 MB of RAM, despite the fact that it was developed for 2 GB? Have you ever updated the BIOS so that the motherboard works with hard drives more than 8 GB? Remember the jumpers on 80-GB hard drives, trimming them up to 32 GB?

The first problem overtook me when I tried to create an array of at least 32 KB in ROM. Why in ROM, but not in the RAM? Because currently, 8-bit AVRs with more than 32KB of RAM simply do not exist. And with more than 256 B - there are. This is probably why compiler creators chose size 16 b (2 b) for pointers in RAM (and at the same time for type int), which can be found in the read paragraph Data types in Chapter 11.14. What are the registers? AVR Libc documentation. Oh, and after all we were not going to hack, and then the registers ... But back to the array. It turned out that it is impossible to create an object larger than 32 767 B (2 ^ (16 - 1) - 1 B). I do not know why the length of the object was taken to be symbolic, but this is a fact: no object, even a multidimensional array, can have a length of 32,768 B or more.

As far as I know, this problem has no solution. If you want to put an object with a length of 32,768 into the ROM, split it into smaller objects.

Once again we turn to the paragraph Data types: pointers are 16 bits. Apply this knowledge to Chapter 5 Data in Program Space. No, theory is not enough; practice is needed. I wrote a test program, started the debugger (unfortunately, software, not hardware) and saw that the function pgm_read_byteis able to return only those data whose addresses fit into 16 bits (64 KB; thank you for not 15). Then overflow occurs, the older part is discarded. It is logical, given that the pointers are 16-bit. But two questions arise: why this is not written in Chapter 5 (a rhetorical question, but it was he who prompted me to write this article) and how to still overcome the border in 64 KB of ROM, without switching to assembler.

Fortunately, in addition to Chapter 5, there is another 25.18 pgmspace.h File Reference, from where we learn that the family of functions pgm_read_*is only a redefinition for pgm_read_*_nearaccepting 16-bit addresses, but there is also pgm_read_*_far, and there you can submit an address 32 bits long. Eureka!

Write the code:

unsignedchar element = pgm_read_byte_far(&(array2d[i][j]));

It compiles, but does not work as we would like it (if array2d is located after 32 KB). Why? Yes, because the operation &returns a significant 16-bit number! It's funny that the family pgm_read_*_nearaccepts unsigned 16-bit addresses, that is, it is able to work with 64 KB of data, and the operation &is useful only for 32 KB.

Go ahead. What do we have in pgmspace.h besides pgm_read_*? A function pgm_get_far_address(var)that has a half-page description, and replaces the operation &.

Probably the right way:

unsignedchar element = pgm_read_byte_far(pgm_get_far_address(array2d[i][j]));

Compilation error. We read the description of the following: fails to get the address if it is not), a struct name or a struct field name, a function identifier, a linker defined identifier, ...

Put another crutch: move from array indices to pointer arithmetic:

unsignedchar element = pgm_read_byte_far(pgm_get_far_address(array2d) + i*3*sizeof(unsignedchar) + j*sizeof(unsignedchar));

Now everything works.


If you write C / C ++ for 8-bit AVR microcontrollers using the GCC compiler and store the data in ROM, then:

  • with a ROM size of no more than 32 KB, you will not encounter problems by reading only chapter 5 Data in Program Space;
  • with a ROM of more than 32 KB, use the family of functions pgm_read_*_far, the function pgm_get_far_addressinstead &, the arithmetic of pointers instead of array indices, and the size of any object cannot exceed 32,767 B.


Also popular now: