Why should the buffer grow exponentially

  • Tutorial
Mozilla employee Nicholas Neterkot published a note with a very clear explanation of why the size of the memory buffer for the program needs to increase exponentially, rather than linearly.

Suppose we have a data structure that needs more and more memory, such as a string or vector. If new elements do not fit in the buffer, a new buffer is created, all the contents from the old are copied there, and then the old buffer is freed. Normal does this realloc().

So here. Imagine that our initial 1-byte buffer grows by 1 byte until it reaches a size of 1 MiB. How much memory did we use cumulatively for him?

1 + 2 + 3 + ... + 1,048,575 + 1,048,576 = 549,756,338,176 bytes

Not weak, right?

For experienced programmers, this nuance has long been known, so that someone even considers a linear increase in the buffer to be a "typical mistake by a novice . "

The bottom line is that if the new array is always larger than the previous one by a constant value, then the total work will be O (n 2 ). If the new array is increased by a constant factor N, then the total work will be O (n). By “work” is meant calls to functions malloc()and realloc()with subsequent redistribution and copying of values ​​in memory, that is, the total load on the system.

Of course, there are situations, for example, with built-in firmware for hardware, which can be considered as exceptions to the rule. But for desktop systems, this is practically the law. The only question iswhich multiplier to use: 2 or 1.5 .

Nicholas Neterkot adds to his example that in practice the total memory consumption will, of course, be less, because the memory manager rounds the size of the buffer. For example, the manager of the same Firefox (the old, highly modified version of jemalloc) rounds up memory allocation requests between 4 KiB and 1 MiB to the nearest 4 KiB multiplier. That is, in our example, for a buffer with a total volume of 1 MiB, we get only:

4,096 + 8,192 + 12,288 + ... + 1,044,480 + 1,048,576 = 134,742,016 bytes

For comparison, if we immediately used the factor 2:

4,096 + 8,192 + 16,384 + 32,768 + 65,536 + 131,072 + 262,144 + 524,288 + 1,048,576 = 2,093,056 bytes

They say that a "typical newbie mistake" appears in so many projects when any I / O task does not scale the size of the buffer. Neterkot immediately found the corresponding fragments of non-optimal code in the algorithm implementations in XDRBuffer , nsTArray and SQLite .

Also popular now: