Again about STL

  • Tutorial
This collection of short notes about C ++ containers appeared as a result of a request to prepare a concise overview of STL for novice colleagues . Why was this necessary when there is original documentation on this subject and many whole separate books, of which at least 5-6 excellent quality exist in translations into Russian?
But the point, however, is, and here it is:

  • All books were published in the late 90s - early 2000s. Later C ++ language standards (up to C ++ 11) introduce new syntactic constructs which, when applied with STL, give a cross-effect ... with very interesting interference. This often allows the use of STL constructs with much greater ease.
  • Books usually describe the subject in too much detail (this is good for students, but redundant for programmers, even junior level, who, after other languages, for example, need only basic familiarization). The original documentation, on the contrary, is crammed with formal syntactic definitions (this is great as a reference at hand, but redundant for acquaintance). The present notes, completely avoiding formal definitions, are built around examples of use, for the most part understandable to the programmer, even without any additional explanation.
  • The contingent for which the texts were originally prepared, among other things, is largely focused on numerical mathematical methods, processing data in a stream, which existing publications are not designed for. Such a bias is also closer to me, and such an emphasis will be noticeable in the examples on which the description is built.

Due to the requirements of mandatory uniformity of objects in containers, they could be called regular containers with clarification, this clarifies a lot (such a clarification is not done just because everyone is clear about what this is about). This, of course, is about STL containers, but the traditional C / C ++ array is also a regular container, and they will appear in the text and examples. (Structures, and even more generally, classes with data fields are also containers, but they cannot be called regular.)

I would like to hope that these notes will be useful to someone mastering STLs and simplify this process of understanding. And the suggestions and comments, when they will be, in essence, from those readers who are already pros in C ++, will improve these texts for the future, when they can be useful to someone else.

Arrays with static and dynamic dimension


But you need to start the overview of STL containers from traditional arrays, since the former are a logical development of the latter. Arrays are one of the most used forms of data organization, and historically one of the very first forms of structuring that appeared in programming languages ​​(late 50s), before the appearance of structures and any other aggregation methods in languages. An array is a representation of a set of consecutive elements of the same type, and so it is organic for an internal representation for machine computing at the instruction level. Fundamentally important in this definition are 2 points that must be satisfied for an array:

  1. Each element of the array can be indicated by the number of its location in the sequence of elements similar to it.
  2. All elements of the array must necessarily be of the same type . Everyone knows and understands the simplest definitions, for example, of integer arrays: int array [100]. But this does not mean at all that only the simplest built-in types of the C ++ language can be organized into arrays - variable objects of any composite type (class) and any degree of complexity can be organized into an array.

The only limitation is that all elements of the same array must be of the same type. For example, this is how a student group can be described in the dean’s accounting program:
class  student {
...
}
student group[ 30 ];

We will return to this extremely important circumstance — the types of array elements — in the future.

Ever since the earliest programming languages ​​(FORTRAN, etc.), there was a very strong restriction on arrays: the size of the array should be determined only by an integer constant , the value of which should be determined at the time of compilation of the code. The same restriction remained in the C language, which became the progenitor of C ++. For example:
#define size 100
double array[ size ];

In C ++ (in the classic, except for the standards of recent years!), This restriction is slightly relaxed to the extent that the size of the array can be an integer constant, the value of which can be calculated at the time of compilation of the code. For example, like this:
#include  
using namespace std; 
float array[ (int)( 10. * 10. )  + 2 ]; 
int main( void ) { 
   cout << "array size = " 
        << sizeof( array ) / sizeof( array[ 0 ] ) 
        << endl; 
} 

$ ./siz1 
array size = 102 

In all such cases, after defining the array, its size is fixed and we cannot increase its size in any way (if, for example, during the calculations it turns out that we lack this size). Thus, certain arrays are called arrays with statically declared (at the time of writing the code) size.

Note: Due to the fact that all elements of the array are arranged sequentially (the first rule from the ones mentioned above), to calculate the size of the array (in the field of visibility), you can use the trick shown in the example: the size of the array is equal to the length of the entire array divided by the length any of its elements (since they are all the same in type).

It might seem that arrays are defined differently without specifying a size, but with a list of initializing values:
float array[] = { 1., 2., 3., 4., 5. };

But this is not so! It’s just that here the constant value of the size of the declared array is extracted from the list of values, and it is equal, in the shown example 5.

The main way to create an array, in classical C and C ++, with the size of N elements, calculated at the time the array was created (at run time) - it was way to dynamically allocate an array. Which is done in C and C ++, respectively:
double *array = (double*)calloc( N, sizeof( double ) ); // это C
double *array = new double[ N ];                        // а это C++

For example, like this:
#include  
#include  
using namespace std; 
int main( void ) { 
   int size = (int)( sin( M_PI / 2 ) * pow( 2, 10 ) ); 
   float *array = new float[ size ]; 
   cout << "array size = " << size << endl; 
   delete [] array; 
} 

$ ./siz2 
array size = 1024 

Note: Dynamic array allocation is the main way, although there are some other ways, such as alloca () from the C API, or the inclusion of zero-length arrays in extensible structures (GCC compiler extension) or arrays that replaced them with variable boundaries (standard extension C99):
typedef struct vararr {
//   int n, data[ 0 ];    // массив нулевой длины - расширение GCC
     int n, data[];       // массив с переменными границами - расширение C99
} vararr_t;

But all these methods, judging by the frequency of their use, are only exotic in comparison with dynamic allocation. And their diversity in different versions is only an indication that the static size of the array is a strong limiting factor, and all this is a search for removing this restriction.

The latest standards (C99, C ++ 11) have introduced extensions that allow the creation of localarrays in functions with sizes calculated at runtime when entering the function (under such conditions, the array will be allocated in the function stack). This is already quite significant in cases where we cannot know in advance the size of the processed data (the C99 standard calls it: VLA - variable-length array). As an example, let's look at the problem of finding all primes not exceeding N (the sieve of Eratosthenes), where N is given by the command line parameter when the program starts:
#include  
#include  
using namespace std; 
int main( int argc, char **argv ) { 
   unsigned k, j, n;             // верхняя граница 
   bool a[ n = atoi( argv[ 1 ] ) + 1 ]; 
   a[ 0 ] = a[ 1 ] = false; 
   for( k = 2; k < n; k++ ) 
      a[ k ] = true; 
   for( k = 2; k < n; k++ ) 
      for( j = k + k; j < n; j += k ) // вычеркивание всех чисел кратных k 
         a[ j ] = false; 
   const int line = 10; 
   for( k = 0, j = 0; k < n; k++ ) 
     if( a[ k ] ) { 
        cout << k << "\t"; 
        if( 0 == ++j % line ) cout << endl; 
     } 
   if( j % line != 0 ) cout << endl; 
   return 0; 
}

We are already obliged to compile this code with the 2011 C ++ standard (whether by compiler options or the properties of the project being built):
$ g++ -Wall -std=c++11 -O3 erastof.cc -o erastof 
$ ./erastof 300 
2	3	5	7	11	13	17	19	23	29	 
31	37	41	43	47	53	59	61	67	71	 
73	79	83	89	97	101	103	107	109	113	 
127	131	137	139	149	151	157	163	167	173	 
179	181	191	193	197	199	211	223	227	229	 
233	239	241	251	257	263	269	271	277	281	 
283	293	

But even after all the extensions, the simplest array, as a form of organizing a set of objects, is not flexible enough, the main limiting factors of which are:

  • No matter how the size of the array is determined (by a constant or by calculating the definition point), it is impossible to increase this size in the future (unless you have guessed the required size in advance, or if you have not made a sufficient margin).
  • According to C / C ++ rules, when calling functions instead of an array, a pointer to its beginning (the address of the 1st element) is passed as a parameter to the function. This allows you to greatly increase the efficiency of many computational algorithms, but at the same time information about the size of the array is completely lost (it is necessary, as an option) to transmit a separate parameter. For example, if we wanted to form the sieve of Eratosthenes not in the main () function, but in a separate function, then we should formulate its call as erastof (a, n).
  • Many intuitively simple operations on arrays cause difficulties, such as, for example: in a 15-element array, element 10 is inserted between elements 2 and 3. Moreover, a). all elements from 3 to 9 must be copied one position to the right, b). this can only be done in a descending order from 9 to 3 and c). all indices of these operations must be monitored manually.

In order to more easily understand access mechanisms in STL containers, it is useful to first recall how this is done to array elements. The choice (for reading or writing) of an arbitrary element of the array (let's call it arbitrarily arr []) can be selected by two mechanisms: a). the indexing operation is arr [n], or b). by the pointer counted from the beginning of the array - * (& arr [0] + n). The process of moving a pointer along an array during access to its various elements can be called an iteration, and a pointer used in this quality can be called an iterator. Thus, access to elements of any arrays can be carried out either by index or by iterator.

The C ++ 11 standard complements the operation of cyclic access to array elements with a new syntactic form, in the manner of the for_each () algorithm from STL, which (using type inference from the same C ++ 11) can look quite unusual for the traditional syntax:
   for( auto i : arr ) . . .
   for( auto &i : arr ) . . . // или так

The following example shows all of these features at the same time:
#include  
using namespace std; 
double arr[] = { 1, 2, 3, 4, 5, 6, 8, 9 }; 
const unsigned n = sizeof( arr ) / sizeof( arr[ 0 ] ); 
int main( void ) { 
   for( unsigned i = 0; i < n; i++ ) cout << arr[ i ] << " "; 
   cout << endl; 
   for( double* i = arr; i - arr < (int)n; i++ ) cout << *i << " "; 
   cout << endl; 
   for( auto i : arr ) cout << i << " "; 
   cout << endl; 
   for( auto &i : arr ) i = i * i; 
   for( double i : arr ) cout << i << " "; 
   cout << endl; 
} 

Pay attention to the for (auto & i: arr) expression when using a reference to array elements, representing a left-side expression that allows not only to read, but also write values ​​to array elements:
$ g++ -Wall -std=c++11 -O3 siz3.cc -o siz3 
$ ./siz3 
1 2 3 4 5 6 8 9 
1 2 3 4 5 6 8 9 
1 2 3 4 5 6 8 9 
1 4 9 16 25 36 64 81 

Postscript: Naturally, what is described here, and even more in the following parts, not only works, but at least even compiles in an elementary way:

  • When compiling the examples, you must explicitly indicate compliance with the C ++ 11 standard: either by the command line option (GCC and Clang - as shown in the text), or in the properties of the compiled project (Code :: Blocks). Implicitly using C ++ 11 constructs is not yet supported.
  • It is necessary that your compiler elementary knows and supports the C ++ 11 standard, that is, the compiler (or the IDE that it is part of) was later ... well, say, 2013.

The last condition is fully satisfied by all currently in use versions of GCC or Clang on Linux. At the request of readers, the given code examples (hereinafter) were tested in the Windows 7 virtual machine, and in the Visual Sudio 2013 and Code :: Blocks 2013 implementations. With the declared support for C ++ 11 (with reservations to “incompleteness”) in Visual Sudio 2013, support there, if any, is very limited, and the compiler is not suitable for experimenting with the examples shown. But Code :: Blocks (with MinGW) turned out to be quite suitable (although, in my firm opinion, learning C ++ languages, and especially C, is necessary only in POSIX / Linux, and then used in any environment ... but this is a purely personal conviction, which can be ignored).

From such a quick overview of arrays as containers, it is clear that the needs of practice often require more, which created the need for STL containers (Standard Template Library) as a means of expanding the functionality of arrays.

Also popular now: