Again about STL: containers

  • Tutorial
In the previous article, we talked about arrays as a prototype and ancestor of containers. Now the turn has come to the actual container classes and their supporting libraries.

Under the term standard template library (STL, S tandard T emplate Library) understand the set of interfaces and components originally developed by Alexander Stepanov, Meng Lee and other employees of AT&T Bell Laboratories and Hewlett-Packard Research Laboratories in the early 90s (although quite a few more later had a hand in what became standard today C ++ component). Further, the STL library became the property of SGI, and was also included as a component in the Boost library set. Finally, the STL library entered the C ++ standards of 1998 and 2003 (ISO / IEC 14882: 1998 and ISO / IEC 14882: 2003) and has since been considered one of the components of the standard C ++ libraries.

The standard does not name this part of the STL library, but it would be nice to take this chronology into account when sorting out which version of the compiler, language and literature you are dealing with - in the process of reducing HP STL to sizes suitable for standardization, some algorithms and functors fell out of the library, and something is added over time (for example, expanding the set of overridden prototypes of some container methods). The text will use the traditional name STL only to make it clear which part of the standard C ++ library we have in mind.

The initial goal of STL (this can be clearly seen from the chronology of comments in the header files) was to create a more flexible model of regular containers compared to arrays and generalize some widely used algorithms (such as search, sort, and some others) to them. But the idea turned out to be more fruitful than the original intentions, and was significantly expanded. STL introduces a number of concepts and data structures that in almost all cases can greatly simplify program code. The following categories of concepts are introduced:

  1. A container is a way of storing a set of objects in memory.
  2. An iterator is a means of accessing the contents of individual objects in a container.
  3. Algorithm - determination of the most standard computational procedures on containers.
  4. Adapter - adaptation of the main categories to provide the most used interfaces (such as a stack or queue).
  5. Functor (functional object) - hiding a function in an object for use by other categories.

The STL library is a very vast area. A whole separate book is devoted to its description. Here, due to the fairly initial level of acquaintance, limited volume and following the previously announced goals, we will consider the technique of using STL with intuitively clear examples. The STL syntax is based on the use of C ++ syntax constructs such as class templates and function templates. But for a successful application of the STL technique, a deep understanding of the templates technique is not necessary at all (this may come later).

The central concept of STL, around which everything else revolves, is a container (they also use the term collection). A container is a collection of a certain number of elements of the same type, packed in a container in a certain way. The simplest prototype container in the classic C ++ language is an array. The way that the elements are packed into a container determines the type of container and the features of working with elements in such a container. STL introduces a number of different types of containers, the main ones are:

  • sequential containers - vector (vector), doubly linked list (list), deck (deque);
  • associative containers - sets (set and multiset), hash tables (map and multimap);
  • pseudo containers - bit masks (bitset), strings (string and wstring), arrays (valarray);

We will look at examples of using sequentially the main ones, moving from simpler to more complex. The simplest type of container is a vector . A close prototype of a vector is a C ++ array, but the size of a vector can be dynamically changed at any time by the operations of adding (push_back () method) or deleting (for example, pop_back () method) an element. As for the array, we can refer to an arbitrary element of the vector by the indexing operation [n]. What has already been said is already the first, superficial layer of knowledge about vector, but which is sufficient for it to allow us to start working with it on the analogies of how we work with traditional arrays:

#include  
#include  
#include  
using namespace std; 
void put( const vector& v ) { 
   cout << "capacity=" << v.capacity() 
        << ", size=" << v.size() << " : "; 
   for( unsigned i = 0; i < v.size(); i++ ) 
      cout << v[ i ] << " "; 
   cout << endl; 
} 
vector& operator +=( vector& v, unsigned n ) { 
   float last = v[ v.size() - 1 ]; 
   for( unsigned i = 0; i < n; i++ ) 
      v.push_back( last + i + 1 ); 
   return v; 
} 
int main( void ) { 
   float data[] = { 1., 2., 3., 4., 5., 6., 7. }; 
   int n = sizeof( data ) / sizeof( data[ 0 ] ); 
   vector array( data, data + n ); 
   cout << "max_size=" << array.max_size() 
        << "  ((INT_MAX+1)/2)=" << ( (unsigned)INT_MAX + 1 ) / 2 << endl; 
   put( array ); 
   put( array += 2 ); 
   put( array += 6 ); 
}

Description vector(this is the previously mentioned template in the class description) declares an object in the code : vector
elements of type float. Next we see such methods of the vector class as max_size () - the maximum possible length of the vectors in general (implementation constant), size () - the current size (number of elements) of the vector, capacity () - the current capacity of the vector, the maximum number of elements that can be placed into a vector at its current location . Executing this snippet will give something like the following (details may vary depending on implementation):

$ ./vect1 
max_size=1073741823  ((INT_MAX+1)/2)=1073741824 
capacity=7, size=7 : 1 2 3 4 5 6 7 
capacity=14, size=9 : 1 2 3 4 5 6 7 8 9 
capacity=28, size=15 : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 

Here you can see a rather interesting behavior of the vector (this is its meaning): as soon as the next element of the vector is added, its capacity is not enough for another element, a new placement of the vector is made, reserving double capacity for it (with a margin so that the next addition of a new element did not immediately require a new relocation).

Note:The doubling of the capacity of the vector shown above during relocation is a characteristic behavior for the implementation of the GCC compiler libraries. But the standard leaves the exact reserve reservation algorithm for future elements at the discretion of the implementer, so you cannot count on it, and it is described here only for a qualitative understanding of the picture (Visual Studio implementations behave differently, reserving only a small excess ... you will learn this yourself).

We note for the time being, without comment, the important fact that the operation of placing elements in the container copies the element , which entails a). The requirement for a copy constructor for the type of elements and b). For structural elements, this can lead to noticeable cost performance .

Thus, we got the equivalent of a C ++ array, the size of which (size ()) dynamically changes in an arbitrary range from several units to millions of elements. Pay attention (this is very important), that the increase in size of the vector is achieved in any case not indexed beyond its current size, and "Pushing» (push_back () method), a new element to the end of the vector (symmetrical, pop_back () method pushes the last element from an array and reduces its size ()). Another way to resize a vector is to immediately call the resize () methods to the desired size. Just because the size of a vector, unlike an array, can dynamically change, there are 2 different indexing methods for a vector : as an operation[i] and as a method-function at (i). They differ : the at () method checks the current size of the vector size (), and when indexing beyond its boundary, throws an exception . On the contrary, the indexing operation does not check the border, which is unsafe, but it is faster. The at () method allows us to control going beyond the boundaries of the vector and either qualify it as a logical error, or adjust the current container size to fit the need, like in such a fragment (here there are twice as many access attempts than the actual operations performed):

int main( void ) { 
   vector nums; 
   for( int i = 0; i < 10; ) { 
      try { 
         nums.at( i ) = i;    // vector::at throws an out-of-range 
         i++; 
      } 
      catch( const out_of_range& ) { 
         cout << i << " "; 
         nums.resize( i + 1 ); 
      } 
   } 
   cout << endl << nums.size() << endl; 
} 

$ ./vect7 
0 1 2 3 4 5 6 7 8 9 
10 

The C ++ 11 standard introduces additional expressive tools, such as, for example, initialization lists and type inference, which greatly simplify the work with containers (and even make old familiar writing methods unnecessary). This is how a matrix can be described when it is simultaneously described a). configuration (square, although it may be rectangular or even triangular), b). dimension (3x3) and c). initialization values:

void print( const vector< vector >& m ) { 
   for( auto &row : m ) { 
      for( auto x : row ) 
         cout << x << ' '; 
      cout << endl; 
   } 
} 
void trans( vector< vector >& m ) { 
   for( unsigned i = 0; i < m.size(); i++ ) 
      for( unsigned j = i + 1; j < m[ i ].size(); j++ ) { 
         float tmp = m[ i ][ j ]; 
         m[ i ][ j ] = m[ j ][ i ]; 
         m[ j ][ i ] = tmp; 
      } 
} 
int main( void ) { 
   vector< vector > matrix = { 
      { 1, 2, 3 }, 
      { 4, 5, 6 }, 
      { 7, 8, 9 } 
   }; 
   print( matrix ); 
   cout << "---------" << endl; 
   trans( matrix ); 
   print( matrix ); 
} 

And at the same time, it also shows the work with vectors (transposition of a square matrix and output to the output stream) as pseudo-arrays (using only indexing), which are essentially the vectors (in particular, it is shown how the type of the element is determined by vector output type according to the C ++ 11 standard):

$ ./vect6 
1 2 3 
4 5 6 
7 8 9 
--------- 
1 4 7 
2 5 8 
3 6 9 

Note: In the framework of what we already know about vectors, the question sometimes arises: how should the type of the returned size () result be strictly determined (in order to avoid platform dependence) and, accordingly, of any variable cycles that operate on the size of the vector? From time to time, from the guardians of syntax purity, the answer follows that this should be size_t, and this answer is incorrect (especially since for many platforms size_t is defined as unsigned int). If you want to write an absolutely strict definition of the size () type of a vector, then the line in the example above should be written like this:

for( vector::size_type j = i + 1; j < m[ i ].size(); j++ ) { ...

Or, relying on C ++ 11 type inference, like this:
for( auto j = i + 1; j < m[ i ].size(); j++ ) { ...

Having noted this subtle nuance here (taking into account), we will not apply it further in order to avoid unnecessarily cumbersome examples, but we will simply use unsigned for dimensions.

PS Whom it may be of interest to, an archive of all code examples for verification and further experiments (both the previous and this parts, and all subsequent ones) can be taken here or here so as not to pound all this manually.

Also popular now: