Computer Club - Group C: STL

Topics to be Covered

1. STL

Why STL?
idea of templates & generics
vector - push_back(), resize(), clear(), memory allocation, complexities
pair - advantage over simple struct, operator overloading
Iterators - analogous to pointers, normal access, random access,
Set - insert(), erase(), find(), implemented as red-black tree
Map - Next class
Algorithms - Next class
Question Codes - vector / pair : BYTESE2 BUSYMAN
multiset : PRO(use multiset instead of set, rest syntax is same)
set : RPLD

-----------------------------------------------------------------------------------------------------------------------------
Reference - Topcoder Link

We will review some of the powerful features of the Standard Template Library (STL)

Containers

Any time you need to operate with many elements you require some kind of container. In native C (not C++) there was only one type of container: the array.

The main problem with array is that many problems require a container with greater functionality.
For example, we may need one or more of the following operations:

Add some string to a container.
Remove a string from a container.
Determine whether a string is present in the container.
Return a number of distinct elements in a container.
Iterate through a container and get a list of added strings in some order.

Of course, one can implement this functionality in an ordinal array. But the trivial implementation would be very inefficient. You can create the tree- of hash- structure to solve it in a faster way, but think a bit: does the implementation of such a container depend on elements we are going to store? Do we have to re-implement the module to make it functional, for example, for points on a plane but not strings?

If not, we can develop the interface for such a container once, and then use everywhere for data of any type. That, in short, is the idea of STL containers.
---------------------------------------------------------------------------------------------------------------------

Container types (and algorithms, functors and all STL as well) are defined in a special namespace called “std.” So you need to add the following line to use them directly :

using namespace std;

The type of a container is the template parameter. Template parameters are specified with the ‘<’/’>’ “brackets” in code. For example:

vector<int> N;

-----------------------------------------------------------------------------------------------------------------------

Vector

Vector is just an array with extended functionality. It's backward compatible with C.

 vector<int> v(10);          
 for(int i = 0; i < 10; i++)

   {                      
      v[i] = (i+1)*(i+1);  
   }

int elements_count = v.size().

Two remarks:

size() is unsigned, which may sometimes cause problems.

use empty() function instead of comparing size() to 0.This is because not all the containers can report their size in O(1).

        bool is_nonempty_notgood = (v.size() >= 0); // Try to avoid this
        bool is_nonempty_ok = !v.empty();

push_back() - Push_back adds an element to the end of vector, increasing its size by one. Consider the following example:

 vector<int> v;                     
 for(int i = 1; i < 1000000; i *= 2)

 {                               
      v.push_back(i);                
 }                               
 int elements_count = v.size();

memory allocation — vector will not allocate just one element each time. Instead, vector allocates more memory then it actually needs when adding new elements with push_back.

resize() - The resize() function makes vector contain the required number of elements. If you require less elements than vector already contain, the last ones will be deleted. If you ask vector to grow, it will enlarge its size and fill the newly created elements with zeroes.

             vector<int> v(20); 
             for(int i = 0; i < 20; i++)

             { 
                 v[i] = i+1; 
             } 
             v.resize(25); 
             for(int i = 20; i < 25; i++)

             { 
                 v[i] = i*2; 
             }

If you use push_back() after resize(), it will add elements AFTER the newly allocated size, but not INTO it.

clear() - This function makes vector to contain 0 elements. It does not make elements zeroes, it completely erases the container.
As pointed out by Saurabh Sir, the complexity for push_back can be O(n) in the worst case since vector needs to create elements on the fly and yet maintain the contiguous property. The rule of thumb is to use vector::reserve beforehand to avoid re-allocation.

There are many ways to initialize vector:

 vector<int> v1; 
 // ... 
 vector<int> v2 = v1; 
 vector<int> v3(v1);

 vector<int> v4(v.begin(), v.end()); // v3 equals to v2

 vector<int> Data(1000); //create a vector of specific size

 vector<string> names(20, “Unknown”); // create a vector with specific size & a default value

 vector< vector<int> > Matrix; //create a 2-D array via vector

 vector< vector<int> > Matrix(N, vector<int>(M, -1)); //create a matrix of size N*M and fill it with -1.

 int data[] = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31 }; 
 vector<int> primes(data, data+(sizeof(data) / sizeof(data[0])));

 vector<int> v2(v.begin(), v.begin() + (v.size()/2));

insert() , erase()- To add data somewhere other than the end, there is the insert() member function. And there is also the erase() member function to erase elements, as well.

Passing vectors as parameters to functions -When vector is passed as a parameter to some function, a copy of vector is actually created. It may take a lot of time and memory to create new vectors when they are not really needed.

           void some_function(vector<int> v)

             // Never do it unless you’re sure what you do! 
             // ... 
           }

Instead, use the following construction:

           void some_function(vector<int>& v)

             // OK 
             // ... 
          }

-----------------------------------------------------------------------------------

Pairs

STL std::pair is just a pair of elements. The simplest form would be the following:

 template<typename T1, typename T2> struct pair

 { 
      T1 first; 
      T2 second; 
 };

At a more complex level, pair<string, pair<int, int> > is a pair of string and two integers. In the second case, the usage may be like this:

 pair<string, pair<int,int> > P; 
 string s = P.first; // extract string 
 int x = P.second.first; // extract first int 
 int y = P.second.second; // extract second int

The great advantage of pairs is that they have built-in operations to compare themselves. Pairs are compared first-to-second element. If the first elements are not equal, the result will be based on the comparison of the first elements only; the second elements will be compared only if the first ones are equal.
Hence array (or vector) of pairs can easily be sorted by STL internal functions.

For example, if you want to sort the array of integer points so that they form a polygon, it’s a good idea to put them to the vector< pair<double, pair<int,int> >, where each element of vector is { polar angle, { x, y } }. One call to the STL sorting function will give you the desired order of points.

----------------------------------------------------------------------------------------------------------------------------

Iterators

In STL iterators are the most general way to access data in containers. Consider the simple problem:

Reverse the array A of N int’s. Let’s begin from a C-like solution:

 void reverse_array_simple(int *A, int N)

 { 
      int first = 0, last = N-1; // First and last indices of elements to be swapped 
      While(first < last)

           // Loop while there is something to swap 
           swap(A[first], A[last]); // swap(a,b) is the standard STL function 
           first++; // Move first index forward 
           last--; // Move last index back 
      } 
 }

Rewriting it in terms of pointers:

 void reverse_array(int *A, int N)

 { 
      int *first = A, *last = A+N-1; 
      while(first < last)

      { 
           Swap(*first, *last); 
           first++; 
           last--; 
      } 
 }

The above code uses only four distinct operations on pointers ‘first’ and ‘last’:

compare pointers (first < last),
get value by pointer (*first, *last),
increment pointer, and
decrement pointer

Now if we had to reverse the contents of a double-linked list, or a part of it, the first code, which uses indexing, will definitely not work. The second code can work for ANY pointer-like object. The only restriction is that that object can perform the operations described above. Objects with these properties that are associated with containers are called iterators. Any STL container may be traversed by means of an iterator.

There are so-called “normal iterators” and “random access iterators”. Normal iterators may be compared with ‘==’ and ‘!=’, and they may also be incremented and decremented. They may not be subtracted and we can not add a value to the normal iterator. Basically, it’s impossible to implement the described operations in O(1) for all container types.

The following operations are defined for iterators

get value of an iterator, int x = *it;
increment and decrement iterators it1++, it2--;

compare iterators by ‘!=’ and by ‘<’

add an immediate to iterator it += 20; <=> shift 20 elements forward {random-access only}

get the distance between iterators, int n = it2-it1; { random-access only}

STL algorithms always use two iterators, called “begin” and “end.” The end iterator is pointing not to the last object, however, but to the first invalid object, or the object directly following the last one. Each STL container has member functions begin() and end() that return the begin and end iterators for that container.

Based on these principles, c.begin() == c.end() if and only if c is empty, and c.end() – c.begin() will always be equal to c.size() {for random-access iterators}.

The STL-compliant reverse function should be written as follows:

 template<typename T> void reverse_array_stl_compliant(T *begin, T *end)

 { 
      // We should at first decrement 'end' 
      // But only for non-empty range 
      if(begin != end) 
      { 
           end--; 
           if(begin != end)

           { 
                while(true)

                { 
                     swap(*begin, *end); 
                     begin++; 
                     If(begin == end) break; 
                     end--; 
                     if(begin == end) break; 
                } 
           } 
      } 
 }

This is the actual implementation of std::reverse() found in algorithms module. The main difference between this code and the previous one is that we don’t use the “<” comparison on iterators, just the “==” one.

Each container also has the rbegin()/rend() functions, which return reverse iterators. Reverse iterators are used to traverse the container in backward order. Thus:

     vector<int> v; 
     vector<int> v2(v.rbegin()+(v.size()/2), v.rend());

will create v2 with first half of v, ordered back-to-front.

To create an iterator object, we must specify its type. The type of iterator can be constructed by a type of container by appending “::iterator”, “::const_iterator”, “::reverse_iterator” or “::const_reverse_iterator” to it. Thus, vector can be traversed in the following way:

        vector<int> v; 

        // ... 

        // Traverse all container, from begin() to end() 
        for(vector<int>::iterator it = v.begin(); it != v.end(); it++)

        { 
           *it++; // Increment the value iterator is pointing to 
        }

find() :-The find() algorithm looks for appropriate elements in an interval. If the element is found, the iterator pointing to the first occurrence of the element is returned. Otherwise, the return value equals the end of interval. See the code:

         vector<int> v; 
         for(int i = 1; i < 100; i++)

         { 
            v.push_back(i*i); 
         } 
         if(find(v.begin(), v.end(), 49) != v.end()) { 
            // ... 
         }

To get the index of element found, one should subtract the beginning iterator from the result of find():

     int i = (find(v.begin(), v.end(), 49) - v.begin(); 
      if(i < v.size()) { 
        // ... 
      }

Min and Max To get the value of min/max element, like in find(), use *min_element(…) or *max_element(…), to get index in array subtract the begin iterator of a container or range:

        int data[5] = { 1, 5, 2, 4, 3 }; 
        vector<int> X(data, data+5); 
        int v1 = *max_element(X.begin(), X.end()); // Returns value of max element in vector 
        int i1 = min_element(X.begin(), X.end()) – X.begin; // Returns index of min element in vector

sort():- Consider the following examples:

      vector<int> X; 

      // ... 

     sort(X.begin(), X.end()); // Sort array in ascending order 
     sort(all(X)); // Sort array in ascending order, use our #define 
     sort(X.rbegin(), X.rend()); // Sort array in descending order using with reverse iterators

typeof:- This operator is replaced to the type of an expression during the compilation.

   typeof(a+b) x = (a+b);

This will create the variable x of type matching the type of (a+b) expression.

insert():- One can insert an element to vector by using the insert() function:

         vector<int> v; 
         // ... 
         v.insert(1, 42); // Insert value 42 after the first

All elements from second (index 1) to the last will be shifted right one element to leave a place for a new element.

insert() also has an interval form to add many elements

         vector<int> v; 
         vector<int> v2; 
         // .. 
         // Shift all elements from second to last to the appropriate number of elements. 
         // Then copy the contents of v2 into v. 
         v.insert(1, all(v2));

erase():-Vector also has a member function erase, which has two forms. Guess what they are:

       erase(iterator); // single element of vector is deleted
       erase(begin iterator, end iterator);//the interval is erased from vector.

---------------------------------------------------------------------------------------------------------------------------

String

There is a special container to manipulate with strings. The string container has a few differences from vector<char>. Most of the differences come down to string manipulation functions and memory management policy.

String has a substring function without iterators, just indices:

 string s = "hello"; 
 string 
      s1 = s.substr(0, 3), // "hel" 
      s2 = s.substr(1, 3), // "ell" 
      s3 = s.substr(0, s.length()-1), "hell" 
      s4 = s.substr(1); // "ello"

Beware of (s.length()-1) on empty string because s.length() is unsigned and unsigned(0) – 1 is definitely not what you are expecting!

----------------------------------------------------------------------------------------------------------------------------

Set

Why Set? - Consider we need a container with the following features:

add an element, but do not allow duplicates

remove elements

get count of elements (distinct elements)

check whether elements are present in set.
STL provides the special container for it – set.

Set can add, remove and check the presence of particular element in O(log N) complexity.
While adding elements to set, the duplicates are discarded. A count of the elements in the set, N, is returned in O(1).
Set is implemented internally as red-black tree.
Syntax:

         set<int> s; 
         for(int i = 1; i <= 100; i++)

         { 
              s.insert(i); // Insert 100 elements, [1..100] 
         } 
         s.insert(42); // does nothing, 42 already exists in set 
         for(int i = 2; i <= 100; i += 2)

         { 
              s.erase(i); // Erase even values 
         } 
         int n = int(s.size()); // n will be 50

         int r = 0; 
         for(set<int>::const_iterator it = s.begin(); it != s.end(); it++)

         { 
              r += *it;// Calculate the sum of elements in set  
         }

The order in which elements are inserted in set does not matter.

Since set is not a linear container, it’s impossible to take the element in set by index. Therefore, the only way to traverse the elements of set is to use iterators.

An elegant way to traverse a complex set(or any other container)

     #define tr(Co, it) for(typeof(Co.begin()) it = Co.begin(); it != Co.end(); it++)

     set< pair<string, pair< int, vector<int> > > SS; 
     int total = 0; 
     tr(SS, it)  
         total += it->second.first;

Notice the ‘it->second.first’ syntax. Since ‘it’ is an iterator, we need to take an object from ‘it’ before operating. So, the correct syntax would be ‘(*it).second.first’ or ‘it->second.first’

‘find()’- To determine whether some element is present in set. While searching in set(and
map), do not use global find – instead, use member function ‘set::find()’.

            set<int> s; 
            // ... 
            if(s.find(42) != s.end())

            { 
                 // 42 presents in set 
            } 
            else

            { 
                 // 42 not presents in set 
            }

Following is a useful & convenient macro check if an element is present:

       #define present(container, element) (container.find(element) != container.end())

To erase an element from set use the erase() function.

          set<int> s; 
          // … 
          s.erase(29);

The erase() function also has the interval form:

           set<int> s; 
           // .. 
           set<int>::iterator it1, it2; 
           it1 = s.find(10); 
           it2 = s.find(100); 
           s.erase(it1, it2); // Note that 10 will be deleted, but 100 will remain in the container

Set has an interval constructor:

           int data[5] = { 5, 1, 4, 2, 3 }; 
           set<int> S(data, data+5);

It gives us a simple way to get rid of duplicates in vector, and sort it:

           vector<int> v; 
           // … 
           set<int> s(all(v)); 
           vector<int> v2(all(s));

----------------------------------------------------------------------------------------------------------------------------

Map

Map is very much like set, except -

it contains not just values but pairs <key, value>. Map ensures that at most one pair with specific key exists.
map has operator [] defined.

Internally map and set are almost always stored as red-black trees. The elements of map and set are always sorted in ascending order while traversing these containers. If you make the modification in the key value, that breaks the order and will lead to improper functionality of container’s algorithms.
Don’t change the key of map element by iterator, because it may break the integrity of map internal data structure

A simple STL map has the following Syntax:-

Definition :     map<string, int> M; 
Mapping    :     M["Top"] = 1; 
                 M["Coder"] = 2; 
                 M["SRM"] = 10; 
Accessing  :     int x = M["Top"] + M["Coder"]; //for accessing find() can also be used

Difference between map::find() and map::operator []:

While map::find() will never change the contents of map, operator [] will create an element if it does not exist.
It’s a bad idea to use operator [] when you do not want to add new elements.
Operator [] may not be used if map is passed as a const reference parameter to some function.

 void f(const map<string, int>& M)

 { 
      if(M["the meaning"] == 42)

            // Error! Cannot use [] on const map objects! 
      } 
      if(M.find("the meaning") != M.end() && M.find("the meaning")->second == 42)

           // Correct 
           cout << "Don't Panic!" << endl; 
      } 
 }

Map Iterator:

The iterator will be an std::pair of key and value. To get the value use it->second.

      map<string,int> M;

      //...

      map<string,int>::iterator it;

      int r = 0;

      for(it= M.begin(); it != M.end(); it++)

      { 
         r += it->second; 
      }

-----------------------------------------------------------------------------------------------------------------------

STL Algorithms

Most algorithms are declared in the #include <algorithm> standard header.

There are 3 very simple algorithms:
- min(a,b)
- max(a,b)
- swap(a,b).

Algorithm sort() is also widely used. The call to sort(begin, end) sorts an interval in ascending order. sort() requires random access iterators, so it will not work on all containers. However, you probably won’t ever call sort() on set, which is already ordered.

The find(begin, end, element) method returns the iterator where ‘element’ first occurs, or end if the element is not found.
count(begin, end, element) returns the number of occurrences of an element in a container or a part of a container.

[ Remember that set and map have the member functions find() and count(), which works in O(log N), while std::find() and std::count() take O(N) ]

Some other useful algorithms are next_permutation() and prev_permutation().
The call to next_permutation(begin, end) makes the interval [begin, end) hold the next permutation of the same elements, or returns false if the current permutation is the last one.

If you want to check all permutations :

          vector<int> v; 
          for(int i = 0; i < 10; i++) 
              v.push_back(i); 
          do

           { 
              Solve(..., v); 
           }

          while(next_permutation(all(v));

The initial state should form the very first permutation!

The short list of templates and macros follows:

 typedef vector<int> vi; 
 typedef vector<vi> vvi; 
 typedef pair<int,int> ii; 
 #define sz(a) int((a).size()) 
 #define pb push_back 
 #defile all(c) (c).begin(),(c).end() 
 #define tr(c,i) for(typeof((c).begin() i = (c).begin(); i != (c).end(); i++) 
 #define present(c,x) ((c).find(x) != (c).end()) 
 #define cpresent(c,x) (find(all(c),x) != (c).end())

3 comments:

Balo BabaAugust 8, 2015 at 2:41 AM
great tutorial, even for those who want to revise things fast :-D
UnknownAugust 9, 2015 at 2:42 AM
Probably it would be worth mentioning in the vector section that the complexity for push_back can be O(n) in the worst case since vector needs to create elements on the fly and yet maintain the contigous property. The rule of thumb is to use vector::reserve beforehand to avoid re-allocation

Computer Club - Group C

Thursday, August 6, 2015

STL - Class 1