The programming paradigm underlying STL is called generic programming. Here is one definition [Jazaeri98]:
Generic programming is a sub-discipline of computer science that deals with finding abstract representations of efficient algorithms, data structures, and other software concepts, and with their systematic organization. The goal of generic programming is to express algorithms and data structures in a broadly adaptable, interoperable form that allows their direct use in software construction. Key ideas include:
- Expressing algorithms with minimal assumptions about data abstractions, and vice versa, thus making them as interoperable as possible.
- Lifting of a concrete algorithm to as general a level as possible without losing efficiency; i.e., the most abstract form such that when specialized back to the concrete case the result is just as efficient as the original algorithm.
- When the result of lifting is not general enough to cover all uses of an algorithm, additionally providing a more general form, but ensuring that the most efficient specialized form is automatically chosen when applicable.
- Providing more than one generic algorithm for the same purpose and at the same level of abstraction, when none dominates the others in efficiency for all inputs. This introduces the necessity to provide sufficiently precise characterizations of the domain for which each algorithm is the most efficient.
template <class T>
void swap( T& a, T& b) {
T tmp = a;
a = b;
b = tmp;
}
When the template is instantiated (by calling the function) the
placeholder T becomes an actual type. However, compilation
can only succeed if this actual type has an assignment operator and a
copy constructor. The function could have been implemented using a
default constructor and assignment, but the copy constructor is more
likely to exist than the default constructor (given that the
assignment operator is required anyway).We can distinguish between syntactic requirements and semantic requirements. The syntactic requirements are the assignment operator and the copy constructor in our example. If an actual type fails to comply with these requirements a compilation error points that out. The semantic requirements are that the copy constructor and the assignment operator should actually copy the values, should be side effect free, and in general should behave according to the C object model, e.g., tmp = y; x = tmp; should give you the same as x = y;. Remember that these are user defined functions. Semantic requirements are not checkable at compile time.
Instead of documenting requirements always in all detail, it is convenient to group them in often used combinations. We call these collections of requirements concepts. The concept for the swap function parameter is called Assignable.
If an actual type fulfills the requirements of a concept, it is a called a model for this concept. In our example, int is a model of the concept Assignable.
| Concept | Syntactic requirements |
| Assignable | copy constructor assignment operator |
| Default Constructible | default constructor |
| Equality Comparable | equality and inequality operator |
| LessThan Comparable | order comparison with operators <, <=, >=, and > |
A regular type is one that is a model of Assignable, Default Constructible, Equality Comparable, and one in which these expressions interact in the expected way, for example, for x = y; we may assume that now x == y true is.
In general, concepts factor out common signature and behavior for template arguments. One can think of a concept as the `greatest common denominator' of all types for which a function template is supposed to work. Of course, the function has then to be implemented using only the operations specified in the concept.
In analogy to the object-oriented paradigm, concepts correspond to virtual base classes, and models correspond to derived classes. However, there is the important difference that concepts are nowhere explicitly coded in the language. They are only communicated in documentations. This is a maintenance disadvantage, but also an advantage, because it avoids the coupling of a common base class. A common base class needs a header file and all derived classes have to agree on this single header file, linking, etc.
In general, the flexibility is resolved at compile time which gives us the advantages of strong type checking and inline efficiency where needed. If runtime flexibility is needed, the generic data structures and algorithms can be parameterized with a base class used in the object-oriented programming to get the runtime flexibility.
The following table shows the different iterator concepts and the refinement relation between them and the basic concepts (see above). The syntactic requirements are only sketched here, see [ISO-C++-98, SGI-STL] for the full requirements.
| Concept | Refinement of | Syntactic requirements |
| Trivial Iterator | Assignable, Equality Comparable | operator*() operator->() |
| Input Iterator | Trivial Iterator | operator++(), ... |
| Output Iterator | Assignable | operator*(), operator++() ... |
| Forward Iterator | Input Iterator, Output Iterator, Default Constructible | ... |
| Bidirectional Iterator | Forward Iterator | operator--(), ... |
| Random Access Iterator | Bidirectional Iterator, LessThan Comparable | operator+(), operator+=(), operator-(), operator[](), ... |
Sequences of items are specified by a range [first,beyond) of two iterators. This notion of a half-open interval denotes the sequence of all iterators obtained by starting with the iterator first and advancing first until the iterator beyond is reached, but it does not include beyond. The iterator beyond is also referred to as the past-the-end position.
A container class is supposed to provide a member type called iterator, which is a model of the Iterator concept, and two member functions: begin() returns the start iterator of the sequence and end() returns the iterator referring to the past-the-end position of the sequence. The list class template example from the previous section can be extended as follows, though we leave the actual implementation of the iterator open.
template <class T> class list {
void push_back( const T& t); // append t to list.
typedef ... iterator;
iterator begin();
iterator end();
};
Generic algorithms are not written for a particular container
class in STL, they use iterators instead. For example, a generic
contains function can be written to work for any model of an
input iterator. It returns true iff the value is
contained in the values of the range [first,beyond).
template <class InputIterator, class T>
bool contains( InputIterator first, InputIterator beyond, const T& value){
while ((first != beyond) && (*first != value))
++first;
return (first != beyond);
}
This generic contains function can be used with C-pointers
referring to a C-array. Recall that C-pointers are a model for a
random access iterator, which is more general than an input iterator.
The following example declares an array of a hundred integers and
searches for a 42.
int a[100]; // ... initialize elements of a. bool found = contains( a, a+100, 42);We can also search only a part of an array.
bool in_first_half = contains( a, a+50, 42); bool in_third_quarter = contains( a+50, a+75, 42);This generic contains function can also be used with our list class template as illustrated in the following example:
list<int> ls; // ... insert some elements into ls. bool found = contains( ls.begin(), ls.end(), 42);A generic copy function copies the values of an iterator range to a sequence starting where another iterator points to. The copy function returns an iterator pointing to the past-the-end position of the target sequence after copying.
template <class InputIterator, class OutputIterator>
OutputIterator copy( InputIterator first, InputIterator beyond,
OutputIterator result){
while (first != beyond)
*result++ = *first++;
return result;
}
Lets copy 100 elements from an array of integers to another array of integers.
int a1[100]; int a2[100]; // ... initialize elements of a1. copy( a1, a1+100, a2);The copy function is writing over the already existing elements in a2. If we want to copy the 100 elements into a list that is empty at the beginning, we cannot use the begin() iterator of the list. For an empty list the begin() iterator is actually equal to the end() iterator, which is not dereferenceable.
The STL provides in these cases small adapters that interface between the concepts. Here, the adapter is a model of an output iterator, and it uses a model of a container class, here the list, to append a new element to the end of this container class whenever an element is written to the iterator. We will see later on how this back_inserter adaptor is actually implemented. Here is the example how it is used with the copy function and the list class assuming we still have the array a1 at hand.
list<int> ls; copy( a1, a1+100, back_inserter(ls));There are also adapters to interface between C++ I/O streams and iterators. The following example reads integers from the standard input stream and writes them to the standard output stream, each integer followed by a carriage return "\n". The istream_iterator with the empty parenthesis denotes the past-the-end position for this range, which is the end-of-file condition for the stream.
copy( istream_iterator<int>(cin), istream_iterator<int>(),
ostream_iterator<int>( cout, "\n"));
The concepts in the STL and the adaptors form an extremely flexible
toolkit. Most adaptors are small classes and function. Own adaptors
for other concepts are easy to add. The whole is more than the sum of
its parts.
template <class T>
class Const_value {
T t;
public:
// Default Constructible !
Const_value() {}
explicit Const_value( const T& s) : t(s) {}
// Assignable by default.
// Equality Comparable (not so easy what that should mean here)
bool operator==( const Const_value<T>& cv) const { return ( this == &cv); }
bool operator!=( const Const_value<T>& cv) const { return !(*this == cv); }
// Trivial Iterator:
const T& operator* () const { return t; }
const T* operator->() const { return & operator*(); }
// Input Iterator
Const_value<T>& operator++() { return *this; }
Const_value<T> operator++(int) {
Const_value<T> tmp = *this;
++*this;
return tmp;
}
};
Note that operator!= and operator++(int) are
implemented in terms of other member functions of the iterator. In
this example, they are unnecessarily complicated. But in general, only
a small subset of the member functions needs to be implemented for a
new iterator, all other member functions are generic.Other examples for such simple input iterators are a counting iterator and a random number generator.
Using the concept of lazy evaluation from functional programming languages we can also imagine iterators representing more complex and potentially infinite sequences, for example, the sequence of prime numbers.
However, there is no point in copying an infinite sequence. Instead, we might be interested in a finite subsequence. Another generic function, copy_n solves this. Note that copy_n is not part of the C++ standard, but it is available in most implementations of the STL (or easy to write). (see also Const_value.C)
int a[100]; Const_value<int> cv( 42); copy_n( cv, 100, a); // fills a with 100 times 42.
| Concept | Refinement of | Syntactic requirements |
| Generator | Assignable | function call, no arguments: Result operator()() |
| Unary Function | Assignable | function call, one argument: Result operator()(Arg1) |
| Binary Function | Assignable | function call, two arguments: Result operator()(Arg1, Arg2) |
| Predicate | Unary Function | result type is bool |
| Binary Predicate | Binary Function | result type is bool |
Function objects are well suited as parameters for generic functions. A typical example would be the exchange of the equality comparison with a function object, which is currently hard coded as the operator== in the generic contains function from above. First, we define a function object equals that performs the same comparison.
template <class T>
struct equals {
bool operator()( const T& a, const T& b) { return a == b; }
};
We modify the iterator-based generic contains function from
above. It needs an additional template parameter Eq and takes
an additional function parameter eq for a binary function
object which is used for the comparison.
template <class InputIterator, class T, class Eq>
bool contains( InputIterator first, InputIterator beyond, const T& value,
Eq eq ) {
while ((first != beyond) && ( ! eq( *first, value)))
++first;
return (first != beyond);
}
The example using C-arrays with the contains function needs
now an additional argument -- the function object. The expression
equals<int>() calls the default constructor for the
template class equals<int> from above which is a
function object comparing two integers for equality.
int a[100]; // ... initialize elements of a. bool found = contains( a, a+100, 42, equals<int>());The next section illustrates how the additional parameter of the contains function can be automatically selected if the value type of the iterator is known. C++ allows to use also simple function pointers as function objects. The advantage of objects is that they can have an internal state. We continue our example of the contains function and define a comparison object that is true when the absolute value of the difference of its two arguments is smaller than eps. The eps value is stored in the function object itself. At construction time of the function object the actual value for eps is initialized, in our example to one, so that the contains function will also return true if the values 41 or 43 do occur in the range.
template <class T>
struct eps_equals {
T epsilon;
eps_equals( const T& eps) : epsilon(eps) {}
bool operator()( const T& a, const T& b) {
return (a-b <= epsilon) && (b-a <= epsilon);
}
};
bool found = contains( a, a+100, 42, eps_equals<int>(1));
How about a function object that counts the number of comparisons
needed as a side-effect? Here it is:
template <class T>
struct count_equals {
size_t& count;
count_equals( size_t& c) : count(c) {}
bool operator()( const T& a, const T& b) {
++count;
return a == b;
}
};
size_t counter = 0;
bool found = contains( a, a+100, 42, count_equals<int>(counter));
// counter contains number of comparisons needed.
Note that since function objects are usually passed by value in the
STL we store a reference to an external counter and not the counter
value itself in the function objects.
struct iterator_over_ints {
typedef int value_type;
// ...
};
Since a C-pointer is a valid iterator, this approach is not
sufficient. The solution chosen for the STL is the iterator traits
class, which is a class template parameterized with an iterator:
template <class Iterator>
struct iterator_traits {
typedef typename Iterator::value_type value_type;
// ...
};
The value type of the iterator example class above can now be
expressed as iterator_traits< iterator_over_ints
>::value_type. For C-pointers a specialized version of the
iterator traits class exists.
template <class T>
struct iterator_traits<T*> {
typedef T value_type;
// ...
};
Now the value type of a C-pointer, e.g., to int, can be
expressed as iterator_traits< int* >::value_type. Here,
partial specialization is required. The iterator traits class
contains also definitions about the difference_type, the
iterator_category, the pointer type and the
reference type of the iterator.The example of the generic contains function with the function object from above can be made more convenient for the default use with a default initializer as follows: (see also contains.C)
template <class InputIterator, class T>
bool contains( InputIterator first, InputIterator beyond, const T& value) {
typedef typename iterator_traits<InputIterator>::value_type value_type;
typedef equals<value_type> Equal;
return contains( first, beyond, value, Equal());
}
STL makes use of traits classes in other places as well, for example,
char_traits to define the equality test and other operations
for a character type. In addition, this character traits class is used
as a template parameter for the basic_string class template,
which allows the adaption of the string class to different character
sets.
| Concept | Refinement of | Syntactic requirements, model T |
| Adaptable Generator | Generator | T::result_type |
| Adaptable Unary Function | Unary Function | T::result_type, T::argument_type |
| Adaptable Binary Function | Binary Function | T::result_type, T::first_argument_type, T::second_argument_type |
| Adaptable Predicate | Predicate, Adaptable Unary Function | |
| Adaptable Binary Predicate | Binary Predicate, Adaptable Binary Function |
Small helper classes help to define adaptable function objects easily. For example, our function object equals from above could be derived from std::binary_function to declare the appropriate types.
#include <functional>
template <class T>
struct equals : public std::binary_function<T,T,bool> {
bool operator()( const T& a, const T& b) { return a == b; }
};
The definition of binary_function in the STL is as follows:
template <class Arg1, class Arg2, class Result>
struct binary_function {
typedef Arg1 first_argument_type;
typedef Arg2 second_argument_type;
typedef Result result_type;
};
Adaptable function objects can be used with adaptors to compose
function objects. The adaptors need the annotated type information to
declare proper function signatures etc. An examples is the negater
unary_negate that takes an unary predicate and is itself a model
for an unary predicate, but with negated boolean values.
template <class Predicate>
class unary_negate
: public unary_function< typename Predicate::argument_type, bool> {
protected:
Predicate pred;
public:
explicit unary_negate( const Predicate& x) : pred(x) {}
bool operator()(const typename Predicate::argument_type& x) const {
return ! pred(x);
}
};
The function adaptors are paired with function templates for easy
creation. The idea is that the function template derives the type for
the template argument automatically (because of the matching
types).
template <class Predicate>
inline unary_negate< Predicate>
not1( const Predicate& pred) {
return unary_negate< Predicate>( pred);
}
A short program in [Stepanov95]
makes use of this negater. The program copies all integers from
cin to cout that cannot be divided by the integer
parameter given to the program. (see also remove_if_divides.C)
int main( int argc, char** argv) {
if ( argc != 2)
throw( "usage: remove_if_divides integer\n");
remove_copy_if( istream_iterator<int>(cin), istream_iterator<int>(),
ostream_iterator<int>(cout, "\n"),
not1( bind2nd( modulus<int>(), atoi( argv[1]))));
return 0;
}
The other function object adaptor in this example, bind2nd,
is again a small helper function to create an object of type
binder2nd.
template < class Operation, class Tp>
inline binder2nd< Operation>
bind2nd( const Operation& fn, const Tp& x) {
typedef typename Operation::second_argument_type Arg2_type;
return binder2nd< Operation>( fn, Arg2_type(x));
}
An object of type binder2nd stores an adaptable binary
function object and a value compatible with the type of the second
argument of the adaptable binary function object. The object itself
behaves then like an unary function object. Whenever its operator is
called, it returns the value of the binary function object called with
its argument and its internally stored value as second argument. This
adapter binds a value to the free variable of the second argument of a
binary function object. There is a similar adaptor called
binder1st that binds a value to the first argument. This is
similar to currying known in functional programming languages
(it needs much more writing in C++ to make it work, but then it
works). So, these are higher order function objects.
template <class Operation>
class binder2nd
: public unary_function< typename Operation::first_argument_type,
typename Operation::result_type> {
protected:
Operation op;
typename Operation::second_argument_type value;
public:
binder2nd( const Operation& x,
const typename Operation::second_argument_type& y)
: op(x), value(y) {}
typename Operation::result_type
operator()(const typename Operation::first_argument_type& x) const {
return op(x, value);
}
};
Other function object adaptors exist that can compose function
objects, or encapsulate function pointers and member function pointers
in adaptable function objects.
template <class Container>
class back_insert_iterator {
protected:
Container* container;
public:
typedef Container container_type;
typedef output_iterator_tag iterator_category;
typedef void value_type;
typedef void difference_type;
typedef void pointer;
typedef void reference;
explicit back_insert_iterator(Container& x) : container(&x) {}
back_insert_iterator<Container>&
operator=(const typename Container::value_type& value) {
container->push_back(value);
return *this;
}
back_insert_iterator<Container>& operator*() { return *this; }
back_insert_iterator<Container>& operator++() { return *this; }
back_insert_iterator<Container>& operator++(int) { return *this; }
};
A small helper function template provides again the convenience not to
type the template arguments explicitly.
template <class Container>
inline back_insert_iterator<Container> back_inserter(Container& x) {
return back_insert_iterator<Container>(x);
}
Here is a short example of its use with a list class.
list<int> ls; copy( a1, a1+100, back_inserter(ls));
The C++ standard defines five empty classes to denote the different iterator categories. These types will be used as symbolic tags at compile time.
struct input_iterator_tag {};
struct output_iterator_tag {};
struct forward_iterator_tag : public input_iterator_tag {};
struct bidirectional_iterator_tag : public forward_iterator_tag {};
struct random_access_iterator_tag : public bidirectional_iterator_tag {};
An iterator is assumed to have a local type iterator_category
that is defined to be one of these tags.
struct Some_iterator {
typedef forward_iterator_tag iterator_category;
// ...
};
This iterator category is accessed using iterator traits. Now we can
implement a generic distance function (original implementation as it is in the STL):
template <class InputIterator>
inline typename iterator_traits<InputIterator>::difference_type
__distance( InputIterator first, InputIterator last, input_iterator_tag) {
typename iterator_traits<InputIterator>::difference_type n = 0;
while (first != last)
++first; ++n;
return n;
}
template <class RandomAccessIterator>
inline typename iterator_traits<RandomAccessIterator>::difference_type
__distance( RandomAccessIterator first, RandomAccessIterator last,
random_access_iterator_tag) {
return last - first;
}
template <class InputIterator>
inline typename iterator_traits<InputIterator>::difference_type
distance( InputIterator first, InputIterator last) {
typedef typename iterator_traits<InputIterator>::iterator_category
Category;
return __distance(first, last, Category());
}
Note how the class hierarchy among the iterator tags is used to reduce
the number of overloaded functions __distance that need to be
implemented here. Following the refinement relation of the iterator
concepts, the forward_iterator_tag should be derived also
from the output_iterator_tag. Obscure reasons about multiple
derivation kept this derivation out of the standard. On the other
hand, this derivation isn't likely to simplify real implementations
anyway.These tags are quite convenient to annotate symbolic information at compile time. However, there is a catch. An object has always non-zero size, even of an empty class. This is reasonable (the address identifies an object) and helps defining invariants about size, allocation, arrays, etc. However, if we derive from an empty class, like we do with function objects and binary_function<Arg1,Arg2,Result>, we would like to avoid any size penalties. In principle the compiler could perform this optimization, but, for example, g++ does not. The following program shows the effect.
#include <iostream>
using namespace std;
class A {};
class B : public A {
int i;
};
class C {
int i;
};
int main() {
cout << "size of A = " << sizeof(A) << endl;
cout << "size of B = " << sizeof(B) << endl;
cout << "size of C = " << sizeof(C) << endl;
return 0;
}