A
catalog of STL algorithms
This
section provides a quick reference for when you’re searching for the
appropriate algorithm. I leave the full exploration of all the STL algorithms
to other references (see the end of this chapter, and Appendix XX), along with
the more intimate details of complexity, performance, etc. My goal here is for
you to become rapidly comfortable and facile with the algorithms, and I will
assume you will look into the more specialized references if you need more
depth of detail.
Although
you will often see the algorithms described using their full template
declaration syntax, I am not doing that here because you already know they are
templates, and it’s quite easy to see what the template arguments are
from the function declarations. The type names for the arguments provide
descriptions for the types of iterators required. I think you’ll find
this form is easier to read, while you can quickly find the full declaration in
the template header file if for some reason you feel the need.
The
names of the iterator classes describe the iterator type they must conform to.
The iterator types were described in the previous chapter, but here is a summary:
InputIterator.
You
(or rather, the STL algorithm and any algorithms you write that use
InputIterators)
can
increment this with
operator++
and dereference it with
operator*
to
read
the value (and
only
read the value), but you can only read each value once.
InputIterators
can
be tested with
operator==
and
operator!=.
That’s all. Because an
InputIterator
is so limited, it can be used with
istreams
(via
istream_iterator). OutputIterator.
This
can be incremented with
operator++,
and dereferenced with
operator*
to
write
the value (and
only
write the value), but you can only dereference/write each value once.
OutputIterators
cannot be tested with
operator==
and
operator!=,
however, because you assume that you can just keep sending elements to the
destination and that you don’t have to see if the destination’s end
marker has been reached. That is, the container that an
OutputIterator
references can take an infinite number of objects, so no end-checking is
necessary. This requirement is important so that an
OutputIterator
can be used with
ostreams
(via
ostream_iterator),
but you’ll also commonly use the “insert” iterators
insert_iterator,
front_insert_iterator
and
back_insert_iterator
(generated by the helper templates
inserter( ),
front_inserter( )
and
back_inserter( )). With
both
InputIterator
and
OutputIterator,
you cannot have multiple iterators pointing to different parts of the same
range. Just think in terms of iterators to support
istreams
and
ostreams,
and
InputIterator
and
OutputIterator
will make perfect sense. Also note that
InputIterator
and
OutputIterator
put the weakest restrictions on the types of iterators they will accept, which
means that you can use any “more sophisticated” type of iterator
when you see
InputIterator
or
OutputIterator
used as STL algorithm template arguments.
ForwardIterator.
InputIterator
and
OutputIterator
are the most restricted, which means they’ll work with the largest number
of actual iterators. However, there are some operations for which they are too
restricted; you can only read from an
InputIterator
and write to an
OutputIterator,
so you can’t use them to read and modify a range, for example, and you
can’t have more than one active iterator on a particular range, or
dereference such an iterator more than once. With a
ForwardIterator
these restrictions are relaxed; you can still only move forward using
operator++,
but you can both write and read and you can write/read multiple times in each
location. A
ForwardIterator
is much more like a regular pointer, whereas
InputIterator
and
OutputIterator
are a bit strange by comparison.
BidirectionalIterator.
Effectively,
this is a
ForwardIterator
that can also go backward. That is, a
BidirectionalIterator
supports all the operations that a
ForwardIterator
does, but in addition it has an
operator--.
RandomAccessIterator.
An iterator that is random access supports all the same operations that a
regular pointer does: you can add and subtract integral values to move it
forward and backward by jumps (rather than just one element at a time), you can
subscript it with
operator[ ],
you can subtract one iterator from another, and iterators can be compared to
see which is greater using
operator<,
operator>,
etc. If you’re implementing a sorting routine or something similar,
random access iterators are necessary to be able to create an efficient
algorithm.
The
names used for the template parameter types consist of the above iterator types
(sometimes with a ‘1’ or ‘2’ appended to distinguish
different template arguments), and may also include other arguments, often
function objects.
When
describing the group of elements that an operation is performed on,
mathematical “range” notation will often be used. In this, the
square bracket means “includes the end point” while the parenthesis
means “does not include the end point.” When using iterators, a
range is determined by the iterator pointing to the initial element, and the
“past-the-end” iterator, pointing past the last element. Since the
past-the-end element is never used, the range determined by a pair of iterators
can thus be expressed as
[first,
last)
,
where
first
is the iterator pointing to the initial element and
last
is the past-the-end iterator.
Most
books and discussions of the STL algorithms arrange them according to side
effects: nonmutating algorithms don’t change the elements in the range,
mutating algorithms do change the elements, etc. These descriptions are based
more on the underlying behavior or implementation of the algorithm – that
is, the designer’s perspective. In practice, I don’t find this a
very useful categorization so I shall instead organize them according to the
problem you want to solve: are you searching for an element or set of elements,
performing an operation on each element, counting elements, replacing elements,
etc. This should help you find the one you want more easily.
Note
that all the algorithms are in the
namespace
std
.
If you do not see a different header such as
<utility>
or
<numerics>
above the function declarations, that means it appears in
<algorithm>.
Support
tools for example creation
It’s
useful to create some basic tools with which to test the algorithms.
Displaying
a range is something that will be done constantly, so here is a templatized
function that allows you to print any sequence, regardless of the type
that’s in that sequence:
//: C21:PrintSequence.h
// Prints the contents of any sequence
#ifndef PRINTSEQUENCE_H
#define PRINTSEQUENCE_H
#include <iostream>
template<typename InputIter>
void print(InputIter first, InputIter last,
char * nm = "", char* sep = "\n",
std::ostream& os = std::cout) {
if(*nm != '\0') // Only if you provide a string
os << nm << ": " << sep; // is this printed
while(first != last)
os << *first++ << sep;
os << std::endl;
}
// Use template-templates to allow type deduction
// of the typename T:
template<typename T, template<typename> class C>
void print(C<T>& c, char * nm = "",
char* sep = "\n",
std::ostream& os = std::cout) {
if(*nm != '\0') // Only if you provide a string
os << nm << ": " << sep; // is this printed
std::copy(c.begin(), c.end(),
std::ostream_iterator<T>(os, " "));
cout << endl;
}
#endif // PRINTSEQUENCE_H ///:~
There
are two forms here, one that requires you to give an explicit range (this
allows you to print an array or a sub-sequence) and one that prints any of the
STL containers, which provides notational convenience when printing the entire
contents of that container. The second form performs template type deduction to
determine the type of
T
so it can be used in the
copy( )
algorithm. That trick wouldn’t work with the first form, so the
copy( )
algorithm is avoided and the copying is just done by hand (this could have been
done with the second form as well, but it’s instructive to see a
template-template in use). Because of this, you never need to specify the type
that you’re printing when you call either template function.
The
default is to print to
cout
with newlines as separators, but you can change that. You may also provide a
message to print at the head of the output.
Next,
it’s useful to have some generators (classes with an
operator( )
that returns values of the appropriate type) which allow a sequence to be
rapidly filled with different values.
//: C21:Generators.h
// Different ways to fill sequences
#ifndef GENERATORS_H
#define GENERATORS_H
#include <set>
#include <cstdlib>
#include <ctime>
// A generator that can skip over numbers:
class SkipGen {
int i;
int skp;
public:
SkipGen(int start = 0, int skip = 1)
: i(start), skp(skip) {}
int operator()() {
int r = i;
i += skp;
return r;
}
};
// Generate unique random numbers from 0 to mod:
class URandGen {
std::set<int> used;
int modulus;
public:
URandGen(int mod) : modulus(mod) {
std::srand(std::time(0));
}
int operator()() {
while(true) {
int i = (int)std::rand() % modulus;
if(used.find(i) == used.end()) {
used.insert(i);
return i;
}
}
}
};
// Produces random characters:
class CharGen {
static const char* source;
static const int len;
public:
CharGen() { std::srand(std::time(0)); }
char operator()() {
return source[std::rand() % len];
}
};
// Statics created here for convenience, but
// will cause problems if multiply included:
const char* CharGen::source = "ABCDEFGHIJK"
"LMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
const int CharGen::len = strlen(source);
#endif // GENERATORS_H ///:~
To
create some interesting values, the
SkipGen
generator skips by the value
skp
each time its
operator( )
is called. You can initialize both the start value and the skip value in the
constructor.
URandGen
(‘U’ for “unique”) is a generator for random
ints
between 0 and
mod,
with the additional constraint that each value can only be produced once (thus
you must be careful not to use up all the values). This is easily accomplished
with a
set. CharGen
generates
chars
and can be used to fill up a
string
(when treating a
string
as a sequence container). You’ll note that the one member function that
any generator implements is
operator( )
(with no arguments). This is what is called by the “generate”
functions.
The
use of the generators and the
print( )
functions is shown in the following section.
Finally,
a number of the STL algorithms that move elements of a sequence around
distinguish between “stable” and “unstable” reordering
of a sequence. This refers to preserving the original order of the elements for
those elements that are equivalent but not identical. For example, consider a
sequence
{
c(1), b(1), c(2), a(1), b(2), a(2) }
.
These elements are tested for equivalence based on their letters, but their
numbers indicate how they first appeared in the sequence. If you sort (for
example) this sequence using an unstable sort, there’s no guarantee of
any particular order among equivalent letters, so you could end up with
{
a(2), a(1), b(1), b(2), c(2), c(1) }
.
However, if you used a stable sort, it guarantees you will get
{
a(1), a(2), b(1), b(2), c(1), c(2) }
. To
demonstrate the stability versus instability of algorithms that reorder a
sequence, we need some way to keep track of how the elements originally
appeared. The following is a kind of
string
object that keeps track of the order in which that particular object originally
appeared, using a
static
map
that maps
NStrings
to
Counters.
Each
NString
then contains an
occurrence
field that indicates the order in which this
NString
was discovered:
//: C21:NString.h
// A "numbered string" that indicates which
// occurrence this is of a particular word
#ifndef NSTRING_H
#define NSTRING_H
#include <string>
#include <map>
#include <iostream>
class NString {
std::string s;
int occurrence;
struct Counter {
int i;
Counter() : i(0) {}
Counter& operator++(int) {
i++;
return *this;
} // Post-incr
operator int() { return i; }
};
// Keep track of the number of occurrences:
typedef std::map<std::string, Counter> csmap;
static csmap occurMap;
public:
NString() : occurrence(0) {}
NString(const std::string& x)
: s(x), occurrence(occurMap[s]++) {}
NString(const char* x)
: s(x), occurrence(occurMap[s]++) {}
// The synthesized operator= and
// copy-constructor are OK here
friend std::ostream& operator<<(
std::ostream& os, const NString& ns) {
return os << ns.s << " ["
<< ns.occurrence << "]";
}
// Need this for sorting. Notice it only
// compares strings, not occurrences:
friend bool
operator<(const NString& l, const NString& r) {
return l.s < r.s;
}
// For sorting with greater<NString>:
friend bool
operator>(const NString& l, const NString& r) {
return l.s > r.s;
}
// To get at the string directly:
operator const std::string&() const {return s;}
};
// Allocate static member object. Done here for
// brevity, but should actually be done in a
// separate CPP file:
NString::csmap NString::occurMap;
#endif // NSTRING_H ///:~
In
the constructors (one that takes a
string,
one that takes a
char*),
the simple-looking initialization
occurrence(occurMap[s]++)
performs all the work of maintaining and assigning the occurrence counts (see
the demonstration of the
map
class in the previous chapter for more details).
To
do an ordinary ascending sort, the only operator that’s necessary is
NString::operator<( ),
however to sort in reverse order the
operator>( )
is also provided so that the
greater
template can be used.
As
this is just a demonstration class I am getting away with the convenience of
putting the definition of the static member
occurMap
in the header file, but this will break down if the header file is included in
more than one place, so you should normally relegate all
static
definitions to CPP files.
Filling
& generating
These
algorithms allow you to automatically fill a range with a particular value, or
to generate a set of values for a particular range (these were introduced in
the previous chapter). The “fill” functions insert a single value
multiple times into the container, while the “generate” functions
use an object called a
generator
(described earlier) to create the values to insert into the container.
void
fill(ForwardIterator first, ForwardIterator last, const T& value);
void
fill_n(OutputIterator first, Size n, const T& value);
fill( )
assigns
value
to every element in the range
[first,
last)
.
fill_n( )
assigns
value
to
n
elements starting at
first. void
generate(ForwardIterator first, ForwardIterator last, Generator gen);
void
generate_n(OutputIterator first, Size n, Generator gen);
generate( )
makes a call to
gen( )
for each element in the range
[first,
last)
,
presumably
to
produce a different value for each element.
generate_n( )
calls
gen( )
n
times and assigns each result to
n
elements starting at
first.
Example
The
following example fills and generates into
vectors.
It also shows the use of
print( ):
//: C21:FillGenerateTest.cpp
// Demonstrates "fill" and "generate"
#include <vector>
#include <algorithm>
#include <string>
#include "Generators.h"
#include "PrintSequence.h"
using namespace std;
int main() {
vector<string> v1(5);
fill(v1.begin(), v1.end(), "howdy");
print(v1, "v1", " ");
vector<string> v2;
fill_n(back_inserter(v2), 7, "bye");
print(v2.begin(), v2.end(), "v2");
vector<int> v3(10);
generate(v3.begin(), v3.end(), SkipGen(4,5));
print(v3, "v3", " ");
vector<int> v4;
generate_n(back_inserter(v4),15, URandGen(30));
print(v4, "v4", " ");
} ///:~
A
vector<string>
is created with a pre-defined size. Since storage has already been created for
all the
string
objects in the
vector,
fill( )
can use its assignment operator to assign a copy of “howdy” to each
space in the
vector.
To print the result, the second form of
print( )
is used which simply needs a container (you don’t have to give the first
and last iterators). Also, the default newline separator is replaced with a
space.
The
second
vector<string>
v2
is not given an initial size so
back_inserter
must be used to force new elements in instead of trying to assign to existing
locations. Just as an example, the other
print( )
is used which requires a range.
The
generate( )
and
generate_n( )
functions have the same form as the “fill” functions except that
they use a generator instead of a constant value; here, both generators are
demonstrated.
Counting
All
containers have a method
size( )
that will tell you how many elements they hold. The following two algorithms
count objects only if they satisfy certain criteria.
IntegralValue
count(InputIterator first, InputIterator last,
const EqualityComparable& value);
Produces
the number of elements in
[first,
last)
that are equivalent to
value
(when tested using
operator==). IntegralValue
count_if(InputIterator first, InputIterator last, Predicate pred);
Produces
the number of elements
in
[first,
last)
which each cause
pred
to return
true.
Example
Here,
a
vector<char>
v
is
filled
with random characters (including some duplicates). A
set<char>
is initialized from
v,
so it holds only one of each letter represented in
v.
This
set
is used to count all the instances of all the different characters, which are
then displayed:
//: C21:Counting.cpp
// The counting algorithms
#include <vector>
#include <algorithm>
#include "Generators.h"
#include "PrintSequence.h"
using namespace std;
int main() {
vector<char> v;
generate_n(back_inserter(v), 50, CharGen());
print(v, "v", "");
// Create a set of the characters in v:
set<char> cs(v.begin(), v.end());
set<char>::iterator it = cs.begin();
while(it != cs.end()) {
int n = count(v.begin(), v.end(), *it);
cout << *it << ": " << n << ", ";
it++;
}
int lc = count_if(v.begin(), v.end(),
bind2nd(greater<char>(), 'a'));
cout << "\nLowercase letters: " << lc << endl;
sort(v.begin(), v.end());
print(v, "sorted", "");
} ///:~
The
count_if( )
algorithm is demonstrated by counting all the lowercase letters; the predicate
is created using the
bind2nd( )
and
greater
function object templates.
Manipulating
sequences
These
algorithms allow you to move sequences around.
OutputIterator
copy(InputIterator, first InputIterator last, OutputIterator destination);
Using
assignment, copies from
[first,
last)
to
destination,
incrementing
destination
after each assignment. Works with almost any type of source range and almost
any kind of destination. Because assignment is used, you cannot directly insert
elements into an empty container or at the end of a container, but instead you
must wrap the
destination
iterator in an
insert_iterator
(typically by using
back_inserter( ),
or
inserter( )
in the case of an associative container).
The
copy algorithm is used in many examples in this book.
BidirectionalIterator2
copy_backward(BidirectionalIterator1 first,
BidirectionalIterator1 last, BidirectionalIterator2 destinationEnd);
Like
copy( ),
but performs the actual copying of the elements in reverse order. That is, the
resulting sequence is the same, it’s just that the copy happens in a
different way. The source range
[first,
last)
is copied to the destination, but the first destination element is
destinationEnd
- 1
.
This iterator is then decremented after each assignment. The space in the
destination range must already exist (to allow assignment), and the destination
range cannot be within the source range.
void
reverse(BidirectionalIterator first, BidirectionalIterator last);
OutputIterator
reverse_copy(BidirectionalIterator first, BidirectionalIterator last,
OutputIterator destination);
Both
forms of this function reverse the range
[first,
last)
.
reverse( )
reverses the range in place, while
reverse_copy( )
leaves the original range alone and copies the reversed elements into
destination,
returning the past-the-end iterator of the resulting range.
ForwardIterator2
swap_ranges(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2);
Exchanges
the contents of two ranges of equal size, by moving from the beginning to the
end of each range and swapping each set of elements.
void
rotate(ForwardIterator first, ForwardIterator middle, ForwardIterator last);
OutputIterator
rotate_copy(ForwardIterator first, ForwardIterator middle,
ForwardIterator last, OutputIterator destination);
Swaps
the two ranges
[first,
middle)
and
[middle,
last)
.
With
rotate( ),
the swap is performed in place, and with
rotate_copy( )
the original range is untouched and the rotated version is copied into
destination,
returning the past-the-end iterator of the resulting range. Note that while
swap_ranges( )
requires that the two ranges be exactly the same size, the “rotate”
functions do not.
bool
next_permutation(BidirectionalIterator first, BidirectionalIterator last);
bool
next_permutation(BidirectionalIterator first, BidirectionalIterator last,
StrictWeakOrdering binary_pred);
bool
prev_permutation(BidirectionalIterator first, BidirectionalIterator last);
bool
prev_permutation(BidirectionalIterator first, BidirectionalIterator last,
StrictWeakOrdering binary_pred);
A
permutation
is one unique ordering of a set of elements. If you have
n
unique elements, then there are
n!
(
n
factorial) distinct possible combinations of those elements. All these
combinations can be conceptually sorted into a sequence using a lexicographical
ordering, and thus produce a concept of a “next” and
“previous” permutation. Therefore, whatever the current ordering of
elements in the range, there is a distinct “next” and
“previous” permutation in the sequence of permutations.
The
next_permutation( )
and
prev_permutation( )
functions re-arrange the elements into their next or previous permutation, and
if successful return
true.
If there are no more “next” permutations, it means that the
elements are in sorted order so
next_permutation( )
returns
false.
If there are no more “previous” permutations, it means that the
elements are in descending sorted order so
previous_permutation( )
returns
false. The
versions of the functions which have a
StrictWeakOrdering
argument perform the comparisons using
binary_pred
instead of
operator<. void
random_shuffle(RandomAccessIterator first, RandomAccessIterator last);
void
random_shuffle(RandomAccessIterator first, RandomAccessIterator last
RandomNumberGenerator& rand);
This
function randomly rearranges the elements in the range. It yields uniformly
distributed results. The first form uses an internal random number generator
and the second uses a user-supplied random-number generator.
BidirectionalIterator
partition(BidirectionalIterator first, BidirectionalIterator last,
Predicate pred);
BidirectionalIterator
stable_partition(BidirectionalIterator first,
BidirectionalIterator last, Predicate pred);
The
“partition” functions use
pred
to organize the elements in the range
[first,
last)
so they are before or after the partition (a point in the range). The partition
point is given by the returned iterator. If
pred(*i)
is
true
(where
i
is the iterator pointing to a particular element), then that element will be
placed before the partition point, otherwise it will be placed after the
partition point.
With
partition( ),
the order of the elements is after the function call is not specified, but with
stable_parition( )
the relative order of the elements before and after the partition point will be
the same as before the partitioning process.
Example
This
gives a basic demonstration of sequence manipulation:
//: C21:Manipulations.cpp
// Shows basic manipulations
#include <vector>
#include <string>
#include <algorithm>
#include "Generators.h"
#include "PrintSequence.h"
#include "NString.h"
using namespace std;
int main() {
vector<int> v1(10);
// Simple counting:
generate(v1.begin(), v1.end(), SkipGen());
print(v1, "v1", " ");
vector<int> v2(v1.size());
copy_backward(v1.begin(), v1.end(), v2.end());
print(v2, "copy_backward", " ");
reverse_copy(v1.begin(), v1.end(), v2.begin());
print(v2, "reverse_copy", " ");
reverse(v1.begin(), v1.end());
print(v1, "reverse", " ");
int half = v1.size() / 2;
// Ranges must be exactly the same size:
swap_ranges(v1.begin(), v1.begin() + half,
v1.begin() + half);
print(v1, "swap_ranges", " ");
// Start with fresh sequence:
generate(v1.begin(), v1.end(), SkipGen());
print(v1, "v1", " ");
int third = v1.size() / 3;
for(int i = 0; i < 10; i++) {
rotate(v1.begin(), v1.begin() + third,
v1.end());
print(v1, "rotate", " ");
}
cout << "Second rotate example:" << endl;
char c[] = "aabbccddeeffgghhiijj";
const char csz = strlen(c);
for(int i = 0; i < 10; i++) {
rotate(c, c + 2, c + csz);
print(c, c + csz, "", "");
}
cout << "All n! permutations of abcd:" << endl;
int nf = 4 * 3 * 2 * 1;
char p[] = "abcd";
for(int i = 0; i < nf; i++) {
next_permutation(p, p + 4);
print(p, p + 4, "", "");
}
cout << "Using prev_permutation:" << endl;
for(int i = 0; i < nf; i++) {
prev_permutation(p, p + 4);
print(p, p + 4, "", "");
}
cout << "random_shuffling a word:" << endl;
string s("hello");
cout << s << endl;
for(int i = 0; i < 5; i++) {
random_shuffle(s.begin(), s.end());
cout << s << endl;
}
NString sa[] = { "a", "b", "c", "d", "a", "b",
"c", "d", "a", "b", "c", "d", "a", "b", "c"};
const int sasz = sizeof sa / sizeof *sa;
vector<NString> ns(sa, sa + sasz);
print(ns, "ns", " ");
vector<NString>::iterator it =
partition(ns.begin(), ns.end(),
bind2nd(greater<NString>(), "b"));
cout << "Partition point: " << *it << endl;
print(ns, "", " ");
// Reload vector:
copy (sa, sa + sasz, ns.begin());
it = stable_partition(ns.begin(), ns.end(),
bind2nd(greater<NString>(), "b"));
cout << "Stable partition" << endl;
cout << "Partition point: " << *it << endl;
print(ns, "", " ");
} ///:~
The
best way to see the results of the above program is to run it (you’ll
probably want to redirect the output to a file).
The
vector<int>
v1
is initially loaded with a simple ascending sequence and printed. You’ll
see that the effect of
copy_backward( )
(which copies into
v2,
which is the same size as
v1)
is the same as an ordinary copy. Again,
copy_backward( )
does the same thing as
copy( ),
it just performs the operations in backward order.
reverse_copy( ),
however, actually does created a reversed copy, while
reverse( )
performs the reversal in place. Next,
swap_ranges( )
swaps the upper half of the reversed sequence with the lower half. Of course,
the ranges could be smaller subsets of the entire vector, as long as they are
of equivalent size.
After
re-creating the ascending sequence,
rotate( )
is demonstrated by rotating one third of
v1
multiple times. A second
rotate( )
example uses characters and just rotates two characters at a time. This also
demonstrates the flexibility of both the STL algorithms and the
print( )
template, since they can both be used with arrays of
char
as easily as with anything else.
To
demonstrate
next_permutation( )
and
prev_permutation( ),
a set of four characters “abcd” is permuted through all
n!
(
n
factorial) possible combinations. You’ll see from the output that the
permutations move through a strictly-defined order (that is, permuting is a
deterministic process).
A
quick-and-dirty demonstration of
random_shuffle( )
is to apply it to a
string
and see what words result. Because a
string
object has
begin( )
and
end( )
member functions that return the appropriate iterators, it too may be easily
used with many of the STL algorithms. Of course, an array of
char
could also have been used.
Finally,
the
partition( )
and
stable_partition( )
are demonstrated, using an array of
NString.
You’ll note that the aggregate initialization expression uses
char
arrays, but
NString
has a
char*
constructor which is automatically used.
When
partitioning a sequence, you need a predicate which will determine whether the
object belongs above or below the partition point. This takes a single argument
and returns
true
(the object is above the partition point) or
false
(it isn’t). I could have written a separate function or function object
to do this, but for something simple, like “the object is greater than
‘b’”, why not use the built-in function object templates? The
expression is:
bind2nd(greater<NString>(),
"b")
And
to understand it, you need to pick it apart from the middle outward. First,
produces
a binary function object which compares its first and second arguments:
and
returns a
bool.
But we don’t want a binary predicate, and we want to compare against the
constant value “
b.”
So
bind2nd( )
says: create a new function object which only takes one argument, by taking this
greater<NString>( )
function and forcing the second argument to always be “
b.”
The first argument (the only argument) will be the one from the vector
ns. You’ll
see from the output that with the unstable partition, the objects are correctly
above and below the partition point, but in no particular order, whereas with
the stable partition their original order is maintained.
Searching
& replacing
All
of these algorithms are used for searching for one or more objects within a
range defined by the first two iterator arguments.
InputIterator
find(InputIterator first, InputIterator last,
const EqualityComparable& value);
Searches
for
value
within
a range of elements. Returns an iterator in the range
[first,
last)
that points to the first occurrence of
value.
If
value
isn’t in the range, then
find( )
returns
last.
This is a
linear
search
,
that is, it starts at the beginning and looks at each sequential element
without making any assumptions about the way the elements are ordered. In
contrast, a
binary_search( )
(defined later) works on a sorted sequence and can thus be much faster.
InputIterator
find_if(InputIterator first, InputIterator last, Predicate pred);
Just
like
find( ),
find_if( )
performs a linear search through the range. However, instead of searching for
value,
find_if( )
looks for an element such that the
Predicate
pred
returns
true
when applied to that element. Returns
last
if no such element can be found.
ForwardIterator
adjacent_find(ForwardIterator first, ForwardIterator last);
ForwardIterator
adjacent_find(ForwardIterator first, ForwardIterator last,
BinaryPredicate binary_pred);
Like
find( ),
performs a linear search through the range, but instead of looking for only one
element it searches for two elements that are right next to each other. The
first form of the function looks for two elements that are equivalent (via
operator==).
The second form looks for two adjacent elements that, when passed together to
binary_pred,
produce a
true
result. If two adjacent elements cannot be found,
last
is returned.
ForwardIterator1
find_first_of(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2, ForwardIterator2 last2);
ForwardIterator1
find_first_of(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2, ForwardIterator2 last2, BinaryPredicate
binary_pred);
Like
find( ),
performs a linear search through the range. The first form finds the first
element in the first range that is equivalent to any of the elements in the
second range. The second form finds the first element in the first range that
produces
true
when passed to
binary_pred
along with any of the elements in the second range. When a
BinaryPredicate
is used with two ranges in the algorithms, the element from the first range
becomes the first argument to
binary_pred,
and the element from the second range becomes the second argument.
ForwardIterator1
search(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2, ForwardIterator2 last2);
ForwardIterator1
search(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2, ForwardIterator2 last2 BinaryPredicate binary_pred);
Attempts
to find the entire range
[first2,
last2)
within the range
[first1,
last1)
.
That is, it checks to see if the second range occurs (in the exact order of the
second range) within the first range, and if so returns an iterator pointing to
the place in the first range where the second range begins. Returns
last1
if no subset can be found. The first form performs its test using
operator==,
while the second checks to see if each pair of objects being compared causes
binary_pred
to return
true. ForwardIterator1
find_end(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2, ForwardIterator2 last2);
ForwardIterator1
find_end(ForwardIterator1 first1, ForwardIterator1 last1,
ForwardIterator2 first2, ForwardIterator2 last2, BinaryPredicate binary_pred);
The
forms and arguments are just like
search( )
in that it looks for the second range within the first range, but while
search( )
looks for the first occurrence of the second range,
find_end( )
looks for the
last
occurrence of the second range within the first.
ForwardIterator
search_n(ForwardIterator first, ForwardIterator last,
Size count, const T& value);
ForwardIterator
search_n(ForwardIterator first, ForwardIterator last,
Size count, const T& value, BinaryPredicate binary_pred);
Looks
for a group of
count
consecutive values in
[first,
last)
that are all equal to
value
(in the first form) or that all cause a return value of
true
when passed into
binary_pred
along with
value
(in the second form). Returns
last
if such a group cannot be found.
ForwardIterator
min_element(ForwardIterator first, ForwardIterator last);
ForwardIterator
min_element(ForwardIterator first, ForwardIterator last,
BinaryPredicate binary_pred);
Returns
an iterator pointing to the first occurrence of the smallest value in the range
(there may be multiple occurrences of the smallest value). Returns
last
if the range is empty. The first version performs comparisons with
operator<
and the value
r
returned is such that
*e
< *r
is
false for every element
e
in the range. The second version compares using
binary_pred
and the value
r
returned is such that
binary_pred
(*e, *r)
is false for every element
e
in the range.
ForwardIterator
max_element(ForwardIterator first, ForwardIterator last);
ForwardIterator
max_element(ForwardIterator first, ForwardIterator last,
BinaryPredicate binary_pred);
Returns
an iterator pointing to the first occurrence of the largest value in the range
(there may be multiple occurrences of the largest value). Returns
last
if the range is empty. The first version performs comparisons with
operator<
and the value
r
returned is such that
*r
< *e
is
false for every element
e
in the range. The second version compares using
binary_pred
and the value
r
returned is such that
binary_pred
(*r, *e)
is false for every element
e
in the range.
void
replace(ForwardIterator first, ForwardIterator last,
const T& old_value, const T& new_value);
void
replace_if(ForwardIterator first, ForwardIterator last,
Predicate pred, const T& new_value);
OutputIterator
replace_copy(InputIterator first, InputIterator last,
OutputIterator result, const T& old_value, const T& new_value);
OutputIterator
replace_copy_if(InputIterator first, InputIterator last,
OutputIterator result, Predicate pred, const T& new_value);
Each
of the “replace” forms moves through the range
[first,
last)
,
finding values that match a criterion and replacing them with
new_value.
Both
replace( )
and
replace_copy( )
simply look for
old_value
to replace, while
replace_if( )
and
replace_copy_if( )
look for values that satisfy the predicate
pred.
The “copy” versions of the functions do not modify the original
range but instead make a copy with the replacements into
result
(incrementing
result
after each assignment).
Example
To
provide easy viewing of the results, this example will manipulate
vectors
of
int.
Again, not every possible version of each algorithm will be shown (some that
should be obvious have been omitted).
//: C21:SearchReplace.cpp
// The STL search and replace algorithms
#include <vector>
#include <algorithm>
#include <functional>
#include "PrintSequence.h"
using namespace std;
struct PlusOne {
bool operator()(int i, int j) {
return j == i + 1;
}
};
class MulMoreThan {
int value;
public:
MulMoreThan(int val) : value(val) {}
bool operator()(int v, int m) {
return v * m > value;
}
};
int main() {
int a[] = { 1, 2, 3, 4, 5, 6, 6, 7, 7, 7,
8, 8, 8, 8, 11, 11, 11, 11, 11 };
const int asz = sizeof a / sizeof *a;
vector<int> v(a, a + asz);
print(v, "v", " ");
vector<int>::iterator it =
find(v.begin(), v.end(), 4);
cout << "find: " << *it << endl;
it = find_if(v.begin(), v.end(),
bind2nd(greater<int>(), 8));
cout << "find_if: " << *it << endl;
it = adjacent_find(v.begin(), v.end());
while(it != v.end()) {
cout << "adjacent_find: " << *it
<< ", " << *(it + 1) << endl;
it = adjacent_find(it + 2, v.end());
}
it = adjacent_find(v.begin(), v.end(),
PlusOne());
while(it != v.end()) {
cout << "adjacent_find PlusOne: " << *it
<< ", " << *(it + 1) << endl;
it = adjacent_find(it + 1, v.end(),
PlusOne());
}
int b[] = { 8, 11 };
const int bsz = sizeof b / sizeof *b;
print(b, b + bsz, "b", " ");
it = find_first_of(v.begin(), v.end(),
b, b + bsz);
print(it, it + bsz, "find_first_of", " ");
it = find_first_of(v.begin(), v.end(),
b, b + bsz, PlusOne());
print(it,it + bsz,"find_first_of PlusOne"," ");
it = search(v.begin(), v.end(), b, b + bsz);
print(it, it + bsz, "search", " ");
int c[] = { 5, 6, 7 };
const int csz = sizeof c / sizeof *c;
print(c, c + csz, "c", " ");
it = search(v.begin(), v.end(),
c, c + csz, PlusOne());
print(it, it + csz,"search PlusOne", " ");
int d[] = { 11, 11, 11 };
const int dsz = sizeof d / sizeof *d;
print(d, d + dsz, "d", " ");
it = find_end(v.begin(), v.end(), d, d + dsz);
print(it, v.end(),"find_end", " ");
int e[] = { 9, 9 };
print(e, e + 2, "e", " ");
it = find_end(v.begin(), v.end(),
e, e + 2, PlusOne());
print(it, v.end(),"find_end PlusOne"," ");
it = search_n(v.begin(), v.end(), 3, 7);
print(it, it + 3, "search_n 3, 7", " ");
it = search_n(v.begin(), v.end(),
6, 15, MulMoreThan(100));
print(it, it + 6,
"search_n 6, 15, MulMoreThan(100)", " ");
cout << "min_element: " <<
*min_element(v.begin(), v.end()) << endl;
cout << "max_element: " <<
*max_element(v.begin(), v.end()) << endl;
vector<int> v2;
replace_copy(v.begin(), v.end(),
back_inserter(v2), 8, 47);
print(v2, "replace_copy 8 -> 47", " ");
replace_if(v.begin(), v.end(),
bind2nd(greater_equal<int>(), 7), -1);
print(v, "replace_if >= 7 -> -1", " ");
} ///:~
The
example begins with two predicates:
PlusOne
which is a binary predicate that returns
true
if the second argument is equivalent to one plus the first argument, and
MulMoreThan
which returns
true
if the first argument times the second argument is greater than a value stored
in the object. These binary predicates are used as tests in the example.
In
main( ),
an array
a
is created and fed to the constructor for
vector<int>
v
.
This vector will be used as the target for the search and replace activities,
and you’ll note that there are duplicate elements – these will be
discovered by some of the search/replace routines.
The
first test demonstrates
find( ),
discovering the value 4 in
v.
The return value is the iterator pointing to the first instance of 4, or the
end of the input range (
v.end( ))
if the search value is not found.
find_if( )
uses a predicate to determine if it has discovered the correct element. In the
above example, this predicate is created on the fly using
greater<int>
(that is, “see if the first
int
argument
is greater than the second”) and
bind2nd( )
to fix the second argument to 8. Thus, it returns true if the value in
v
is greater than 8.
Since
there are a number of cases in
v
where two identical objects appear next to each other, the test of
adjacent_find( )
is designed to find them all. It starts looking from the beginning and then
drops into a
while
loop, making sure that the iterator
it
has not reached the end of the input sequence (which would mean that no more
matches can be found). For each match it finds, the loop prints out the matches
and then performs the next
adjacent_find( ),
this time using
it
+ 2
as the first argument (this way, it moves past the two elements that it already
found).
You
might look at the
while
loop and think that you can do it a bit more cleverly, to wit:
while(it != v.end()) {
cout << "adjacent_find: " << *it++
<< ", " << *it++ << endl;
it = adjacent_find(it, v.end());
}
Of
course, this is exactly what I tried at first. However, I did not get the
output I expected, on any compiler. This is because there is no guarantee about
when the increments occur in the above expression. A bit of a disturbing
discovery, I know, but the situation is best avoided now that you’re
aware of it.
The
next test uses
adjacent_find( )
with the
PlusOne
predicate, which discovers all the places where the next number in the sequence
v
changes from the previous by one. The same
while
approach is used to find all the cases.
find_first_of( )
requires a second range of objects for which to hunt; this is provided in the
array
b.
Notice that, because the first range and the second range in
find_first_of( )
are controlled by separate template arguments, those ranges can refer to two
different types of containers, as seen here. The second form of
find_first_of( )
is also tested, using
PlusOne. search( )
finds exactly the second range inside the first one, with the elements in the
same order. The second form of
search( )
uses a predicate, which is typically just something that defines equivalence,
but it also opens some interesting possibilities – here, the
PlusOne
predicate causes the range
{
4, 5, 6 }
to be found.
The
find_end( )
test discovers the
last
occurrence of the entire sequence
{
11, 11, 11 }
.
To show that it has in fact found the last occurrence, the rest of
v
starting from
it
is printed.
The
first
search_n( )
test looks for 3 copies of the value 7, which it finds and prints. When using
the second version of
search_n( ),
the predicate is ordinarily meant to be used to determine equivalence between
two elements, but I’ve taken some liberties and used a function object
that multiplies the value in the sequence by (in this case) 15 and checks to
see if it’s greater than 100. That is, the
search_n( )
test above says “find me 6 consecutive values which, when multiplied by
15, each produce a number greater than 100.” Not exactly what you
normally expect to do, but it might give you some ideas the next time you have
an odd searching problem.
min_element( )
and
max_element( )
are straightforward; the only thing that’s a bit odd is that it looks
like the function is being dereferenced with a ‘
*’.
Actually, the returned iterator is being dereferenced to produce the value for
printing.
To
test replacements,
replace_copy( )
is used first (so it doesn’t modify the original vector) to replace all
values of 8 with the value 47. Notice the use of
back_inserter( )
with the empty vector
v2.
To demonstrate
replace_if( ),
a function object is created using the standard template
greater_equal
along with
bind2nd
to replace all the values that are greater than or equal to 7 with the value -1.
Comparing
ranges
These
algorithms provide ways to compare two ranges. At first glance, the operations
they perform seem very close to the
search( )
function above. However,
search( )
tells you where the second sequence appears within the first, while
equal( )
and
lexicographical_compare( )
simply
tell you whether or not two sequences are exactly identical (using different
comparison algorithms). On the other hand,
mismatch( )
does tell you where the two sequences go out of sync, but those sequences must
be exactly the same length.
bool
equal(InputIterator first1, InputIterator last1, InputIterator first2);
bool
equal(InputIterator first1, InputIterator last1, InputIterator first2
BinaryPredicate binary_pred);
In
both of these functions, the first range is the typical one,
[first1,
last1)
.
The second range starts at
first2,
but there is no “last2” because its length is determined by the
length of the first range. The
equal( )
function returns true if both ranges are exactly the same (the same elements in
the same order); in the first case, the
operator==
is used to perform the comparison and in the second case
binary_pred
is used to decide if two elements are the same.
bool
lexicographical_compare(InputIterator1 first1, InputIterator1 last1
InputIterator2 first2, InputIterator2 last2);
bool
lexicographical_compare(InputIterator1 first1, InputIterator1 last1
InputIterator2 first2, InputIterator2 last2, BinaryPredicate binary_pred);
These
two functions determine if the first range is “lexicographically
less” than the second (they return
true
if range 1 is less than range 2, and false otherwise. Lexicographical equality,
or “dictionary” comparison, means that the comparison is done the
same way we establish the order of strings in a dictionary, one element at a
time. The first elements determine the result if these elements are different,
but if they’re equal the algorithm moves on to the next elements and
looks at those, and so on. until it finds a mismatch. At that point it looks at
the elements, and if the element from range 1 is less than the element from
range two, then
lexicographical_compare( )
returns
true,
otherwise it returns
false.
If it gets all the way through one range or the other (the ranges may be
different lengths for this algorithm) without finding an inequality, then range
1 is
not
less
than range 2 so the function returns
false. If
the two ranges are different lengths, a missing element in one range acts as
one that “precedes” an element that exists in the other range. So
{‘a’, ‘b’} lexicographically precedes {‘a’,
‘b’, ‘a’ }.
In
the first version of the function,
operator<
is used to perform the comparisons, and in the second version
binary_pred
is used.
pair<InputIterator1,
InputIterator2> mismatch(InputIterator1 first1,
InputIterator1 last1, InputIterator2 first2);
pair<InputIterator1,
InputIterator2> mismatch(InputIterator1 first1,
InputIterator1 last1, InputIterator2 first2, BinaryPredicate binary_pred);
As
in
equal( ),
the length of both ranges is exactly the same, so only the first iterator in
the second range is necessary, and the length of the first range is used as the
length of the second range. Whereas
equal( )
just tells you whether or not the two ranges are the same,
mismatch( )
tells you where they begin to differ. To accomplish this, you must be told (1)
the element in the first range where the mismatch occurred and (2) the element
in the second range where the mismatch occurred. These two iterators are
packaged together into a
pair
object and returned. If no mismatch occurs, the return value is
last1
combined with the past-the-end iterator of the second range.
As
in
equal( ),
the first function tests for equality using
operator==
while the second one uses
binary_pred.
Example
Because
the standard C++
string
class is built like a container (it has
begin( )
and
end( )
member functions which produce objects of type
string::iterator),
it can be used to conveniently create ranges of characters to test with the STL
comparison algorithms. However, you should note that
string
has
a fairly complete set of native operations, so you should look at the
string
class before using the STL algorithms to perform operations.
//: C21:Comparison.cpp
// The STL range comparison algorithms
#include <vector>
#include <algorithm>
#include <functional>
#include <string>
#include "PrintSequence.h"
using namespace std;
int main() {
// strings provide a convenient way to create
// ranges of characters, but you should
// normally look for native string operations:
string s1("This is a test");
string s2("This is a Test");
cout << "s1: " << s1 << endl
<< "s2: " << s2 << endl;
cout << "compare s1 & s1: "
<< equal(s1.begin(), s1.end(), s1.begin())
<< endl;
cout << "compare s1 & s2: "
<< equal(s1.begin(), s1.end(), s2.begin())
<< endl;
cout << "lexicographical_compare s1 & s1: " <<
lexicographical_compare(s1.begin(), s1.end(),
s1.begin(), s1.end()) << endl;
cout << "lexicographical_compare s1 & s2: " <<
lexicographical_compare(s1.begin(), s1.end(),
s2.begin(), s2.end()) << endl;
cout << "lexicographical_compare s2 & s1: " <<
lexicographical_compare(s2.begin(), s2.end(),
s1.begin(), s1.end()) << endl;
cout << "lexicographical_compare shortened "
"s1 & full-length s2: " << endl;
string s3(s1);
while(s3.length() != 0) {
bool result = lexicographical_compare(
s3.begin(), s3.end(), s2.begin(),s2.end());
cout << s3 << endl << s2 << ", result = "
<< result << endl;
if(result == true) break;
s3 = s3.substr(0, s3.length() - 1);
}
pair<string::iterator, string::iterator> p =
mismatch(s1.begin(), s1.end(), s2.begin());
print(p.first, s1.end(), "p.first", "");
print(p.second, s2.end(), "p.second","");
} ///:~
Note
that the only difference between
s1
and
s2
is the capital ‘T’ in
s2’s
“Test.” Comparing
s1
and
s1
for equality yields
true,
as expected, while
s1
and
s2
are not equal because of the capital ‘T’.
To
understand the output of the
lexicographical_compare( )
tests, you must remember two things: first, the comparison is performed
character-by-character, and second that capital letters “precede”
lowercase letters. In the first test,
s1
is compared to
s1.
These are exactly equivalent, thus one is
not
lexicographically less than the other (which is what the comparison is looking
for) and thus the result is
false.
The second test is asking “does
s1
precede
s2?”
When the comparison gets to the ‘t’ in “test”, it
discovers that the lowercase ‘t’ in
s1
is “greater” than the uppercase ‘T’ in
s2,
so the answer is again
false.
However, if we test to see whether
s2
precedes
s1,
the answer is
true. To
further examine lexicographical comparison, the next test in the above example
compares
s1
with
s2
again (which returned
false
before). But this time it repeats the comparison, trimming one character off
the end of
s1
(which is first copied into
s3)
each time through the loop until the test evaluates to
true.
What you’ll see is that, as soon as the uppercase ‘T’ is
trimmed off of
s3
(the copy of
s1),
then the characters, which are exactly equal up to that point, no longer count
and the fact that
s3
is shorter than
s2
is what makes it lexicographically precede
s2. The
final test uses
mismatch( )
.
In order to capture the return value, you must first create the appropriate
pair
p
,
constructing the template using the iterator type from the first range and the
iterator type from the second range (in this case, both
string::iterators).
To print the results, the iterator for the mismatch in the first range is
p.first,
and for the second range is
p.second.
In both cases, the range is printed from the mismatch iterator to the end of
the range so you can see exactly where the iterator points.
Removing
elements
Because
of the genericity of the STL, the concept of removal is a bit constrained.
Since elements can only be “removed” via iterators, and iterators
can point to arrays, vectors, lists, etc., it is not safe or reasonable to
actually try to destroy the elements that are being removed, and to change the
size of the input range
[first,
last)
(an array, for example, cannot have its size changed). So instead, what the STL
“remove” functions do is rearrange the sequence so that the
“removed” elements are at the end of the sequence, and the
“un-removed” elements are at the beginning of the sequence (in the
same order that they were before, minus the removed elements – that is,
this is a
stable
operation). Then the function will return an iterator to the “new
last” element of the sequence, which is the end of the sequence without
the removed elements and the beginning of the sequence of the removed elements.
In other words, if
new_last
is the iterator that is returned from the “remove” function, then
[first,
new_last)
is the sequence without any of the removed elements, and
[new_last,
last)
is the sequence of removed elements.
If
you are simply using your sequence, including the removed elements, with more
STL algorithms, you can just use
new_last
as the new past-the-end iterator. However, if you’re using a resizable
container
c
(not
an array) and you actually want to eliminate the removed elements from the
container you can use
erase( )
to do so, for example:
c.erase(remove(c.begin(),
c.end(), value), c.end());
The
return value of
remove( )
is the
new_last
iterator, so
erase( )
will delete all the removed elements from
c. The
iterators in
[new_last,
last)
are dereferenceable but the element values are undefined and should not be used.
ForwardIterator
remove(ForwardIterator first, ForwardIterator last, const T& value);
ForwardIterator
remove_if(ForwardIterator first, ForwardIterator last,
Predicate pred);
OutputIterator
remove_copy(InputIterator first, InputIterator last,
OutputIterator result, const T& value);
OutputIterator
remove_copy_if(InputIterator first, InputIterator last,
OutputIterator result, Predicate pred);
Each
of the “remove” forms moves through the range
[first,
last)
,
finding values that match a removal criterion and copying the un-removed
elements over the removed elements (thus effectively removing them). The
original order of the un-removed elements is maintained. The return value is an
iterator pointing past the end of the range that contains none of the removed
elements. The values that this iterator points to are unspecified.
The
“if” versions pass each element to
pred( )
to determine whether it should be removed or not (if
pred( )
returns
true,
the element is removed). The “copy” versions do not modify the
original sequence, but instead copy the un-removed values into a range
beginning at
result,
and return an iterator indicating the past-the-end value of this new range.
ForwardIterator
unique(ForwardIterator first, ForwardIterator last);
ForwardIterator
unique(ForwardIterator first, ForwardIterator last,
BinaryPredicate binary_pred);
OutputIterator
unique_copy(InputIterator first, InputIterator last,
OutputIterator result);
OutputIterator
unique_copy(InputIterator first, InputIterator last,
OutputIterator result, BinaryPredicate binary_pred);
Each
of the “unique” functions moves through the range
[first,
last)
,
finding adjacent values that are equivalent (that is, duplicates) and
“removing” the duplicate elements by copying over them. The
original order of the un-removed elements is maintained. The return value is an
iterator pointing past the end of the range that has the adjacent duplicates
removed.
Because
only duplicates that are adjacent are removed, it’s likely that
you’ll want to call
sort( )
before calling a “unique” algorithm, since that will guarantee that
all
the duplicates are removed.
The
versions containing
binary_pred
call, for each iterator value
i
in the input range:
and
if the result is true then
*(i-1)
is considered a duplicate.
The
“copy” versions do not modify the original sequence, but instead
copy the un-removed values into a range beginning at
result,
and return an iterator indicating the past-the-end value of this new range.
Example
This
example gives a visual demonstration of the way the “remove” and
“unique” functions work.
//: C21:Removing.cpp
// The removing algorithms
#include <vector>
#include <algorithm>
#include <cctype>
#include "Generators.h"
#include "PrintSequence.h"
using namespace std;
struct IsUpper {
bool operator()(char c) {
return isupper(c);
}
};
int main() {
vector<char> v(50);
generate(v.begin(), v.end(), CharGen());
print(v, "v", "");
// Create a set of the characters in v:
set<char> cs(v.begin(), v.end());
set<char>::iterator it = cs.begin();
vector<char>::iterator cit;
// Step through and remove everything:
while(it != cs.end()) {
cit = remove(v.begin(), v.end(), *it);
cout << *it << "[" << *cit << "] ";
print(v, "", "");
it++;
}
generate(v.begin(), v.end(), CharGen());
print(v, "v", "");
cit = remove_if(v.begin(), v.end(), IsUpper());
print(v.begin(), cit, "after remove_if", "");
// Copying versions are not shown for remove
// and remove_if.
sort(v.begin(), cit);
print(v.begin(), cit, "sorted", "");
vector<char> v2;
unique_copy(v.begin(), cit, back_inserter(v2));
print(v2, "unique_copy", "");
// Same behavior:
cit = unique(v.begin(), cit, equal_to<char>());
print(v.begin(), cit, "unique", "");
} ///:~
The
vector<char>
v
is filled with randomly-generated characters and then copied into a
set.
Each element of the
set
is used in a
remove
statement, but the entire
vector
v
is printed out each time so you can see what happens to the rest of the range,
after the resulting endpoint (which is stored in
cit). To
demonstrate
remove_if( ),
the address of the Standard C library function
isupper( )
(in
<cctype>
is
called inside of the function object class
IsUpper,
an object of which is
passed
as the predicate for
remove_if( ).
This
only returns
true
if a character is uppercase, so only lowercase characters will remain. Here,
the end of the range is used in the call to
print( )
so only the remaining elements will appear. The copying versions of
remove( )
and
remove_if( )
are not shown because they are a simple variation on the non-copying versions
which you should be able to use without an example.
The
range of lowercase letters is sorted in preparation for testing the
“unique” functions (the “unique” functions are not
undefined if the range isn’t sorted, but it’s probably not what you
want). First,
unique_copy( )
puts the unique elements into a new
vector
using the default element comparison, and then the form of
unique( )
that takes a predicate is used; the predicate used is the built-in function
object
equal_to( ),
which produces the same results as the default element comparison.
Sorting
and operations on sorted ranges
There
is a significant category of STL algorithms which require that the range they
operate on be in sorted order.
There
is actually only one “sort” algorithm used in the STL. This
algorithm is presumably the fastest one, but the implementor has fairly broad
latitude. However, it comes packaged in various flavors depending on whether
the sort should be stable, partial or just the regular sort. Oddly enough, only
the partial sort has a copying version; otherwise you’ll need to make
your own copy before sorting if that’s what you want. If you are working
with a very large number of items you may be better off transferring them to an
array (or at least a
vector,
which uses an array internally) rather than using them in some of the STL
containers.
Once
your sequence is sorted, there are many operations you can perform on that
sequence, from simply locating an element or group of elements to merging with
another sorted sequence or manipulating sequences as mathematical sets.
Each
algorithm involved with sorting or operations on sorted sequences has two
versions of each function, the first that uses the object’s own
operator<
to perform the comparison, and the second that uses an additional
StrictWeakOrdering
object’s
operator( )(a,
b)
to compare two objects for
a
<
b.
Other than this there are no differences, so the distinction will not be
pointed out in the description of each algorithm.
Sorting
One
STL container (
list)
has its own built-in
sort( )
function which is almost certainly going to be faster than the generic sort
presented here (especially since the
list
sort
just swaps pointers rather than copying entire objects around). This means that
you’ll only want to use the sort functions here if (a) you’re
working with an array or a sequence container that doesn’t have a
sort( )
function or (b) you want to use one of the other sorting flavors, like a
partial or stable sort, which aren’t supported by
list’s
sort( ). void
sort(RandomAccessIterator first, RandomAccessIterator last);
void
sort(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering binary_pred);
Sorts
[first,
last)
into ascending order. The second form allows a comparator object to determine
the order.
void
stable_sort(RandomAccessIterator first, RandomAccessIterator last);
void
stable_sort(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering binary_pred);
Sorts
[first,
last)
into ascending order, preserving the original ordering of equivalent elements
(this is important if elements can be equivalent but not identical). The second
form allows a comparator object to determine the order.
void
partial_sort(RandomAccessIterator first,
RandomAccessIterator middle, RandomAccessIterator last);
void
partial_sort(RandomAccessIterator first,
RandomAccessIterator middle, RandomAccessIterator last,
StrictWeakOrdering binary_pred);
Sorts
the number of elements from
[first,
last)
that can be placed in the range
[first,
middle)
.
The rest of the elements end up in
[middle,
last)
,
and have no guaranteed order. The second form allows a comparator object to
determine the order.
RandomAccessIterator
partial_sort_copy(InputIterator first, InputIterator last,
RandomAccessIterator result_first, RandomAccessIterator result_last);
RandomAccessIterator
partial_sort_copy(InputIterator first,
InputIterator last, RandomAccessIterator result_first,
RandomAccessIterator result_last, StrictWeakOrdering binary_pred);
Sorts
the number of elements from
[first,
last)
that can be placed in the range
[result_first,
result_last)
,
and copies those elements into
[result_first,
result_last)
.
If the range
[first,
last)
is smaller than
[result_first,
result_last)
,
then the smaller number of elements is used. The second form allows a
comparator object to determine the order.
void
nth_element(RandomAccessIterator first,
RandomAccessIterator nth, RandomAccessIterator last);
void
nth_element(RandomAccessIterator first,
RandomAccessIterator nth, RandomAccessIterator last,
StrictWeakOrdering binary_pred);
Just
like
partial_sort( ),
nth_element( )
partially orders a range of elements. However, it’s much “less
ordered” than
partial_sort( ).
The only thing that
nth_element( )
guarantees is that whatever
location
you
choose will become a dividing point. All the elements in the range
[first,
nth)
will
be less than (they could also be equivalent to) whatever element ends up at
location
nth
and
all the elements in the range
(nth,
last]
will be greater than whatever element ends up location
nth.
However, neither range is in any particular order, unlike
partial_sort( )
which has the first range in sorted order.
If
all you need is this very weak ordering (if, for example, you’re
determining medians, percentiles and that sort of thing) this algorithm is
faster than
partial_sort( ).
Example
The
StreamTokenizer
class from the previous chapter is used to break a file into words, and each
word is turned into an
NString
and added to a
deque<NString>.
Once the input file is completely read, a
vector<NString>
is
created from the contents of the
deque.
The
vector
is then used to demonstrate the sorting algorithms:
//: C21:SortTest.cpp
//{L} ../C20/StreamTokenizer
// Test different kinds of sorting
#include <algorithm>
#include <fstream>
#include <queue>
#include <vector>
#include <cctype>
#include "../require.h"
#include "../C20/StreamTokenizer.h"
#include "PrintSequence.h"
#include "NString.h"
#include "Generators.h"
using namespace std;
// For sorting NStrings and ignore string case:
struct NoCase {
bool operator()(
const NString& x, const NString& y) {
/* Somthing's wrong with this approach but I
can't seem to see it. It would be much faster:
const string& lv = x;
const string& rv = y;
int len = min(lv.size(), rv.size());
for(int i = 0; i < len; i++)
if(tolower(lv[i]) < tolower(rv[i]))
return true;
return false;
}
*/
// Brute force: copy, force to lowercase:
string lv(x);
string rv(y);
lcase(lv);
lcase(rv);
return lv < rv;
}
void lcase(string& s) {
int n = s.size();
for(int i = 0; i < n; i++)
s[i] = tolower(s[i]);
}
};
int main(int argc, char* argv[]) {
requireArgs(argc, 1);
ifstream in(argv[1]);
assure(in, argv[1]);
StreamTokenizer words(in);
deque<NString> nstr;
string word;
while((word = words.next()).size() != 0)
nstr.push_back(NString(word));
print(nstr);
// Create a vector from the contents of nstr:
vector<NString> v(nstr.begin(), nstr.end());
sort(v.begin(), v.end());
print(v, "sort");
// Use an additional comparator object:
sort(v.begin(), v.end(), NoCase());
print(v, "sort NoCase");
copy(nstr.begin(), nstr.end(), v.begin());
stable_sort(v.begin(), v.end());
print(v, "stable_sort");
// Use an additional comparator object:
stable_sort(v.begin(), v.end(),
greater<NString>());
print(v, "stable_sort greater");
copy(nstr.begin(), nstr.end(), v.begin());
// Partial sorts. The additional comparator
// versions are obvious and not shown here.
partial_sort(v.begin(),
v.begin() + v.size()/2, v.end());
print(v, "partial_sort");
// Create a vector with a preallocated size:
vector<NString> v2(v.size()/2);
partial_sort_copy(v.begin(), v.end(),
v2.begin(), v2.end());
print(v2, "partial_sort_copy");
// Finally, the weakest form of ordering:
vector<int> v3(20);
generate(v3.begin(), v3.end(), URandGen(50));
print(v3, "v3 before nth_element");
int n = 10;
vector<int>::iterator vit = v3.begin() + n;
nth_element(v3.begin(), vit, v3.end());
cout << "After ordering with nth = " << n
<< ", nth element is " << v3[n] << endl;
print(v3, "v3 after nth_element");
} ///:~
The
first class is a binary predicate used to compare two
NString
objects while ignoring the case of the
strings.
You can pass the object into the various sort routines to produce an alphabetic
sort (rather than the default lexicographic sort, which has all the capital
letters in one group, followed by all the lowercase letters).
As
an example, try the source code for the above file as input. Because the
occurrence numbers are printed along with the strings you can distinguish
between an ordinary sort and a stable sort, and you can also see what happens
during a partial sort (the remaining unsorted elements are in no particular
order). There is no “partial stable sort.”
You’ll
notice that the use of the second “comparator” forms of the
functions are not exhaustively tested in the above example, but the use of a
comparator is the same as in the first part of the example.
The
test of
nth_element
does not use the
NString
objects because it’s simpler to see what’s going on if
ints
are used. Notice that, whatever the nth element turns out to be (which will
vary from one run to another because of
URandGen),
the elements before that are less, and after that are greater, but the elements
have no particular order other than that. Because of
URandGen,
there are no duplicates but if you use a generator that allows duplicates you
can see that the elements before the nth element will be less than or equal to
the nth element.
Locating
elements in sorted ranges
Once
a range is sorted, there are a group of operations that can be used to find
elements within those ranges. In the following functions, there are always two
forms, one that assumes the intrinsic
operator<
has been used to perform the sort, and the second that must be used if some
other comparison function object has been used to perform the sort. You must
use the same comparison for locating elements as you do to perform the sort,
otherwise the results are undefined. In addition, if you try to use these
functions on unsorted ranges the results will be undefined.
bool
binary_search(ForwardIterator first, ForwardIterator last, const T& value);
bool
binary_search(ForwardIterator first, ForwardIterator last, const T& value,
StrictWeakOrdering binary_pred);
Tells
you whether
value
appears in the sorted range
[first,
last)
. ForwardIterator
lower_bound(ForwardIterator first, ForwardIterator last,
const T& value);
ForwardIterator
lower_bound(ForwardIterator first, ForwardIterator last,
const T& value, StrictWeakOrdering binary_pred);
Returns
an iterator indicating the first occurrence of
value
in the sorted range
[first,
last)
.
Returns
last
if
value
is not found.
ForwardIterator
upper_bound(ForwardIterator first, ForwardIterator last,
const T& value);
ForwardIterator
upper_bound(ForwardIterator first, ForwardIterator last,
const T& value, StrictWeakOrdering binary_pred);
Returns
an iterator indicating one past the last occurrence of
value
in the sorted range
[first,
last)
.
Returns
last
if
value
is not found.
pair<ForwardIterator,
ForwardIterator>
equal_range(ForwardIterator first, ForwardIterator last,
const T& value);
pair<ForwardIterator,
ForwardIterator>
equal_range(ForwardIterator first, ForwardIterator last,
const T& value, StrictWeakOrdering binary_pred);
Essentially
combines
lower_bound( )
and
upper_bound( )
to return a
pair
indicating the first and one-past-the-last occurrences of
value
in the sorted range
[first,
last)
.
Both iterators indicate
last
if
value
is not found.
Example
Here,
we can use the approach from the previous example:
//: C21:SortedSearchTest.cpp
//{L} ../C20/StreamTokenizer
// Test searching in sorted ranges
#include <algorithm>
#include <fstream>
#include <queue>
#include <vector>
#include "../C20/StreamTokenizer.h"
#include "PrintSequence.h"
#include "NString.h"
#include "../require.h"
using namespace std;
int main() {
ifstream in("SortedSearchTest.cpp");
assure(in, "SortedSearchTest.cpp");
StreamTokenizer words(in);
deque<NString> dstr;
string word;
while((word = words.next()).size() != 0)
dstr.push_back(NString(word));
vector<NString> v(dstr.begin(), dstr.end());
sort(v.begin(), v.end());
print(v, "sorted");
typedef vector<NString>::iterator sit;
sit it, it2;
string f("include");
cout << "binary search: "
<< binary_search(v.begin(), v.end(), f)
<< endl;
it = lower_bound(v.begin(), v.end(), f);
it2 = upper_bound(v.begin(), v.end(), f);
print(it, it2, "found range");
pair<sit, sit> ip =
equal_range(v.begin(), v.end(), f);
print(ip.first, ip.second,
"equal_range");
} ///:~
The
input is forced to be the source code for this file because the word
“include” will be used for a find string (since
“include” appears many times). The file is tokenized into words
that are placed into a
deque
(a better container when you don’t know how much storage to allocate),
and left unsorted in the
deque.
The
deque
is copied into a
vector
via the appropriate constructor, and the
vector
is sorted and printed.
The
binary_search( )
function only tells you if the object is there or not;
lower_bound( )
and
upper_bound( )
produce iterators to the beginning and ending positions where the matching
objects appear. The same effect can be produced more succinctly using
equal_range( )
(as shown in the previous chapter, with
multimap
and
multiset).
Merging
sorted ranges
As
before, the first form of each function assumes the intrinsic
operator<
has been used to perform the sort. The second form must be used if some other
comparison function object has been used to perform the sort. You must use the
same comparison for locating elements as you do to perform the sort, otherwise
the results are undefined. In addition, if you try to use these functions on
unsorted ranges the results will be undefined.
OutputIterator
merge(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2, OutputIterator result);
OutputIterator
merge(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2, OutputIterator result,
StrictWeakOrdering binary_pred);
Copies
elements from
[first1,
last1)
and
[first2,
last2)
into
result,
such that the resulting range is sorted in ascending order. This is a stable
operation.
void
inplace_merge(BidirectionalIterator first,
BidirectionalIterator middle, BidirectionalIterator last);
void
inplace_merge(BidirectionalIterator first,
BidirectionalIterator middle, BidirectionalIterator last,
StrictWeakOrdering binary_pred);
This
assumes that
[first,
middle)
and
[middle,
last)
are each sorted ranges. The two ranges are merged so that the resulting range
[first,
last)
contains the combined ranges in sorted order.
Example
It’s
easier to see what goes on with merging if
ints
are used; the following example also emphasizes how the algorithms (and my own
print
template) work with arrays as well as containers.
//: C21:MergeTest.cpp
// Test merging in sorted ranges
#include <algorithm>
#include "PrintSequence.h"
#include "Generators.h"
using namespace std;
int main() {
const int sz = 15;
int a[sz*2] = {0};
// Both ranges go in the same array:
generate(a, a + sz, SkipGen(0, 2));
generate(a + sz, a + sz*2, SkipGen(1, 3));
print(a, a + sz, "range1", " ");
print(a + sz, a + sz*2, "range2", " ");
int b[sz*2] = {0}; // Initialize all to zero
merge(a, a + sz, a + sz, a + sz*2, b);
print(b, b + sz*2, "merge", " ");
// set_union is a merge that removes duplicates
set_union(a, a + sz, a + sz, a + sz*2, b);
print(b, b + sz*2, "set_union", " ");
inplace_merge(a, a + sz, a + sz*2);
print(a, a + sz*2, "inplace_merge", " ");
} ///:~
In
main( ),
instead of creating two separate arrays both ranges will be created end-to-end
in the same array
a
(this will come in handy for the
inplace_merge).
The first call to
merge( )
places the result in a different array,
b.
For comparison,
set_union( )
is also called, which has the same signature and similar behavior, except that
it removes the duplicates. Finally,
inplace_merge( )
is used to combine both parts of
a.
Set
operations on sorted ranges
Once
ranges have been sorted, you can perform mathematical set operations on them.
bool
includes(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2);
bool
includes (InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2,
StrictWeakOrdering binary_pred);
Returns
true
if
[first2,
last2)
is a subset of
[first1,
last1)
.
Neither range is required to hold only unique elements, but if
[first2,
last2)
holds
n
elements of a particular value, then
[first1,
last1)
must also hold
n
elements if the result is to be
true. OutputIterator
set_union(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2, OutputIterator result);
OutputIterator
set_union(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2, OutputIterator result,
StrictWeakOrdering binary_pred);
Creates
the mathematical union of two sorted ranges in the
result
range, returning the end of the output range. Neither input range is required
to hold only unique elements, but if a particular value appears multiple times
in both input sets, then the resulting set will contain the larger number of
identical values.
OutputIterator
set_intersection (InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2, OutputIterator result);
OutputIterator
set_intersection (InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2, OutputIterator result,
StrictWeakOrdering binary_pred);
Produces,
in
result,
the intersection of the two input sets, returning the end of the output range.
That is, the set of values that appear in both input sets. Neither input range
is required to hold only unique elements, but if a particular value appears
multiple times in both input sets, then the resulting set will contain the
smaller number of identical values.
OutputIterator
set_difference (InputIterator1 first1, InputIterator1 last1, InputIterator2
first2, InputIterator2 last2, OutputIterator result);
OutputIterator
set_difference (InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, InputIterator2 last2, OutputIterator result,
StrictWeakOrdering binary_pred);
Produces,
in
result,
the mathematical set difference, returning the end of the output range. All the
elements that are in
[first1,
last1)
but not in
[first2,
last2)
are placed in the result set. Neither input range is required to hold only
unique elements, but if a particular value appears multiple times in both input
sets (
n
times in set 1 and
m
times in set 2), then the resulting set will contain
max(n-m,
0)
copies of that value.
OutputIterator
set_symmetric_difference(InputIterator1 first1,
InputIterator1 last1, InputIterator2 first2, InputIterator2 last2,
OutputIterator result);
OutputIterator
set_symmetric_difference(InputIterator1 first1,
InputIterator1 last1, InputIterator2 first2, InputIterator2 last2,
OutputIterator result, StrictWeakOrdering binary_pred);
Constructs,
in
result,
the set containing:
- All
the elements in set 1 that are not in set 2
- All
the elements in set 2 that are not in set 1.
Neither
input range is required to hold only unique elements, but if a particular value
appears multiple times in both input sets (
n
times in set 1 and
m
times in set 2), then the resulting set will contain
abs(n-m)
copies of that value, where
abs( )
is the absolute value. The return value is the end of the output range
Example
It’s
easiest to see the set operations demonstrated using simple vectors of
characters, so you view the sets more easily. These characters are randomly
generated and then sorted, but the duplicates are not removed so you can see
what the set operations do when duplicates are involved.
//: C21:SetOperations.cpp
// Set operations on sorted ranges
#include <vector>
#include <algorithm>
#include "Generators.h"
#include "PrintSequence.h"
using namespace std;
int main() {
vector<char> v(50), v2(50);
CharGen g;
generate(v.begin(), v.end(), g);
generate(v2.begin(), v2.end(), g);
sort(v.begin(), v.end());
sort(v2.begin(), v2.end());
print(v, "v", "");
print(v2, "v2", "");
bool b = includes(v.begin(), v.end(),
v.begin() + v.size()/2, v.end());
cout << "includes: " <<
(b ? "true" : "false") << endl;
vector<char> v3, v4, v5, v6;
set_union(v.begin(), v.end(),
v2.begin(), v2.end(), back_inserter(v3));
print(v3, "set_union", "");
set_intersection(v.begin(), v.end(),
v2.begin(), v2.end(), back_inserter(v4));
print(v4, "set_intersection", "");
set_difference(v.begin(), v.end(),
v2.begin(), v2.end(), back_inserter(v5));
print(v5, "set_difference", "");
set_symmetric_difference(v.begin(), v.end(),
v2.begin(), v2.end(), back_inserter(v6));
print(v6, "set_symmetric_difference","");
} ///:~
After
v
and
v2
are generated, sorted and printed, the
includes( )
algorithm is tested by seeing if the entire range of
v
contains the last half of
v,
which of course it does so the result should always be true. The vectors
v3,
v4,
v5
and
v6
are created to hold the output of
set_union( ),
set_intersection( ),
set_difference( )
and
set_symmetric_difference( ),
and the results of each are displayed so you can ponder them and convince
yourself that the algorithms do indeed work as promised.
Heap
operations
The
heap operations in the STL are primarily concerned with the creation of the STL
priority_queue,
which provides efficient access to the “largest” element, whatever
“largest” happens to mean for your program. These were discussed in
some detail in the previous chapter, and you can find an example there.
As
with the “sort” operations, there are two versions of each
function, the first that uses the object’s own
operator<
to perform the comparison, the second that uses an additional
StrictWeakOrdering
object’s
operator( )(a,
b)
to compare two objects for
a
<
b.
void
make_heap(RandomAccessIterator first, RandomAccessIterator last);
void
make_heap(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering binary_pred);
Turns
an arbitrary range into a heap. A heap is just a range that is organized in a
particular way.
void
push_heap(RandomAccessIterator first, RandomAccessIterator last);
void
push_heap(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering binary_pred);
Adds
the element *(
last-1)
to the heap determined by the range
[first,
last-1)
.
Yes, it seems like an odd way to do things but remember that the
priority_queue
container presents the nice interface to a heap, as shown in the previous
chapter.
void
pop_heap(RandomAccessIterator first, RandomAccessIterator last);
void
pop_heap(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering binary_pred);
Places
the largest element (which is actually in
*first,
before the operation, because of the way heaps are defined) into the position
*(last-1)
and
reorganizes the remaining range so that it’s still in heap order. If you
simply grabbed
*first,
the next element would not be the next-largest element so you must use
pop_heap( )
if you want to maintain the heap in its proper priority-queue order.
void
sort_heap(RandomAccessIterator first, RandomAccessIterator last);
void
sort_heap(RandomAccessIterator first, RandomAccessIterator last,
StrictWeakOrdering binary_pred);
This
could be thought of as the complement of
make_heap( ),
since it takes a range that is in heap order and turns it into ordinary sorted
order, so it is no longer a heap. That means that if you call
sort_heap( )
you can no longer use
push_heap( )
or
pop_heap( )
on that range (rather, you can use those functions but they won’t do
anything sensible). This is not a stable sort.
Applying
an operation to each element in a range
These
algorithms move through the entire range and perform an operation on each
element. They differ in what they do with the results of that operation:
for_each( )
discards the return value of the operation (but returns the function object
that has been applied to each element), while
transform( )
places the results of each operation into a destination sequence (which can be
the original sequence).
UnaryFunction
for_each(InputIterator first, InputIterator last, UnaryFunction f);
Applies
the function object
f
to each element in
[first,
last)
,
discarding the return value from each individual application of
f.
If
f
is
just a function pointer then you are typically not interested in the return
value, but if
f
is
an object that maintains some internal state it can capture the combined return
value of being applied to the range. The final return value of
for_each( )
is
f. OutputIterator
transform(InputIterator first, InputIterator last,
OutputIterator result, UnaryFunction f);
OutputIterator
transform(InputIterator1 first, InputIterator1 last,
InputIterator2 first2, OutputIterator result, BinaryFunction f);
Like
for_each( ),
applies a function object
f
to each element in the range
[first,
last)
.
However, instead of discarding the result of each function call,
transform( )
copies the result (using
operator=)
into
*result,
incrementing
result
after each copy (the sequence pointed to by
result
must have enough storage, otherwise you should use an inserter to force
insertions instead of assignments).
The
first form of
transform( )
simply calls
f( )
and passes it each object from the input range as an argument. The second form
passes an object from the first input range and one from the second input range
as the two arguments to the binary function
f
(note
the length of the second input range is determined by the length of the first).
The return value in both cases is the past-the-end iterator for the resulting
output range.
Examples
Since
much of what you do with objects in a container is to apply an operation to all
of those objects, these are fairly important algorithms and merit several
illustrations.
First,
consider
for_each( ).
This sweeps through the range, pulling out each element and passing it as an
argument as it calls whatever function object it’s been given. Thus
for_each( )
performs operations that you might normally write out by hand. In
Stlshape.cpp,
for example:
for(Iter j = shapes.begin();
j != shapes.end(); j++)
delete *j;
If
you look in your compiler’s header file at the template defining
for_each( ),
you’ll see something like this:
template <class InputIterator, class Function>
Function for_each(InputIterator first,
InputIterator last,
Function f) {
while (first != last) f(*first++);
return f;
}
Function
f
looks
at first like it must be a pointer to a function which takes, as an argument,
an object of whatever
InputIterator
selects. However, the above template actually only says that you must be able
to call
f
using parentheses and an argument. This is true for a function pointer, but
it’s also true for a function object – any class that defines the
appropriate
operator( ).
The
following example shows several different ways this template can be expanded.
First, we need a class that keeps track of its objects so we can know that
it’s being properly destroyed:
//: C21:Counted.h
// An object that keeps track of itself
#ifndef COUNTED_H
#define COUNTED_H
#include <vector>
class Counted {
static int count;
char * id;
public:
Counted(char * ID) : id(ID) { count++; }
~Counted() {
std::cout << id << " count = "
<< --count << std::endl;
}
};
int Counted::count = 0;
class CountedVector :
public std::vector<Counted*> {
public:
CountedVector(char* ID) {
for(int i = 0; i < 5; i++)
push_back(new Counted(ID));
}
};
#endif // COUNTED_H ///:~
The
class
Counted
keeps a static count of how many
Counted
objects have been created, and tells you as they are destroyed. In addition,
each
Counted
keeps a
char*
identifier to make tracking the output easier.
The
CountedVector
is inherited from
vector<Counted*>,
and in the constructor it creates some
Counted
objects, handing each one your desired
char*.
The
CountedVector
makes testing quite simple, as you’ll see.
//: C21:ForEach.cpp
// Use of STL for_each() algorithm
#include <iostream>
#include <vector>
#include <algorithm>
#include "Counted.h"
using namespace std;
// Simple function:
void Destroy(Counted* fp) { delete fp; }
// Function object:
template<class T>
class DeleteT {
public:
void operator()(T* x) { delete x; }
};
// Template function:
template <class T>
void wipe(T* x) { delete x; }
int main() {
CountedVector A("one");
for_each(A.begin(), A.end(), Destroy);
CountedVector B("two");
for_each(B.begin(),B.end(),DeleteT<Counted>());
CountedVector C("three");
for_each(C.begin(), C.end(), wipe<Counted>);
} ///:~
In
main( ),
the first approach is the simple pointer-to-function, which works but has the
drawback that you must write a new
Destroy
function for each different type. The obvious solution is to make a template,
which is shown in the second approach with a templatized function object. On
the other hand, approach three also makes sense: template functions work as
well.
Since
this is obviously something you might want to do a lot, why not create an
algorithm to
delete
all the pointers in a container? This was accomplished with the
purge( )
template created in the previous chapter. However, that used explicitly-written
code; here, we could use
transform( ).
The value of
transform( )
over
for_each( )
is that
transform( )
assigns the result of calling the function object into a resulting range, which
can actually be the input range. That case means a literal transformation for
the input range, since each element would be a modification of its previous
value. In the above example this would be especially useful since it’s
more appropriate to assign each pointer to the safe value of zero after calling
delete
for that pointer.
Transform( )
can easily do this:
//: C21:Transform.cpp
// Use of STL transform() algorithm
#include <iostream>
#include <vector>
#include <algorithm>
#include "Counted.h"
using namespace std;
template<class T>
T* deleteP(T* x) { delete x; return 0; }
template<class T> struct Deleter {
T* operator()(T* x) { delete x; return 0; }
};
int main() {
CountedVector cv("one");
transform(cv.begin(), cv.end(), cv.begin(),
deleteP<Counted>);
CountedVector cv2("two");
transform(cv2.begin(), cv2.end(), cv2.begin(),
Deleter<Counted>());
} ///:~
This
shows both approaches: using a template function or a templatized function
object. After the call to
transform( ),
the vector contains zero pointers, which is safer since any duplicate
deletes
will have no effect.
One
thing you cannot do is
delete
every pointer in a collection without wrapping the call to
delete
inside
a function or an object. That is, you don’t want to say something like
this:
for_each(a.begin(),
a.end(), ptr_fun(operator delete));
You
can say it, but what you’ll get is a sequence of calls to the function
that releases the storage. You will not get the effect of calling
delete
for each pointer in
a,
however; the destructor will not be called. This is typically not what you
want, so you will need wrap your calls to
delete. In
the previous example of
for_each( ),
the return value of the algorithm was ignored. This return value is the
function that is passed in to
for_each( ).
If the function is just a pointer to a function, then the return value is not
very useful, but if it is a function object, then that function object may have
internal member data that it uses to accumulate information about all the
objects that it sees during
for_each( ). For
example, consider a simple model of inventory. Each
Inventory
object has the type of product it represents (here, single characters will be
used for product names), the quantity of that product and the price of each item:
//: C21:Inventory.h
#ifndef INVENTORY_H
#define INVENTORY_H
#include <iostream>
#include <cstdlib>
#include <ctime>
class Inventory {
char item;
int quantity;
int value;
public:
Inventory(char it, int quant, int val)
: item(it), quantity(quant), value(val) {}
// Synthesized operator= & copy-constructor OK
char getItem() const { return item; }
int getQuantity() const { return quantity; }
void setQuantity(int q) { quantity = q; }
int getValue() const { return value; }
void setValue(int val) { value = val; }
friend std::ostream& operator<<(
std::ostream& os, const Inventory& inv) {
return os << inv.item << ": "
<< "quantity " << inv.quantity
<< ", value " << inv.value;
}
};
// A generator:
struct InvenGen {
InvenGen() { std::srand(std::time(0)); }
Inventory operator()() {
static char c = 'a';
int q = std::rand() % 100;
int v = std::rand() % 500;
return Inventory(c++, q, v);
}
};
#endif // INVENTORY_H ///:~
There
are member functions to get the item name, and to get and set quantity and
value. An
operator<<
prints the
Inventory
object to an
ostream.
There’s also a generator that creates objects that have
sequentially-labeled items and random quantities and values. Note the use of
the return value optimization in
operator( ). To
find out the total number of items and total value, you can create a function
object to use with
for_each( )
that has data members to hold the totals:
//: C21:CalcInventory.cpp
// More use of for_each()
#include <vector>
#include <algorithm>
#include "Inventory.h"
#include "PrintSequence.h"
using namespace std;
// To calculate inventory totals:
class InvAccum {
int quantity;
int value;
public:
InvAccum() : quantity(0), value(0) {}
void operator()(const Inventory& inv) {
quantity += inv.getQuantity();
value += inv.getQuantity() * inv.getValue();
}
friend ostream&
operator<<(ostream& os, const InvAccum& ia) {
return os << "total quantity: "
<< ia.quantity
<< ", total value: " << ia.value;
}
};
int main() {
vector<Inventory> vi;
generate_n(back_inserter(vi), 15, InvenGen());
print(vi, "vi");
InvAccum ia = for_each(vi.begin(),vi.end(),
InvAccum());
cout << ia << endl;
} ///:~
InvAccum’s
operator( )
takes a single argument, as required by
for_each( ).
As
for_each( )
moves through its range, it takes each object in that range and passes it to
InvAccum::operator( ),
which performs calculations and saves the result. At the end of this process,
for_each( )
returns the
InvAccum
object which you can then examine; in this case it is simply printed.
You
can do most things to the
Inventory
objects using
for_each( ).
For example, if you wanted to increase all the prices by 10%,
for_each( )
could do this handily. But you’ll notice that the
Inventory
objects have no way to change the
item
value. The programmers who designed
Inventory
thought this was a good idea, after all, why would you want to change the name
of an item? But marketing has decided that they want a “new,
improved” look by changing all the item names to uppercase; they’ve
done studies and determined that the new names will boost sales (well,
marketing has to have
something
to do ...). So
for_each( )
will not work here, but
transform( )
will:
//: C21:TransformNames.cpp
// More use of transform()
#include <vector>
#include <algorithm>
#include <cctype>
#include "Inventory.h"
#include "PrintSequence.h"
using namespace std;
struct NewImproved {
Inventory operator()(const Inventory& inv) {
return Inventory(toupper(inv.getItem()),
inv.getQuantity(), inv.getValue());
}
};
int main() {
vector<Inventory> vi;
generate_n(back_inserter(vi), 15, InvenGen());
print(vi, "vi");
transform(vi.begin(), vi.end(), vi.begin(),
NewImproved());
print(vi, "vi");
} ///:~
Notice
that the resulting range is the same as the input range, that is, the
transformation is performed in-place.
Now
suppose that the sales department needs to generate special price lists with
different discounts for each item. The original list must stay the same, and
there need to be any number of generated special lists. Sales will give you a
separate list of discounts for each new list. To solve this problem we can use
the second version of
transform( ):
//: C21:SpecialList.cpp
// Using the second version of transform()
#include <vector>
#include <algorithm>
#include <cstdlib>
#include <ctime>
#include "Inventory.h"
#include "PrintSequence.h"
using namespace std;
struct Discounter {
Inventory operator()(const Inventory& inv,
float discount) {
return Inventory(inv.getItem(),
inv.getQuantity(),
inv.getValue() * (1 - discount));
}
};
struct DiscGen {
DiscGen() { srand(time(0)); }
float operator()() {
float r = float(rand() % 10);
return r / 100.0;
}
};
int main() {
vector<Inventory> vi;
generate_n(back_inserter(vi), 15, InvenGen());
print(vi, "vi");
vector<float> disc;
generate_n(back_inserter(disc), 15, DiscGen());
print(disc, "Discounts:");
vector<Inventory> discounted;
transform(vi.begin(),vi.end(), disc.begin(),
back_inserter(discounted), Discounter());
print(discounted, "discounted");
} ///:~
Discounter
is a function object that, given an
Inventory
object and a discount percentage, produces a new
Inventory
with the discounted price.
DiscGen
just generates random discount values between 1 and 10 percent to use for
testing. In
main( ),
two
vectors
are created, one for
Inventory
and one for discounts. These are passed to
transform( )
along with a
Discounter
object, and
transform( )
fills
a new
vector<Inventory>
called
discounted.
Numeric
algorithms
These
algorithms are all tucked into the header
<numeric>,
since they are primarily useful for performing numerical calculations.
<numeric>T
accumulate(InputIterator first, InputIterator last, T result);
T
accumulate(InputIterator first, InputIterator last, T result,
BinaryFunction f);
The
first form is a generalized summation; for each element pointed to by an
iterator
i
in
[first,
last)
,
it performs the operation
result
= result + *i
,
where
result
is of type
T.
However, the second form is more general; it applies the function
f(result,
*i)
on each element
*i
in the range from beginning to end. The value
result
is initialized in both cases by
resultI,
and if the range is empty then
resultI
is returned.
Note
the similarity between the second form of
transform( )
and the second form of
accumulate( ).
<numeric>T
inner_product(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, T init);
T
inner_product(InputIterator1 first1, InputIterator1 last1,
InputIterator2 first2, T init
BinaryFunction1 op1, BinaryFunction2 op2);
Calculates
a generalized inner product of the two ranges
[first1,
last1)
and
[first2,
first2 + (last1 - first1))
.
The return value is produced by multiplying the element from the first sequence
by the “parallel” element in the second sequence, and then adding
it to the sum. So if you have two sequences {1, 1, 2, 2} and {1, 2, 3, 4} the
inner product becomes:
(1*1)
+ (1*2) + (2*3) + (2*4)
Which
is 17. The
init
argument is the initial value for the inner product; this is probably zero but
may be anything and is especially important for an empty first sequence,
because then it becomes the default return value. The second sequence must have
at least as many elements as the first.
While
the first form is very specifically mathematical, the second form is simply a
multiple application of functions and could conceivably be used in many other
situations. The
op1
function is used in place of addition, and
op2
is used instead of multiplication. Thus, if you applied the second version of
inner_product( )
to the above sequence, the result would be the following operations:
init = op1(init, op2(1,1));
init = op1(init, op2(1,2));
init = op1(init, op2(2,3));
init = op1(init, op2(2,4));
Thus
it’s similar to
transform( )
but two operations are performed instead of one.
<numeric>OutputIterator
partial_sum(InputIterator first, InputIterator last,
OutputIterator result);
OutputIterator
partial_sum(InputIterator first, InputIterator last,
OutputIterator result, BinaryFunction op);
Calculates
a generalized partial sum. This means that a new sequence is created, beginning
at
result,
where each element is the sum of all the elements up to the currently selected
element in
[first,
last)
.
For example, if the original sequence is
{1,
1, 2, 2, 3}
then the generated sequence is
{1,
1 + 1, 1 + 1 + 2, 1 + 1 + 1 + 2 + 2, 1 + 1 + 1 + 2 + 2 + 3}
,
that is,
{1,
2, 4, 6, 9}
. In
the second version, the binary function
op
is used instead of the
+
operator to take all the “summation” up to that point and combine
it with the new value. For example, if you use
multiplies<int>( )
as the object for the above sequence, the output is
{1,
1, 2, 4, 12}
.
Note that the first output value is always the same as the first input value.
The
return value is the end of the output range
[result,
result + (last - first) )
. <numeric>OutputIterator
adjacent_difference(InputIterator first, InputIterator last,
OutputIterator result);
OutputIterator
adjacent_difference(InputIterator first, InputIterator last,
OutputIterator result, BinaryFunction op);
Calculates
the differences of adjacent elements throughout the range
[first,
last)
.
This means that in the new sequence, the value is the value of the difference
of the current element and the previous element in the original sequence (the
first value is the same). For example, if the original sequence is
{1,
1, 2, 2, 3}
,
the resulting sequence is
{1,
1 – 1, 2 – 1, 2 – 2, 3 – 2}
,
that is:
{1,
0, 1, 0, 1}
. The
second form uses the binary function
op
instead of the
–
operator to perform the “differencing.” For example, if you use
multiplies<int>( )
as the function object for the above sequence, the output is
{1,
1, 2, 4, 6}
. The
return value is the end of the output range
[result,
result + (last - first) )
.
Example
This
program tests all the algorithms in
<numeric>
in both forms, on integer arrays. You’ll notice that in the test of the
form where you supply the function or functions, the function objects used are
the ones that produce the same result as form one so the results produced will
be exactly the same. This should also demonstrate a bit more clearly the
operations that are going on, and how to substitute your own operations.
//: C21:NumericTest.cpp
#include <numeric>
#include <algorithm>
#include <iostream>
#include <iterator>
#include <functional>
#include "PrintSequence.h"
using namespace std;
int main() {
int a[] = { 1, 1, 2, 2, 3, 5, 7, 9, 11, 13 };
const int asz = sizeof a / sizeof a[0];
print(a, a + asz, "a", " ");
int r = accumulate(a, a + asz, 0);
cout << "accumulate 1: " << r << endl;
// Should produce the same result:
r = accumulate(a, a + asz, 0, plus<int>());
cout << "accumulate 2: " << r << endl;
int b[] = { 1, 2, 3, 4, 1, 2, 3, 4, 1, 2 };
print(b, b + sizeof b / sizeof b[0], "b", " ");
r = inner_product(a, a + asz, b, 0);
cout << "inner_product 1: " << r << endl;
// Should produce the same result:
r = inner_product(a, a + asz, b, 0,
plus<int>(), multiplies<int>());
cout << "inner_product 2: " << r << endl;
int* it = partial_sum(a, a + asz, b);
print(b, it, "partial_sum 1", " ");
// Should produce the same result:
it = partial_sum(a, a + asz, b, plus<int>());
print(b, it, "partial_sum 2", " ");
it = adjacent_difference(a, a + asz, b);
print(b, it, "adjacent_difference 1"," ");
// Should produce the same result:
it = adjacent_difference(a, a + asz, b,
minus<int>());
print(b, it, "adjacent_difference 2"," ");
} ///:~
Note
that the return value of
inner_product( )
and
partial_sum( )
is the past-the-end iterator for the resulting sequence, so it is used as the
second iterator in the
print( )
function.
Since
the second form of each function allows you to provide your own function
object, only the first form of the functions is purely “numeric.”
You could conceivably do some things that are not intuitively numeric with
something like
inner_product( ).
General
utilities
Finally,
here are some basic tools that are used with the other algorithms; you may or
may not use them directly yourself.
<utility>struct
pair;
make_pair( );
This
was described and used in the previous chapter and in this one. A
pair
is simply a way to package two objects (which may be of different types)
together into a single object. This is typically used when you need to return
more than one object from a function, but it can also be used to create a
container that holds
pair
objects,
or to pass more than one object as a single argument. You access the elements
by saying
p.first
and
p.second,
where
p
is the
pair
object. The function
equal_range( ),
described in the last chapter and in this one, returns its result as a
pair
of iterators. You can
insert( )
a
pair
directly into a
map
or
multimap;
a
pair
is the
value_type
for those containers.
If
you want to create a
pair,
you typically use the template function
make_pair( )
rather than explicitly constructing a
pair
object.
<iterator>distance(InputIterator
first, InputIterator last);
Tells
you the number of elements between
first
and
last.
More precisely, it returns an integral value that tells you the number of times
first
must be incremented before it is equal to
last.
No dereferencing of the iterators occurs during this process.
<iterator>void
advance(InputIterator& i, Distance n);
Moves
the iterator
i
forward by the value of
n
(the iterator can also be moved backward for negative values of
n
if the iterator is also a bidirectional iterator). This algorithm is aware of
bidirectional iterators, and will use the most efficient approach.
<iterator>back_insert_iterator<Container>
back_inserter(Container& x);
front_insert_iterator<Container>
front_inserter(Container& x);
insert_iterator<Container>
inserter(Container& x, Iterator i);
These
functions are used to create iterators for the given containers that will
insert elements into the container, rather than overwrite the existing elements
in the container using
operator=
(which
is the default behavior). Each type of iterator uses a different operation for
insertion:
back_insert_iterator
uses
push_back( ),
front_insert_iterator
uses
push_front( )
and
insert_iterator
uses
insert( )
(and thus it can be used with the associative containers, while the other two
can be used with sequence containers). These were shown in some detail in the
previous chapter, and also used in this chapter.
const
LessThanComparable& min(const LessThanComparable& a,
const LessThanComparable& b);
const
T& min(const T& a, const T& b, BinaryPredicate binary_pred);
Returns
the lesser of its two arguments, or the first argument if the two are
equivalent. The first version performs comparisons using
operator<
and the second passes both arguments to
binary_pred
to perform the comparison.
const
LessThanComparable& max(const LessThanComparable& a,
const LessThanComparable& b);
const
T& max(const T& a, const T& b, BinaryPredicate binary_pred);
Exactly
like
min( ),
but returns the greater of its two arguments.
void
swap(Assignable& a, Assignable& b);
void
iter_swap(ForwardIterator1 a, ForwardIterator2 b);
Exchanges
the values of
a
and
b
using assignment. Note that all container classes use specialized versions of
swap( )
that are typically more efficient than this general version.
iter_swap( )
is a backwards-compatible remnant in the standard; you can just use
swap( ).
Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru