Operating
on strings
If
you’ve programmed in C, you are accustomed to the convenience of a large
family of functions for writing, searching, rearranging, and copying
char
arrays. However, there are two unfortunate aspects of the Standard C library
functions for handling
char
arrays. First, there are three loosely organized families of them: the
“plain” group, the group that manipulates the characters
without
respect to case, and the ones which require you to supply a count of the number
of characters to be considered in the operation at hand. The roster of function
names in the C
char
array handling library literally runs to several pages, and though the kind and
number of arguments to the functions are somewhat consistent within each of the
three groups, to use them properly you must be very attentive to details of
function naming and parameter passing.
The
second inherent trap of the standard C
char
array tools is that they all rely explicitly on the assumption that the
character array includes a null terminator. If by oversight or error the null
is omitted or overwritten, there’s very little to keep the C
char
array handling functions from manipulating the memory beyond the limits of the
allocated space, sometimes with disastrous results.
C++
provides a vast improvement in the convenience and safety of
string
objects. For purposes of actual string handling operations, there are a modest
two or three dozen member function names. It’s worth your while to become
acquainted with these. Each function is overloaded, so you don’t have to
learn a new
string
member function name simply because of small differences in their parameters.
Appending,
inserting and concatenating strings
One
of the most valuable and convenient aspects of C++ strings is that they grow as
needed, without intervention on the part of the programmer. Not only does this
make string handling code inherently more trustworthy, it also almost entirely
eliminates a tedious “housekeeping” chore – keeping track of
the bounds of the storage in which your strings live. For example, if you
create a string object and initialize it with a string of 50 copies of
‘X’, and later store in it 50 copies of “Zowie”, the
object itself will reallocate sufficient storage to accommodate the growth of
the data. Perhaps nowhere is this property more appreciated than when the
strings manipulated in your code will change in size, but when you don’t
know big the change is. Appending, concatenating, and inserting strings often
give rise to this circumstance, but the string member functions
append( )
and
insert( )
transparently reallocate storage when a string grows.
//: C17:StrSize.cpp
#include <string>
#include <iostream>
using namespace std;
int main() {
string bigNews("I saw Elvis in a UFO. ");
cout << bigNews << endl;
// How much data have we actually got?
cout << "Size = " << bigNews.size() << endl;
// How much can we store without reallocating
cout << "Capacity = "
<< bigNews.capacity() << endl;
// Insert this string in bigNews immediately
// following bigNews[1]
bigNews.insert(1, " thought I ");
cout << bigNews << endl;
cout << "Size = " << bigNews.size() << endl;
cout << "Capacity = "
<< bigNews.capacity() << endl;
// Make sure that there will be this much space
bigNews.reserve(500);
// Add this to the end of the string
bigNews.append("I've been working too hard.");
cout << bigNews << endl;
cout << "Size = " << bigNews.size() << endl;
cout << "Capacity = "
<< bigNews.capacity() << endl;
} ///:~
I saw Elvis in a UFO.
Size = 21
Capacity = 31
I thought I saw Elvis in a UFO.
Size = 32
Capacity = 63
I thought I saw Elvis in a UFO. I've been
working too hard.
Size = 66
Capacity = 511
This
example demonstrates that even though you can safely relinquish much of the
responsibility for allocating and managing the memory your
strings
occupy, C++
strings
provide you with several tools to monitor and manage their size. The
size( ),
resize( ),
capacity( ),
and
reserve( )
member functions can be very useful when its necessary to work back and forth
between data contained in C++ style strings and traditional null terminated C
char
arrays. Note the ease with which we changed the size of the storage allocated
to the string.
The
exact fashion in which the
string
member functions will allocate space for your data is dependent on the
implementation of the library. When one implementation was tested with the
example above, it appeared that reallocations occurred on even word boundaries,
with one byte held back. The architects of the
string
class have endeavored to make it possible to mix the use of C
char
arrays and C++ string objects, so it is likely that figures reported by
StrSize.cpp
for capacity reflect that in this particular implementation, a byte is set
aside to easily accommodate the insertion of a null terminator.
Replacing
string characters
insert( )
is
particularly nice because it absolves you of making sure the insertion of
characters in a string won’t overrun the storage space or overwrite the
characters immediately following the insertion point. Space grows and existing
characters politely move over to accommodate the new elements. Sometimes,
however, this might not be what you want to happen. If the data in string needs
to retain the ordering of the original characters relative to one another or
must be a specific constant size, use the
replace( )
function to overwrite a particular sequence of characters with another group of
characters. There are quite a number of overloaded versions of
replace( ),
but the simplest one takes three arguments: an integer telling where to start
in the string, an integer telling how many characters to eliminate from the
original string, and the replacement string (which can be a different number of
characters than the eliminated quantity). Here’s a very simple example:
//: C17:StringReplace.cpp
// Simple find-and-replace in strings
#include <string>
#include <iostream>
using namespace std;
int main() {
string s("A piece of text");
string tag("$tag$");
s.insert(8, tag + ' ');
cout << s << endl;
int start = s.find(tag);
cout << "start = " << start << endl;
cout << "size = " << tag.size() << endl;
s.replace(start, tag.size(), "hello there");
cout << s << endl;
} ///:~
The
tag
is first inserted into
s
(notice that the insert happens
before
the value indicating the insert point, and that an extra space was added after
tag),
then it is found and replaced.
You
should actually check to see if you’ve found anything before you perform a
replace( ).
The
above example replaces with a
char*,
but there’s an overloaded version that replaces with a
string.
Here’s
a more complete demonstration
replace( )
//: C17:Replace.cpp
#include <string>
#include <iostream>
using namespace std;
void replaceChars(string& modifyMe,
string findMe, string newChars){
// Look in modifyMe for the "find string"
// starting at position 0
int i = modifyMe.find(findMe, 0);
// Did we find the string to replace?
if(i != string::npos)
// Replace the find string with newChars
modifyMe.replace(i,newChars.size(),newChars);
}
int main() {
string bigNews =
"I thought I saw Elvis in a UFO. "
"I have been working too hard.";
string replacement("wig");
string findMe("UFO");
// Find "UFO" in bigNews and overwrite it:
replaceChars(bigNews, findMe, replacement);
cout << bigNews << endl;
} ///:~
Now
the last line of output from
replace.cpp
looks like this:
I thought I saw Elvis in a wig. I have been
working too hard.
If
replace doesn’t find the search string, it returns
npos.
npos
is a static constant member of the
basic_string
class.
Unlike
insert( ),
replace( )
won’t grow the
string’s
storage space if you copy new characters into the middle of an existing series
of array elements. However, it
will
grow the storage space if you make a “replacement” that writes
beyond the end of an existing array. Here’s an example:
//: C17:ReplaceAndGrow.cpp
#include <string>
#include <iostream>
using namespace std;
int main() {
string bigNews("I saw Elvis in a UFO. "
"I have been working too hard.");
string replacement("wig");
// The first arg says "replace chars
// beyond the end of the existing string":
bigNews.replace(bigNews.size(),
replacement.size(), replacement);
cout << bigNews << endl;
} ///:~
The
call to
replace( )
begins
“replacing” beyond the end of the existing array. The output looks
like this:
I saw Elvis in a UFO. I have
been working too hard.wig
Notice
that
replace( )
expands the array to accommodate the growth of the string due to
“replacement” beyond the bounds of the existing array.
Simple
character replacement using the STL replace( ) algorithm
You
may have been hunting through this chapter trying to do something relatively
simple like replace all the instances of one character with a different
character. Upon finding the above section on replacing, you thought you found
the answer but then you started seeing groups of characters and counts and
other things that looked a bit too complex. Doesn’t
string
have a way to just replace one character with another everywhere?
The
answer comes in observing that the
string
class doesn’t solve all these problems alone. It contains a significant
number of functions, but there other problems it doesn’t solve. These are
relegated to the STL algorithms, and enabled because the
string
class can look just like an STL container, which is what the STL algorithms
want to work with. All the STL algorithms specify a “range” of
elements within a container that the algorithm will work upon. Usually that
range is just “from the beginning of the container to the end.” A
string
object looks like a container of characters, and to get the beginning of the
range you use
string::begin( )
and to get the end of the range you use
string::end( ).
The following example shows the use of the STL
replace( )
algorithm to replace all the instances of ‘X’ with ‘Y’:
//: C17:StringCharReplace.cpp
#include <string>
#include <algorithm>
#include <iostream>
using namespace std;
int main() {
string s("aaaXaaaXXaaXXXaXXXXaaa");
cout << s << endl;
replace(s.begin(), s.end(), 'X', 'Y');
cout << s << endl;
} ///:~
Notice
that this
replace( )
is
not
called as a member function of
string.
Also, unlike the
string::replace( )
functions which only perform one replacement, the STL replace is replacing all
instances of one character with another.
The
STL
replace( )
algorithm only works with single objects (in this case,
char
objects), and will not perform replacements of quoted
char
arrays or of
string
objects.
Since
a
string
looks like an STL container, there are a number of other STL algorithms that
can be applied to it, which may solve other problems you have that are not
directly addressed by the
string
member functions. See Chapter 21 for more information on the STL algorithms.
Concatenation
using non-member overloaded operators
One
of the most delightful discoveries awaiting a C programmer learning about C++
string
handling is how simply
strings
can be combined and appended using
operator+
and
operator+=.
These
operators make combining
strings
syntactically equivalent to adding numeric data.
//: C17:AddStrings.cpp
#include <string>
#include <iostream>
using namespace std;
int main() {
string s1("This ");
string s2("That ");
string s3("The other ");
// operator+ concatenates strings
s1 = s1 + s2;
cout << s1 << endl;
// Another way to concatenates strings
s1 += s3;
cout << s1 << endl;
// You can index the string on the right
s1 += s3 + s3[4] + "oh lala";
cout << s1 << endl;
} ///:~
The
output looks like this:
This
This That
This That The other
This That The other ooh lala
operator+
and
operator+=
are
a very flexible and
convenient
means of combining
string
data. On the right hand side of the statement, you can use almost any type that
evaluates to a group of one or more characters.
Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru