MFC Programmer's SourceBook : Thinking in C++
Bruce Eckel's Thinking in C++, 2nd Ed Contents | Prev | Next

Operating on strings

If you’ve programmed in C, you are accustomed to the convenience of a large family of functions for writing, searching, rearranging, and copying char arrays. However, there are two unfortunate aspects of the Standard C library functions for handling char arrays. First, there are three loosely organized families of them: the “plain” group, the group that manipulates the characters without respect to case, and the ones which require you to supply a count of the number of characters to be considered in the operation at hand. The roster of function names in the C char array handling library literally runs to several pages, and though the kind and number of arguments to the functions are somewhat consistent within each of the three groups, to use them properly you must be very attentive to details of function naming and parameter passing.

The second inherent trap of the standard C char array tools is that they all rely explicitly on the assumption that the character array includes a null terminator. If by oversight or error the null is omitted or overwritten, there’s very little to keep the C char array handling functions from manipulating the memory beyond the limits of the allocated space, sometimes with disastrous results.

C++ provides a vast improvement in the convenience and safety of string objects. For purposes of actual string handling operations, there are a modest two or three dozen member function names. It’s worth your while to become acquainted with these. Each function is overloaded, so you don’t have to learn a new string member function name simply because of small differences in their parameters.

Appending, inserting and concatenating strings

One of the most valuable and convenient aspects of C++ strings is that they grow as needed, without intervention on the part of the programmer. Not only does this make string handling code inherently more trustworthy, it also almost entirely eliminates a tedious “housekeeping” chore – keeping track of the bounds of the storage in which your strings live. For example, if you create a string object and initialize it with a string of 50 copies of ‘X’, and later store in it 50 copies of “Zowie”, the object itself will reallocate sufficient storage to accommodate the growth of the data. Perhaps nowhere is this property more appreciated than when the strings manipulated in your code will change in size, but when you don’t know big the change is. Appending, concatenating, and inserting strings often give rise to this circumstance, but the string member functions append( ) and insert( ) transparently reallocate storage when a string grows.

//: C17:StrSize.cpp
#include <string>
#include <iostream>
using namespace std;

int main() {
  string bigNews("I saw Elvis in a UFO. ");
  cout << bigNews << endl;
  // How much data have we actually got?
  cout << "Size = " << bigNews.size() << endl;
  // How much can we store without reallocating
  cout << "Capacity = " 
    << bigNews.capacity() << endl;
  // Insert this string in bigNews immediately
  // following bigNews[1]
  bigNews.insert(1, " thought I ");
  cout << bigNews << endl;
  cout << "Size = " << bigNews.size() << endl;
  cout << "Capacity = " 
    << bigNews.capacity() << endl;
  // Make sure that there will be this much space
  bigNews.reserve(500);
  // Add this to the end of the string
  bigNews.append("I've been working too hard.");
  cout << bigNews << endl;
  cout << "Size = " << bigNews.size() << endl;
  cout << "Capacity = " 
    << bigNews.capacity() << endl;
} ///:~ 

Here is the output:

I saw Elvis in a UFO.
Size = 21
Capacity = 31
I thought I saw Elvis in a UFO.
Size = 32
Capacity = 63
I thought I saw Elvis in a UFO. I've been 
working too hard.
Size = 66
Capacity = 511 

This example demonstrates that even though you can safely relinquish much of the responsibility for allocating and managing the memory your strings occupy, C++ strings provide you with several tools to monitor and manage their size. The size( ), resize( ), capacity( ), and reserve( ) member functions can be very useful when its necessary to work back and forth between data contained in C++ style strings and traditional null terminated C char arrays. Note the ease with which we changed the size of the storage allocated to the string.

The exact fashion in which the string member functions will allocate space for your data is dependent on the implementation of the library. When one implementation was tested with the example above, it appeared that reallocations occurred on even word boundaries, with one byte held back. The architects of the string class have endeavored to make it possible to mix the use of C char arrays and C++ string objects, so it is likely that figures reported by StrSize.cpp for capacity reflect that in this particular implementation, a byte is set aside to easily accommodate the insertion of a null terminator.

Replacing string characters

insert( ) is particularly nice because it absolves you of making sure the insertion of characters in a string won’t overrun the storage space or overwrite the characters immediately following the insertion point. Space grows and existing characters politely move over to accommodate the new elements. Sometimes, however, this might not be what you want to happen. If the data in string needs to retain the ordering of the original characters relative to one another or must be a specific constant size, use the replace( ) function to overwrite a particular sequence of characters with another group of characters. There are quite a number of overloaded versions of replace( ), but the simplest one takes three arguments: an integer telling where to start in the string, an integer telling how many characters to eliminate from the original string, and the replacement string (which can be a different number of characters than the eliminated quantity). Here’s a very simple example:

//: C17:StringReplace.cpp
// Simple find-and-replace in strings
#include <string>
#include <iostream>
using namespace std;

int main() {
  string s("A piece of text");
  string tag("$tag$");
  s.insert(8, tag + ' ');
  cout << s << endl;
  int start = s.find(tag);
  cout << "start = " << start << endl;
  cout << "size = " << tag.size() << endl;
  s.replace(start, tag.size(), "hello there");
  cout << s << endl;
} ///:~ 

The tag is first inserted into s (notice that the insert happens before the value indicating the insert point, and that an extra space was added after tag), then it is found and replaced.

You should actually check to see if you’ve found anything before you perform a replace( ). The above example replaces with a char*, but there’s an overloaded version that replaces with a string. Here’s a more complete demonstration replace( )

//: C17:Replace.cpp
#include <string>
#include <iostream>
using namespace std;

void replaceChars(string& modifyMe, 
  string findMe, string newChars){
  // Look in modifyMe for the "find string"
  // starting at position 0
  int i = modifyMe.find(findMe, 0);
  // Did we find the string to replace?
  if(i != string::npos)
    // Replace the find string with newChars
    modifyMe.replace(i,newChars.size(),newChars);
}

int main() {
  string bigNews = 
   "I thought I saw Elvis in a UFO. "
   "I have been working too hard.";
  string replacement("wig");
  string findMe("UFO");
  // Find "UFO" in bigNews and overwrite it:
  replaceChars(bigNews, findMe,  replacement);
  cout << bigNews << endl;
} ///:~ 

Now the last line of output from replace.cpp looks like this:

I thought I saw Elvis in a wig. I have been
working too hard. 

If replace doesn’t find the search string, it returns npos. npos is a static constant member of the basic_string class.

Unlike insert( ), replace( ) won’t grow the string’s storage space if you copy new characters into the middle of an existing series of array elements. However, it will grow the storage space if you make a “replacement” that writes beyond the end of an existing array. Here’s an example:

//: C17:ReplaceAndGrow.cpp
#include <string>
#include <iostream>
using namespace std;

int main() {
  string bigNews("I saw Elvis in a UFO. "
    "I have been working too hard.");
  string replacement("wig");
  // The first arg says "replace chars 
  // beyond the end of the existing string":
  bigNews.replace(bigNews.size(), 
    replacement.size(), replacement);
  cout << bigNews << endl;
} ///:~ 

The call to replace( ) begins “replacing” beyond the end of the existing array. The output looks like this:

I saw Elvis in a UFO. I have 
been working too hard.wig 

Notice that replace( ) expands the array to accommodate the growth of the string due to “replacement” beyond the bounds of the existing array.

Simple character replacement using the STL replace( ) algorithm

You may have been hunting through this chapter trying to do something relatively simple like replace all the instances of one character with a different character. Upon finding the above section on replacing, you thought you found the answer but then you started seeing groups of characters and counts and other things that looked a bit too complex. Doesn’t string have a way to just replace one character with another everywhere?

The answer comes in observing that the string class doesn’t solve all these problems alone. It contains a significant number of functions, but there other problems it doesn’t solve. These are relegated to the STL algorithms, and enabled because the string class can look just like an STL container, which is what the STL algorithms want to work with. All the STL algorithms specify a “range” of elements within a container that the algorithm will work upon. Usually that range is just “from the beginning of the container to the end.” A string object looks like a container of characters, and to get the beginning of the range you use string::begin( ) and to get the end of the range you use string::end( ). The following example shows the use of the STL replace( ) algorithm to replace all the instances of ‘X’ with ‘Y’:

//: C17:StringCharReplace.cpp
#include <string>
#include <algorithm>
#include <iostream>
using namespace std;

int main() {
  string s("aaaXaaaXXaaXXXaXXXXaaa");
  cout << s << endl;
  replace(s.begin(), s.end(), 'X', 'Y');
  cout << s << endl;
} ///:~ 

Notice that this replace( ) is not called as a member function of string. Also, unlike the string::replace( ) functions which only perform one replacement, the STL replace is replacing all instances of one character with another.

The STL replace( ) algorithm only works with single objects (in this case, char objects), and will not perform replacements of quoted char arrays or of string objects.

Since a string looks like an STL container, there are a number of other STL algorithms that can be applied to it, which may solve other problems you have that are not directly addressed by the string member functions. See Chapter 21 for more information on the STL algorithms.

Concatenation using non-member overloaded operators

One of the most delightful discoveries awaiting a C programmer learning about C++ string handling is how simply strings can be combined and appended using operator+ and operator+=. These operators make combining strings syntactically equivalent to adding numeric data.

//: C17:AddStrings.cpp
#include <string>
#include <iostream>
using namespace std;

int main() {
  string s1("This ");
  string s2("That ");
  string s3("The other ");
  // operator+ concatenates strings
  s1 = s1 + s2;
  cout << s1 << endl;
  // Another way to concatenates strings
  s1 += s3;
  cout << s1 << endl;
  // You can index the string on the right
  s1 += s3 + s3[4] + "oh lala";
  cout << s1 << endl;
} ///:~ 

The output looks like this:

This
This That
This That The other
This That The other ooh lala 
operator+ and operator+= are a very flexible and convenient means of combining string data. On the right hand side of the statement, you can use almost any type that evaluates to a group of one or more characters.

Contents | Prev | Next


Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru