MFC Programmer's SourceBook : Thinking in C++
Bruce Eckel's Thinking in C++, 2nd Ed Contents | Prev | Next

A string application

My friend Daniel (who designed the cover and page layout for this book) does a lot of work with Web pages. One tool he uses creates a “site map” consisting of a Java applet to display the map and an HTML tag that invoked the applet and provided it with the necessary data to create the map. Daniel wanted to use this data to create an ordinary HTML page (sans applet) that would contain regular links as the site map. The resulting program turns out to be a nice practical application of the string class, so it is presented here.

The input is an HTML file that contains the usual stuff along with an applet tag with a parameter that begins like this:

<param name="source_file" value="

The rest of the line contains encoded information about the site map, all combined into a single line (it’s rather long, but fortunately string objects don’t care). Each entry may or may not begin with a number of ‘ #’ signs; each of these indicates one level of depth. If no ‘ #’ sign is present the entry will be considered to be at level one. After the ‘ #’ is the text to be displayed on the page, followed by a ‘ %’ and the URL to use as the link. Each entry is terminated by a ‘ *’. Thus, a single entry in the line might look like this:

###|Useful Art%./Build/useful_art.html*

The ‘ |’ at the beginning is an artifact that needs to be removed.

My solution was to create an Item class whose constructor would take input text and create an object that contains the text to be displayed, the URL and the level. The objects essentially parse themselves, and at that point you can read any value you want. In main( ), the input file is opened and read until the line contains the parameter that we’re interested in. Everything but the site map codes are stripped away from this string, and then it is parsed into Item objects:

//: C17:SiteMapConvert.cpp
// Using strings to create a custom conversion
// program that generates HTML output
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
#include "../require.h"
using namespace std;

class Item {
  string id, url;
  int depth;
  string removeBar(string s) {
    if(s[0] == '|')
      return s.substr(1);
    else return s;
  }
public:
  Item(string in, int& index) : depth(0) {
    while(in[index] == '#' && index < in.size()){
      depth++;
      index++;
    }
    // 0 means no '#' marks were found:
    if(depth == 0) depth = 1;
    while(in[index] != '%' && index < in.size())
      id += in[index++];
    id = removeBar(id);
    index++; // Move past '%'
    while(in[index] != '*' && index < in.size())
      url += in[index++];
    url = removeBar(url);
    index++; // To move past '*'
  }
  string identifier() { return id; }
  string path() { return url; }
  int level() { return depth; }
};

int main(int argc, char* argv[]) {
  requireArgs(argc, 1,
    "usage: SiteMapConvert inputfilename");
  ifstream in(argv[1]);
  assure(in, argv[1]);
  ofstream out("plainmap.html");
  string line;
  while(getline(in, line)) {
    if(line.find("<param name=\"source_file\"")
       != string::npos) {
      // Extract data of from start of sequence
      // until the terminating quote mark:
      line = line.substr(line.find("value=\"") 
             + string("value=\"").size());
      line = line.substr(0, 
               line.find_last_of("\""));
      int index = 0;
      while(index < line.size()) {
        Item item(line, index);
        string startLevel, endLevel;
        if(item.level() == 1) {
          startLevel = "<h1>";
          endLevel = "</h1>";
        } else
          for(int i = 0; i < item.level(); i++)
            for(int j = 0; j < 5; j++)
              out << "&nbsp;";
        string htmlLine = "<a href=\""
          + item.path() + "\">"
          + item.identifier() + "</a><br>";
        out << startLevel << htmlLine 
            << endLevel << endl;
      }
      break; // Out of while loop
    }
  }
} ///:~ 

Item contains a private member function removeBar( ) that is used internally to strip off the leading bars, if they appear.

The constructor for Item initializes depth to 0 to indicate that no ‘#’ signs were found yet; if none are found then it is assumed the Item should be displayed at level one. Each character in the string is examined using operator[ ] to find the depth, id and url values. The other member functions simply return these values.

After opening the files, main( ) uses string::find( ) to locate the line containing the site map data. At this point, substr( ) is used in order to strip off the information before and after the site map data. The subsequent while loop performs the parsing, but notice that the value index is passed by reference into the Item constructor, and that constructor increments index as it parses each new Item, thus moving forward in the sequence.

If an Item is at level one, then an HTML h1 tag is used, otherwise the elements are indented using HTML non-breaking spaces. Note in the initialization of htmlLine how easy it is to construct a string – you can just combine quoted character arrays and other string objects using operator+.

When the output is written to the destination file, startLevel and endLevel will only produce results if they have been given any value other than their default initialization values.

Contents | Prev | Next


Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru