MFC Programmer's SourceBook : Thinking in C++

The basic object

Step one in C++ is exactly that. Functions can now be placed inside structs as “member functions.” Here’s what it looks like after converting the C version of CStash to the C++ Stash:

//: C04:Libcpp.h
// C library converted to C++

struct Stash {
  int size;      // Size of each space
  int quantity;  // Number of storage spaces
  int next;      // Next empty space
   // Dynamically allocated array of bytes:
  unsigned char* storage;
  // Functions!
  void initialize(int size);
  void cleanup();
  int add(void* element);
  void* fetch(int index);
  int count();
  void inflate(int increase);
}; ///:~

The first thing you’ll notice is the new comment syntax, / /. This is in addition to C-style comments, which still work fine. The C++ comments only go to the end of the line, which is often very convenient. In addition, in this book we put a colon after the // on the first line of the file, followed by the name of the file and a brief description. This allows an exact inclusion of the file from the source code. In addition, you can easily identify the file in the electronic source code from its name in the book listing.

Next, notice there is no typedef. Instead of requiring you to create a typedef, the C++ compiler turns the name of the structure into a new type name for the program (just like int, char, float and double are type names). The use of Stash is still the same.

All the data members are exactly the same as before, but now the functions are inside the body of the struct. In addition, notice that the first argument from the C version of the library has been removed. In C++, instead of forcing you to pass the address of the structure as the first argument to all the functions that operate on that structure, the compiler secretly does this for you. Now the only arguments for the functions are concerned with what the function does, not the mechanism of the function’s operation.

It’s important to realize that the function code is effectively the same as it was with the C library. The number of arguments are the same (even though you don’t see the structure address being passed in, it’s still there); and there’s only one function body for each function. That is, just because you say

Stash A, B, C;

doesn’t mean you get a different add( ) function for each variable.

So the code that’s generated is almost the same as you would have written for the C library. Interestingly enough, this includes the “name mangling” you probably would have done to produce Stash_initialize( ), Stash_cleanup( ), and so on. When the function name is inside the struct, the compiler effectively does the same thing. Therefore, initialize( ) inside the structure Stash will not collide with initialize( ) inside any other structure. Most of the time you don’t have to worry about the function name mangling – you use the unmangled name. But sometimes you do need to be able to specify that this initialize( ) belongs to the struct Stash, and not to any other struct. In particular, when you’re defining the function you need to fully specify which one it is. To accomplish this full specification, C++ has a new operator, :: the scope resolution operator (named so because names can now be in different scopes: at global scope, or within the scope of a struct). For example, if you want to specify initialize( ), which belongs to Stash, you say Stash::initialize(int size, int quantity); . You can see how the scope resolution operator is used in the function definitions for the C++ version of Stash:

//: C04:Libcpp.cpp {O}
// C library converted to C++
// Declare structure and functions:
#include <cstdlib> // Dynamic memory
#include <cstring> // memcpy()
#include <cstdio>
#include "../require.h" // Error testing code
#include "Libcpp.h"
using namespace std;

void Stash::initialize(int sz) {
  size = sz;
  quantity = 0;
  storage = 0;
  next = 0;
}

void Stash::cleanup() {
  if(storage) {
    puts("freeing storage");
    free(storage);
  }
}

int Stash::add(void* element) {
  if(next >= quantity) // Enough space left?
    inflate(100);
  // Copy element into storage,
  // starting at next empty space:
  memcpy(&(storage[next * size]),
         element, size);
  next++;
  return(next - 1); // Index number
}

void* Stash::fetch(int index) {
  if(index >= next || index < 0)
    return 0;  // Not out of bounds?
  // Produce pointer to desired element:
  return &(storage[index * size]);
}

int Stash::count() {
  return next; // Number of elements in Stash
}

void Stash::inflate(int increase) {
  void* v =
    realloc(storage, (quantity+increase)*size);
  require(v != 0);  // Was it successful?
  storage = (unsigned char*)v;
  quantity += increase;
} ///:~

There are several other things that are different about this file. First, the declarations in the header files are required by the compiler. In C++ you cannot call a function without declaring it first. The compiler will issue an error message otherwise. This is an important way to ensure that function calls are consistent between the point where they are called and the point where they are defined. By forcing you to declare the function before you call it, the C++ compiler virtually ensures you will perform this declaration by including the header file. If you also include the same header file in the place where the functions are defined, then the compiler checks to make sure the declaration in the header and the definition match up. This means that the header file becomes a validated repository for function declarations and ensures that functions are used consistently throughout all translation units in the project.

Of course, global functions can still be declared by hand every place where they are defined and used. (This is so tedious that it becomes very unlikely.) However, structures must always be declared before they are defined or used, and the most convenient place to put a structure definition is in a header file, except for those you intentionally hide in a file).

You can see that all the member functions are virtually the same, except for the scope resolution and the fact that the first argument from the C version of the library is no longer explicit. It’s still there, of course, because the function has to be able to work on a particular struct variable. But notice that inside the member function the member selection is also gone! Thus, instead of saying s–>size = Size; you say size = Size; and eliminate the tedious s–>, which didn’t really add anything to the meaning of what you were doing anyway. Of course, the C++ compiler must still be doing this for you. Indeed, it is taking the “secret” first argument and applying the member selector whenever you refer to one of the data members of a class. This means that whenever you are inside the member function of another class, you can refer to any member (including another member function) by simply giving its name. The compiler will search through the local structure’s names before looking for a global version of that name. You’ll find that this feature means that not only is your code easier to write, it’s a lot easier to read.

But what if, for some reason, you want to be able to get your hands on the address of the structure? In the C version of the library it was easy because each function’s first argument was a Stash* called s. In C++, things are even more consistent. There’s a special keyword, called this, which produces the address of the struct. It’s the equivalent of s in the C version of the library. So we can revert to the C style of things by saying

this->size = Size;

The code generated by the compiler is exactly the same. Usually, you don’t use this very often, but when you need it, it’s there.

There’s one last change in the definitions. In inflate( ) in the C library, you could assign a void* to any other pointer like this:

s->storage = v;

and there was no complaint from the compiler. But in C++, this statement is not allowed. Why? Because in C, you can assign a void* (which is what malloc( ), calloc( ), and realloc( ) return) to any other pointer without a cast. C is not so particular about type information, so it allows this kind of thing. Not so with C++. Type is critical in C++, and the compiler stamps its foot when there are any violations of type information. This has always been important, but it is especially important in C++ because you have member functions in structs. If you could pass pointers to structs around with impunity in C++, then you could end up calling a member function for a struct that doesn’t even logically exist for that struct! A real recipe for disaster. Therefore, while C++ allows the assignment of any type of pointer to a void* (this was the original intent of void*, which is required to be large enough to hold a pointer to any type), it will not allow you to assign a void pointer to any other type of pointer. A cast is always required, to tell the reader and the compiler that you know the type that it is going to. Thus you will see the return values of calloc( ) and realloc( ) are explicitly cast to (unsigned char*) .

This brings up an interesting issue. One of the important goals for C++ is to compile as much existing C code as possible to allow for an easy transition to the new language. Notice in the above example how Standard C library functions are used. In addition, all C operators and expressions are available in C++. However, this doesn’t mean any code that C allows will automatically be allowed in C++. There are a number of things the C compiler lets you get away with that are dangerous and error-prone. (We’ll look at them as the book progresses.) The C++ compiler generates warnings and errors for these situations. This is often much more of an advantage than a hindrance. In fact, there are many situations where you are trying to run down an error in C and just can’t find it, but as soon as you recompile the program in C++, the compiler points out the problem! In C, you’ll often find that you can get the program to compile, but then you have to get it to work. In C++, often when the program compiles correctly, it works, too! This is because the language is a lot stricter about type.

You can see a number of new things in the way the C++ version of Stash is used, in the following test program:

//: C04:Libtest.cpp
//{L} Libcpp
// Test of C++ library
#include <cstdio>
#include "../require.h"
#include "Libcpp.h"
using namespace std;
#define BUFSIZE 80

int main() {
  Stash intStash, stringStash;
  int i;
  FILE* file;
  char buf[BUFSIZE];
  char* cp;
  // ....
  intStash.initialize(sizeof(int));
  for(i = 0; i < 100; i++)
    intStash.add(&i);
  // Holds 80-character strings:
  stringStash.initialize(sizeof(char)*BUFSIZE);
  file = fopen("Libtest.cpp", "r");
  require(file != 0);
  while(fgets(buf, BUFSIZE, file))
    stringStash.add(buf);
  fclose(file);

  for(i = 0; i < intStash.count(); i++)
    printf("intStash.fetch(%d) = %d\n", i,
           *(int*)intStash.fetch(i));

  i = 0;
  while(
    (cp = (char*)stringStash.fetch(i++))!=0)
      printf("stringStash.fetch(%d) = %s",
             i - 1, cp);
  putchar('\n');
  intStash.cleanup();
  stringStash.cleanup();
} ///:~

The code is quite similar, but when a member function is called, the call occurs using the member selection operator ‘ .’ preceded by the name of the variable. This is a convenient syntax because it mimics the selection of a data member of the structure. The difference is that this is a function member, so it has an argument list.

Of course, the call that the compiler actually generates looks much more like the original C library function. Thus, considering name mangling and the passing of this, the C++ function call intStash.initialize(sizeof(int), 100) becomes something like Stash_initialize(&intStash, sizeof(int), 100) . If you ever wonder what’s going on underneath the covers, remember that the original C++ compiler cfront from AT&T produced C code as its output, which was then compiled by the underlying C compiler. This approach meant that cfront could be quickly ported to any machine that had a C compiler, and it helped to rapidly disseminate C++ compiler technology.

You’ll also notice an additional cast in

while(cp = (char*)stringStash.fetch(i++))

This is due again to the stricter type checking in C++.

Contents | Prev | Next

Go to CodeGuru.com
Contact: webmaster@codeguru.com
� Copyright 1997-1999 CodeGuru