MFC Programmer's SourceBook : Thinking in C++
Bruce Eckel's Thinking in C++, 2nd Ed Contents | Prev | Next

All-purpose virtual constructors

Show simpler version of virtual constructor scheme, letting the user create the object with new. Probably make constructor for objects private and use a maker function to force all objects on the heap.

Consider the oft-cited “shapes” example. It seems logical that inside the constructor for a Shape object, you would want to set everything up and then draw( ) the Shape. draw( ) should be a virtual function, a message to the Shape that it should draw itself appropriately, depending on whether it is a circle, square, line, and so on. However, this doesn’t work inside the constructor, for the reasons given in Chapter 13: Virtual functions resolve to the “local” function bodies when called in constructors.

If you want to be able to call a virtual function inside the constructor and have it do the right thing, you must use a technique to simulate a virtual constructor. This is a conundrum. Remember the idea of a virtual function is that you send a message to an object and let the object figure out the right thing to do. But a constructor builds an object. So a virtual constructor would be like saying, “I don’t know exactly what type of object you are, but build yourself anyway.” In an ordinary constructor, the compiler must know which VTABLE address to bind to the VPTR, and if it existed, a virtual constructor couldn’t do this because it doesn’t know all the type information at compile-time. It makes sense that a constructor can’t be virtual because it is the one function that absolutely must know everything about the type of the object.

And yet there are times when you want something approximating the behavior of a virtual constructor.

In the Shape example, it would be nice to hand the Shape constructor some specific information in the argument list and let the constructor create a specific type of Shape (a Circle, Square, or Triangle) with no further intervention. Ordinarily, you’d have to make an explicit call to the Circle, Square, or Triangle constructor yourself.

Coplien[70] calls his solution to this problem “envelope and letter classes.” The “envelope” class is the base class, a shell that contains a pointer to an object of the base class. The constructor for the “envelope” determines (at run-time, when the constructor is called, not at compile-time, when the type checking is normally done) what specific type to make, then creates an object of that specific type (on the heap) and assigns the object to its pointer. All the function calls are then handled by the base class through its pointer.

Here’s a simplified version of the “shape” example:

//: C:ShapeV.cpp
// "Virtual constructors"
// Used in a simple "Shape" framework
#include <iostream>
#include <vector>
#include "../purge.h"
using namespace std;

class Shape {
  Shape* S;
  // Prevent copy-construction & operator=
  Shape(Shape&);
  Shape operator=(Shape&);
protected:
  Shape() { S = 0; };
public:
  enum type { tCircle, tSquare, tTriangle };
  Shape(type);  // "Virtual" constructor
  virtual void draw() { S->draw(); }
  virtual ~Shape() {
    cout << "~Shape\n";
    delete S;
  }
};

class Circle : public Shape {
  Circle(Circle&);
  Circle operator=(Circle&);
public:
  Circle() {}
  void draw() { cout << "Circle::draw\n"; }
  ~Circle() { cout << "~Circle\n"; }
};

class Square : public Shape {
  Square(Square&);
  Square operator=(Square&);
public:
  Square() {}
  void draw() { cout << "Square::draw\n"; }
  ~Square() { cout << "~Square\n"; }
};

class Triangle : public Shape {
  Triangle(Triangle&);
  Triangle operator=(Triangle&);
public:
  Triangle() {}
  void draw() { cout << "Triangle::draw\n"; }
  ~Triangle() { cout << "~Triangle\n"; }
};

Shape::Shape(type t) {
  switch(t) {
    case tCircle: S = new Circle; break;
    case tSquare: S = new Square; break;
    case tTriangle: S = new Triangle; break;
  }
  draw();  // Virtual call in the constructor
}

int main() {
  vector<Shape*> shapes;
  cout << "virtual constructor calls:" << endl;
  shapes.push_back(new Shape(Shape::tCircle));
  shapes.push_back(new Shape(Shape::tSquare));
  shapes.push_back(new Shape(Shape::tTriangle));
  cout << "virtual function calls:" << endl;
  for(int i = 0; i < shapes.size(); i++)
    shapes[i]->draw();
  Shape c(Shape::tCircle); // Can create on stack
  purge(shapes);
} ///:~ 

The base class Shape contains a pointer to an object of type Shape as its only data member. When you build a “virtual constructor” scheme, you must exercise special care to ensure this pointer is always initialized to a live object.

The type enumeration inside class Shape and the requirement that the constructor for Shape be defined after all the derived classes are two of the restrictions of this method. Each time you derive a new subtype from Shape, you must go back and add the name for that type to the type enumeration. Then you must modify the Shape constructor to handle the new case. The disadvantage is you now have a dependency between the Shape class and all classes derived from it. However, the advantage is that the dependency that normally occurs in the body of the program (and possibly in more than one place, which makes it less maintainable) is isolated inside the class. In addition, this produces an effect like the “cheshire cat” technique in Chapter 2; in this case all the specific shape class definitions can be hidden inside the implementation files. That way, the base-class interface is truly the only thing the user sees.

In this example, the information you must hand the constructor about what type to create is very explicit: It’s an enumeration of the type. However, your scheme may use other information – for example, in a parser the output of the scanner may be handed to the “virtual constructor,” which then uses that text string to determine what exact token to create.

The “virtual constructor” Shape(type) can only be declared inside the class; it cannot be defined until after all the base classes have been declared. However, the default constructor can be defined inside class Shape , but it should be made protected so temporary Shape objects cannot be created. This default constructor is only called by the constructors of derived-class objects. You are forced to explicitly create a default constructor because the compiler will create one for you automatically only if there are no constructors defined. Because you must define Shape(type), you must also define Shape( ).

The default constructor in this scheme has at least one very important chore – it must set the value of the S pointer to zero. This sounds strange at first, but remember that the default constructor will be called as part of the construction of the actual object – in Coplien’s terms, the “letter,” not the “envelope.” However, the “letter” is derived from the “envelope,” so it also inherits the data member S. In the “envelope,” S is important because it points to the actual object, but in the “letter,” S is simply excess baggage. Even excess baggage should be initialized, however, and if S is not set to zero by the default constructor called for the “letter,” bad things happen (as you’ll see later).

The “virtual constructor” takes as its argument information that completely determines the type of the object. Notice, though, that this type information isn’t read and acted upon until run-time, whereas normally the compiler must know the exact type at compile-time (one other reason this system effectively imitates virtual constructors).

Inside the virtual constructor there’s a switch statement that uses the argument to construct the actual object, which is then assigned to the pointer inside the “envelope.” After that, the construction of the “letter” is completed, so any virtual calls will be properly directed.

As an example, consider the call to draw( ) inside the virtual constructor. If you trace this call (either by hand or with a debugger), you can see that it starts in the draw( ) function in the base class, Shape. This function calls draw( ) for the “envelope” S pointer to its “letter.” All types derived from Shape share the same interface, so this virtual call is properly executed, even though it seems to be in the constructor. (Actually, the constructor for the “letter” has already completed.) As long as all virtual calls in the base class simply make calls to identical virtual function through the pointer to the “letter,” the system operates properly.

To understand how it works, consider the code in main( ). To create the array s[ ] , “virtual constructor” calls are made to Shape. Ordinarily in a situation like this, you would call the constructor for the actual type, and the VPTR for that type would be installed in the object. Here, however, the VPTR used in each case is the one for Shape, not the one for the specific Circle, Square, or Triangle.

In the for loop where the draw( ) function is called for each Shape, the virtual function call resolves, through the VPTR, to the corresponding type. However, this is Shape in each case. In fact, you might wonder why draw( ) was made virtual at all. The reason shows up in the next step: The base-class version of draw( ) makes a call, through the “letter” pointer S, to the virtual function draw( ) for the “letter.” This time the call resolves to the actual type of the object, not just the base class Shape. Thus the run-time cost of using “virtual constructors” is one more virtual call every time you make a virtual function call.

A remaining conundrum

In the “virtual constructor” scheme presented here, calls made to virtual functions inside destructors will be resolved in the normal way, at compile-time. You might think that you can get around this restriction by cleverly calling the base-class version of the virtual function inside the destructor. Unfortunately, this results in disaster. The base-class version of the function is indeed called. However, its this pointer is the one for the “letter” and not the “envelope.” So when the base-class version of the function makes its call – remember, the base-class versions of virtual functions always assume they are dealing with the “envelope” – it will go to the S pointer to make the call. For the “letter,” S is zero, set by the protected default constructor. Of course, you could prevent this by adding even more code to each virtual function in the base class to check that S is not zero. Or, you can follow two rules when using this “virtual constructor” scheme:

  1. Never explicitly call root class virtual functions from derived-class functions.
  2. Always redefine virtual functions defined in the root class.
The second rule results from the same problem as rule one. If you call a virtual function inside a “letter,” the function gets the “letter” this pointer. If the virtual call resolves to the base-class version, which will happen if the function was never redefined, then a call will be made through the “letter” this pointer to the “letter” S, which is again zero.

You can see that there are costs, restrictions, and dangers to using this method, which is why you don’t want to have it as part of the programming environment all the time. It’s one of those features that’s very useful for solving certain types of problems, but it isn’t something you want to cope with all the time. Fortunately, C++ allows you to put it in when you need it, but the standard way is the safest and, in the end, easiest.

Destructor operation

The activities of destruction in this scheme are also tricky. To understand, let’s verbally walk through what happens when you call delete for a pointer to a Shape object – specifically, a Square – created on the heap. (This is more complicated than an object created on the stack.) This will be a delete through the polymorphic interface, as in the statement delete s[i] in main( ).

The type of the pointer s[i] is of the base class Shape, so the compiler makes the call through Shape. Normally, you might say that it’s a virtual call, so Square’s destructor will be called. But with this “virtual constructor” scheme, the compiler is creating actual Shape objects, even though the constructor initializes the “letter” pointer to a specific type of Shape. The virtual mechanism is used, but the VPTR inside the Shape object is Shape’s VPTR, not Square’s. This resolves to Shape’s destructor, which calls delete for the “letter” pointer S, which actually points to a Square object. This is again a virtual call, but this time it resolves to Square’s destructor.

With a destructor, however, all destructors in the hierarchy must be called. Square’s destructor is called first, followed by any intermediate destructors, in order, until finally the base-class destructor is called. This base-class destructor has code that says delete S . When this destructor was called originally, it was for the “envelope” S, but now it’s for the “letter” S, which is there because the “letter” was inherited from the “envelope,” and not because it contains anything. So this call to delete should do nothing.

The solution to the problem is to make the “letter” S pointer zero. Then when the “letter” base-class destructor is called, you get delete 0 , which by definition does nothing. Because the default constructor is protected, it will be called only during the construction of a “letter,” so that’s the only situation where S is set to zero.


[70]James O. Coplien, Advanced C++ Programming Styles and Idioms , Addison-Wesley, 1992.

Contents | Prev | Next


Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru