MFC Programmer's SourceBook : Thinking in C++
Bruce Eckel's Thinking in C++, 2nd Ed Contents | Prev | Next

Object creation

When a C++ object is created, two events occur:

  1. Storage is allocated for the object.
  2. The constructor is called to initialize that storage.
By now you should believe that step two always happens. C++ enforces it because uninitialized objects are a major source of program bugs. It doesn’t matter where or how the object is created – the constructor is always called.

Step one, however, can occur in several ways, or at alternate times:

  1. Storage can be allocated before the program begins, in the static storage are
a. This storage exists for the life of the program.

  1. Storage can be created on the stack
whenever a particular execution point is reached (an opening brace). That storage is released automatically at the complementary execution point (the closing brace). These stack-allocation operations are built into the instruction set of the processor and are very efficient. However, you have to know exactly how much storage you need when you’re writing the program so the compiler can generate the right code.

  1. Storage can be allocated from a pool of memory called the heap
(also known as the free store). This is called dynamic memory allocation. To allocate this memory, a function is called at run-time; this means you can decide at any time that you want some memory and how much you need. You are also responsible for determining when to release the memory, which means the lifetime of that memory can be as long as you choose – it isn’t determined by scope.

Often these three regions are placed in a single contiguous piece of physical memory: the static area, the stack, and the heap (in an order determined by the compiler writer). However, there are no rules. The stack may be in a special place, and the heap may be implemented by making calls for chunks of memory from the operating system. As a programmer, these things are normally shielded from you, so all you need to think about is that the memory is there when you call for it.

C’s approach to the heap

To allocate memory dynamically at run-time, C provides functions in its standard library: malloc( ) and its variants calloc( ) and realloc( ) to produce memory from the heap, and free( ) to release the memory back to the heap. These functions are pragmatic but primitive and require understanding and care on the part of the programmer. To create an instance of a class on the heap using C’s dynamic memory functions, you’d have to do something like this:

//: C13:Malclass.cpp
// Malloc with class objects
// What you'd have to do if not for "new"
#include <cstdlib> // Malloc() & free()
#include <cstring> // Memset()
#include <iostream>
#include "../require.h"
using namespace std;

class Obj {
  int i, j, k;
  enum { sz = 100 };
  char buf[sz];
public:
  void initialize() { // Can't use constructor
    cout << "initializing Obj" << endl;
    i = j = k = 0;
    memset(buf, 0, sz);
  }
  void destroy() { // Can't use destructor
    cout << "destroying Obj" << endl;
  }
};

int main() {
  Obj* obj = (Obj*)malloc(sizeof(Obj));
  require(obj != 0);
  obj->initialize();
  // ... sometime later:
  obj->destroy();
  free(obj);
} ///:~ 

You can see the use of malloc( ) to create storage for the object in the line:

Obj* obj = (Obj*)malloc(sizeof(Obj));

Here, the user must determine the size of the object (one place for an error). malloc( ) returns a void* because it’s just a patch of memory, not an object. C++ doesn’t allow a void* to be assigned to any other pointer, so it must be cast.

Because malloc( ) may fail to find any memory (in which case it returns zero), you must check the returned pointer to make sure it was successful.

But the worst problem is this line:

Obj->initialize();

If they make it this far correctly, users must remember to initialize the object before it is used. Notice that a constructor was not used because the constructor cannot be called explicitly – it’s called for you by the compiler when an object is created. The problem here is that the user now has the option to forget to perform the initialization before the object is used, thus reintroducing a major source of bugs.

It also turns out that many programmers seem to find C’s dynamic memory functions too confusing and complicated; it’s not uncommon to find C programmers who use virtual memory machines allocating huge arrays of variables in the static storage area to avoid thinking about dynamic memory allocation. Because C++ is attempting to make library use safe and effortless for the casual programmer, C’s approach to dynamic memory is unacceptable.

operator new

The solution in C++ is to combine all the actions necessary to create an object into a single operator called new. When you create an object with new (using a new-expression), it allocates enough storage on the heap to hold the object, and calls the constructor for that storage. Thus, if you say

MyType *fp = new MyType(1,2);

at run-time, the equivalent of malloc(sizeof(MyType)) is called (often, it is literally a call to malloc( )), and the constructor for MyType is called with the resulting address as the this pointer, using (1,2) as the argument list. By the time the pointer is assigned to fp, it’s a live, initialized object – you can’t even get your hands on it before then. It’s also automatically the proper MyType type so no cast is necessary.

The default new also checks to make sure the memory allocation was successful before passing the address to the constructor, so you don’t have to explicitly determine if the call was successful. Later in the chapter you’ll find out what happens if there’s no memory left.

You can create a new-expression using any constructor available for the class. If the constructor has no arguments, you can make the new-expression without the constructor argument list:

MyType *fp = new MyType;

Notice how simple the process of creating objects on the heap becomes – a single expression, with all the sizing, conversions, and safety checks built in. It’s as easy to create an object on the heap as it is on the stack.

operator delete

The complement to the new-expression is the delete-expression, which first calls the destructor and then releases the memory (often with a call to free( )). Just as a new-expression returns a pointer to the object, a delete-expression requires the address of an object.

delete fp;

cleans up the dynamically allocated MyType object created earlier.

delete can be called only for an object created by new. If you malloc( ) (or calloc( ) or realloc( )) an object and then delete it, the behavior is undefined. Because most default implementations of new and delete use malloc( ) and free( ), you’ll probably release the memory without calling the destructor.

If the pointer you’re deleting is zero, nothing will happen. For this reason, people often recommend setting a pointer to zero immediately after you delete it, to prevent deleting it twice. Deleting an object more than once is definitely a bad thing to do, and will cause problems.

A simple example

This example shows that the initialization takes place:

//: C13:Newdel.cpp
// Simple demo of new & delete
#include <iostream>
using namespace std;

class Tree {
  int height;
public:
  Tree(int height) {
    height = height;
  }
  ~Tree() { cout << "*"; }
  friend ostream&
  operator<<(ostream& os, const Tree* t) {
    return os << "Tree height is: "
              << t->height << endl;
  }
};

int main() {
  Tree* t = new Tree(40);
  cout << t;
  delete t;
} ///:~ 

We can prove that the constructor is called by printing out the value of the Tree. Here, it’s done by overloading the operator<< to use with an ostream. Note, however, that even though the function is declared as a friend, it is defined as an inline! This is a mere convenience – defining a friend function as an inline to a class doesn’t change the friend status or the fact that it’s a global function and not a class member function. Also notice that the return value is the result of the entire output expression, which is itself an ostream& (which it must be, to satisfy the return value type of the function).

Memory manager overhead

When you create auto objects on the stack, the size of the objects and their lifetime is built right into the generated code, because the compiler knows the exact quantity and scope. Creating objects on the heap involves additional overhead, both in time and in space. Here’s a typical scenario. (You can replace malloc( ) with calloc( ) or realloc( ).)

  1. You call malloc( ), which requests a block of memory from the pool. (This code may actually be part of malloc( ).)
  2. The pool is searched for a block of memory large enough to satisfy the request. This is done by checking a map or directory of some sort that shows which blocks are currently in use and which blocks are available. It’s a quick process, but it may take several tries so it might not be deterministic – that is, you can’t necessarily count on malloc( 
) always taking exactly the same amount of time.

  1. Before a pointer to that block is returned, the size and location of the block must be recorded so further calls to malloc( ) won’t use it, and so that when you call free( 
), the system knows how much memory to release.

The way all this is implemented can vary widely. For example, there’s nothing to prevent primitives for memory allocation being implemented in the processor. If you’re curious, you can write test programs to try to guess the way your malloc( ) is implemented. You can also read the library source code, if you have it.

Contents | Prev | Next


Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru