Elimination
of the definition block
In
C,
you must always define all the variables at the beginning of a block, after the
opening brace. This is not an uncommon requirement in programming languages,
and the reason given has always been that it’s “good programming
style.” On this point, I have my suspicions. It has always seemed
inconvenient to me, as a programmer, to pop back to the beginning of a block
every time I need a new variable. I also find code more readable when the
variable definition is close to its point of use. Perhaps
these arguments are stylistic. In C++, however, there’s a significant
problem in being forced to define all objects at the beginning of a scope. If a
constructor exists, it must be called when the object is created. However, if
the constructor takes one or more initialization arguments, how do you know you
will have that initialization information at the beginning of a scope? In the
general programming situation, you won’t. Because C has no concept of
private,
this separation of definition and initialization is no problem. However, C++
guarantees that when an object is created, it is simultaneously initialized.
This ensures you will have no uninitialized objects running around in your
system. C doesn’t care; in fact, C
encourages
this practice by requiring you to define variables at the beginning of a block
before you necessarily have the initialization information.
Generally
C++ will not allow you to create an object before you have the initialization
information for the constructor, so you don’t have to define variables at
the beginning of a scope. In fact, the style of the language would seem to
encourage the definition of an object as close to its point of use as possible.
In C++, any rule that applies to an “object” automatically refers
to an object of a built-in type, as well. This means that any class object or
variable of a built-in type can also be defined at any point in a scope. It
also means that you can wait until you have the information for a variable
before defining it, so you can always define and initialize at the same time:
//: C06:DefineInitialize.cpp
// Defining variables anywhere
#include <cstdio>
#include <cstdlib>
#include "../require.h"
using namespace std;
class G {
int i;
public:
G(int ii);
};
G::G(int ii) { i = ii; }
int main() {
#define SZ 100
char buf[SZ];
printf("initialization value? ");
int retval = (int)gets(buf);
require(retval != 0);
int x = atoi(buf);
int y = x + 3;
G g(y);
} ///:~
You
can see that
buf
is defined, then some code is executed, then
x
is defined and initialized using a function call, then
y
and
g
are defined. C, of course, would never allow a variable to be defined anywhere
except at the beginning of the scope.
Generally,
you should define variables as close to their point of use as possible, and
always initialize them when they are defined. (This is a stylistic suggestion
for built-in types, where initialization is optional.) This is a safety issue.
By reducing the duration of the variable’s availability within the scope,
you are reducing the chance it will be misused in some other part of the scope.
In addition, readability is improved because the reader doesn’t have to
jump back and forth to the beginning of the scope to know the type of a variable.
for
loops
In
C++, you will often see a
for
loop counter
defined right inside the
for
expression:
for(int j = 0; j < 100; j++) {
printf("j = %d\n", j);
}
for(int i = 0; i < 100; i++)
printf("i = %d\n", i);
The
above statements are important special cases, which cause confusion to new C++
programmers.
The
variables
i
and
j
are defined directly inside the
for
expression (which you cannot do in C). They are then available for use in the
for
loop. It’s a very convenient syntax because the context removes all
question about the purpose of
i
and
j,
so you don’t need to use such ungainly names as
i_loop_counter
for clarity.
The
problem is the lifetime of the variables,
which was formerly determined
by
the enclosing scope
.
This is a situation where a design decision was made from a
compiler-writer’s view of what is logical because as a programmer you
obviously intend
i
to be used only inside the statement(s) of the
for
loop. Unfortunately, however, if you previously took this approach and said
for(int i = 0; i < 100; i++)
printf("i = %d\n", i);
// ....
for(int i = 0; i < 100; i++){
printf("i = %d\n", i);
}
(with
or without curly braces) within the same scope, compilers written for the old
specification gave you a multiple-definition error for
i.
The Standard C++ specification says that the lifetime of a loop counter defined
within the control expression of a
for
loop lasts until the end of the controlled expression, so the above statements
will work with a conforming compiler. Watch out, though, for local variables
that hide variables in the enclosing scope.
I
find small scopes an indicator of good design. If you have several pages for a
single function, perhaps you’re trying to do too much with that function.
More granular functions are not only more useful, but it’s also easier to
find bugs.
Storage
allocation
A
variable can now be defined at any point in a scope, so it might seem initially
that the storage for a variable may not be defined until its point of
definition. It’s more likely that the compiler will follow the practice
in C of allocating all the storage for a block at the opening brace of that
block. It doesn’t matter because, as a programmer, you can’t get
the storage (a.k.a. the object) until it has been defined. Although the storage
is allocated
at the beginning of the block, the constructor call doesn’t happen until
the sequence point where the object is defined because the identifier
isn’t available until then. The compiler even checks to make sure you
don’t put the object definition (and thus the constructor call) where the
sequence point only
conditionally passes through it, such as in a
switch
statement
or somewhere a
goto
can
jump past it. Uncommenting the statements in the following code will generate a
warning or an error:
//: C06:Nojump.cpp {O}
// Can't jump past constructors
class X {
public:
X() {}
};
void f(int i) {
if(i < 10) {
//! goto jump1; // Error: goto bypasses init
}
X x1; // Constructor called here
jump1:
switch(i) {
case 1 :
X x2; // Constructor called here
break;
//! case 2 : // Error: case bypasses init
X x3; // Constructor called here
break;
}
} ///:~
In
the above code, both the
goto
and the
switch
can potentially jump past the sequence point where a constructor is called.
That object will then be in scope even if the constructor hasn’t been
called, so the compiler gives an error message. This once again guarantees that
an object cannot be created unless it is also initialized.
All
the storage allocation discussed here happens, of course, on the stack.
The storage is allocated by the compiler by moving the stack pointer
“down” (a relative term, which may indicate an increase or decrease
of the actual stack pointer value,
depending on your machine). Objects can also be allocated on the heap, but
that’s the subject of Chapter 11.
Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru