If
you don’t know how many objects you’re going to need to solve a
particular problem, or how long they will last, you also don’t know how
to store those objects. How can you know how much space to create? You
can’t, since that information isn’t known until run time.
The
solution to most problems in object-oriented design seems flippant: you create
another type of object. For the storage problem, the new type of object holds
other objects, or pointers to objects. Of course, you can do the same thing
with an array, but there’s more. This new type of object, which is
typically referred to in C++ as a
container
(also called a
collection
in some languages), will expand itself whenever necessary to accommodate
everything you place inside it. So you don’t need to know how many
objects you’re going to hold in a collection. You just create a
collection object and let it take care of the details.
Fortunately,
a good OOP language comes with a set of containers as part of the package. In
C++, it’s the Standard Template Library (STL). In some libraries, a
generic container is considered good enough for all needs, and in others (C++
in particular) the library has different types of containers for different
needs: a vector for consistent access to all elements, and a linked list for
consistent insertion at all elements, for example, so you can choose the
particular type that fits your needs. These may include sets, queues, hash
tables, trees, stacks, etc.
All
containers have some way to put things in and get things out. The way that you
place something into a container is fairly obvious. There’s a function
called “push” or “add” or a similar name. Fetching
things out of a container is not always as apparent; if it’s an
array-like entity such as a vector, you might be able to use an indexing
operator or function. But in many situations this doesn’t make sense.
Also, a single-selection function is restrictive. What if you want to
manipulate or compare a group of elements in the container?
The
solution is an
iterator,
which is an object whose job is to select the elements within a container and
present them to the user of the iterator. As a class, it also provides a level
of abstraction. This abstraction can be used to separate the details of the
container from the code that’s accessing that container. The container,
via the iterator, is abstracted to be simply a sequence. The iterator allows
you to traverse that sequence without worrying about the underlying structure
– that is, whether it’s a vector, a linked list, a stack or
something else. This gives you the flexibility to easily change the underlying
data structure without disturbing the code in your program.
From
the design standpoint, all you really want is a sequence that can be
manipulated to solve your problem. If a single type of sequence satisfied all
of your needs, there’d be no reason to have different kinds. There are
two reasons that you need a choice of containers. First, containers provide
different types of interfaces and external behavior. A stack has a different
interface and behavior than that of a queue, which is different than that of a
set or a list. One of these might provide a more flexible solution to your
problem than the other. Second, different containers have different
efficiencies for certain operations. The best example is a vector and a list.
Both are simple sequences that can have identical interfaces and external
behaviors. But certain operations can have radically different costs. Randomly
accessing elements in a vector is a constant-time operation; it takes the same
amount of time regardless of the element you select. However, in a linked list
it is expensive to move through the list to randomly select an element, and it
takes longer to find an element if it is further down the list. On the other
hand, if you want to insert an element in the middle of a sequence, it’s
much cheaper in a list than in a vector. These and other operations have
different efficiencies depending upon the underlying structure of the sequence.
In the design phase, you might start with a list and, when tuning for
performance, change to a vector. Because of the abstraction via iterators, you
can change from one to the other with minimal impact on your code.
In
the end, remember that a container is only a storage cabinet to put objects in.
If that cabinet solves all of your needs, it doesn’t really matter
how
it is implemented (a basic concept with most types of objects). If you’re
working in a programming environment that has built-in overhead due to other
factors, then the cost difference between a vector and a linked list might not
matter. You might need only one type of sequence. You can even imagine the
“perfect” container abstraction, which can automatically change its
underlying implementation according to the way it is used.
STL
reference documentation
You
will notice that this chapter does not contain exhaustive documentation
describing each of the member functions in each STL container. Although I
describe the member functions that I use, I’ve left the full descriptions
to others: there are at least two very good on-line sources of STL
documentation in HTML format that you can keep resident on your computer and
view with a Web browser whenever you need to look something up. The first is
the Dinkumware library (which covers the entire Standard C and C++ library)
mentioned at the beginning of this book section (page XXX). The second is the
freely-downloadable SGI STL and documentation, freely downloadable at
http://www.sgi.com/Technology/STL/. These should provide complete references
when you’re writing code. In addition, the STL books listed in Appendix
XX will provide you with other resources.