MFC Programmer's SourceBook : Thinking in C++
Bruce Eckel's Thinking in C++, 2nd Ed Contents | Prev | Next

Holding bits

Most of my computer education was in hardware-level design and programming, and I spent my first few years doing embedded systems development. Because C was a language that purported to be “close to the hardware,” I have always found it dismaying that there was no native binary representation for numbers. Decimal, of course, and hexadecimal (tolerable only because it’s easier to group the bits in your mind), but octal? Ugh. Whenever you read specs for chips you’re trying to program, they don’t describe the chip registers in octal, or even hexadecimal – they use binary. And yet C won’t let you say 0b0101101, which is the obvious solution for a language close to the hardware.

Although there’s still no native binary representation in C++, things have improved with the addition of two classes: bitset and vector<bool>, both of which are designed to manipulate a group of on-off values. The primary differences between these types are:

  1. The bitset holds a fixed number of bits. You establish the quantity of bits in the bitset template argument. The vector<bool> can, like a regular vector, expand dynamically to hold any number of bool values.
  2. The bitset is explicitly designed for performance when manipulating bits, and not as a “regular” container. As such, it has no iterators and it’s most storage-efficient when it contains an integral number of long values. The vector<bool>, on the other hand, is a specialization of a vector, and so has all the operations of a normal vector – the specialization is just designed to be space-efficient for bool.
There is no trivial conversion between a bitset and a vector<bool>, which implies that the two are for very different purposes.

bitset<n>

The template for bitset accepts an integral template argument which is the number of bits to represent. Thus, bitset<10> is a different type than bitset<20>, and you cannot perform comparisons, assignments, etc. between the two.

A bitset provides virtually any bit operation that you could ask for, in a very efficient form. However, each bitset is made up of an integral number of longs (typically 32 bits), so even though it uses no more space than it needs, it always uses at least the size of a long. This means you’ll use space most efficiently if you increase the size of your bitsets in chunks of the number of bits in a long. In addition, the only conversion from a bitset to a numerical value is to an unsigned long , which means that 32 bits (if your long is the typical size) is the most flexible form of a bitset.

The following example tests almost all the functionality of the bitset (the missing operations are redundant or trivial). You’ll see the description of each of the bitset outputs to the right of the output so that the bits all line up and you can compare them to the source values. If you still don’t understand bitwise operations, running this program should help.

//: C20:BitSet.cpp
// Exercising the bitset class
#include <iostream>
#include <bitset>
#include <cstdlib>
#include <ctime>
#include <string>
using namespace std;
const int sz = 32;
typedef bitset<sz> BS;

template<int bits>
bitset<bits> randBitset() {
  bitset<bits> r(rand());
  for(int i = 0; i < bits/16 - 1; i++) {
    r <<= 16;
    // "OR" together with a new lower 16 bits:
    r |= bitset<bits>(rand()); 
  }
  return r;
}  

int main() {
  srand(time(0));
  cout << "sizeof(bitset<16>) = " 
    << sizeof(bitset<16>) << endl;
  cout << "sizeof(bitset<32>) = " 
    << sizeof(bitset<32>) << endl;
  cout << "sizeof(bitset<48>) = " 
    << sizeof(bitset<48>) << endl;
  cout << "sizeof(bitset<64>) = " 
    << sizeof(bitset<64>) << endl;
  cout << "sizeof(bitset<65>) = " 
    << sizeof(bitset<65>) << endl;
  BS a(randBitset<sz>()), b(randBitset<sz>());
  // Converting from a bitset:
  unsigned long ul = a.to_ulong();
  string s = b.to_string();
  // Converting a string to a bitset:
  char* cbits = "111011010110111";
  cout << "char* cbits = " << cbits <<endl;
  cout << BS(cbits) << " [BS(cbits)]" << endl;
  cout << BS(cbits, 2) 
    << " [BS(cbits, 2)]" << endl;
  cout << BS(cbits, 2, 11)
    << " [BS(cbits, 2, 11)]" << endl;
  cout << a << " [a]" << endl;
  cout << b << " [b]"<< endl;
  // Bitwise AND:
  cout << (a & b) << " [a & b]" << endl;
  cout << (BS(a) &= b) << " [a &= b]" << endl;
  // Bitwise OR:
  cout << (a | b) << " [a | b]" << endl;
  cout << (BS(a) |= b) << " [a |= b]" << endl;
  // Exclusive OR:
  cout << (a ^ b) << " [a ^ b]" << endl;
  cout << (BS(a) ^= b) << " [a ^= b]" << endl;
  cout << a << " [a]" << endl; // For reference
  // Logical left shift (fill with zeros):
  cout << (BS(a) <<= sz/2) 
    << " [a <<= (sz/2)]" << endl;
  cout << (a << sz/2) << endl;
  cout << a << " [a]" << endl; // For reference
  // Logical right shift (fill with zeros):
  cout << (BS(a) >>= sz/2) 
    << " [a >>= (sz/2)]" << endl;
  cout << (a >> sz/2) << endl;
  cout << a << " [a]" << endl; // For reference
  cout << BS(a).set() << " [a.set()]" << endl;
  for(int i = 0; i < sz; i++)
    if(!a.test(i)) {
      cout << BS(a).set(i) 
        << " [a.set(" << i <<")]" << endl;
      break; // Just do one example of this
    }
  cout << BS(a).reset() << " [a.reset()]"<< endl;
  for(int j = 0; j < sz; j++)
    if(a.test(j)) {
      cout << BS(a).reset(j) 
        << " [a.reset(" << j <<")]" << endl;
      break; // Just do one example of this
    }
  cout << BS(a).flip() << " [a.flip()]" << endl;
  cout << ~a << " [~a]" << endl;  
  cout << a << " [a]" << endl; // For reference
  cout << BS(a).flip(1) << " [a.flip(1)]"<< endl;
  BS c;
  cout << c << " [c]" << endl;
  cout << "c.count() = " << c.count() << endl;
  cout << "c.any() = " 
    << (c.any() ? "true" : "false") << endl;
  cout << "c.none() = " 
    << (c.none() ? "true" : "false") << endl;
  c[1].flip(); c[2].flip();
  cout << c << " [c]" << endl;
  cout << "c.count() = " << c.count() << endl;
  cout << "c.any() = " 
    << (c.any() ? "true" : "false") << endl;
  cout << "c.none() = " 
    << (c.none() ? "true" : "false") << endl;
  // Array indexing operations:
  c.reset();
  for(int k = 0; k < c.size(); k++)
    if(k % 2 == 0)
      c[k].flip();
  cout << c << " [c]" << endl;
  c.reset();
  // Assignment to bool:
  for(int ii = 0; ii < c.size(); ii++)
    c[ii] = (rand() % 100) < 25;
  cout << c << " [c]" << endl;
  // bool test:
  if(c[1] == true) 
    cout << "c[1] == true"; 
  else 
    cout << "c[1] == false" << endl;
} ///:~ 

To generate interesting random bitsets, the randBitset( ) function is created. The Standard C rand( ) function only generates an int, so this function demonstrates operator<<= by shifting each 16 random bits to the left until the bitset (which is templatized in this function for size) is full. The generated number and each new 16 bits is combined using the operator|=.

The first thing demonstrated in main( ) is the unit size of a bitset. If it is less than 32 bits, sizeof produces 4 (4 bytes = 32 bits), which is the size of a single long on most implementations. If it’s between 32 and 64, it requires two longs, greater than 64 requires 3 longs, etc. Thus you make the best use of space if you use a bit quantity that fits in an integral number of longs. However, notice there’s no extra overhead for the object – it’s as if you were hand-coding to use a long.

Another clue that bitset is optimized for longs is that there is a to_ulong( ) member function that produces the value of the bitset as an unsigned long . There are no other numerical conversions from bitset, but there is a to_string( ) conversion that produces a string containing ones and zeros, and this can be as long as the actual bitset. However, using bitset<32> may make your life simpler because of to_ulong( ).

There’s still no primitive format for binary values, but the next best thing is supported by bitset: a string of ones and zeros with the least-significant bit (lsb) on the right. The three constructors demonstrated show taking the entire string (the char array is automatically converted to a string), the string starting at character 2, and the string from character 2 through 11. You can write to an ostream from a bitset using operator<< and it comes out as ones and zeros. You can also read from an istream using operator>> (not shown here).

You’ll notice that bitset only has three non-member operators: and ( &), or ( |) and exclusive-or ( ^). Each of these create a new bitset as their return value. All of the member operators opt for the more efficient &=, |=, etc. form where a temporary is not created. However, these forms actually change their lvalue (which is a in most of the tests in the above example). To prevent this, I created a temporary to be used as the lvalue by invoking the copy-constructor on a; this is why you see the form BS(a). The result of each test is printed out, and occasionally a is reprinted so you can easily look at it for reference.

The rest of the example should be self-explanatory when you run it; if not you can find the details in your compiler’s documentation or the other documentation mentioned earlier in this chapter.

vector<bool>

vector<bool> is a specialization of the vector template. A normal bool variable requires at least one byte, but since a bool only has two states the ideal implementation of vector<bool> is such that each bool value only requires one bit. This means the iterator must be specially-defined, and cannot be a bool*.

The bit-manipulation functions for vector<bool> are much more limited than those of bitset. The only member function that was added to those already in vector is flip( ), to invert all the bits; there is no set( ) or reset( ) as in bitset. When you use operator[ ] , you get back an object of type vector<bool>::reference, which also has a flip( ) to invert that individual bit.

//: C20:VectorOfBool.cpp
// Demonstrate the vector<bool> specialization
#include <iostream>
#include <sstream>
#include <vector>
#include <bitset>
#include <iterator>
using namespace std;

int main() {
  vector<bool> vb(10, true);
  vector<bool>::iterator it;
  for(it = vb.begin(); it != vb.end(); it++)
    cout << *it;
  cout << endl;
  vb.push_back(false);
  ostream_iterator<bool> out (cout, "");
  copy(vb.begin(), vb.end(), out);
  cout << endl;
  bool ab[] = { true, false, false, true, true, 
    true, true, false, false, true };
  // There's a similar constructor:
  vb.assign(ab, ab + sizeof(ab)/sizeof(bool));
  copy(vb.begin(), vb.end(), out);
  cout << endl;
  vb.flip(); // Flip all bits
  copy(vb.begin(), vb.end(), out);
  cout << endl;
  for(int i = 0; i < vb.size(); i++)
    vb[i] = 0; // (Equivalent to "false")
  vb[4] = true;
  vb[5] = 1;
  vb[7].flip(); // Invert one bit
  copy(vb.begin(), vb.end(), out);
  cout << endl;
  // Convert to a bitset:
  ostringstream os;
  copy(vb.begin(), vb.end(), 
    ostream_iterator<bool>(os, ""));
  bitset<10> bs(os.str());
  cout << "Bitset:\n" << bs << endl;
} ///:~ 

The last part of this example takes a vector<bool> and converts it to a bitset by first turning it into a string of ones and zeros. Of course, you must know the size of the bitset at compile-time. You can see that this conversion is not the kind of operation you’ll want to do on a regular basis.

Contents | Prev | Next


Go to CodeGuru.com
Contact: webmaster@codeguru.com
© Copyright 1997-1999 CodeGuru