Experimental Methods in Computer Science – Exercise 12

Experimental Methods in Computer Science

Exercise 12 – Bin Packing Algorithms

Goals

Find which is better: first fit or best fit
...and under what conditions

Background

Bin packing is one of the most studied problems in computer science. The on-line version proceeds as follows. You are given a suply of unit-size bins, and a sequence of items that need to be packed into these bins. Items have a size in the range 0 to 1. As each item arrives, you need to decide which bin to pack it into, without knowing what items will appear in the future. The items packed into the same bin must not violate the bin capacity, i.e. the sum of their sizes is bounded by 1. Your goal is to use as few bins as possible.

When a new item arrives, first fit scans all the bins that have been used so far from the first to the last, to see if the new item fits in any of them; it is mapped to the first one where it fits. If it does not fit into any available bin, a new bin is started for it.

Best fit is similar; the difference is that it always scans all the current bins, and selects the one in which the fit is snuggest (i.e. the leftover space in the bin will be the smallest).

Two other possible algorithms are worst fit and next fit. Worst fit is the opposite of best fit: it scans all the bins, but chooses the one that leaves the most space available after the allocation. Next fit is more like first fit, except that it treats the bins as a cycle, and always starts with the bin following the one in which the previous allocation was made, instead of starting from bin number 1.

Assignment

You need to measure the performance of first fit and best fit (the other two are optional if you do the exercise on time, and required if you want an extra few weeks due to exams pressure). The first assignment is therefore to implement them.
Given working bin packing algorithms, measure their performance as a function of how many items are being packed, and what distribution they come from.
Use the following 3 distributions:
- Uniform on 0 to 1
- Exponential with mean 1, but truncated at 1 because we can't pack bigger items in unit-size bins. Note that due to the truncation the actual mean will be lower.
- Exponential with mean 0.1, also truncated at 1.
Do this for different numbers of items n, ranging from a few to very many (you define the exact range). For the truncated exponential distributions use C*n items, where C compensates for the fact that the distribution's mean is lower than 0.5 and that only a fraction of the samples are used. For the exponential with mean 1 C comes out about 1.9, and for the exponential with mean 0.1 it will be around 5.
The metric to use is the excess capacity requirement (or relative fragmentation), i.e. the ratio of the capacities of all the bins you needed divided by the total sizes of the packed items.
Repeat the measurement several times to note the effect of the random sampling on the results. Make sure that each repetition uses a different seed!

Submit

Submit a single pdf file that contains all the following information:

Your names, logins, and IDs.
Your implementations of the bin packing algorithms.
Results of the experiments, detailing the effect of n and of the distribution. Use graphs as appropriate.
Your conclusion: which algorithm is better and when.

The official submission deadline is Tuesday morning, 30/6/09, but you can get an extension of a few weeks if you need due to exam pressure. If you submit late, send me an email about it so I don't forget to check it.

Please do the exercise in pairs.

To the course home page