The Feitelson 1996 Model
This model is based on observations from 6 accounting logs from
production parallel machines:
- The 128-node iPSC/860 at NASA Ames
- The 128-node IBM SP1 at Argonne
- The 400-node Paragon at SDSC
- The 126-node Butterfly at LLNL
- The 512-node IBM SP2 at CTC
- The 96-node Paragon at ETH, Zurich
In the first 3 cases the logs were analyzed directly, and in the
others published observations were used.
The goal was to produce a workload model that reflects real workloads,
and can be used to drive simulations of parallel schedulers.
Thus the emphasis was on capturing various features of the workloads,
at the expense of mathematical niceties.
In particular, the model captures the discrete nature of the
distribution of degrees of parallelism, the correlation between
parallelism and runtime, and the repeated executions of the same
jobs.
Degrees of Parallelism
Two conspicuous effects were observed in the logs and modeled.
One is the dominance of small jobs over large ones: all logs had a
significant number of serial jobs, and many others using only less
than 10 nodes, even on machines with hundreds of nodes.
The second is the discrete nature of the distribution, with some sizes
preferred over others.
The most popular sizes by far are powers of two, even on machines
where there is no architectural reason for using such sizes.
The model for sizes is based on a harmonic distribution of order 1.5
(the probability for size n is proportional to
1/n1.5), which is then hand-tailored to further
emphasize small sizes and powers of two.
Correlation of Runtime with Parallelism
Two observations guided the modeling of runtimes.
One is the well-known fact that the variability of runtimes is high,
such that the coefficient of variation of the distribution is larger
than one (the coefficient of variation is the quotient of the standard
deviation divided by the mean; for an exponential distribution it is
1).
The other is that there is a weak correlation between size (that is,
degree of parallelism) and length (runtime).
While there are small jobs that run for a long time and large jobs
that run for a short time, the tendancy is for large jobs to run
longer.
The model for runtimes is a hyperexponential distribution, so that the
coefficient of variation is larger than one.
In addition, a linear relation is made between the job size and the
probability of using the exponential with the higher mean.
Thus there is actually a different distribution for each size, and
the mean runtime for larger sizes is higher.
Repeated Executions
It is often assumed that jobs are statistically equivalent and
independent of each other.
Analysis of the logs shows that this is in fact not always so.
In particular, there are many cases where the same user runs the same
job repeatadly for many times, creating a sequence of jobs with
similar properties.
Defining the runlength of a job to be the number of repeated
executions, it was found that runlegths come from a Zipf distribution.
The model was therefore that the probability of a runlength of
n is proportional to n-2.5.
An important consequence of this feature is that this model captures a
degree of feedback from the scheduler to the workload.
Specifically, when a job is repeated n times, each repetition
is submitted when the previous one terminates.
Thus if the scheduler delays one repetition, it causes a corresponding
delay in the submital of the next one.
Arrival Process
The arrival process used in this model is Poisson.
However, the fact that each job may be repeated multiple times changes
this, because the repetitions are generated immediately after the
previous run completes.
They do not wait a randomly chosen interarrival time.
Two Versions
The availabe code actually includes two versions: the original version
from 1996, and an improved one from 1997.
The differences are in the details of the distributions.
In particular, the original model the runtimes were in arbitrary units
with a mean of around 2, whereas in the improved model they were meant
to reflect seconds.
It is recomended that the 1997 version be used.
Parallel Workloads Archive - Models