Experimental Methods in Computer Science
Exercise 3 – Measuring Trap Overhead
Goals
- Understand the factors affecting a microbenchmark
- Measure the overhead of a trap and get reliable results
Background
A basic characterization of a computer system is often done using
microbenchmarks.
These are short programs that are designed to measure a single well-defined
feature of the system.
For example, the lmbench
benchmark suite includes programs to measure
- memory latency and bandwidth
- inter-process communication bandwidth
- I/O bandwidth
- signal handling overhead
- process creation overhead
- ...and a few other things
To obtain precise results, the measurements are typically repeated multiple
times and averaged.
However, this leads to a risk of increased noise due to interrupts, and a risk
of measuring the wrong thing due to unknown compiler or hardware optimizations
(for example, when trying to measure memory latency we may end up measuring
cache latency instead).
It is therefore necessary to carefully design the benchmarks, and to ensure
they are measuring the right thing.
The specific measurement we will perform in this exercise is the overhead
involved in trapping into the kernel and returning to user mode.
We will measure this using several system calls that are expected to do very
little while actually in the kernel:
- Closing a file descriptor that was not open,
e.g. close(13).
- Obtaining the process ID with getpid().
- Write one word to /dev/null.
For this, first use open("/dev/null",O_WRONLY) to get
a file descriptor, define an integer variable x, and
then measure the time to perform
write(fd,&x,sizeof(x)).
Assignment
The assignment has 3 parts.
Part one is to characterize gettimeofday(), which we
will use to measure time.
Write a short program that calls gettimeofday
repeatedly, and then look at the intervals between the obtained values.
What can you say about the accuracy, precision, and resolution?
Think about how to write the best program for this purpose, and repeat the
measurement if you come up with new ideas.
Part two is to study the effect of the loop structure of performing a
measurement.
Specifically, you should compare measurements obtained using averages of 1, 10,
100, 1000, 10000, 100000, and 1000000 repetitions of the
close(13) system call.
Think about how to account for the loop overhead.
Consider the use of loop unroling, and repeating measurements more than once.
It is crucial to take compiler optimizations into account here — use
gcc -S to create assembler, and look at it to verify that
the compiler did not optimize your measurement away, nor added spurious
instructions (do this just for the main measurement code, not the whole
program, to make it easier to identify what is going on).
Part three is to compare the three system calls listed above.
Based on your data from part two, decide on a measurement scheme (i.e. how many
measurements to conduct and how to structure the loops),
and then measure all three system calls.
Submit
Submit
a single pdf file that contains all the following information:
- Your names, logins, and IDs
- A short explanation of what you did and why, organized as answers to
the following questions:
- On what machine did you run your tests (machine name, CPU type, and
clock rate; use hostname and see
/proc/cpuinfo).
- What were your considerations in writing the program to characterize
gettimeofday()?
- Your results pertaining to gettimeofday: what
can you say about its accuracy, precision, and resolution?
- Did you encounter any problems with compiler optimizations? What did you
do to avoid them?
- What were the results obtained for the trap overhead using different
numbers of repetitions?
If you used several loop structures, provide all the results in an organized
manner (but don't just create lots of meaningless combinations and cause clutter).
- What measurement scheme do you think produces the best results, and why?
- Your results of the trap overhead as measured by the three different
system calls.
If these results do not agree with each other,
- Try to explain the discrepancy.
- Which one do you trust the most as an estimate of the trap overhead, and
why?
- Any relevant data or graphs that you want to use to illustrate or
present your results.
When creating graphs, remember to label the axes, use a legend, etc. as needed.
- The program used to characterize gettimeofday()
and the program you used to measure the three system calls
(include the code listing at the end of the report; don't submit the code itself).
Submission deadline is
Monday morning, 7/3/11,
so I can give feedback in class on Tuesday.
Please do the exercise in pairs.
Remember, this is to allow you to collaborate and discuss how to get the best
solution.
Benchmarking is tricky, so thinking together can be a big help.
To the course home page