Topics in Performance Evaluation – Exercise 1

Topics in Performance Evaluation

Exercise 1 – Plotting Research Results

Looking at data is a basic pre-requisite in performance evaluation. It is used both to characterize measured system conditions and to analyze evaluation results. In this exercise we'll do the latter.

The Data

The following table contains data about the performance of I/O operations in virtual machines on a certain cloud platform (it comes from a paper by Zaharia et al. from OSDI 2008). It gives the write bandwidth achieved for different levels of load, where load is the number of virtual machines on the host. It also shows how common the different load levels were: the column labeled "VMs" is the count of how many VMs experienced this load level, and the performance is given as the average and standard deviation for this set of VMs.

Load Level VMs Write Perf (MB/s) Std Dev
1 VMs/host 202 61.8 4.9
2 VMs/host 264 56.5 10.0
3 VMs/host 201 53.6 11.2
4 VMs/host 140 46.4 11.9
5 VMs/host 45 34.2 7.9
6 VMs/host 12 25.4 2.5
7 VMs/host 7 24.8 0.9

Making Graphs

The most important thing to do when plotting data is to decide what you want to show and how. Then there's the technical issue of actually doing it.

The simplest tool for many people is Excel. You may use Excel. However, be warned that Excel has pretty bad defaults for lots of things, and excuses like "this is how Excel did it" will not be accepted. Example problems are the use of line-plots instead of X-Y plots and limitations with logarithmic axes. Another is the color scheme but this was improved in the latest version.

It may be worth your time to learn to use some other graphics package. Commonly used free packages include gnuplot and ploticus. R is a full statistics analysis environment with good graphics capabilities. Or you could just use matlab.

In this exercise and in all future exercises strong emphasis will be placed on graphical excellence. This means you should take care of the following:

  1. Scales should be appropriate to show the data clearly, without misleading the reader. If relevant, consider using logarithmic scales or axis breaks.
  2. The axes should be labeled and the units included in square brackets. Only exception for units is when you are plotting a pure number, e.g. a count.
  3. The available space should be used efficiently (that is, avoid situations where the graph occupies only a small part of the plotting area, unless you have a good reason related to the story that the graph is trying to convey).
  4. Colors and shapes should be used intelligently to make connections as appropriate. Avoid situations where a line is plotted in light yellow on white background, or lines are hard to distinguish from each other.

Assignment

The authors of the paper chose to present the data in a table. Your assignment is to present it graphically. In particular, you should

  1. define what you are trying to achieve. In your report, state explicitly what is the main thing you want to show, and what may be secondary additional goals. Ideally, this will cover all the data that appears in the table.
  2. Create a graph that you think implements your goals. If you think it is worth while, you may use more than one graph, but note that more is not necessarily better – you want to convey information, not to create clutter.

Submit

Use Moodle to submit a report on your work in pdf format. Note that I request pdf; do not send me a Microsoft word (.doc) file. The report should include the following:

  1. Your names, logins, and IDs
  2. Your rationale for drawing the graph as you did
  3. The resulting graph(s).
Note that in this and future exercises I do not appreciate wordy and excessively long reports. If you can say the important things in one page or less it is better than doing so using more pages.

Submission deadline is Monday, 24 February 2014. No extensions, as I need to check them and bring highlights to the discussion in class on Tuesday.

Please do the exercise in pairs. Use this to work together and improve your ideas.

To the course home page