Looking at data is a basic pre-requisite in performance evaluation. It is used both to characterize measured system conditions and to analyze evaluation results. In this exercise we'll do the latter.
The following table contains data about the performance of I/O operations in virtual machines on a certain cloud platform (it comes from a paper by Zaharia et al. from OSDI 2008). It gives the write bandwidth achieved for different levels of load, where load is the number of virtual machines on the host. It also shows how common the different load levels were: the column labeled "VMs" is the count of how many VMs experienced this load level, and the performance is given as the average and standard deviation for this set of VMs.
Load Level | VMs | Write Perf (MB/s) | Std Dev |
1 VMs/host | 202 | 61.8 | 4.9 |
2 VMs/host | 264 | 56.5 | 10.0 |
3 VMs/host | 201 | 53.6 | 11.2 |
4 VMs/host | 140 | 46.4 | 11.9 |
5 VMs/host | 45 | 34.2 | 7.9 |
6 VMs/host | 12 | 25.4 | 2.5 |
7 VMs/host | 7 | 24.8 | 0.9 |
The most important thing to do when plotting data is to decide what you want to show and how. Then there's the technical issue of actually doing it.
The simplest tool for many people is Excel. You may use Excel. However, be warned that Excel has pretty bad defaults for lots of things, and excuses like "this is how Excel did it" will not be accepted. Example problems are the use of line-plots instead of X-Y plots and limitations with logarithmic axes. Another is the color scheme but this was improved in the latest version.
It may be worth your time to learn to use some other graphics package. Commonly used free packages include gnuplot and ploticus. R is a full statistics analysis environment with good graphics capabilities. Or you could just use matlab.
In this exercise and in all future exercises strong emphasis will be placed on graphical excellence. This means you should take care of the following:
The authors of the paper chose to present the data in a table. Your assignment is to present it graphically. In particular, you should
Use Moodle to submit a report on your work in pdf format. Note that I request pdf; do not send me a Microsoft word (.doc) file. The report should include the following:
Submission deadline is Monday, 24 February 2014. No extensions, as I need to check them and bring highlights to the discussion in class on Tuesday.
Please do the exercise in pairs. Use this to work together and improve your ideas.