Experimental Methods in Computer Science – Exercise 4

Experimental Methods in Computer Science

Exercise 4 – Finding Cache Parameters

Goals

Background

Modern microprocessors have a multi-level memory hierarchy. Usually there are two or even three levels of caches between the processor and the main memory. Data is moved between the levels according to usage: When a datum is first used, it is inserted into the caches. When space is needed to serve other data, older data is evicted.

In order to benefit from spatial locality, data is moved between memory levels in chunks called cache lines. Each cache line contains a certain number of words (or bytes). Bringing in a cache line is an expensive operation, but once it's there, you can access its elements for a very low cost. For example, accessing only one element from each cache line takes approximately the same time as accessing all of them.

The caches themselves have different sizes: the L1 cache is the smallest, the L2 cache bigger, but it takes more time to access the L2 cache. Caches may also have different levels of associativity.

To make things even more interesting, not all the memory is directly accessible to the processor. The processor works with virtual addresses, and these have to be translated to physical addresses using the page table. In order to work faster, the most useful part of the page table is cached in another, special cache, called the TLB. Thus only memory that is mapped using the TLB is directly accessible; memory whose mapping does not appear in the TLB will first suffer a TLB miss, and the access can only be tried after the TLB is loaded with the required translation. Luckily, the TLB may not be relevant in this exercise.

Assignment

In this exercise we want to write a microbenchmark that provides data regarding the size of the cache line and the sizes of the different caches.

First, measure the time needed to access the elements of a large array ("large" means really large, so that you can be sure it doesn't fit in the caches, e.g. 64MB). Measure the total time when accessing every cell, every other cell, every fourth cell, and so on, up to every 1024th cell (all values are powers of two). Draw a graph of the results. Are the results as may be expected? What can you learn from this about the cache line size?

Now measure the time per access when traversing arrays of different sizes. Start with a size of 1KB, and go up to 64MB in powers of two. You do not need to touch all the elements of each array -- it is enough to touch one from each cache line, based on the previous measurement. Draw a graph of the results. Are the results as may be expected? What can you learn from this about the cache sizes?

When you design your measurements, think about the following:

Run the measurements and deduce the sizes on two different machines (e.g. Levy, Aquarium, Inferno, Mangal, Santa).

Submit

Submit a single pdf file that contains all the following information:

  1. Your names, logins, and IDs (PLEASE DON'T FORGET LOGINS)
  2. Specifics of where you ran the measurements: what machines, and any information you can get on their model and cache parameters (from the machines operating system, not from your measurements).
  3. Measurements of cache line size:
    1. Your considerations in designing the first set of measurements (for the cache line size).
    2. Your measurement results, in a graphical format, for both machines. Think about how to best show what you got. Don't forget to label the axes etc.
    3. A short explanation of what the results show -- what is the cache line size, and why you think so. Is the data consistent and good, or are there things you don't understand?
    4. The program you wrote to perform the measurements.
  4. Measurements of cache sizes:
    1. Your considerations in designing the second set of measurements (for the cache sizes).
    2. Your measurement results, in a graphical format. Think about how to best show what you got. Don't forget to label the axes etc.
    3. A short explanation of what the results show -- what are the cache sizes, and why you think so. Is the data consistent and good, or are there things you don't understand?
    4. The program you wrote to perform the measurements.
Submission deadline is Monday morning, 14/3/11, so I can give feedback in class on Tuesday.

Please do the exercise in pairs.

To the course home page