This log contains over two years worth of accounting records from the national grid of the Czech republic, called MetaCentrum. It is a longer log from a later period compared to the original MetaCentrum log. The MetaCentrum grid is composed of a varying number of clusters, each with several multiprocessor machines with multicore CPUs or GPUs. Importantly, the the scheduling system underwent a significant reconfiguration in the middle of this period, which is the subject of a paper based on this log. For more information about the system, see URL http://metavo.metacentrum.cz/en/index.html. The MetaCentrum workload log was graciously provided by Czech National Grid Infrastructure MetaCentrum. If you use this log in your work, please use a similar acknowledgment. It was made available via the web page of Dalibor Klusacek, which also includes data about the configuration, specifying 19 clusters with 495 nodes and 8412 cores in total (however, the log appears to contain some jobs that ran on additional clusters as well). To acknowledge Dalibor's work please consider citing the paper that introduced this log: D. Klusacek, S. Toth, and G. Podolnikova, ``Real-life Experience with Major Reconfiguration of Job Scheduling System''. In Job Scheduling Strategies for Parallel Processing, May 2015. Downloads:
|
|
no. | Cluster | From | To | NxC | Cores | Mem/node (GB) | GPUs/node |
---|---|---|---|---|---|---|---|
1 | ajax.zcu.cz | start | end | 1x8 | 8 | 72 | - |
2 | alela.feec.vutbr.cz | start | 5-Oct-2013 | 12x8 | 96 | 32 | - |
3 | doom.metacentrum.cz | 30-Sep-2013 | end | 30x16 | 480 | 67 | 2xGPU |
4 | eru.ruk.cuni.cz | start | end | 2x32 | 64 | 264 | - |
5 | gram.zcu.cz | start | end | 10x16 | 160 | 67 | 4xGPU |
6 | haldir.metacentrum.cz | 2-Apr-2013 | end | 1x64 | 64 | 1040 | - |
7 | hda.cerit-sc.cz (zapat) | start | end | 112x16 | 1792 | 134 | - |
8 | hdb.cerit-sc.cz (zigur) | 22-Apr-2013 | end | 32x8 | 256 | 134 | - |
9 | hdc.cerit-sc.cz (zegox) | 22-Apr-2013 | end | 48x12 | 576 | 94 | - |
10 | hermes.metacentrum.cz | 19-Feb-2013 | end | 11x8 | 88 | 14 | - |
11a | hildor.prf.jcu.cz | start | 11-Feb-2013 (renamed) | 26x16 | 416 | 67 | - |
11b | hildor.metacentrum.cz | 11-Feb-2013 | end | 26x16 | 416 | 67 | - |
12 | konos.fav.zcu.cz | start | end | 9x12 | 108 | 24 | 2xGPU |
13 | losgar.ics.muni.cz | start | end | 2x48 | 96 | 64 | - |
14 | loslab.ics.muni.cz | start | end | 14x12 | 168 | 12 | - |
15 | luna.fzu.cz | start | end | 47x16 | 752 | 96 | - |
16 | mandos.ics.muni.cz | start | end | 14x64 | 896 | 264 | - |
17 | manegrot.ics.muni.cz | 16-Dec-2014 | end | 4x32 | 128 | 512 | - |
18 | manwe.ics.muni.cz | start | end | 7x16 | 112 | 66 | - |
19 | minos.zcu.cz | start | end | 49x12 | 588 | 20 | - |
20 | mudrc.metacentrum.cz | 17-May-2014 | end | 12x4 | 48 | 3 | - |
21 | nympha.zcu.cz | start | end | 19x8 | 152 | 14 | - |
22a | perian1-20.ncbr.muni.cz | start | 29-May-2014 | 20x8 | 160 | 25 | - |
22b | perian21-40.ncbr.muni.cz | start | end | 20x8 | 160 | 25 | - |
22c | perian41-56.ncbr.muni.cz | start | end | 16x12 | 192 | 50 | - |
23 | quark.video.muni.cz | start | end | 3x8 | 24 | 18 | - |
24 | ramdal.ics.muni.cz | start | end | 1x32 | 32 | 1058 | - |
25 | skirit.ics.muni.cz | start | 5-Oct-2013 | 28x4 | 112 | 3 | - |
26 | tarkil.cesnet.cz | start | end | 28x8 | 224 | 22 | - |
27 | ungu.cerit-sc.cz | 12-Dec-2013 | end | 1x288 | 288 | 6144 | - |
28 | urga.cerit-sc.cz | 19-Nov-2014 | end | 1x384 | 384 | 6144 | - |
29a | zewura.cerit-sc.cz | start | 12-Nov-2014 (split) | 20x80 | 1600 | 512 | - |
29b | zewura.cerit-sc.cz | 12-Nov-2014 | end | 8x80 | 640 | 512 | - |
30 | zebra.cerit-sc.cz | 12-Nov-2014 | end | 12x80 | 960 | 512 | - |
31 | zorg.cerit-sc.cz | 11-Dec-2014 | end | 4x10 | 40 | 1536 | - |
32 | kalpa.fzu.cz | 6-Nov-2013 | end | 2x24 | 48 | 256 | - |
Jobs could run on processors from more than one cluster. While relatively rare, this did happen for 7011 jobs in the log.
Scheduling is done with TORQUE with a custom built scheduler, employing a system of general queues served by two scheduling servers. The scheduler uses common approaches such as backfilling and fairshare. Documentation is available on the MetaCentrum site. The main queues are as follows:
Queue | Priority | Time limit |
---|---|---|
q_2h | 50 | 2h |
q_4h | 500 | 4h |
q_1d | 50 | 24h |
q_2d | 50 | 48h |
q_4d | 50 | 96h |
q_1w | 50 | 168h |
q_2w | 50 | 336h |
q_2w_plus | 50 | 720/1488h |
backfill | 20 | 24h |
short | 50 | 2h |
normal | 50 | 24h |
long | 50 | 720h |
uv | 30 | 96h |
gpu | 75 | 24h |
gpu_long | 55 | 168h |
The above data is valid for the second half of the log, from January 2014. Note that nearly all the queues have the same priority (50). The practical effect is that jobs are prioritized just by fairshare, and queues are basically only used to define various per-user/group limits. Thus, the system operates over one "virtual" queue which is ordered by fairshare. Before January 2014, there was a fixed queue ordering, where the highest priority was for "long" (70), followed by "short" (60), "normal" (50) and "backfill" (20). Fairshare was only used "locally", within a given queue. The changes in configuration are described in detail in the paper which introduced the log [klusacek15]. The change in configuration apparently led to a change in utilization as seen in the figures below.
Importantly, data about the specific requests made by users is included as an additional field in the original log. This is a ':'-separated list of properties, such as the number of nodes and cores requested, the architecture, and specific clusters to use or to avoid. The possible properties and the mapping of properties to clusters is available in the MetaCentrum documentation. This is considered important as it enables evaluations that take all these different constraints into account.
The log contains all the jobs that started in the logging period, which is all of 2013-2014. Some of these jobs are extremely long, as the maximal runtime allowed on this system is 30 days. Therefore edge effects may happen at both ends of the log, where the logged data does not represent the actual load faithfully. In particular, all the jobs executing in 2015 are actually leftovers from 2014.