This log contains several months worth of accounting records from a large Linux cluster called Atlas installed at Lawrence Livermore National Lab. For more information about Linux clusters at LLNL, see URL https://computing.llnl.gov/tutorials/linux_clusters/. This specific cluster has 1152 nodes, each with 8 processors, for a total of 9216 processors. Atlas is considered a "capability" computing resource, meaning that it is intended for running large parallel jobs that cannot execute on lesser machines. This is in contrast with Thunder, which is a "capacity" machine, used for running large numbers of smaller jobs. Note that the log does not include arrival information, only start times. The LLNL Atlas workload log was graciously provided by Moe Jette, who also helped with background information and interpretation. If you use this log in your work, please use a similar acknowledgment.
Downloads:
|
|
The nodes are divided into three partitions:
login | 8 nodes |
debug | 32 nodes |
batch | 1072 nodes |
This file contains one line per completed job in the Slurm format. The fields are
COMPLETED | 1 |
FAILED | 0 |
TIMEOUT | 0 |
NODE_FAIL | 0 |
CANCELLED | 5 |
The conversion was done by a log-specific parser in conjunction with a more general converter module.
A flurry is a burst of very high activity by a single user. The filters used to remove the initial section and the five flurries that were identified are
submitted before 18 Dec 2006 (1434 jobs)Note that the filters were applied to the original log, and unfiltered jobs remain untouched. As a result, in the filtered logs job numbering is not consecutive. Moreover, due to the fact that the whole initial part of the log is discarded, the start time indication in the header comments is also wrong.
user=4 and job>3873 and job<5926 (2038 jobs)
user=28 and job>20616 and job<21532 (887 jobs)
user=7 and job>21547 and job<22295 (709 jobs)
user=19 and job>40338 and job<51898 (6783 jobs)
user=66 and job>22438 and job<56102 (4703 jobs)
Further information on flurries and the justification for removing them can be found in:
File LLNL-Atlas-2006-2.1-cln.swf