NPACI JOBLOG Job Trace Repository V1.0 Maintained by Victor Hazlewood Created May 16, 2000 Last Modified May 17, 2000 Welcome to the JOBLOG Repository! This Job Trace Repository is brought to you by the HPC Systems group of the San Diego Supercomputer Center (SDSC), which is the leading-edge site of the National Partnership for Advanced Computational Infrastructure (NPACI). The JOBLOG job traces currently are available for the 128 node IBM SP system at SDSC and includes data from May 1, 1998 until April 31, 2000. This IBM SP system is expected to be retired from production on May 27, 2000 and therefore, this repository will have May 1998 to May 2000 data available for use. A word about privacy. The privacy of our users is very important and in this version the user's Login Name and Account have been encrypted to protect their privacy. All Login Name's and Account's are encrypted with the same process, thereby, ensuring the same encrypted string for the same Login Name and Account. Below is the description of the data returned by the JOBLOG Job Trace Repository. Table/View Name: JOBLOG Any field that cannot be reported is represented by a -1. All times are reported in seconds unless otherwise noted. Name Datatype Reference/Description LOGIN_NAME Text User's login name from /etc/passwd file. ACCOUNT Text Users account name. The account/contract/grant to charge users's usage to. QUEUE_DATE Number The date the job was queued to the batch system. This is represented as an integer number containing the number of seconds since the Epoch in GMT. -1 for non-batch jobs. START_DATE Number The date the job was started by the system. This is represented as an integer number containing the number of seconds since the Epoch in GMT. END_DATE Number The date the job was completed by the system. This is represented as an integer number containing the number of seconds since the Epoch in GMT. QUEUE Text Queue name. CPU_TIME Number CPU usage. CPU time is for all processes of job. This value is the number of seconds as a integer. WALLCLOCK Number Wall clock time which elapsed while the request was running on a CPU. This does not include queue wait time, system down time, or the period when the request was suspended, checkpointed, or held, if possible. This value is the number of seconds as a integer. For clusters where a node is exclusively allocated the wallclock is multiplied by the number of processors yielding wallclock processor hours. Therefore, on an IBM SP system this is actually the wallclock node hours or "wallclock * number of cpus" SU Number Total charge for this job in System Billing Units This value is in seconds as a integer. NODES Number cumulative sum of all processors allocated to the job MAXPAR Number Maximum node partition. Largest number of processors allocated to parallel applications within the job. NUMMPPJOBS Number Number of parallel applications run in this job MAXMEMORY Number Memory high water mark for entire job MEMORY Number Memory usage in Kcore-hours I_O Number I/O usage in megabytes transferred DISK Number Disk Charge in units defined by CPU: disk blocks or other. CONNECT_TIME Number Connect time for interactive session QWAIT Number Queue wait time for batch jobs EXPF Number Expansion factor.(QWAIT+WALLCLOCK)/WALLCLOCK This version of expansion factor would give you an idea of whether queue times are proportional to job size -- presumably you want to avoid long queue times for small jobs. PRIORITY Number Priority weight value. APP_NAME Text Job or Application name. If available, the name of the process which took the largest percentage of time of the overall job, session or the batch job name, if available. JOB_ID Number Job id or session id from the originating system. Usually different from QUEUEID. Master process id (most UNIX systems) or job id (UNICOS) QUEUE_ID Text Queueing system identification code. NQS id, LoadLeveler cluster id or LSF id, for example. BATCH_SUBMISSION_DATE Date This is the date the job was submitted to the queuing system in human readable form. COMPLETION_DATE Date/Time Date the job was completed in human readable form. REQUESTED_TIME Number Amount of time requested at queue submission time for resource time, either wallclock time for parallel jobs or cpu time for vector/SMP systems REQUESTED_MEM Number Amount of memory requested at job submission time REQUESTED_PROCS Number Number of processors requested at job submission time. JOB_COMP_STATUS Number Number representing completion status of the job. 1 - means job completed successfully 2 - means job canceled 3 - other/unknown