The file pbs2swf.tgz contains all the files that compose the pbs2swf utility. These are:
# | file | description |
1 | pbs2swf.pl | The conversion script. Uses the modules below. |
2 | ConversionConfig.pm | Some global configuration variables ("constants") that are used throughout. |
3 | ConversionLog.pm | Everything related to creating the conversion summary report.. |
4 | ParseArgv.pm | Parsing command line arguments of pbs2swf.pl and setting ConversionConfig.pm accordingly. |
5 | ParsePBS.pm | Perform the actual parsing of PBS logs. |
6 | PrintSWF.pm | Print the data parsed by ParsePBS.pm in SWF format. |
zcat LPC-EGEE-2004-0old.pbs.gz LPC-EGEE-2004-0ce1.pbs.gz LPC-EGEE-2004-0ce2.pbs.gz | pbs2swf.pl \ \ --output=l_lpc \ \ --proc_used=1,started \ --proc_req=1,all \ --executable=-1,all,overwrite \ \ --mem_req.type=physical \ \ --anonymize.partition=clrglop195.in2p3.fr:1 \ --anonymize.partition=clrce01.in2p3.fr:2 \ --anonymize.partition=clrce02.in2p3.fr:3 \ \ --anonymize.queue=test:1 \ --anonymize.queue=short:2 \ --anonymize.queue=long:3 \ --anonymize.queue=day:4 \ --anonymize.queue=infinite:5 \ --anonymize.queue=batch:6 \ \ --anonymize.gid=dteam:1 \ --anonymize.gid=dteam005:1 \ --anonymize.gid=biomed:2 \ --anonymize.gid=biomgrid:2 \ \ --Computer="3GHz Pentium-IV Xeon Linux Cluster" \ --Installation="LPC (Laboratoire de Physique Corpusculaire)" \ --Installation="Part of the LCG (Large hadron collider Computing Grid project)" \ --Information="http://www.cs.huji.ac.il/labs/parallel/workload/l_lpc.html" \ --Information="JSSPP'05 - Workload Analysis of a Cluster in a Grid Environment" \ --Acknowledge="Emmanuel Medernach - medernac AT clermont.in2p3.fr" \ --Conversion="Dan Tsafrir - dants AT cs.huji.ac.il" \ --MaxNodes="70 (dual)" \ --MaxProcs=140 \ --TimeZoneString="Europe/Paris" \ --MaxRuntime=259200 \ --AllowOveruse=False \ --Queues="Queues enforce a runtime limit on the jobs that populate them." \ --Queues="See URL in 'Information' for details." \ --Partitions="One small partition, later replaced by two disjoint partitions." \ --Partitions="See URL in 'Information' for details." \ --Note="Jobs are always serial." |
option flag | meaning | details |
--proc_used=1,started --proc_req=1,all |
Number of requested- (all jobs) and used- (started jobs) processors is set to be 1. | The size of all the jobs in the LPC log is 1. However some PBS records are missing this data. We therefore decide that the number of requested-processors (proc_req) of all the jobs is 1. This is also true with respect to used-processors (proc_used) but only for jobs that have actually started to run. Jobs that were canceled before this point, are always assigned with proc_used=0 by the pbs2swf.pl script. |
--executable=-1,all,overwrite |
Set the executable of all jobs in the SWF version to be undefined (-1). | This data is actually available for all the started jobs (hence we overwrite it), but is meaningless, because it species the names of the PBS submittal scripts rather than names of actual applications. And so, almost 88% of the jobs specify "STDIN" as their executable name. Another 6% specify "test.job", and another 6% are canceled jobs (before start), thus their executable name is missing form the PBS log altogether. This leaves us with only tens of jobs that also usually have names like "test1.job", "job.sh" etc. |
--mem_req.type=physical |
SWF data regarding requested memory is associated with physical (rather than virtual) memory. | By default, pbs2swf.pl prefers extracting data from the PBS log that is associated with virtual memory. However, no such data is available in the LPC log, whereas some data specifying requested physical memory is in fact available (but only for 480 jobs). |
--anonymize.partition=* |
Explicitly associating PBS partitions with SWF codes that reflect the chronological order in which they were defined. | For example, the earliest partition is the 'old' one (clrglop195.in2p3.fr) and so it is set to be partition number 1. If SWF codes were not assigned explicitly, they would have been assigned arbitrarily by pbs2swf.pl. |
--anonymize.queue=* |
Explicitly associating PBS queues with SWF codes such that the bigger the code, the longer the jobs that may populate it. | For example, the 'test' queue has the smallest limit on the requested runtime of the jobs that may populate it, and so it is set to be queue number 1. |
--anonymize.gid=* |
Unite PBS groups that appear different but are actually the same. | For example, PBS jobs associated with groups 'dteam' and 'dteam005' actually originate from the same group (which is indeed collectively referred to as 'dteam' in [medernach05]). And so, they are both explicitly assigned to the same SWF group code 1. |
Others | Some predefined SWF header fields. | Including only those that pbs2swf.pl cannot compute by itself (those that can be computed, may not be given as command line options). |
The synopsis of pbs2swf.pl is:
./pbs2swf.pl <options> [PBS log(s)]such that if no PBS files are given, pbs2swf.pl will attempt to read the PBS log(s) from stdin. The pbs2swf.pl script generates two files. See the --output option below.
The following is a description of all the available options ('mandatory' options must appear in the command line; 'multiple' options may appear more than once in the command line):
a n d a t o r y |
u l t i p l e |
|||||
Defaults to SWF fields | wait |
Synopsis of the values associated with these flags:
<default_value>,<all|started|canceled>[,overwrite]There's an option for each SWF fields that can possibly be effected by user's choice. For example, --proc_req=10,all means that all jobs, for which the requested-processor were unobtainable from the PBS log, will be assigned with 10 in this field. If this is given: --proc_req=10,all,overwrite, then 10 will be assigned to be the requested-processor of all the jobs, regardless of whether the associated data exists in the original PBS logs or not. Finally, if this is given --proc_req=10,canceled then only jobs that were canceled before they were started are effected (note that once again, 'overwrite' is optional). Similarly, using 'started' will only effect jobs that were actually started. Time fields are specified in seconds, and memory fields are specified in KB. |
||||
proc_req | ||||||
cpu_req | ||||||
mem_req | ||||||
uid | ||||||
gid | ||||||
executable | ||||||
queue | ||||||
partition | ||||||
runtime |
The synopsis of the values associated with these flags is:
<default_value>,<started>[,overwrite]This is similar in every respect to group of flags explained above, but this fields only have meaning for jobs that were actually started. And so a --runtime=3600,started means all jobs that are missing a runtime data in the original PBS logs will be assigned with a 1-hour runtime. |
|||||
proc_used | ||||||
cpu_used | ||||||
mem_used | ||||||
Attributes of SWF fields | mem_used.type | virtual | The values associated with 'type' attributes are either 'physical' or 'virtual'. The values associated with 'quantity' attributes are either 'per_job' or 'per_process' (resource was consumed by a single process; to know the amount of resource consumed by the entire job one must multiple this with the job's size). For example, --mem_used.quantity=per_job means that the associated SWF column specifies the aggregated amount of memory used by all the processes composing the job. | |||
mem_req.type | virtual | |||||
mem_used.quantity | per_job | |||||
mem_req.quantity | per_job | |||||
cpu_used.quantity | per_process | |||||
Anonymizing PBS values | anonymize.uid | √ |
Synopsis of the values associated with these flags:
<PBS_value>:<SWF_code>The pbs2swf.pl script arbitrarily replaces every PBS string representing user/group/executable/queue/partition with an SWF code (but does so consistently, that is, once a PBS value is associated with an arbitrary SWF-code, the code will always be used to represent this PBS value). These options gives control to the converter on how the anonymization will actually be performed (which values will be used for which PBS names). And so, for example, --anonymize.queue=short:1 means that the PBS 'short' queue will be represented by 1 in the resulting SWF file. See the pbs2swf - Example to understand why these options can be useful. |
|||
anonymize.gid | √ | |||||
anonymize.executable | √ | |||||
anonymize.queue | √ | |||||
anonymize.partition | √ | |||||
Predefined SWF header fields | Computer | √ | √ | <short machine description> e.g. "P-III Linux cluster" | ||
Installation | √ | √ | <location and name of machine> e.g. "SDSC - Blue Horizon" | |||
Information | √ | √ | <where find additional info> usually URL and possibly a paper-ref | |||
Acknowledge | √ | √ | <name+email of supplier of PBS data> | |||
Conversion | √ | √ | <name+email of converter and possibly additional conversion info> | |||
TimeZoneString | √ | <verbal time zone of PBS log> a file which is (usually) found in /usr/share/zoneinfo/, e.g. US/Alaska | ||||
MaxNodes | √ | <int> [comment] e.g. "72 (dual CPU)" | ||||
MaxProcs | √ | <int> [comment] e.g. "144" | ||||
MaxRuntime | <seconds> administrative max allowed runtime | |||||
MaxMemory | <KB> administrative max allowed memory | |||||
AllowOveruse | <bool> can jobs use more resource(s) than requested? | |||||
Queues | √ | <verbal information about queues> | ||||
Partitions | √ | <verbal information about partitions> | ||||
Note | √ | <any important note> | ||||
Other | output | √ | <prefix name for the result file e.g., "l_sdsc_sp2"> In this example, pbs2swf.pl generates both l_sdsc_sp2.swf (the actual conversion), and l_sdsc_sp2.conversion.txt (reporting various statistics and problems). Here's an example of a conversion summary file. | |||
help | print a help message | |||||
debug | <0|1> 1 means an added a 19-th field will be added to the resulting SWF file that holds the original PBS-ID of the job |
record type | meaning |
A | Job was aborted by server |
B | Beginning of reservation period |
C | Job was check-pointed and held |
D | Job was deleted by request (record contains requestor=user@host |
E | Job ended (terminated execution) (record contains all the data needed for SWF) |
F | Resources reservation period finished |
K | Scheduler/server requested removal of reservation |
Q | Job entered a queue; record for each move between queues; record contains queue=name |
R | Job was rerun |
S | Job execution started |
flow |
|||
QSE | 223838 | 16443.clrglop195.in2p3.fr | "normal" jobs [5] |
QQQSE | 1 | 10610.clrce02.in2p3.fr | |
QSDE | 2361 | 43671.clrce01.in2p3.fr | started jobs that were canceled (reached [5] through [4]) |
QSD...DE | 7 | 33836.clrce01.in2p3.fr | |
QD | 14362 | 33885.clrglop195.in2p3.fr | jobs that were canceled before they were started ([1]->[3]) |
QDD | 4 | 43569.clrce02.in2p3.fr | |
QA...A | 7 | 49921.clrce02.in2p3.fr | |
Q | 25 | 36406.clrglop195.in2p3.fr | jobs for which there's no E record becuase it occurred after the available PBS log ends |
QS | 10 | 49927.clrce02.in2p3.fr | |
QSD | 16 | 78876.clrce01.in2p3.fr | |
QSEQSE | 2106 | 835.clrglop195.in2p3.fr | IDs wraparound (as explained above); in QSEE there's also a part of the log missing between the two E-s (this is also an event that accompanied the wraparound, that is, that a short portion of the log is missing) |
QSEQSDE | 14 | 2202.clrglop195.in2p3.fr | |
QSEE | 1 | 290.clrglop195.in2p3.fr | |
QSRD | 2 | 5393.clrce02.in2p3.fr | missing E record (unknown why) |
date_time;record_type;id_string;message_textwhere:
date_time | mm/dd/yyyy hh:mm:ss |
id_string | Either the job identifier (job ID) or a PBS reservation identifier (seems irrelevant to SWF). In the LPC log, this may look like so: 468.clrce01.in2p3.fr (the server's name is the "partition" and the serial number is the PBS job ID within the partition). |
record_type | The record type as described above. |
message_text | The content depends on the record_type. The message text format is blank separated keyword=value fields (the relevant pairs are discussed below). |
[ listed in preference order ] |
||
job | - | Jobs are sorted by arrival order (major) and partition SWF code (minor). If jobs are still "equal" (arrived simultaneously to the same partition) we use the serial from the PBS-ID (see above) to break the tie. Jobs are then assigned in order 1,2,3.... |
submit | ctime | By 10.12.5 of the PBS administration guide: 'ctime' is the "Time in seconds [since the epoch] when a job was created (first submitted)." |
wait | ctime-start | By 10.12.5 of the PBS administration guide: 'start' is the "Time in seconds [since the epoch] when the job execution started". |
runtime | resources_used.walltime, etime-ctime |
By 11.20.1 of the
PBS administration guide:
"Use the walltime attribute rather than wall time
calculated by subtracting the job start time from end time. The
walltime resource attribute does not accumulate when a job is
suspended for any reason". However, some records have their 'resources_used.walltime' missing from their record (in the LPC log these are the 352 jobs with Exit_status=-4; that is, that died on signal). In this case, due to no better alternative, the runtime is computed by subtracting the job's 'ctime' from its 'etime'. |
proc_used | resources_used.ncpus, resources_used.nodect, resources_used.nodes, Resource_List.ncpus, Resource_List.nodect, Resource_List.nodes |
|
cpu_used | resources_used.pcput, resources_used.cput |
By 4.8 of the PBS administration guide
'pcput' is the
"per_process maximum CPU time (i.e. for any single process in the job)".
and 'cput' is the
"Maximum aggregated CPU time required by all processes in job"
. We can convert between 'pcput' and 'cput' by dividing / multiplying them with the processors used, if available. Note however that if this is not available, and one PBS job uses 'pcput' while the other uses 'cput' then one of this data fields will be lost in the conversion (as we can't mix them in one SWF column). |
mem_used | resources_used.pvmem, resources_used.vmem, resources_used.pmem, resources_used.mem |
By 4.8 of the PBS administration guide:
|
proc_req | Resource_List.ncpus, Resource_List.nodect, Resource_List.nodes |
See conversion specification of 'proc_used'. |
cpu_req | Resource_List.walltime | See conversion specification of 'runtime'. |
mem_req | Resource_List.pvmem, Resource_List.vmem, Resource_List.pmem, Resource_List.mem |
See conversion specification of 'mem_used'. |
status | Exit_status | By 10.12.5 of the
PBS administration guide:
"The exit status of the job: If the value is less than 10000
(decimal) it is the exit value of the top level process of the job,
typically the shell. If the value is greater than 10000, the top
process exited on a signal whose number is given by subtracting
10000 from the exit value." We assume that the meaning of exit status values is the same as that of the shell. Thus, we consider Exit_status=0 as normal termination (SWF status=1) and other values as indicating failure (SWF status=0, unless the job was canceled, as indicated by a D/A record, in which case the SWF status is 5). |
uid | user | By 10.12.5 of the PBS administration guide: user=username is "The user name under which the job executed." |
gid | group | By 10.12.5 of the PBS administration guide: group=groupname is "The group name under which the job executed." |
executable | jobname | By 10.12.5 of the PBS administration guide: jobname=job_name is "The name of the job." |
queue | queue | By 10.12.5 of the PBS administration guide: queue=queue_name is "The name of the queue in which the job executed." |
partition | id_string | A job ID string may look like this: 123.par.cnn.com. The server in this string serves as the partition identifier. |
[[hours:]minutes:]seconds[.milliseconds]So, for example, '3600' is interpreted as seconds, but times may also be specified like so '10:15:30' (seconds = 10*3600 + 15*60 + 30*1).
integer[suffix](e.g 500kb, 12mb) where:
none | bytes |
b|w | bytes or words |
kb|kw | klio bytes or words (1024) |
mb|mw | mega bytes or words (1024^2 = 1,048,576) |
gb|gw | giga bytes or words (1024^3 = 1,073,741,824) |