This log contains several months worth of accounting records from a large Blue Gene/P system called Intrepid. Intrepid is a 557 TF, 40-rack Blue Gene/P system deployed at Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory. This system comprises 40,960 quad-core nodes, with 163,840 cores, associated I/O nodes, storage servers, and an I/O network. It debuted as No. 3 in the TOP 500 supercomputer list released in June 2008 and was ranked No. 13 in the list released in November 2010.
Intrepid has been in full production since the beginning of 2009. The system is used primarily for scientific and engineering computing. The vast majority of the use is allocated to awardees of the DOE Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. For more information about the system at ANL, see URL http://www.alcf.anl.gov/resources/storage.php.
The workload log from ANL Intrepid was graciously provided by
Susan Coghlan (firstname.lastname@example.org) from ALCF at ANL and Narayan Desai
(email@example.com) from MCS at ANL.
It was converted to SWF and made available by Wei Tang
(firstname.lastname@example.org) from Illinois Institute of Technology.
If you use this log in your work, please cite the following paper:
System EnvironmentIntrepid (Blue Gene/P) – the ALCF production machine for open science research
The log contains the first 8 months' workload on the 40-rack production Intrepid. Each rack houses 1024 nodes, representing 4096 processor cores and 2TB of memory. As other Blue Gene/P system, Intrepid groups nodes into partitions. Each job is exected in a separate partition. In 8 racks the minimal partition size is 64 nodes (256 cores). In the rest the minimal size is 512 nodes (2048 cores). Partitions of less than 512 nodes are only used for development jobs. Scheduling is performed with the Cobalt resource management system. For more information about Cobalt, see URL http://trac.mcs.anl.gov/projects/cobalt/.
In parallel to the job log available here, there is also a RAS log on the Computer Failure Data Repository. This enables the joint analysis of how failures affect jobs.
The available file contains one line per completed job in the SWF format. The valid fields are:
1 Job number
2 Submit time (in seconds)
3 Wait time (in seconds)
4 running time (in seconds)
5 Number of allocated processors
8 Requested number of processors
9 Requested running time (in seconds)
12 User ID
15 Queue Number
Field 5 is the number of processors allocated to the job, which is larger or equal to field 8, and corresponds to the partition size. Alltold, 30,948 jobs in the log got more processors than they requested.
When allocating nodes, the Cobalt scheduler will choose a partition with least size that can accomodate the job. This has two aspects. First, the partition should obviously have enough processors. Rounding the number of processors up to a possible partition size leads to fragmentation. Second, the partition should have enough memory. Specifically, if a job needs more than the 512 MB available per core, the price will be to allocate more nodes and leave some of the cores idle. In particular, quite a few jobs get a full node for each process, so the allocated processors are 4 times as many as the requested ones. In some cases this difference was extreme; for example, there are jobs that requested 40,960 processors but were allocated the full 163,840 processors in the machine. Such allocations provide indirect information about memory requirements.