Parallel Workloads Archive: ANL Intrepid

The ANL Intrepid log

System:	Blue Gene/P (Intrepid) at ANL
Duration:	Jan 2009 to Sept 2009
Jobs:	68,936

This log contains several months worth of accounting records from a large Blue Gene/P system called Intrepid. Intrepid is a 557 TF, 40-rack Blue Gene/P system deployed at Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory. This system comprises 40,960 quad-core nodes, with 163,840 cores, associated I/O nodes, storage servers, and an I/O network. It debuted as No. 3 in the TOP 500 supercomputer list released in June 2008 and was ranked No. 13 in the list released in November 2010.

Intrepid has been in full production since the beginning of 2009. The system is used primarily for scientific and engineering computing. The vast majority of the use is allocated to awardees of the DOE Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program. For more information about the system at ANL, see URL http://www.alcf.anl.gov/resources/storage.php.

The workload log from ANL Intrepid was graciously provided by Susan Coghlan (smc@alcf.anl.gov) from ALCF at ANL and Narayan Desai (desai@mcs.anl.gov) from MCS at ANL. It was converted to SWF and made available by Wei Tang (wtang6@iit.edu) from Illinois Institute of Technology. If you use this log in your work, please cite the following paper:
W. Tang, Z. Lan, N. Desai, D. Buettner, and Y. Yu, “Reducing Fragmentation on Torus-Connected Supercomputers” In Proc. IEEE Intl. Parallel & Distributed Processing Symp., pp. 828--839, May 2011.

Downloads:

ANL-Intrepid-2009-1.swf	0.9 MB gz	converted log
failure data	40 MB zip	RAS log available from the CFDR

(May need to click with right mouse button to save to disk)

System Environment

Intrepid (Blue Gene/P) – the ALCF production machine for open science research

40,960 quad-core nodes
163,840 cores available for computation
80 terbytes memory (2GB per node, 512MB per core)
557 teraflops
640 additional I/O nodes

The log contains the first 8 months' workload on the 40-rack production Intrepid. Each rack houses 1024 nodes, representing 4096 processor cores and 2TB of memory. As other Blue Gene/P system, Intrepid groups nodes into partitions. Each job is exected in a separate partition. In 8 racks the minimal partition size is 64 nodes (256 cores). In the rest the minimal size is 512 nodes (2048 cores). Partitions of less than 512 nodes are only used for development jobs. Scheduling is performed with the Cobalt resource management system. For more information about Cobalt, see URL http://trac.mcs.anl.gov/projects/cobalt/.

In parallel to the job log available here, there is also a RAS log on the Computer Failure Data Repository. This enables the joint analysis of how failures affect jobs.

Papers Using this Log:

This log was used in the following papers:
[tang10] [tang11] [zheng11] [niu12] [krakov12] [kumar12] [tang13] [shih13] [zakay13] [deb13] [yang13] [zhou13] [deng13b] [rajbhandary13] [kumar14] [zakay14] [zakay14b] [feitelson14] [dorier14] [carastans17] [ntakpe17] [wang18] [hai20]

Log Format

The original log is not available.

The available file contains one line per completed job in the SWF format. The valid fields are:

1 Job number
2 Submit time (in seconds)
3 Wait time (in seconds)
4 running time (in seconds)
5 Number of allocated processors
8 Requested number of processors
9 Requested running time (in seconds)
12 User ID
15 Queue Number

Conversion Notes

The converted log is available as ANL-Intrpid-2009-1.swf. The conversion from the original format to SWF was done subject to the following.

30,948 jobs got more processors than they requested; see usage notes below. 12 jobs got less processors than they requested.
Requested time is the wallclock limit, not a precise estimate.
Normally, the runtime (Field 4) should be no larger than the requested time (Field 9). In the log 12,241 jobs got more runtime than they requested, and in 9,096 cases the extra runtime was larger than 1 min. There also exist very few jobs (~0.1%) which got much more runtime than they requested; it is caused by control system failure.
The log includes all job FINISHED before Sept 1, 2009, 23:59 (GMT-6). Thus there are some job SUBMITTED in Sept 1 not entirely included in the log.

Usage Notes

Field 8 is the requested number of processors which is provided by the user. It need not correspond to a partition size. This is most probably the number of processors the job will actually use.

Field 5 is the number of processors allocated to the job, which is larger or equal to field 8, and corresponds to the partition size. Alltold, 30,948 jobs in the log got more processors than they requested.

When allocating nodes, the Cobalt scheduler will choose a partition with least size that can accomodate the job. This has two aspects. First, the partition should obviously have enough processors. Rounding the number of processors up to a possible partition size leads to fragmentation. Second, the partition should have enough memory. Specifically, if a job needs more than the 512 MB available per core, the price will be to allocate more nodes and leave some of the cores idle. In particular, quite a few jobs get a full node for each process, so the allocated processors are 4 times as many as the requested ones. In some cases this difference was extreme; for example, there are jobs that requested 40,960 processors but were allocated the full 163,840 processors in the machine. Such allocations provide indirect information about memory requirements.

The Log in Graphics

File ANL-Intrepid-2009-1.swf

weekly cycle daily cycle burstiness and active users job size and runtime histograms job size vs. runtime scatterplot utilization offered load performance

Parallel Workloads Archive - Logs