Parallel Workloads Archive: PIK IPLEX

The Potsdam Institute for Climate Impact Research (PIK) IBM iDataPlex Cluster log

System: 320-node IBM iDataPlex Cluster
Duration: April 2009 thru July 2012
Jobs: 742,964

This log contains more than 3 years worth of accounting records from the 320-node IBM iDataPlex cluster at the Potsdam Institute for Climate Impact Research (PIK) in Germany. For more information about this installation, see URL http://www.pik-potsdam.de/services/it/hpc

The log starts from when the machine was installed and first put into use, so the initial part may not include representative workload data.

The workload log from the PIK IPLEX was graciously provided by Ciaron Linstead (linstead@pik-potsdam.de), who also helped with background information and interpretation. If you use this log in your work, please use a similar acknowledgment.

Downloads:

PIK-IPLEX-2009-0.tgz 156 MB gz original monthly logs
PIK-IPLEX-2009-1.swf 10 MB gz converted log
(May need to click with right mouse button to save to disk)

There is no cleaned version of this log yet.

Papers Using this Log:

This log was used in the following papers:
[skowron13] [zakay14] [feitelson14] [lic14]

System Environment

The 320 nodes in the cluster are each configured with 2 processors that have 4 cores each, for a total of 8 cores per node, and 2560 cores in the whole system. Each node also has 32 GB of memory, which is shared by the 8 cores.

The nodes are interconnected by an Infiniband DDR interconnect. They are divided into two network domains of 160 nodes each. Jobs cannot span both domains. The maximal job size observed used 128 nodes (1024 cores).

The system has 800 TB total disk space.

Scheduling is performed using LoadLeveler with the backfilling scheduler option. Nodes may be shared by several jobs or allocated exclusively to a single job. Reasons for requesting non-shared access are either using all the cores (that is, running 8 tasks on the node) or using all the physical memory (so a single task that needs lots of memory will get the whole node, and leave 7 cores idle). The system does not run more than one user process per core.

Of the 32 GB of memory on each node 6 GB are reserved for the operating system, leaving 26 GB for user processes (28672000 KB). If 8 processes are run on the node, each can get 3500 MB (3584000 KB).

Log Format

The original log is a set of files, one per month, generated by the LoadLeveler llsummary -l command.

The files contain a multi-line stanza for each job, where each line is a field:value pair. Jobs may further be composed of several steps, which are in effect a sequence of jobs that are executed one after the other. If a job includes multiple steps, there will be a separate stanza describing each one.

Conversion Notes

The converted log is available as PIK-IPLEX-2009-1.swf. The conversion from the original format to SWF was done subject to the following. The conversion was done by a log-specific parser in conjunction with a more general converter module.

The Log in Graphics

File PIK-IPLEX-2009-1.swf

weekly cycle daily cycle burstiness and active users job size and runtime histograms job size vs. runtime scatterplot utilization offered load performance


Parallel Workloads Archive - Logs