Parallel Workloads Archive: SHARCNET

The SHARCNET log

System: Set of 10 clusters in Ontario, Canada
Duration: December 2005 thru January 2007
Jobs: 1,195,242

This log contains up to a year's worth of accounting records from the SHARCNET clusters installed at several academic institutions in Ontario, Canada. This is not really a grid because jobs are mostly run locally at the location where they were submitted. For more information about this installation, see URL http://www.sharcnet.ca.

The MaxProcs and MaxNodes noted in the log are sumd over all the clusters. As different clusters became active at different times, the utilization cannot be calculated reliably. This also implies that it may be inadvisable to use the whole log as is for simulations. The numbers of jobs and users are a total for all the clusters.

In order to make the data more useful, a file containing the jobs on the Whale cluster is provided separately. This is the largest and most highly loaded cluster in the system.

The SHARCNET workload log was graciously provided by John Morton (john@sharcnet.ca) and Clayton Chrusch (chrusch@sharcnet.ca), who also helped with background information and interpretation. If you use this log in your work, please use a similar acknowledgment.

Downloads:

SHARCNET-2005-0 31 MB gz original log
SHARCNET-2005-2.swf 18 MB gz converted log
SHARCNET-Whale-2006-2.swf 9 MB gz converted log, only Whale cluster
SHARCNET-2005-1.swf 18 MB gz OLD VERSION of converted log (replaced 7 Dec 2011)
(May need to click with right mouse button to save to disk)

Papers Using this Log:

This log was used in the following papers: [feitelson07a] [amar08a] [amar08b] [thebe09] [yuan11] [carvalho12] [di12] [krakov12] [skowron13] [bacso14] [feitelson14] [jackson14]

System Environment

The 10 clusters making up SHARCNET are:
numbernameprocessorsnodes
1 bruce 128 32 x 4 x Opteron
2 narwhal1068 267 x 4 x Opteron dual core
3 tiger 128 32 x 4 x Opteron
4 bull 384 96 x 4 x Opteron
5 megaladon128 32 x 4 x Opteron
6 dolphin128 32 x 4 x Opteron
7 requin 1536 768 x 2 x Opteron
8 whale 3072 768 x 4 x Opteron
9 zebra 128 32 x 4 x Opteron
10 bala 128 32 x 4 x Opteron

Jobs are submitted to each cluster using a set of queues. These queues have different limits on job sizes and run times. The configuration of queues for each cluster is different.

Note that all clusters use quad-nodes (except one that uses dual nodes). However, processors are allocated individually, so several different jobs may be running on the same node.

When started, jobs normally run to completion. However, the clusters support a "test" queue which has priority: jobs in this queue may cause running jobs to be suspended, and run in their place.

Log Format

The original log is available as SHARCNET-2005-0.

This file contains one line per completed job with the following comma separated fields:

Conversion Notes

The converted log is available as SHARCNET-2005-1.swf. The conversion from the original format to SWF was done subject to the following. The differences between conversion 2 (reflected in SHARCNET-2005-2.swf) and conversion 1 (reflected in SHARCNET-2005-1.swf) are

The conversion was done by a log-specific parser in conjunction with a more general converter module.

The Log in Graphics

File SHARCNET-2005-2.swf

weekly cycle daily cycle burstiness and active users job size and runtime histograms job size vs. runtime scatterplot activity in different partitions

File SHARCNET-Whale-2006-2.swf

weekly cycle daily cycle burstiness and active users job size and runtime histograms job size vs. runtime scatterplot utilization offered load performance


Parallel Workloads Archive - Logs