Parallel Workloads Archive: Intel Netbatch

The Intel Netbatch logs

System: Intel Netbatch Grid
Duration: November 2012
Jobs: 48,821,850

These logs contains one month's worth of accounting records from the Intel Netbatch grid. This grid is composed of multiple clusters (called physical pools) in different locations around the world, each with tens of thousands of nodes. The data comes from four such pools, three on the west coast in the US and one in Israel.

The Intel Netbatch workload logs were graciously provided by Ohad Shai, Edi Shmueli, and Nir Antebi from Intel. If you use this log in your work, please use a similar acknowledgment. You can also cite the following paper where they were introduced:

Ohad Shai, Edi Shmueli, and Dror G. Feitelson, “Heuristics for resource matching in Intel's compute farm”. In Job Scheduling Strategies for Parallel Processing, Walfredo Cirne and Narayan Desai, (ed.), Springer-Verlag, 2013.

Downloads:

Intel-NetbatchA-2012-0 592 MB gz original log
Intel-NetbatchB-2012-0 475 MB gz original log
Intel-NetbatchC-2012-0 602 MB gz original log
Intel-NetbatchD-2012-0 422 MB gz original log
Intel-NetbatchA-2012-1 142 MB gz converted log
Intel-NetbatchB-2012-1 139 MB gz converted log
Intel-NetbatchC-2012-1 126 MB gz converted log
Intel-NetbatchD-2012-1 93 MB gz converted log
(May need to click with right mouse button to save to disk)

Papers Using this Log:

This log was used in the following papers:
[shai13] [wang18]

System Environment

The Intel Netbatch grid is composed of multiple physical pools with different numbers of nodes, typically tens of thousands. Exact details have not been disclosed.

The vast majority of jobs are serial. However, there are also some parallel MPI jobs. Scheduling is done by the in-house Netbatch system.

Log Format

The original logs are available as Intel-NetbatchX-2012-0 (where "X" is A, B, C, or D for the four different pools).

These files are in CSV format. They contain one line per completed job with the following comma-separated fields:

Conversion Notes

The converted logs are available as Intel-NetbatchX-2012-1 (where "X" is A, B, C, or D for the four different pools). The conversion from the original format to SWF was done subject to the following. The conversion was done by a log-specific parser in conjunction with a more general converter module.

Usage Notes

The log from pool D appears to have a couple of flurries. In other pools some users are much more active than others, but this is not concentrated in a short time span.

The Logs in Graphics

Note: the zig-zag pattern in the utilization and offered load graphs is an artifact of showing only the maximal and minimal values for each day. The utilization is the number (and not the fraction) of concurrently allocated cores.

File Intel-NetbatchA-2012-1.swf

weekly cycle daily cycle burstiness and active users job size and runtime histograms job size vs. runtime scatterplot utilization offered load

File Intel-NetbatchB-2012-1.swf

weekly cycle daily cycle burstiness and active users job size and runtime histograms job size vs. runtime scatterplot utilization offered load

File Intel-NetbatchC-2012-1.swf

weekly cycle daily cycle burstiness and active users job size and runtime histograms job size vs. runtime scatterplot utilization offered load

File Intel-NetbatchD-2012-1.swf

weekly cycle daily cycle burstiness and active users job size and runtime histograms job size vs. runtime scatterplot utilization offered load


Parallel Workloads Archive - Logs