The ParPar Project

The goal of the ParPar project is to provide a platform for research
on parallel systems.
Its premise is that such a system can be built totaly in software,
based on off-the-shelf components.
Thus the hardware base is a set of PCs with a fast local area network,
and the software is based on a Unix kernel on each node.
The main software components and their interactions are depicted in
the following figure:

- Masterd
- The master daemon.
This is the heart of the system.
It is resopnsible for configuration management, resource management,
etc.
There is one masterd in the system, running on the host node.
- Noded
- The node daemon.
It is responsible for local activities on the node, e.g. the spawning
of sproc's.
There is one on each node.
- Master Control
- A GUI used to interact with the masterd, in order to control the
system (e.g. shutdown) and obtain information (e.g. about load conditions).
- Job Rep
- A GUI used to interact with the masterd in order to submit a job.
It also connects directly to the sproc's in order to display standard
I/O.
- sproc
- A "sub-process", i.e. a constituent of a parallel job (or xproc, for
"extended process").
For more information, or if you want to do some work on this project,
contact Dror Feitelson at
feit@cs.huji.ac.il.
You can also download the
full design document (currently in
version 0.2, at 77 pages and 140 KB; we're working on an updated version).
Publications
- Dror G. Feitelson, Anat Batat, Gabriel Benhanokh, David Er-El,
Yoav Etsion, Avi Kavas, Uri Lublin, and Marc A. Volovic,
ParPar Design Document Version 0.3.
Perpetually in preparation (and version 0.2 is outdated).
- Dror G. Feitelson, Anat Batat, Gabriel Benhanokh, David Er-El,
Yoav Etsion, Avi Kavas, Tomer Klainer, Uri Lublin, and Marc A. Volovic,
The ParPar System: A Software MPP.
In High Performance Cluster Computing, Vol 1: Architectures and
Systems, Rajkumar Buyya (Ed.), pp. 754-770, Prentice-Hall, 1999 (108 KB).
Translated
to Chinese.
- Dror G. Feitelson,
Exception Propagation in the ParPar System.
Manuscript, 1998 (51 KB).
-
A. Batat and D. G. Feitelson,
``Gang Scheduling with Memory
Considerations''.
In 14th Intl. Parallel and Distributed Processing Symp.,
pp. 109-114, May 2000. (115 KB)
©Copyright 2000 by IEEE.
Definitive version available from the
IEEE Computer Society Digital Library.
Original extended version available as
Technical Report 99-33 (161 KB).
-
Yoav Etsion, Mickael Raizman, and Dror G. Feitelson,
Topology and Routing in Clusters: From Theory to
Practice.
Technical Report 99-??, Dec 1999 (266 KB).
-
Avi Kavas, David Er-El, and Dror G. Feitelson,
``Using multicast to pre-load jobs on the ParPar cluster''.
Parallel Computing 27(3) pp. 315-327, Feb 2001.
Preliminary version available as Technical Report
98-14, Inst. Computer Science, The Hebrew University of Jerusalem,
Dec 1998 (54 KB).
-
Y. Etsion and D. G. Feitelson,
``User-Level Communication
in a System with Gang Scheduling''.
In 15th Intl. Parallel and Distributed Processing Symp.,
Apr 2001.
©Copyright 2001 by IEEE
but definitive version has not been made available on-line yet!
Original extended version available as
Technical Report 2000-39 (110 KB).
-
A. Kavas and D. G. Feitelson,
``Comparing Windows NT, Linux, and QNX as the Basis for Cluster Systems''.
Concurrency and Computation -- Practice and Experience
13(15) pp. 1303-1332, Dec 2001.
Definitive version available from
Wiley InterScience.
Original version available as a Technical
Report (75 KB).
-
D. G. Feitelson and T. Klainer,
``XML, Hyper-media, and Fortran I/O''.
In High Performance Mass Storage and Parallel I/O:
Technologies and Applications,
H. Jin, T. Cortes, and R. Buyya (Ed.), pp. 633-644, IEEE Press and Wiley, 2001. (105 KB)
-
Y. Wiseman and D. G. Feitelson,
``Paired Gang Scheduling''.
IEEE Trans. Parallel & Distributed Syst., 14(6),
pp. 581-592, Jun 2003. (932 KB)
©Copyright 2003 by IEEE.
Definitive version available from the
IEEE Computer Society Digital Library.
We also have a ParPar photo album (total
about 145 KB)
Acknowledgements
- Dror Feitelson - general design and miscelleneous pieces of
implementation
- Marc Volovic - initial design and implementation of infrastructure
- Anat Batat - job representative, memory considerations in gang scheduling
- Uri Lublin - gang scheduling support
- Dudi Er-El - RDGM design and implementation
- Avi Kavas - RDGM design and implementation, file transfer, study
of base kernels
- Gabi Benhanokh - general support, reimplementation of masterd in
C++
- Yoav Etsion - integration of FM, integration of communication and
gang scheduling
- Noam Lotner - general support
- Dan Tsafrir - gang scheduling extension for evolving jobs
- Niva Aldema - parallel debugger interface
- Hani Mazar - master control interface
- Mickael Raizman - FM routing and network topology
- Eitan Frachtenberg - flexible gang scheduling and dynamic
identification of gangs
- Tomer Klainer - parallel I/O to partitioned matrices with logical
caching
- Avraham Fraenkel - SGML-based parallel file system
To the Parallel
Systems Lab home page
To the Hebrew University Institute
of Computer Science home page