Ex1 Practical: Script PARallel CaLculator : Part 1
Background:
At times a list of complex weakly interdependent computations has to
be performed. In this exercise you'll implement a version of such
parallel script calculator. Given a list of calculations your program
will perform them in parallel using multi-process paradigm.
Assignment:
Your assignment is to build a program sparcl that will implement the following service:
- Given a file name of a file containing a list of computations in
the format specified below, sparcl will
execute them in the order of appearance. Depending on the dependency
of required arguments, computations will be executed in complete
parallel (equivalent to 'background' of shell) or effectively
sequential ('foreground') modes.
The precise requirements are as following:
- Usage: sparcl <script_file_name>
- The calculation file specified by the sparcl's argument should conform to the following format:
- Each individual computation has the form:
<reg><whitespace><fun><whitespace>
<arg1><whitespace>...<whitespace><arg2>
- <reg> is one of the predefined variable names: R0,...,R9, T0,...,T3, A0, A1
- Arguments of calculation <arg_i> can be either <reg> name or explicit numbers
- Functions <fun> that should be recognized by sparcl are:
- Immediate: = (assignment), < (output to stdout - cout)
that require one argument and,
* (multiplication), / (division), + (sum), - (subtraction)
each requiring two arguments
- Extended:
- fact: factorial - one argument
- input: get a number - no arguments
- ilock: program waits indefinitely - no argument
- lcount: count 'ilock' processes - no argument
- lfree: removes (kills) all 'ilock' processes - one argument
- All extended functions are external programs, the return value of
the functions is equal to exit status of the respective process.
-
Each register R0,...,R9, T0,...,T3, A0, A1 has a state either busy or free.
- When the sparcl reads a command line it use the following
rules to do one of the following:
- Wait until all the registers that appear in the command line have
a free state.
- If the command is an internal command, then execute the command
and move to the next line.
- If the command is an external command, then mark the target
register in the command line with the busy state, translate argument
registers into numbers (we use pass by value semantics), spawn a child
process that executes the command, and move to the next line. sparcl
should identify when an external command ends, and should change the
state of the respective target register busy to free.
- If a computation is not blocked it should be performed - it is
possible that several extended functions will be running
simultaneously
- All computations have to be performed in maximally parallel way
(all extended functions are run initially in the background mode)
- All external function programs are placed at
~os03/www/Ex/Ex1/
- All computed values, as well as type of registers, are double
Input examples:
R1 = 0
R2 input
R0 + R1 R2
R1 = 5
R2 fact 3
R4 + R2 1
R5 * 5 R1
T0 = R1
R1 = R2
R2 = T0
Possible design pattern:
- The following design pattern is given as an example of a generic design pattern for your convenience.
You may choose to use any design you want (not necessarily this one). We will accept any design as long as you keep your design clear, simple and well documented.
- The main process (sparcl program) will hold a 'busy vector' of
process numbers, indexed by register numbers. Element 'r' of this
vector will be process ID of the background process that was produced by a
command that uses register 'r' as a target register. This process is termed the blocking process of register 'r'.
- Upon parsing of a command line with target register 'r', sparcl
will produce (if necessary) a background process and write it's
process ID to the 'busy vector' at index of 'r'
- Sparcl will check for dependencies before running a background
process. That is, whether there is any argument for the parsed command
which is currently busy (blocked by a process).
That is checking
'busy vector' for each register that exists in the parsed command line.
- If register 'r' needed by the command is 'busy', sparcl will
actively (in foreground) wait for the blocking process of
register 'r'. Notice that there may be more then one 'busy'
argument, thus sparcl will need to wait for more then one
process. This waiting procedure may be done sequentially.
- Upon termination of a blocking process, SIGCHLD will be
generated. Sparcl will include a SIGCHLD signal handler, which will
remove process IDs from 'busy vector', thus returning the register to the free state.
- Thus sparcl general loop may look something like this:
- Read next command
- Parse the command
- Check dependency of parsed command arguments on previous command target registers - in 'busy vector'
- If there are - actively wait for termination of all processes blocking the needed registers - using 'busy vector' information
- If all registers are clear (not blocked)
- If the command is immediate perform it.
- If the command is external - run it as a background process and mark command's target register as busy - put the process ID at the respective place in 'busy vector' (what should be done first? run a process or mark busy? does it matter?)
- Read next command.....
Notice the following extreme cases:
- New command has same target register as the previous one:
R0 fact 50
R0 = 4
In this case sparcl will run 'R0 fact 50' in background and then after
parsing 'R0 = 4' it will stop and wait for the process produced for
'R0 fact 50'.
- It is possible that immediately after checking if register Rk is
busy, sparcl will receive SIGCHLD signal generated by the blocking
process PID of the very same Rk register. In this case signal handler will remove the blocking process PID from 'busy vector', but after return from signal handler sparcl will still actively wait for PID.
....
R0 fact 50
~~~~~ generate process 1567 for example
~~~~~ write it's number into 'busy vector' at index of R0
....
R1 = R0
~~~~~~Check if R0 is busy - yes by process 1567
~~~~~~SIGCHLD of 1567 arrives
~~~~~~SIGCHLD handler removes 1567 from 'busy vector' at index of R0
~~~~~~SIGCHLD handler returns
~~~~~~sparcl attempts to wait for 1567, since it already checked 'busy vector' and holds positive answer
....
It is possible, but notice that if it happens active wait by sparcl
will immediately return, since process 1567 does not exists any longer
(or if it does in the system, is not a child of sparcl)
Recommendations and additional requirements:
- You are required to use fork() and execve() system
calls. However, you are advised to use the simplified version of
execve(), execvp(). This system call is well suited for specifying the
variable number of parameters and it automatically takes care of the
environment variables inherited from the real shell. Otherwise you
have to refer to the environment variables using the global variable
environ in order to allow relative command names.
- You are NOT allowed to use the library function system() instead
of fork/exec. This is because system() itself forks a new shell that
forks the specified command. Using system() is absolutely prohibited
in this exercise
- You may assume that the user always specifies a correct input with
respect to the format requirements specified above. However, user may
specify non-existent command name, or wrong parameters. Your program
should print an appropriate error message. We require that you use
perror("") library function in order to print out the standard system
error messages for the errors concerned with the system calls
failures. In this case calculation should proceed, but the result is
undefined.
- Notice that access to 'busy vector' is essentially a signal
critical section. That is sensitive to signal arrival during the
operation. Thus we recommend you to protect necessary section in your
program by signal blocking.
- The return value of the sparcl program should be 0 upon success and >0
upon failure
- Note that empty or pure white space(s) command is not an error,
and should be ignored and sparcl should simply proceed to the next
command.
- Return value of output command '<' is always 0 (zero). And it
generates single double number output to stdout(cout) ending with
newline '\n'(endl)
- Example (but not exemplary) parser can be found at ~os/www/Ex/Ex1
- The forementioned parser can handle all correct inputs
System Calls and Functions List
- fork();
- execvp();
- wait()/wait3()/wait4();
- signal();
- exit();
- open()/close()/read()/write();
- dup()/dup2();
- fopen()/feof()/fscanf()/fprintf();
You may use stdio library functions in order to read user's input and
to print the messages. You are advised to use string library
functions, such as strtok() to manipulate strings.
Bibliography
- Process control (fork(), exec(), wait*()) Stevens Chapter 8 (8.1
.. 8.8)
- File I/O (open(), read(), write(), close(), dup(), dup2()) Stevens
Chapter 3 (3.1 .. 3.8, 3.12)
- Standard I/O Library (fopen(), .....) Stevens Chapter 5 (5.1 .. 5.6)
Submit
Submit a tar file on-line containing the following:
- README file with
- your logins
- explanations about your implementation that may be useful for bodek.
- Your sparcl source file(s)
- Your Makefile to compile the program
- Printed out Collaboration Declaration
Do NOT submit a hardcopy of the program
Important
The exercise in not hard, but it is time consuming since you
have to learn a lot of new material. Start early!
Due date: 13.03.03
Grading notice:
The grade will be computed governed by the following division:
- 25% of the grade will be given for a program that runs correctly
on input files containing no external commands nor blocking
dependencies
- Another 30% of the grade will be given for a program that runs
correctly on input files containing both types of commands, but no
blocking dependencies
- Final 45% of the grade will be given for a program that runs
correctly on input files containing both types of commands and
blocking dependencies
Good luck!
To the course home page