Operating Systems, Spring 2003
Exercise 3 Programming: Buffer Cache
Subject: User-Level Buffer Cache Library
Due Date: 12.06.2003 NOTE: there will be NO
EXTENSIONS to this deadline
This exercise is DIFFICULT. Start early.
Introduction
Files containing databases and programs are becoming larger and
larger. Efficient management of multiple file access becomes an imperative
issue of operating systems. Most of the time the problem is solved using a
buffering/caching technique.In this exercise you will practice the paging
concepts you've learned in class and produce an efficient file access
manager.
You are required to implement a basic functionality of Buffer Cache (see
Bach, The Design of the UNIX Operating System, Chapter 3) for multi-thread processes. This will be done in
a form of a library, that will provide basic file access functions as an
interface, while behind it a buffering scheme will be implemented.
You will implement a Buffer Cache consisting of a set of blocks. The number
of blocks and the size of each block is determined at the time of Buffer Cache
initialization.
Buffer Cache logically divides an accessed file into blocks starting from
the beginning of the file. Assume that size of the block is S then
starting from the first byte of the file the blocks will be formed by the byte
intervals: [0,...,S-1],[S,...,2S-1],... Notice that the last block may be not
completely filled by the file, since file size may be not a multiplicative of
the block size. Buffer Cache will still allocate a complete block, however it
will keep the notion in its internal structures that only a fraction of the
block contains valid data.
When a user accesses data from a file, the Buffer Cache
first checks if a copy of the appropriate data exists in blocks held in the
main memory. Otherwise it copies the blocks (perhaps several of them if such
need arises) from the disk. Least Recently Used policy is applied if another
block needs to be evicted from the pool in the main memory.
Notice that eviction of a block that contains data different from the
originally copied from the disk, has to be written back to the disk
Library (libbufc.a) interface
- bufc_init - will initialize all internal structures needed by the
library for the buffer cache management. You are guaranteed that the function
will be called before any thread creation.
- bufc_open - this function will provide similar functionality to system call
open, while creating all internal structures and
gathering information about file, that will be needed later for Buffer Cache
management
- bufc_read - this function will provide same functionality as system call read. However, accessed blocks of the file will be
dynamically managed by the Buffer Cache library.
- bufc_write - this function will provide the same functionality as system
call write. Yet, as in the case with library
function bufc_read, accessed file blocks will be managed by Buffer Cache
library.
- bufc_lseek - this function provides functionality of system call
lseek. Notice that this function does not cause any
blocks being brought to or evicted from the buffer cache pool.
- bufc_close - this function will provide the functionality of system call
close. Moreover, it will clean up, if needed, the
structures created as a consequence of a call to bufc_open library function.
- bufc_shutdown - will write-back all changed file blocks and
destroy all structures created by bufc_init function for buffer cache
management. You are guaranteed that this function will be called after all
threads finished running. However, you are not guaranteed that all files that
were opened by the threads using bufc_open will be closed by the threads.
Buffer Cache Semantics
In this exercise the following semantics will apply to buffer cache:
- Buffer Cache is a pool of file blocks kept in memory, as a form of cache,
to minimize disk accesses per file operation.
- The blocks contained in the Buffer Cache are managed using LRU replacement
policy with complete associativity: the replacement policy applies to all
blocks at once, without any additional grouping.
- Buffer Cache provides atomicity of read/write/lseek operations - threads
may simultaneously call for any mixture of read/write/lseek operations that
effect the same block of the same file, in this case for any two
read/write/lseek operations A,B either A will begin and complete
before B will begin, or B will begin and complete before A
will begin.
- Notice that several read/write/lseek operations may occur simultaneously on
the same file if they access different blocks, without any degree of
interference. (Can you explain why?)
- LRU policy is implemented over blocks. It creates a (virtual) queue of
blocks inside Buffer Cache pool. The blocks are ordered inside the queue from
the most recently used block (the head of the queue) to the least recently used
(the tail of the queue).
- All changes is LRU queue state have to be logged using structure and
functions provided in LRU state library (liblru_state) (header file). Notice that calls to LRU state library functions are an integral
part of LRU queue update operations.
The link above contains a limited (though major) version of
the LRU state library at the moment. Later on you'll be provided an extended
set of features that will constitute the LRU state data set, and your Buffer
Cache library will have to provide.
- Replacement of a changed block by LRU algorithm (was used for writing) will
cause write-back, the block will be written back to the disk into the
respective place in the file on disk.
- bufc_close library function will cause write-back operation in
case no other thread keeps the same file open.
- You may assume that length of bufc_read,bufc_write array will not
exceed overall size of all blocks in the buffer cache pool.
- You may assume that no NON-BLOCKING read/write operations will be used.
Background reading
- File Management functions (open, lseek, unlink, read, write, close)
- Directory Management functions (mkdir, rmdir)
- File/Directory Status functions (stat, fstat)
- File system status functions (statfs, fstatfs)
- Virtual Memory concepts
- File System Concepts
Submission
Submit tar file named ex3.tar containing:
- README file containing your logins as usual.
- Source code.
- Makefile - your makefile should generate a library file named: libbufc.a . Make sure that the tar file can be extracted and that
the extracted files do compile.
Guidelines:
- Run everything on local machine (workstation you are using)
- Avoid collisions: do not use public machines (i.e. inferno, bagel...)
- Design your program carefully before you start writing.
- Always check the return value of a system call wherever applicable.
- Test you code thoroughly - write test programs - cross test program with other groups (swap tests, NOT programs)
- Use asserts and debug printouts, but make sure to remove all asserts and any debug output from the library before submission.
A library function should not have side-effects such as debug output or exit calls.
To the course home page