Experimental Systems Lab Projects

The following list suggests some research topics, but you are also invited to initiate your own. Projects can be done at all levels: "Avoda Mudrechet" for third year students, lab project for M.Sc. Students, or M.Sc. and Ph.D. research (projects marked with an * are especially ill-defined and/or long-term research).

If you are interested in doing a project in our lab, contact Dror Feitelson at .

    Workloads and Their Effect

  1. Workload resampling

    Performance evaluations typically require the use of several related workloads, e.g. in the interest of calculating confidence intervals. The project is to achieve this by taking one real workload trace, partitioning it into basic units (e.g. user sessions), and then re-grouping them in different ways to create the desired workloads. In particular, this can also be used to modify the workload in a controlled manner -- e.g. change the load, change the locality, and maybe even improve the stationarity. The main question is how to do this correctly, and especially how to guarantee that stability is maintained. For example, resampling may be combined with user-based modeling with feedback. Given the basic mechanism, there are many applications that can be investigated, e.g. extending traces, mixing them, and changing their properties as mentioned above.

  2. * Workload analysis by composition

    This is related to the previous project, based on inspiration from Michal Irani's work on image and video processing. The idea there is that a lot of cool things can be done based on inherent redundancy in visual data, where patches of the image are similar to other patches. We want to apply the same ideas to workload data, and do things like

  3. Realistic memory reference models

    memory reference models like the stack model try to mimic the correct correlation structure, but ignore other structure that exists in real reference traces. The project is to create a more realistic model, based on components such as repeated use, scans, and random accesses.

    Another deficiency of the stack model is that it leads to a "crawling phase shift": occasionally an address from deep in the stack is selected, and it then becomes popular. A more realistic model would include phase shifts explicitly, and change the popularities of a whole set of addresses collectively.

    A deeper challenge is to figure out whether all this is important for evaluating real systems. One idea: real phase shifts may be important when evaluating processors with multiple issue and out of order execution, because they can overlap multiple requests for data items from the new locality.

  4. * Simulations with heavy-tailed distributions

    Various parameters of real-world workloads are heavy tailed, meaning that there is a non-negligible probability of very large values. An example is Unix process lengths: most are short, but some are extremely long. Simulating system behavior under such conditions may not converge, and in any case is very sensitive to the few largest samples, which are largely random. The project is to help develop and evaluate methods to overcome this difficulty, based on truncated models of the distributions. This means that we artificially remove some of the most problematic samples in a controlled way, to try and characterize their effect.

    Global Scheduling for Virtual Machines

    The main idea in this project is to first identify the system bottleneck, and then perform the scheduling on that device. In particular, it may be that the CPU is not the bottleneck, and therefore scheduling it should not be the main concern. This approach is motivated in a short paper we published. We are also working already on a generic scheduler called RSVT that governs the allocation of a certain resource to competing VMs.

    Specific sub-projects include

  5. Implementing a monitring facility to identify the system bottleneck.
  6. Complete the implementation of RSVT for different devices. This includes RSVT for I/O, and extending the handling of CPU to SMP machines and multithreaded or forking processes.
  7. Extend RSVT to support allocations to users or groups rather than to single processes.
  8. Create a global scheduler that uses the monitoring facility to decide which instance of RSVT to use at any given time.
  9. * Figure out how to add memory allocation to this framework.
  10. We also have theoretical results regarding off-line allocations in a multi-bottleneck setting. An interesting extension is to simulate how these can be approximated on-line.

    Parallel Job Scheduling

  11. Evaluations as a function of load

    Schedulers are often evaluated by showing how they perform as a function of the load. Thus it is required to generate workloads with varying load levels. The question is how to do this correctly; maybe all workload parameters need to come from distributions conditioned on the desired load level? An intriguing alternative is to try and analyze a single realistic run and extract data about different load conditions that occurred in it. And how can we decide which methodology is more representative?

  12. Clairvoyant scheduling

    One of the constraints on schedulers is that they need to operate in an on-line manner without knowing the future. But how much does knowing the future help? The project is to try and design a scheduler that exploits future knowledge, in the context of parallel job scheduling. This will shed light on the question of how much we should invest in being able to predict the future.

    Software Engineering and Open Source

  13. Assessing success of open source projects

    There is a lot of data about many open-source projects on SourceForge. The question is how to use it to classify projects, and in particular, to identify successful ones. We have identified several patterns of how downloads change with time that can give some clues (see paper). The project is to automate the identification of these patterns, and apply them to the full SourceForge database. In particular, this will enable an extension where we track how the status of a project changes with time.

  14. Globals in operating systems and common coupling

    globals are widely used in operating systems, and cause potentially problematic coupling between modules. The project is to compare the use of globals in different systems and try to analyze the justification of their design. This includes the identification of globals, e.g. using the idea that all instances of the same struct are potentially global. Another part of this project is to assess the degree of common coupling between modules (that is, coupling due to access to shared globals), and how it changes with time in a large project such as Linux.

  15. Release activity in Linux

    Linux (and many other large systems) follows a perpetual development lifecycle model, with a continuous backbone of development activity that is interrupted at times to produce a new stable release (see our paper characterizing this process). It is conjectured that the release activity is related to improving the code and the systems stability, but can this be quantified using code metrics? The project is to characterize what happens both in the preperation before a release and in the period following it, and how it differs from normal development on one hand and continued maintenance on the other.

  16. Up-to-time-view of code

    A developer that enters a long term software project suffers an inherent disadvantage: the code he sees is the result of a long evolutionary process, and is therefore harder to understand than it was to the developers that actually followed the process of its creation. This project is about creating a tool to reconstruct historical views of the code, and assessing whether it indeed helps developers understand the code better. This may be based on file history flow graphs.

  17. Characterizing the functional growth of Linux

    System growth is typically measured in LOC or number of modules. But in an operating system like Linux one may be able to measure functional growth. We have already shown that the growth in number of system calls is slowing down considerably, but the growth in configuration options is accelerating (see paper). The project is to extend this to other metrics, such as flags to system calls and ioctl calls.

  18. * Programmer productivity

    It has been claimed that programmers may have very different levels of competence, but this is rarely quantified. This project will attempt to derive a quantification of the relative contribution of different developers to a project, by studying data about the development of open source projets.

    Other Topics

  19. * Self tuning with genetic algorithms

    Many years ago we published a paper on self tuning a system using genetic algorithms. Some years later someone actually implemented genetic algorithms in the Linux kernel! question is, can we really use this effectively?