Using Bayesian Networks to Analyze Gene Expression Data
Using Bayesian Networks to Analyze Gene Expression Data
Project Goals
A
quick presentation
Interactive
tour of results
On line papers
Who we are
Mail us
Project Goal
Massive amounts of gene expression profiles are quickly accumulating. This
is due to the development of biotechnologies such as array based hybridization.
Our goal is to develop algorithms and computational tools that can extract
meaningful information about gene regulation and function from this data.
We use methods for learning Bayesian networks to recover the structure of regulatory interactions
between the different genes.
We present here some preliminary results.
Presentations
Using
Bayesian Networks to Analyze Gene Expression Data - an
overview of our project.
A tutorial
on Bayesian Networks
An interactive tour
of our results
We present here the results of our learning methods on data from the Yeast
cell cycle analysis project published by Spellman et al. (1998) in
Molecular Biology of the Cell. We thank the lab at Stanford for making
this data available, and the lab members for the courteous help they gave
us.
We applied our methods to a data set of 800 genes that were found to be regulated
by the cell cycle. These genes were clustered into 8 clusters in Spellman
et al. (see their figure
).
We learned our genetic network using no prior knowledge or assumptions
("Tabula Rasa"). We present an interactive tour of the learned networks
.
All 800 genes
Instructions:
-
The learned genetic network is displayed in the upper left window.
It is possible to scroll to different parts of the network.
-
Each node represents the expressed RNA level of some gene .
-
The edges represent relations between these genes. The edges are
colored by level of confidence in the Markov relation (see below).
-
Place the mouse over a node to see the ORF name and protein name (if known).
-
Place the mouse over an edge to see the level of confidence for this relation.
-
Clicking on a node displays additional information about the ORF in the
bottom frame.
This includes links to some tables showing the strength of relations
between the ORF and other genes in the network.
They include:
-
Markov Relations - shows all genes that are direct relatives of
the queried gene.
-
Order (before) - shows all genes that appear before the queried
gene in the learned network. This implies they might be causally
(partly)
responsible for the expression level of this gene, perhaps through a chain
of other genes.
-
Order (After) - shows all genes that appear after the queried gene
in the learned network. This implies that their expression level
might be causally affected by the level of this gene, again, perhaps through
a chain of interactions involving other genes.
-
Edges - shows all genes that have edges common with the queried
gene. The edge direction is marked as:
-
->OtherGene - if the edge points from the queried gene to the
OtherGene
-
OtherGene-> - if the edge points from the OtherGene to the queried
gene.
-
--OtherGene - if the direction of the edge can not be determined.
Those relations will appear in the right-hand frame. The numbers (between
0.0 - 1.0), as well as the size of the bars on the right, indicate the
level of confidence in this relation.
There is also a link to the SGD entry of the queried ORF.
Papers
Using Bayesian Networks to Analyze Expression Data
N. Friedman, I. Nachman and D. Pe'er (A technical report describing this work. Submitted to RECOMB 2000).
Learning
Bayesian Network Structure from Massive Datasets: The "Sparse Candidate
algorithm".
N. Friedman, I. Nachman and D. Pe'er (UAI 99)
Data
Analysis with Bayesian Networks: A Bootstrap Approach.
N. Friedman, M. Goldszmidt, and A. Wyner. (UAI 99)
Who are we
Nir Friedman
Iftach
Nachman
Dana Pe`er
In collaboration with
Michal Linial Life Sciences Hebrew University
Moises Goldszmidt SRI international