9 July, 2009 - The puma software paper is published and is a featured article on the BMC Bioinformatics website.
13 June, 2007 - The TIGRA project, which builds on work done in PUMA, has been funded by the EPSRC.
27 April, 2007 - The puma package has been released through Bioconductor.
Introduction
We are developing probabilistic models for the analysis of microarray data.
These models allow us to include various sources of noise and uncertainty, be
they biological or technical, within a unifying analysis framework. We are
currently working on three main themes:
We are developing methods to extract accurate measurements from the
probe-level analysis of Affymetrix arrays. By taking a probabilistic
perspective, we can associate these measurements with a level of technical
measurement error while integrating out probe-specific effects.
We are developing a suite of methods which use the technical measurement
error from 1 and propagate it through higher level analyses. We have done
this for a number of important applications, e.g. we have published methods
for identifying differentially expressed genes from replicated and
unreplicated experiments, for dimensionality reduction using principal
component analysis, and for model-based clustering of genes.
We are developing latent variable models for inferring the activity and
effect of transcription factors from time-series gene expression data. We
have developed discrete time state-space models for genome-wide inference,
e.g. integrating gene expression data with ChIP-chip data, and we have
developed continuous time Gaussian process models for small ODE systems of
transcriptional regulation. This work is now being continued in the TIGRA project.
In all these cases we have developed practical methods for parameter estimation
and probabilistic inference. Our methods are implemented in a number of freely
available software packages.
R. Pearson, X. Liu, G. Sanguinetti, M. Milo, N.D. Lawrence, M. Rattray (2009)
puma: a Bioconductor package for Propagating Uncertainty in Microarray Analysis,
BMC Bioinformatics, 10:211. (Journal link)
X. Liu, K. Lin, B. Andersen, M. Rattray (2007)
Including probe-level uncertainty in model-based gene expression clustering,
BMC Bioinformatics, 8(98). (Journal link)
G. Sanguinetti, N.D. Lawrence, M. Rattray (2006)
Probabilistic inference of transcription factor concentrations and gene-specific regulatory activities,
Bioinformatics, 22(22):2775-2781. (Journal link)
(errata)
N.D. Lawrence, G. Sanguinetti, M. Rattray (2006)
Modelling transcriptional regulation using Gaussian processes,
to appear in NIPS2006. (download)
Liu, X., Milo, M., Lawrence, N. D. and Rattray, M. (2006)
Probe-level measurement error improves accuracy in detecting differential gene expression,
Bioinformatics, 22(17):2107-2113. (Journal link) (errata)
G. Sanguinetti, M. Rattray and N.D. Lawrence (2006)
Identifying submodules of cellular regulatory networks,
Accepted for CMSB 2006 (Computational Methods in Systems Biology). (pre-print version)
G. Sanguinetti, M. Rattray and N.D. Lawrence (2006) A probabilistic
dynamical model for quantitative inference of the regulatory mechanism
of transcription, Bioinformatics, 22(14):1753-1759. (Journal link)
Rattray, M., Liu, X., Sanguinetti, G., Milo, M. and Lawrence, N. D. (2006)
Propagating uncertainty in Microarray data analysis, Briefings in Bioinformatics, 7(1):37-47.
(Journal link) (errata)
Sanguinetti, G., Milo, M., Rattray, M. and Lawrence, N. D. (2005)
Accounting for probe-level noise in principal component analysis of
microarray data, Bioinformatics,
21(19):3748-3754. (Journal link)
Liu, X., Milo, M., Lawrence, N. D. and Rattray, M. (2005) A
tractable probabilistic model for Affymetrix probe-level analysis
across multiple chips, Bioinformatics,
21(18): 3637-3644. (Journal link)
Milo, M., Fazeli, A., Niranjan, M. and Lawrence, N. D. (2003) A
probabilistic model for the extraction of expression levels from
oligonucleotide arrays, Biochemical
Society Transactions, 31: 1510-1512. (Journal link)
Presentations
Guido Sanguinetti, A probabilistic dynamical model for quantitative
inference of the regulatory mechanism of transcription,
Mathematical and Statistical Aspects of
Molecular Biology (MASAMB) 16th annual meeting, Conway Institute,
University College Dublin, Ireland, April 2006.
(ppt)
Magnus Rattray, Propagating measurement uncertainty in microarray data analysis, Royal Statistical Society Bioinformatics Event, Manchester,
UK, 12 October 2005. (ppt)
Magnus Rattray, Novel probabilistic models for propagating
uncertainty in microarray data analysis, Mathematical and Statistical Aspects of
Molecular Biology (MASAMB) 15th annual meeting, Rothamsted
Research, Harpenden, Hertfordshire, UK, March 2005.
Posters
Xuejun Liu, A probabilistic model for Affymetrix probe-level data analysis, 8th International Meeting of the Microarray Gene Expression Data
Society (MGED 8), Bergen, Norway, September 2005. Won 2nd prize in the poster competition. (pdf)
Guido Sanguinetti, Accounting for probe-level noise in Principal Component Analysis of microarray data, European Conferences on Computational Biology
(ECCB), Madrid, Spain, 2005. (pdf)
Software
Please note that the R packages mmgmos, pplr and pumaclust are no longer supported. All the functionality of these packages is now included in the puma package, which is still being actively supported and developed. The links for mmgmos, pplr and pumaclust can still be used to access historic versions of these packages.
puma: a Bioconductor R package
incorporating mmgmos, pumaDE(a multi-factorial extension to pplr), pumaClust and pumaPCA (an R implementation of NPPCA). (bioconductor link - recommended) (historic versions)
mmgmos: an R
package to estimate
the expression levels and the confidence of measures for multiple
arrays of the same type of Affymetrix GeneChips using the multi-chip
modified gamma Model for Oligonucleotide Signal (multi-mgMOS) and the
modified gamma Model for Oligonucleotide Signal (mgMOS). (download)
pplr: an R
package to include probe-level measurement error into the variance estimates of gene expression levels and to
detect up-regulated genes by calculating probability of positive log-ratio (PPLR). Implementing the MAP approximation and
variational inference.
(download)
pplr-matlab: Matlab implementation of pplr
using MCMC method.
(download)
pumaclust: an R
package to include probe-level measurement error in model-based clustering of gene expression data.
(download)
NPPCA: a
matlab toolbox to implement noise propagation in PCA for microarray
data. (download
)
Chipdyno: a
matlab toolbox implementation of a dynamical model of transcriptional regulation integrating ChIP
or regulatory motif data with microarray data (publication of method currently under review). (download
)
ChipVar: a
matlab toolbox implementation of a variational inference approach to inferring transcription factor
concentrations and activities (publication of method currently under review). (download
)
Funding
This work is supported
by a BBSRC award to N. Lawrence and M. Rattray "Improved Processing of
microarray data using probabilistic models".