Open Access Methodology article

A Poisson hierarchical modelling approach to detecting copy number variation in sequence coverage data

Nuno Sepúlveda12*, Susana G Campino3, Samuel A Assefa13, Colin J Sutherland14, Arnab Pain55 and Taane G Clark1

Author Affiliations

1 London School of Hygiene and Tropical Medicine, London, UK

2 Center of Statistics and Applications, University of Lisbon, Lisbon, Portugal

3 Wellcome Trust Sanger Institute, Hinxton, UK

4 Department of Clinical Parasitology, Hospital for Tropical Diseases, London, UK

5 King Abdullah University of Science and Technology, Thuwal, Saudi Arabia

For all author emails, please log on.

BMC Genomics 2013, 14:128  doi:10.1186/1471-2164-14-128

Published: 26 February 2013

Additional files

Additional file 1:

Skewness and kurtosis of empirical coverage distributions.

Format: PDF Size: 667KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2:

Statistical model comparison between Poisson, Poisson-Gamma, and Poisson-Lognormal distributions. The Poisson and Poisson-Lognormal models were compared to the Poisson-Gamma using the Deviance Information Criteria (DIC) [30] and Bayes factors (BF). In the case of DIC, we calculated the ratio between that of the Poisson-Gamma and those of the remaining models. With respect to BF, they were estimated as the log-ratio between the corresponding predictive prior probabilities via the BIC-MC estimator [31].

Format: PDF Size: 898KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3:

Expected and empirical cumulative coverage distributions. Expected coverage distributions refer to the corresponding posterior predictive distributions for the set of all 100-bp windows used in the analysis.

Format: PDF Size: 943KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 4:

Limits for CNV detection used on each sample as function of the underlying GC content. CNV detection limits were determined according to the posterior predictive probability distribution of the Poisson-Gamma (the best model for every data set under analysis).

Format: PDF Size: 783KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 5:

A large amplification detected between PFL1125w and PFL1160w genes in the 3D7 reference genome data using the Poisson-Gamma model.

Format: PDF Size: 844KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 6:

CNVs larger than 500 bp detected using the Poisson-Gamma model (γ=99%), the FREEC software, and cn.MOPS approach.

Format: PDF Size: 52KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Comparison between hits detected by the Poisson-Gamma model and the FREEC software.

Format: PDF Size: 43KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 8:

Ternary diagrams plotting the joint proportions of shared and exclusively detected hits by the PG model, the FREEC software, and cn.MOPS.

Format: PDF Size: 2.7MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data