Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Methodology article

Accurate and exact CNV identification from targeted high-throughput sequence data

Alex S Nord1*, Ming Lee12, Mary-Claire King12 and Tom Walsh12*

Author Affiliations

1 Department of Genome Sciences, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195-7720, USA

2 Department of Medicine, University of Washington, 1959 NE Pacific Street, Seattle, WA, 98195-7720, USA

For all author emails, please log on.

BMC Genomics 2011, 12:184  doi:10.1186/1471-2164-12-184

Published: 12 April 2011

Additional files

Additional file 1:

Figure S1. Histograms showing distributions for count and variation of coverage and ratio data. A) Median raw coverage: median coverage across samples for each base. B) SD raw coverage: standard deviation for raw coverage generated for each lane (8 samples), median SD across 8 lanes plotted. C) SD normalized coverage: standard deviation for normalized coverage generated for each lane (12 samples), median value across 8 lanes plotted. D) S:N normalized coverage: signal to noise ratio (mean/SD) for normalized coverage for each lane, median value across 8 lanes plotted. E) S:N corrected coverage: signal to noise ratio (mean/SD) for normalized coverage corrected for GC-content and bait capture bias for each lane, median value across 8 lanes plotted. F) Ratio: sample compared to lane median corrected normalized coverage for all bases, data from 10 representative samples plotted.

Format: PDF Size: 311KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2:

Figure S2. Simulated sensitivity estimate based on CNV size and signal-to-noise ratio of data. Data simulated for 1 mb of sequence with one true CNV of length 50 bp, 100 bp, 200 bp, 500 bp, or 100 bp. Random noise introduced in sample coverage data at a level corresponding with given signal-to-noise ratio. 100 replications run at signal-to-noise ratios of one to ten for each CNV size. Sensitivity is the proportion of runs in which the CNV was correctly identified. No false positives were identified when signal-to-noise ratio was greater than two (data not plotted).

Format: PDF Size: 139KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3:

Supplemental table S1.

Format: XLS Size: 21KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 4:

Figure S3. Comparison of true positive signal and false positive signal. True positive refers to bases within confirmed CNVs, whereas false positive refers to bases with ratio values < 0.6 or >1.4, but where no CNV could be confirmed. Histograms show distribution of: A) S:N (signal to noise: mean/SD), and B) z-score ((value-mean)/SD) for true positive versus false positive bases.

Format: PDF Size: 256KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data