Accurate variant detection across non-amplified and whole genome amplified DNA using targeted next generation sequencing
1 Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany
2 RainDance Technologies, Inc. Lexington, Massachusetts, USA
3 Institute of Biochemistry, Christian-Albrechts-University, Kiel, Germany
4 First Medical Clinic, University Hospital, Kiel, Schleswig-Holstein, Germany
BMC Genomics 2012, 13:500 doi:10.1186/1471-2164-13-500Published: 20 September 2012
Additional file 1:
Table S2. An Overview of the RDT 384 Member Panel. Table includes individual tabs describing the amplicons, primers and gff.
Format: PDF Size: 529KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 2:
Figure S1. RainDance Genomic DNA Template Droplet and Primer Droplet Merge. PCR droplets are generated on the RDT 1000 instrument (RainDance Technologies, Inc. USA). For a single sample, the gDNA template mix and the RainDance 384 Member Primer Library and four consumables are required, namely RDT 1000 Chip, RDT 1000 Template input/output vial, RDT 1000 Collect input/output vial, and a PCR Tube Strip, Axygen Scientific. The RDT 1000 instrument generated each PCR droplet by pairing a single gDNA template droplet with a single primer droplet. The paired droplets flow past an electrode embedded in the chip and are instantly merged creating the PCR droplet. All of the resulting PCR droplets were automatically dispensed as an emulsion into a single PCR tube and transferred to a standard thermal cycler for PCR amplification. Figure S2. A and B Comparison of DNA Fragment Distribution of RainDance 384 Member Primer Panel BioAnalyzer Trace with RainDance 384 Member Primer Library Predicted Amplicon Profile. Comparisons of the DNA fragment distribution of RainDance 384 Member Primer Panel Bioanalyzer Trace (2A) with the predicted profile (2B). As shown here, the amplicon profile obtained from the Agilent Bioanalyzer results (2A) nicely matches the predicted histogram distribution (2B). Figure S3. Removal of Primer-Dimer Peaks Using Agencourt AMPure Kit. (A) A typical electropherogram obtained showed a primer-dimer peak between 60 and 70 bp (~ 68 bp). B) The primer-dimer peak (and other unincorporated dNTPs, primers, salts and other contaminants) were removed after purification of the SOLiD Fragment Library (here Library 759L is shown as an example) using the standard procedure of Agencourt AMPure Kit (Beckman Coulter Genomics). Figure S4. Coverage Uniformity Across all the Barcoded and Pooled Samples Before (pB) and After (pA) Emulsion PCR. A comparable distribution of average depth of coverage (ADoC) across libraries pooled before and after emulsion PCR (emPCR) is shown here. Barcoded libraries pooled after emPCR (792 (pA)), showed more uniform ADoC. Samples assigned to barcode 4 (yellow point) showed the lowest ADoC.
Format: PDF Size: 152KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 3:
Table S1. Overview of Sample Processing: RainDance Sequence Enrichment, SOLiD Sequencing Library Construction and Sample Indexing, Emulsion PCR and Sequencing. The samples were enriched on the RDT1000, followed by shearing and standard library construction according to the SOLiD 3.0 protocol (Applied Biosystems/ Life Technologies, USA). Six individual libraries were prepared using the enriched gDNA products and indexed with barcodes 1 to 6 (Library IDs 759L, 760L, 761L, 762L, 763L and 764L). Three additional libraries were prepared using the enriched WGA products and indexed with barcodes 9 to 11 (Library IDs 765L, 766L and 767L). To test the performance of each step, the non-amplified samples were pooled before and after the library preparation process. Pre-library preparation pools were created by combining an equimolar portion of the 6 individual gDNA. The samples were pooled together and processed as single samples (Library IDs 768_1L and 768_2L were duplicate libraries that contained six gDNA samples). Post-library preparation pools were created by combining equimolar portions of each individual sample post library preparation. A post library preparation pool was generated before emPCR (Emulsion ID 770em: gDNA samples 1, 2, 3, 4 and 6) and after emPCR (Emulsion ID 792em: gDNA samples 1, 2, 3, 4, 5 and 6). The resulting PCR products were sequenced on a SOLiD 3.0 system using 50 bp fragment libraries. Table S3. CLC Bio SNP Detection Parameters. The following parameters were used for SNP detection through the CLC bio Genomics Workbench (version 3.0): Maximum coverage 50000. Maximum gap and mismatch count 2. Minimum average quality 15. Minimum central quality 20. Minimum coverage 5. Minimum variant frequency (%) 10.0. Variant count threshold 50000. Window length 11. In the non-pooled samples, a SNP with a non-reference allele frequency of 10-90% was considered a heterozygote. A homozygous SNP in non-pooled samples was defined as having >90% non-reference allele frequency. Table S4. Coverage Metrics CLC bio Genomics Workbench (version 5.1). Table S6. Sequence Data Generated Using 454 FLX and Illumina of the same Target Regions (172kb/384 exons). Table S7. Sample Multiplexing Calculation for the RDT 384 Member Panel and SOLiD Sequencing Platform.
Format: XLSX Size: 158KB Download file
Additional file 4:
Table S5. An Overview of All SNPs and Genotypes Detected. Genotypes from non-barcoded pooled samples. Table includes both inferred and non-inferred genotypes.
Format: XLSX Size: 75KB Download file