Open Access Highly Accessed Research article

Sequencing and analysis of a South Asian-Indian personal genome

Ravi Gupta1, Aakrosh Ratan2, Changanamkandath Rajesh1, Rong Chen3, Hie Lim Kim2, Richard Burhans2, Webb Miller2, Sam Santhosh1, Ramana V Davuluri4, Atul J Butte5, Stephan C Schuster26*, Somasekar Seshagiri7* and George Thomas1*

Author Affiliations

1 SciGenom Labs Pvt Ltd., Plot 43A, SDF 3rd Floor CSEZ, Kakkanad, Cochin, Kerala, 682037, India

2 Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, 310 Wartik Lab, University Park, , Pennsylvania, 16802, USA

3 , , Personalis, 1350 Willow Road, Suite 202, Menlo Park, CA, 94025, USA

4 Center for Systems The Wistar Institute,, , Philadelphia, PA, 19104, USA

5 Division of Systems Medicine, Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA

6 Singapore Centre on Environmental Life Sciences Engineering, Nanyang Technological University, 60 Nanyang Drive, SBS-01N-27, Singapore, Singapore , 637551

7 Department of Molecular Biology, Genentech Inc, 1 DNA Way, South San Francisco, CA, 94080, USA

For all author emails, please log on.

BMC Genomics 2012, 13:440  doi:10.1186/1471-2164-13-440

Published: 31 August 2012

Additional files

Additional file 1:

Table S1. SNPs and indels in (A) Gene, Regulatory and Enhancer regions, (B) Repeat class and family. Table S2 Non-synonymous SNPs in SAIF genome. Table S3 SNPs predicted to be damaging by SIFT. Table S4 (A) In-frame short indels, and (B) Short frameshift indels in SAIF genome. Table S5 Short indels predicted to lead to non-sense mediated decay (NMD) by SIFT. Table S6 SAIF SNP comparison. Table S7 Novel SNPs and indels in SAIF genome. Table S8 SAIF SNPs represented in OMIM. Table S9 SAIF SNPs annotated using SNPedia. Table S10 Pharmcogenomic relevant variants in SAIF genome.

Format: PDF Size: 812KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2:

Figure S1. Pathway analysis of synonymous SNPs. Pathway enrichment analysis was performed using DAVID program. The enriched KEGG pathways (FDR<= 0.05) identified are reported. Figure S2. Protein domain position and non-sense SNP location in MMP28 protein. Figure S3. Phylogenetic relationship of the SAIF mt genome. The tree on the left shows phylogenetic relationships of human mt macro-haplogroups. The right tree is a Neighbor-Joining tree of U and K haplogroups. The tree was constructed using 210 complete mt genome sequences, which were obtained from the GenBank database, including the SAIF mt genome (highlighted by red). The SAIF mitochondrial genome clustered with the U1 branch and was closely related to the U1a3 haplogroup. A comparison of the SAIF mt genomic sequence against the U1a3 sequence (GenBank accession # AY714038) revealed 14 nucleotide differences between the two genomes. Figure S4. Coalescence time estimations for the U haplogroup. The coalescence time for the U mt haplogroup was estimated by the BEAST analysis [97]. A total of 313 mt complete genome sequences obtained from GenBank that are representative of each macro-haplogroup and each U1 haplogroup were used in the analysis. We calibrated our time to most recent common ancestor (TMRCA) estimates based on published estimate of 660 kya for the separation of the Homo sapiens and Neanderthal mt lineages [98] and the 194.3 ± 32.55 kya TMRCA estimate for the global mtDNA genome tree [99]. BEAST analysis was run with HKY substitution model, the strict molecular clock model, exponential population growth tree prior, MCMC chain length 2M, and a 10% burn-in, as parameters. The coalescence time for the U haplogroup and U1a haplogroup were estimated to be 86 kya and 46 kya, respectively. Figure S5. Relative genetic risk of SAIF in comparison to GIH population represented in HapMap III. We used a set of disease SNPs measured in both SAIF and GIH, and recalculated the LR for SAIF and each of 101 GIH individuals. The histogram of the individual in each risk range is shown for each disease. SAIF individual had a higher genetic risk than 80% of GIH on eight diseases. Figure S6. Contribution of individual SNPs to the overall risk for uterine lyoma is shown. For explanation of the symbols and other parameters in the graph refer to Figure 8. Figure S7. Contribution of individual SNPs to the overall risk for asthma is shown. For explanation of the symbols and other parameters in the graph refer to Figure 8. Figure S8. Contribution of individual SNPs to the overall risk for obesity is shown. For explanation of the symbols and other parameters in the graph refer to Figure 8.

Format: XLSX Size: 19.3MB Download file

Open Data