Open Access Highly Accessed Research article

Sequencing and analysis of a South Asian-Indian personal genome

Ravi Gupta1, Aakrosh Ratan2, Changanamkandath Rajesh1, Rong Chen3, Hie Lim Kim2, Richard Burhans2, Webb Miller2, Sam Santhosh1, Ramana V Davuluri4, Atul J Butte5, Stephan C Schuster26*, Somasekar Seshagiri7* and George Thomas1*

Author affiliations

1 SciGenom Labs Pvt Ltd., Plot 43A, SDF 3rd Floor CSEZ, Kakkanad, Cochin, Kerala, 682037, India

2 Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, 310 Wartik Lab, University Park, , Pennsylvania, 16802, USA

3 , , Personalis, 1350 Willow Road, Suite 202, Menlo Park, CA, 94025, USA

4 Center for Systems The Wistar Institute,, , Philadelphia, PA, 19104, USA

5 Division of Systems Medicine, Department of Pediatrics, Stanford University School of Medicine, Stanford, CA, USA

6 Singapore Centre on Environmental Life Sciences Engineering, Nanyang Technological University, 60 Nanyang Drive, SBS-01N-27, Singapore, Singapore , 637551

7 Department of Molecular Biology, Genentech Inc, 1 DNA Way, South San Francisco, CA, 94080, USA

For all author emails, please log on.

Citation and License

BMC Genomics 2012, 13:440  doi:10.1186/1471-2164-13-440

Published: 31 August 2012



With over 1.3 billion people, India is estimated to contain three times more genetic diversity than does Europe. Next-generation sequencing technologies have facilitated the understanding of diversity by enabling whole genome sequencing at greater speed and lower cost. While genomes from people of European and Asian descent have been sequenced, only recently has a single male genome from the Indian subcontinent been published at sufficient depth and coverage. In this study we have sequenced and analyzed the genome of a South Asian Indian female (SAIF) from the Indian state of Kerala.


We identified over 3.4 million SNPs in this genome including over 89,873 private variations. Comparison of the SAIF genome with several published personal genomes revealed that this individual shared ~50% of the SNPs with each of these genomes. Analysis of the SAIF mitochondrial genome showed that it was closely related to the U1 haplogroup which has been previously observed in Kerala. We assessed the SAIF genome for SNPs with health and disease consequences and found that the individual was at a higher risk for multiple sclerosis and a few other diseases. In analyzing SNPs that modulate drug response, we found a variation that predicts a favorable response to metformin, a drug used to treat diabetes. SNPs predictive of adverse reaction to warfarin indicated that the SAIF individual is not at risk for bleeding if treated with typical doses of warfarin. In addition, we report the presence of several additional SNPs of medical relevance.


This is the first study to report the complete whole genome sequence of a female from the state of Kerala in India. The availability of this complete genome and variants will further aid studies aimed at understanding genetic diversity, identifying clinically relevant changes and assessing disease burden in the Indian population.

Indian genome; Personal genomics; Whole genome sequencing