Open Access Highly Accessed Open Badges Research article

The rhesus macaque is three times as diverse but more closely equivalent in damaging coding variation as compared to the human

Qiaoping Yuan1, Zhifeng Zhou1, Stephen G Lindell1, J Dee Higley2, Betsy Ferguson3, Robert C Thompson4, Juan F Lopez5, Stephen J Suomi6, Basel Baghal1, Maggie Baker1, Deborah C Mash7, Christina S Barr1* and David Goldman1*

Author Affiliations

1 Laboratory of Neurogenetics, National Institute on Alcohol Abuse and Alcoholism, NIH, Bethesda, MD, 20892, USA

2 Laboratory of Clinical and Translational Studies, NIAAA, Bethesda, MD, 20892, USA

3 Oregon National Primate Research Center, Oregon Health and Sciences University, 505 NW 185th Ave, Beaverton, OR, 97006, USA

4 Department of Psychiatry, University of Michigan, Ann Arbor, MI, 48104, USA

5 Mental Health Research Institute, University of Michigan Medical Center, 3064 NSL, 1103 East Huron Street, Ann Arbor, MI, 48104, USA

6 Laboratory of Comparative Ethology, National Institute of Child Health and Human Development, NIH, Poolesville, MD, 20837, USA

7 Department of Neurology, University of Miami School of Medicine, Miami, FL, 33136, USA

For all author emails, please log on.

BMC Genetics 2012, 13:52  doi:10.1186/1471-2156-13-52

Published: 29 June 2012



As a model organism in biomedicine, the rhesus macaque (Macaca mulatta) is the most widely used nonhuman primate. Although a draft genome sequence was completed in 2007, there has been no systematic genome-wide comparison of genetic variation of this species to humans. Comparative analysis of functional and nonfunctional diversity in this highly abundant and adaptable non-human primate could inform its use as a model for human biology, and could reveal how variation in population history and size alters patterns and levels of sequence variation in primates.


We sequenced the mRNA transcriptome and H3K4me3-marked DNA regions in hippocampus from 14 humans and 14 rhesus macaques. Using equivalent methodology and sampling spaces, we identified 462,802 macaque SNPs, most of which were novel and disproportionately located in the functionally important genomic regions we had targeted in the sequencing. At least one SNP was identified in each of 16,797 annotated macaque genes. Accuracy of macaque SNP identification was conservatively estimated to be >90%. Comparative analyses using SNPs equivalently identified in the two species revealed that rhesus macaque has approximately three times higher SNP density and average nucleotide diversity as compared to the human. Based on this level of diversity, the effective population size of the rhesus macaque is approximately 80,000 which contrasts with an effective population size of less than 10,000 for humans. Across five categories of genomic regions, intergenic regions had the highest SNP density and average nucleotide diversity and CDS (coding sequences) the lowest, in both humans and macaques. Although there are more coding SNPs (cSNPs) per individual in macaques than in humans, the ratio of dN/dS is significantly lower in the macaque. Furthermore, the number of damaging nonsynonymous cSNPs (have damaging effects on protein functions from PolyPhen-2 prediction) in the macaque is more closely equivalent to that of the human.


This large panel of newly identified macaque SNPs enriched for functionally significant regions considerably expands our knowledge of genetic variation in the rhesus macaque. Comparative analysis reveals that this widespread, highly adaptable species is approximately three times as diverse as the human but more closely equivalent in damaging variation.

Rhesus macaque; Human; Single nucleotide polymorphism; Diversity; Comparative genomics