Comparing self-reported ethnicity to genetic background measures in the context of the Multi-Ethnic Study of Atherosclerosis (MESA)
1 Department of Biostatistical Sciences, Wake Forest University School of Medicine Winston-Salem, North Carolina 27157, USA
2 Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham Birmingham, Alabama 35294, USA
3 Department of Epidemiology University of Alabama at Birmingham Birmingham, Alabama 35294, USA
4 Department of Biostatistics University of Washington Seattle, WA 98195, USA
5 Division of Internal Medicine and Department of Epidemiology, University School of Medicine Baltimore, MD 21287, USA
6 Department of Radiology and Medicine John Hopkins University School of Medicine Baltimore, MD 21287, USA
BMC Genetics 2011, 12:28 doi:10.1186/1471-2156-12-28Published: 4 March 2011
Questions remain regarding the utility of self-reported ethnicity (SRE) in genetic and epidemiologic research. It is not clear whether conditioning on SRE provides adequate protection from inflated type I error rates due to population stratification and admixture. We address this question using data obtained from the Multi-Ethnic Study of Atherosclerosis (MESA), which enrolled individuals from 4 self-reported ethnic groups. We compare the agreement between SRE and genetic based measures of ancestry (GBMA), and conduct simulation studies based on observed MESA data to evaluate the performance of each measure under various conditions.
Four clusters are identified using 96 ancestry informative markers. Three of these clusters are well delineated, but 30% of the self-reported Hispanic-Americans are misclassified. We also found that MESA SRE provides type I error rates that are consistent with the nominal levels. More extensive simulations revealed that this finding is likely due to the multi-ethnic nature of the MESA. Finally, we describe situations where SRE may perform as well as a GBMA in controlling the effect of population stratification and admixture in association tests.
The performance of SRE as a control variable in genetic association tests is more nuanced than previously thought, and may have more value than it is currently credited with, especially when smaller replication studies are being considered in multi-ethnic samples.