Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Methodology article

Detecting autozygosity through runs of homozygosity: A comparison of three autozygosity detection algorithms

Daniel P Howrigan12*, Matthew A Simonson12 and Matthew C Keller12

Author Affiliations

1 Department of Psychology, University of Colorado at Boulder, 1416 Broadway, Boulder, CO, 80301, USA

2 Institute for Behavioral Genetics, University of Colorado at Boulder, 1480 30th St., Boulder, CO, 80303, USA

For all author emails, please log on.

BMC Genomics 2011, 12:460  doi:10.1186/1471-2164-12-460

Published: 23 September 2011

Abstract

Background

A central aim for studying runs of homozygosity (ROHs) in genome-wide SNP data is to detect the effects of autozygosity (stretches of the two homologous chromosomes within the same individual that are identical by descent) on phenotypes. However, it is unknown which current ROH detection program, and which set of parameters within a given program, is optimal for differentiating ROHs that are truly autozygous from ROHs that are homozygous at the marker level but vary at unmeasured variants between the markers.

Method

We simulated 120 Mb of sequence data in order to know the true state of autozygosity. We then extracted common variants from this sequence to mimic the properties of SNP platforms and performed ROH analyses using three popular ROH detection programs, PLINK, GERMLINE, and BEAGLE. We varied detection thresholds for each program (e.g., prior probabilities, lengths of ROHs) to understand their effects on detecting known autozygosity.

Results

Within the optimal thresholds for each program, PLINK outperformed GERMLINE and BEAGLE in detecting autozygosity from distant common ancestors. PLINK's sliding window algorithm worked best when using SNP data pruned for linkage disequilibrium (LD).

Conclusion

Our results provide both general and specific recommendations for maximizing autozygosity detection in genome-wide SNP data, and should apply equally well to research on whole-genome autozygosity burden or to research on whether specific autozygous regions are predictive using association mapping methods.