Email updates

Keep up to date with the latest news and content from BMC Genetics and BioMed Central.

Open Access Open Badges Methodology article

A genetic algorithm based method for stringent haplotyping of family data

Francois Besnier1* and Örjan Carlborg12

Author Affiliations

1 Linnaeus Centre for Bioinformatics, Uppsala University, SE-75124 Uppsala, Sweden

2 Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences SE-750 07 Uppsala Sweden

For all author emails, please log on.

BMC Genetics 2009, 10:57  doi:10.1186/1471-2156-10-57

Published: 17 September 2009



The linkage phase, or haplotype, is an extra level of information that in addition to genotype and pedigree can be useful for reconstructing the inheritance pattern of the alleles in a pedigree, and computing for example Identity By Descent probabilities. If a haplotype is provided, the precision of estimated IBD probabilities increases, as long as the haplotype is estimated without errors. It is therefore important to only use haplotypes that are strongly supported by the available data for IBD estimation, to avoid introducing new errors due to erroneous linkage phases.


We propose a genetic algorithm based method for haplotype estimation in family data that includes a stringency parameter. This allows the user to decide the error tolerance level when inferring parental origin of the alleles. This is a novel feature compared to existing methods for haplotype estimation. We show that using a high stringency produces haplotype data with few errors, whereas a low stringency provides haplotype estimates in most situations, but with an increased number of errors.


By including a stringency criterion in our haplotyping method, the user is able to maintain the error rate at a suitable level for the particular study; one can select anything from haplotyped data with very small proportion of errors and a higher proportion of non-inferred haplotypes, to data with phase estimates for every marker, when haplotype errors are tolerable. Giving this choice makes the method more flexible and useful in a wide range of applications as it is able to fulfil different requirements regarding the tolerance for haplotype errors, or uncertain marker-phases.