Email updates

Keep up to date with the latest news and content from BMC Medical Genomics and BioMed Central.

Open Access Research article

My sister's keeper?: genomic research and the identifiability of siblings

Christopher A Cassa12*, Brian Schmidt2, Isaac S Kohane13 and Kenneth D Mandl13

Author Affiliations

1 Children's Hospital Informatics Program at the Harvard-MIT Division of Health Sciences and Technology, Boston, MA, USA

2 Clinical Decision Making Group, CSAIL, Massachusetts Institute of Technology, Cambridge, MA, USA

3 Harvard Medical School, Boston, MA, USA

For all author emails, please log on.

BMC Medical Genomics 2008, 1:32  doi:10.1186/1755-8794-1-32

Published: 25 July 2008



Genomic sequencing of SNPs is increasingly prevalent, though the amount of familial information these data contain has not been quantified.


We provide a framework for measuring the risk to siblings of a patient's SNP genotype disclosure, and demonstrate that sibling SNP genotypes can be inferred with substantial accuracy.


Extending this inference technique, we determine that a very low number of matches at commonly varying SNPs is sufficient to confirm sib-ship, demonstrating that published sequence data can reliably be used to derive sibling identities. Using HapMap trio data, at SNPs where one child is homozygotic major, with a minor allele frequency ≤ 0.20, (N = 452684, 65.1%) we achieve 91.9% inference accuracy for sibling genotypes.


These findings demonstrate that substantial discrimination and privacy risks arise from use of inferred familial genomic data.