Open Access Highly Accessed Methodology article

Analysis of multiple phenotypes in genome-wide genetic mapping studies

Chen Suo12*, Timothea Toulopoulou345, Elvira Bramon67, Muriel Walshe67, Marco Picchioni678, Robin Murray67 and Jurg Ott1

Author Affiliations

1 Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China

2 Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden

3 Department of Psychology, The University of Hong Kong, Hong Kong, Hong Kong

4 State Key Laboratory of Brain and Cognitive Sciences, The University of Hong Kong, Hong Kong, Hong Kong

5 King's College London, King's Health Partners, Department of Psychosis Studies Institute of Psychiatry, London, UK

6 Institute of Psychiatry, King’s College, London, UK

7 St Andrew’s Academic Centre, Kings College London, Northampton, UK

8 St Andrew’s Academic Centre, Kings College London, Northampton, UK

For all author emails, please log on.

BMC Bioinformatics 2013, 14:151  doi:10.1186/1471-2105-14-151

Published: 2 May 2013

Abstract

Background

Complex traits may be defined by a range of different criteria. It would result in a loss of information to perform analyses simply on the basis of a final clinical dichotomized affected / unaffected variable.

Results

We assess the performance of four alternative approaches for the analysis of multiple phenotypes in genetic association studies. We describe the four methods in detail and discuss their relative theoretical merits and disadvantages. Using simulation we demonstrate that PCA provides the greatest power when applied to both correlated phenotypes and with large numbers of phenotypes. The multivariate approach had low type I error only with independent phenotypes or small numbers of phenotypes. In this study, our application of the four methods to schizophrenia data provides converging evidence of the relative performance of the methods.

Conclusions

Via power analysis of simulated data and testing of experimental data, we conclude that PCA, creating one variable based on a linear combination of all the traits, performs optimally. We propose that our comparison will provide insight into the properties of the methods and help researchers to choose appropriate strategy in future experimental studies.

Keywords:
Multiple phenotypes; Statistical method; Genetic mapping