Email updates

Keep up to date with the latest news and content from BMC Medical Research Methodology and BioMed Central.

Open Access Correspondence

Data preparation techniques for a perinatal psychiatric study based on linked data

Fenglian Xu1*, Lisa Hilder1, Marie-Paule Austin2 and Elizabeth A Sullivan1

Author Affiliations

1 Perinatal and Reproductive Epidemiology Research Unit, School of Women and Children’s Health, University of New South Wales, Randwick, NSW 2031, Australia

2 Perinatal & Women's Mental Health Unit, St John of God Health Care & School of Psychiatry, University of New South Wales, Burwood, NSW 2134, Australia

For all author emails, please log on.

BMC Medical Research Methodology 2012, 12:71  doi:10.1186/1471-2288-12-71

Published: 8 June 2012



In recent years there has been an increase in the use of population-based linked data. However, there is little literature that describes the method of linked data preparation. This paper describes the method for merging data, calculating the statistical variable (SV), recoding psychiatric diagnoses and summarizing hospital admissions for a perinatal psychiatric study.


The data preparation techniques described in this paper are based on linked birth data from the New South Wales (NSW) Midwives Data Collection (MDC), the Register of Congenital Conditions (RCC), the Admitted Patient Data Collection (APDC) and the Pharmaceutical Drugs of Addiction System (PHDAS).


The master dataset is the meaningfully linked data which include all or major study data collections. The master dataset can be used to improve the data quality, calculate the SV and can be tailored for different analyses. To identify hospital admissions in the periods before pregnancy, during pregnancy and after birth, a statistical variable of time interval (SVTI) needs to be calculated. The methods and SPSS syntax for building a master dataset, calculating the SVTI, recoding the principal diagnoses of mental illness and summarizing hospital admissions are described.


Linked data preparation, including building the master dataset and calculating the SV, can improve data quality and enhance data function.

Data preparation; Method; Psychiatric study; Australia