Effect of RNA quality on transcript intensity levels in microarray analysis of human post-mortem brain tissues
Boehringer Ingelheim Pharma GmbH Co & KG, Birkendorfer Str. 65, Biberach and der Riss, Germany
BMC Genomics 2008, 9:91 doi:10.1186/1471-2164-9-91Published: 25 February 2008
Large-scale gene expression analysis of post-mortem brain tissue offers unique opportunities for investigating genetic mechanisms of psychiatric and neurodegenerative disorders. On the other hand microarray data analysis associated with these studies is a challenging task. In this publication we address the issue of low RNA quality data and corresponding data analysis strategies.
A detailed analysis of effects of post chip RNA quality on the measured abundance of transcripts is presented. Overall Affymetrix GeneChip data (HG-U133_AB and HG-U133_Plus_2.0) derived from ten different brain regions was investigated. Post chip RNA quality being assessed by 5'/3' ratio of housekeeping genes was found to introduce a well pronounced systematic noise into the measured transcript expression levels. According to this study RNA quality effects have: 1) a "random" component which is introduced by the technology and 2) a systematic component which depends on the features of the transcripts and probes. Random components mainly account for numerous negative correlations of low-abundant transcripts. These negative correlations are not reproducible and are mainly introduced by an increased relative level of noise. Three major contributors to the systematic noise component were identified: the first is the probe set distribution, the second is the length of mRNA species, and the third is the stability of mRNA species. Positive correlations reflect the 5'-end to 3'-end direction of mRNA degradation whereas negative correlations result from the compensatory increase in stable and 3'-end probed transcripts. Systematic components affect the expressed transcripts by introducing irrelevant gene correlations and can strongly influence the results of the main experiment. A linear model correcting the effect of RNA quality on measured intensities was introduced.
In addition the contribution of a number of pre-mortem and post-mortem attributes to the overall detected RNA quality effect was investigated. Brain pH, duration of agonal stage, post-mortem interval before sampling and donor's age of death within considered limits were found to have no significant contribution.
Basic conclusions for data analysis in expression profiling study are as follows: 1) testing for RNA quality dependency should be included in the preprocessing of the data; 2) investigating inter-gene correlation without regard to RNA quality effects could be misleading; 3) data normalization procedures relying on housekeeping genes either do not influence the correlation structure (if 3'-end intensities are used) or increase it for negatively correlated transcripts (if 5'-end or median intensities are included in normalization procedure); 4) sample sets should be matched with regard to RNA quality; 5) RMA preprocessing is more sensitive to RNA quality effect, than MAS 5.0.