Open Access Research article

Identification of proprotein convertase substrates using genome-wide expression correlation analysis

Hannu Turpeinen12, Sampo Kukkurainen23, Kati Pulkkinen12, Timo Kauppila123, Kalle Ojala4, Vesa P Hytönen23 and Marko Pesu125*

Author affiliations

1 Immunoregulation, Institute of Biomedical Technology, FI-33014 University of Tampere, Finland

2 BioMediTech, Tampere, Finland

3 Protein Dynamics, Institute of Biomedical Technology, University of Tampere, Finland

4 Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland

5 Centre for Laboratory Medicine, Tampere University Hospital, Finland

For all author emails, please log on.

Citation and License

BMC Genomics 2011, 12:618  doi:10.1186/1471-2164-12-618

Published: 20 December 2011



Subtilisin/kexin-like proprotein convertase (PCSK) enzymes have important regulatory function in a wide variety of biological processes. PCSKs proteolytically process at a target sequence that contains basic amino acids arginine and lysine, which results in functional maturation of the target protein. In vitro assays have showed significant biochemical redundancy between the seven family members, but the phenotypes of PCSK deficient mice and patients carrying an inactive PCSK allele argue for a specific biological function. Modeling the structures of individual PCSK enzymes has offered little insights into the specificity determinants. However, previous studies have shown that there can be a coordinated expression between a PCSK and its target molecule. Here, we have surveyed the putative PCSK target proteins using genome-wide expression correlation analysis and cleavage site prediction algorithms.


We first performed a gene expression correlation analysis over the whole genome for all PCSK enzymes. PCSKs were found to cluster differently based on the strength of correlations. The screen for putative PCSK target proteins showed a significant enrichment (p-values from 1.2e-4 to < 1.0e-10) of putative targets among the most positively correlating genes for most PCSKs. Interestingly, there was no enrichment in putative targets among the genes that correlated positively with the biologically redundant PCSK7, whereas PCSK5 showed an inverse correlation. PCSKs also showed a highly variable degree of shared target genes that were identified by expression correlation and cleavage site prediction. Multiple alignments were used to evaluate the putative targets to pinpoint the important residues for the substrate recognition. Finally, we validated our approach and identified biochemically PAPPA1 and ADAMTS6 as novel targets for FURIN proteolytic activity.


Most PCSK enzymes display strong positive expression correlation with predicted target proteins in our genome-wide analysis. We also show that expression correlation screen combined with a cleavage site-prediction analysis can be used to identify novel bona fide target molecules for PCSKs. Exploring the positively correlating genes can thus offer additional insights into the biology of proprotein convertases.