Email updates

Keep up to date with the latest news and content from BMC Bioinformatics and BioMed Central.

This article is part of the supplement: Eighth International Conference on Bioinformatics (InCoB2009): Bioinformatics

Open Access Proceedings

Protein subcellular localization prediction of eukaryotes using a knowledge-based approach

Hsin-Nan Lin123, Ching-Tai Chen123, Ting-Yi Sung2, Shinn-Ying Ho3 and Wen-Lian Hsu2*

Author Affiliations

1 Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, Taiwan, Republic of China

2 Bioinformatics Lab., Institute of Information Science, Academia Sinica, Taipei, Taiwan, Republic of China

3 Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan, Republic of China

For all author emails, please log on.

BMC Bioinformatics 2009, 10(Suppl 15):S8  doi:10.1186/1471-2105-10-S15-S8

Published: 3 December 2009

Additional files

Additional file 1:

ngLOC dataset. The file contains whole ngLOC dataset, in which the row starts with '>' represents the protein name, the next row represents localization site. The localization site is numbered from 1 to 10, denoting Cytoplasm (CYT), Cytoskeleton (CSK), Endoplasmic Reticulum (END), Extracellular (EXC), Golgi Apparatus (GOL), Lysosome (LYS), Mitochondria (MIT), Nuclear (NUC), Plasma Membrane (PLA), and Perixosome (POX). The ngLOC dataset can be also downloaded via http://bio-cluster.iis.sinica.edu.tw/kbloc/DataSet.htm webcite.

Format: TXT Size: 391KB Download file

Open Data

Additional file 2:

KnowPredsite prediction results for single-localized proteins using leave-one-out cross validation. Each row is a prediction result for a protein sequence. Columns A, and B represent protein name and localization site annotation, respectively. Columns C to L are the confidence scores corresponding to each localization site. Columns N to Q are the Top 1 to Top 4 accuracies.

Format: CSV Size: 2.1MB Download file

Open Data

Additional file 3:

KnowPredsite prediction results for single-localized proteins using ten-fold cross validation. The columns' definition is the same as that for Additional File 2.

Format: CSV Size: 2.1MB Download file

Open Data

Additional file 4:

KnowPredsite prediction results for multi-localized proteins using leave-one-out cross validation. Each row is a prediction result for a protein sequence. Columns A to L are the same to Additional File 2. Columns N to Q are the Top 1 to Top 4 accuracies based on the "at least one correct" criterion. Columns S to U are Top 2 to Top 4 accuracies based on the "both correct" criterion.

Format: CSV Size: 215KB Download file

Open Data

Additional file 5:

KnowPredsite prediction results for multi-localized proteins using ten-fold cross validation. The columns' definition is the same as that for Additional file 4.

Format: CSV Size: 215KB Download file

Open Data