This article is part of the supplement: International Workshop on Computational Systems Biology: Approaches to Analysis of Genome Complexity and Regulatory Gene Networks
Research
Parameterization of disorder predictors for large-scale applications requiring high specificity by using an extended benchmark dataset
1 Biomolecular Function Discovery Division, Bioinformatics Institute (BII), Agency for Science Technology and Research (A*STAR), 30 Biopolis Street, #07-01, Matrix, 138671, Singapore
2 Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, 117543, Singapore
3 School of Computer Engineering (SCE), Nanyang Technological University (NTU), 50 Nanyang Drive, 637553, Singapore
BMC Genomics 2010, 11(Suppl 1):S15 doi:10.1186/1471-2164-11-S1-S15
Published: 10 February 2010Additional files
Additional file 1:
SL dataset. The SL dataset comprises DisProt r4.5 sequences re-annotated to consider short and long disordered residues, as well as ordered ones. The file is in fasta format, where the amino acid sequence is represented in single letter code and the one line header about the corresponding sequence starts with the symbol ">". The annotation of disordered and ordered regions follows the DisProt description, where the disordered regions are denoted by the symbol "#", while ordered ones are denoted by the symbol "&", followed by the starting and the end residues of the respective region (e.g. #1-10 &11-70 #71-100; where residues from 1 to 10 and 71 to 100 are disordered, while 11-70 are ordered).
Format: TXT Size: 269KB Download file
Additional file 2:
Remark 465 dataset. The Remark 465 dataset comprises a set of sequences from DisProt r4.5 where at least one structural domain was found in the sequence. Residues annotated under Remark 465 in the PDB were here annotated as disordered. Consequently, the Remark 465 dataset comprises mainly short disordered regions. The file is in fasta format, where the amino acid sequence is represented in single letter code and the one line header about the corresponding sequence starts with the symbol ">". The annotation of disordered and ordered regions follows the DisProt description, where the disordered regions are denoted by the symbol "#", while ordered ones are denoted by the symbol "&", followed by the starting and the end residues of the respective region (e.g. #1-10 &11-70 #71-100; where residues from 1 to 10 and 71 to 100 are disordered, while 11-70 are ordered).
Format: TXT Size: 185KB Download file
Additional file 3:
Supplementary Table and Figures 1 and 2.
Format: DOC Size: 330KB Download file
This file can be viewed with: Microsoft Word Viewer


