An optimized procedure greatly improves EST vector contamination removal
-
* Corresponding author: Pei-Ing Hwang pihwang@gate.sinica.edu.tw
- Equal contributors
1 Bioinformatics Core Laboratory, Agricultural Biotechnology Research Center, Academia Sinica, Taipei, Taiwan
2 Lab of Mathematics in Biology, Institute of Statistical Sciences, Academia Sinica, Taipei, Taiwan
BMC Genomics 2007, 8:416 doi:10.1186/1471-2164-8-416
Published: 13 November 2007Additional files
Additional file 1:
BLAST evaluation on the vector trimming results conducted with the three trimming programs using either vector form. The BLAST analysis results following the filtering criteria used through this report are shown in this Excel file. The bioinformatic program and the vector form used for trimming are indicated in the name of the worksheet. The worksheet "column descript" provides a description of what each column name represents. This file contains all the source data used to derive Tables 3 and 4 in the main text.
Format: XLS Size: 635KB Download file
This file can be viewed with: Microsoft Excel Viewer
Additional file 2:
Effect of vectors on vector trimming performance by three programs. The same as Table 4 in the main text except that the number, instead of the percentage, of incompletely trimmed ESTs was used.
Format: PDF Size: 29KB Download file
This file can be viewed with: Adobe Acrobat Reader
Additional file 3:
Artifact vector trimming found with ESTs from dbEST at NCBI. Error rate of dbEST with emphasis on vector contamination was investigated by "BLASTing" the ESTs randomly sampled from dbEST at NCBI either against the UniVec (worksheet "601_UniVec") or against the sequences of their cloning vectors (worksheet "601_22vector"). Shown in the Excel file are the filtered BLAST results according to the criteria described in Methods. Please note that in worksheet "601_22vector", only 35,363 EST sequences which were cloned into the most prevalent 22 vectors were used for BLAST analysis (Please see methods for details.) The Spreadsheet "col des" provides a description of each column.
Format: XLS Size: 1.2MB Download file
This file can be viewed with: Microsoft Excel Viewer
