Flowchart of EST sequence processing for contig analysis and GenBank submission. 11520 sequence reads were subtracted from bad trace data, failed sequence reads, empty vectors, and short transcripts to obtain 9696 high quality ESTs. Assembly of ESTs into 2894 contigs was followed by BLAST searches with 5 different algorithms to obtain 6669 best hits for 1695 contigs with scores ≤ 1e-05. Hits with smallest score were selected for contig analysis to evaluate weak and significant similarities to NCBI entries and to evaluate the distribution of organisms with closest similarity. Contigs were split into their respective ESTs for submission to GenBank. Best hits from all BLAST algorithms were attached to facilitate further analysis.
Borchardt et al. BMC Genomics 2010 11:4 doi:10.1186/1471-2164-11-4