Bioinformatics pipeline for EST clustering, assembly and annotation. We generated 16,112 reads (9,214 from JD and 6,998 from JG). All reads were trimmed and 13,249 reads were kept for clustering. The clustering resulted in 7,283 valid clusters that were aligned against the GenBank non-redundant protein database (NR), the Arabidopsis thaliana predicted proteome (At) and the Ricinus communis predicted proteome (Rc).
Costa et al. BMC Genomics 2010 11:462 doi:10.1186/1471-2164-11-462