Table 1

Summary of assembly and EST data

Number of Reads

582,650

Total Bases

181 Mb

Average read length after MIRA

312

Number of contigs

75,407

Average contig length

509

Range contig length

40-3,400

Number of singletons

3,071

Number of Contigs with 2 reads

29,206

Number of Contigs with > 2 reads

43,130

Contigs with BLASTx matches (E-value ≤ 10-6)

18,407

*Remaining contigs with additional matches (E-value ≤ 10-2)

3,616

Contigs determined by ESTscan

17,402

**Total number of transcripts

39,425

**Total number of putatively translated amino-acids sequences

42,073


*contigs without BLASTx matches at an E-value cut-off of 10-6 were queried again with BLASTx with an E-value cut-off of 10-2

** The difference between the number of transcripts and total number of amino-acid sequences is due to the possibility of a contig having more than one annotated protein hit.

Bettencourt et al. BMC Genomics 2010 11:559   doi:10.1186/1471-2164-11-559

Open Data