(A) SAGE library statistics: Summary statistics of the 24 SAGE libraries analyzed in this study. Mapping information was based on the May 10th, 2006 version of SAGEGenie . In total, over 3,000,000 SAGE tags were sequenced, with over 110,000 unique tags represented upon the exclusion of super singleton tags. (Super singleton tags are tags which have a count of 1 in a single library only). Approximately 75 % of these 110,000 unique tags, (potentially representing as many unique transcripts), mapped to an annotated UniGene cluster. As multiple SAGE tags frequently map to the same UniGene cluster, we have identified at a total of 25,653 distinct UniGene clusters within our dataset, approximately 68% of which represent previously characterized genes. Notably, 25% of the unique tags had no mapping, suggesting much information is currently unknown. (B) Transcriptome Venn diagram: Venn diagram of the transcriptomes of current, former and never smokers. Reported is the number of tags which are expressed in every library group at a raw tag count greater than or equal to 2, representing the tags which are constitutively expressed in each set. Nearly 2000 SAGE tags, mapping to over 1700 genes are common to all 24 SAGE libraries. A lower number of never smokers may have contributed to a higher number of preferentially expressed transcripts in this group.
Chari et al. BMC Genomics 2007 8:297 doi:10.1186/1471-2164-8-297