The computational pipeline for identifying de novo genes and duplicated new genes in S. cerevisiae. All the S. cerevisiae genes that have BLASTP hits only in S. cerevisiae were collected. Genes with possible homologs were excluded accordingly. The remaining genes were further classified into de novo genes and duplicated new genes. Last, genes with short promoters, poor alignments in promoters, or share core promoters were excluded. Finally, 34 de novo genes and 13 duplicated new genes were selected for subsequent analyses in this study. The numbers on the left denotes the number of remaining genes in each step. The pie chart illustrates the proportion of genes used in our analysis. The darker parts (34 and 13) are the number of genes retained, and the lighter parts (22 and 33) are the number of genes discarded.
Tsai et al. BMC Genomics 2012 13:717 doi:10.1186/1471-2164-13-717