Table 5

Y gene homologs
Y gene Homolog1 Maximum percent identity2 E-value 2 Homolog putative function
sYG1 AGAP003896 35% 1e-39 Bobby sox (bbx) HMG-box transcription factor
sYG2 No homology found - - -
sYG3 AGAP000048 37% 2e-13 Adenomatous polyposis coli protein
gYG1 AGAP005574 47% 1e-16 Unknown
AGAP011774 36% 9e-08 Unknown
AGAP0010793 89% 3 0.034 3 Unknown
gYG2 AGAP0117344 38% 4 1e-5 4 Unknown
gYG3 AGAP012527 33% 2e-26 General transcription factor II repeat domain

The annotated homologs of the Y genes provide clues to putative functions of the Y genes we discovered. 1. Homologs were identified using blastx against the An. gambiae protein database (AgamP3.6 Gene Build) and the NCBI non-redundant protein database. Not surprisingly, the best matches are to An. gambiae sequences because the queries are from An. gambiae and An. stephensi. The An. gambiae genes AGAP003896 and AGAP000048 are orthologous to the autosomal or X paralogs of sYG1 and sYG3, respectively. The other homologs listed in column 2 are either distantly related or only share partial overlap. In cases where the Y genes match more than one related homolog, the homolog with the best e-value is shown. The three homologs shown for gYG1 are unrelated. 2. Maximum percent identities and e-values were obtained by blastx against the An. gambiae protein database (AgamP3.6 Gene Build). For Y genes that have confirmed introns (sYG1, gYG1, and gYG2), transcript sequences were used as the query. 3. The match to AGAP001079 is very short at the amino acid level although the nucleotide identity is 92 percent, and 98 percent in two longer fragments (e-value of 6×10-18). The sense strand of gYG1 is reverse-complementary to AGAP001079. 4. The sense strand of gYG2 is reverse-complementary to AGAP011734.

