Table 4

Five of the six Y genes share high nucleotide identities to non-Y sequences
Y gene 1 Non-Y paralog Identity (E-value) 2
sYG1 AsA-bbx3 95% (0)
sYG2 None 4 Not applicable
sYG3 An. stephensi AGAP0000485 94.8% (0)
gYG1 3 L:34084227–34084914 6 94% (0)
gYG2 3 L:34084227–34084914 7 94% (0)
gYG3 2 L: 5229785–5230624 8 89% (0)

The most similar non-Y sequences to the Y genes in the An. stephensi and An. gambiae genomes. 1. Query sequences are the nucleotide sequences of the Y genes shown in Additional file 3. 2. blastn is used to obtain e-values. Percent identities reflect the entire sequence except for gYG1 and gYG2. 3. AsA-bbx is orthologous to An. gambiae AGAP003896/7. AsA-bbx is an autosomal paralog of sYG1. 4. sYG2 has no similar sequence in the rest of the An. stephensi genome except for a 55 bp repetitive region. 5. sYG3 is closely related to an annotated gene in An. stephensi, which is orthologous to An. gambiae AGAP000048. 6. gYG1 appears to be a composite of different fragments that share high similarities to different un-annotated non-Y sequences in An. gambiae. Some fragments are repeated in the genome. Shown in this table is a 688 base fragment that is 94 percent identical to sequences in chromosome 2R, 3 L as well as unmapped scaffolds. 7. gYG2 and gYG1 overlap, which is why the best non-Y match is identical between the two Y genes. gYG2 and gYG1 are transcribed from opposite strands. 8. gYG3 is most similar to an un-annotated 2 L sequence and an unmapped sequence in An. gambiae.

Hall et al.

Hall et al. BMC Genomics 2013 14:273   doi:10.1186/1471-2164-14-273

Open Data