Method for defining start and stop codon coordinates. The thick black bar indicates the location of the original BLAST HSP, and the thick grey bar indicates the gene coordinates reported by YGAP. M and asterisk (*) represent the locations of all possible start (ATG) and stop (TAA/TAG/TGA) codons in the same frame as the HSP. The start codon is chosen by searching around the beginning of the HSP as follows: (A) If the HSP (or the upstream HSP, in the case where a pair of HSPs is being considered) begins with a methionine codon, no change is made to the starting coordinate. (B) If the HSP does not begin with methionine, the ORF is extended to the furthest upstream methionine. (C) If during extension a stop codon is encountered before reaching a methionine, the software instead searches for a leading methionine within the first 45 nucleotides of the HSP. (D) If no suitable starting methionine is found using these steps, the original coordinates of the HSP are kept and the gene is tagged for manual inspection. Stop codons are found by walking downstream from the HSP, unless there is a stop codon within the HSP (in which case the HSP is trimmed accordingly).
Proux-Wéra et al. BMC Bioinformatics 2012 13:237 doi:10.1186/1471-2105-13-237