Open Access Methodology article

Genome-wide computational prediction of tandem gene arrays: application in yeasts

Laurence Despons1*, Philippe V Baret2, Lionel Frangeul3, Véronique Leh Louis1, Pascal Durrens4 and Jean-Luc Souciet1

Author Affiliations

1 Université de Strasbourg, CNRS UMR7156, F-67083 Strasbourg, France

2 Université Catholique de Louvain, B-1348, Louvain-la-Neuve, Belgium

3 Institut Pasteur, Plate-forme Intégration et analyse génomique, F-75015 Paris, France

4 Université Bordeaux 1, CNRS UMR5800, LaBRI INRIA Bordeaux Sud-Ouest (MAGNOME), F-33405 Talence, France

For all author emails, please log on.

BMC Genomics 2010, 11:56  doi:10.1186/1471-2164-11-56

Published: 21 January 2010

Additional files

Additional file 1:

Tabular data S1 - List of CDSs that constitute TGAs in the nine hemiascomycete yeast genomes. Additional data file 1 is a Microsoft Excel spreadsheet (.xls) containing all the identified tandemly arrayed genes in the nine yeast genomes analyzed. All genetic elements were designated using a systematic nomenclature system adopted in the "Génolevures" projects [42], except for YAR062W which is a pseudogene (TGA n°49). Chromosomal coordinates are indicated for each gene locus.

Format: XLS Size: 121KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 2:

Table S1 - Impact of the data curation steps on the count of TGAs. Three steps of data curation were performed to distinguish between CDSs belonging to a TGA and false positive CDSs: first, an automated curation of tagged CDSs (with a "relic" tag) analyzing the FTB score value of CDSs at positions n+2 and n-2, second, a manual curation of the remaining tagged CDSs and finally, a manual curation of CDSs in which minisatellites were detected by the equicktandem or etandem EMBOSS program.

Format: PDF Size: 80KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3:

Table S2 - Parameters influencing the results of the in silico TGA detection method. We measured the influence of two parameters on the number of TGAs and tagged CDSs identified before the step of manual data curation. The first parameter intervening in the score calculation step is the length of the chromosomal regions located upstream and downstream from each CDS (n times as long as the CDS length). The threshold value of FTB score is the second parameter used to select CDSs belonging to a TGA during the TGA extraction step.

Format: PDF Size: 83KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data