Open Access Highly Accessed Research article

High quality de novo sequencing and assembly of the Saccharomyces arboricolus genome

Gianni Liti1*, Alex N Nguyen Ba23, Martin Blythe4, Carolin A Müller5, Anders Bergström1, Francisco A Cubillos56, Felix Dafhnis-Calas5, Shima Khoshraftar2, Sunir Malla4, Neel Mehta5, Cheuk C Siow5, Jonas Warringer7, Alan M Moses23, Edward J Louis5 and Conrad A Nieduszynski5

Author Affiliations

1 Institute of Research on Cancer and Ageing of Nice (IRCAN), CNRS UMR 7284 - INSERM U1081, Université de Nice Sophia Antipolis, 06107, NICE Cedex 2, France

2 Department of Cell & Systems Biology, University of Toronto, M5S 2 J4, Toronto, Canada

3 Centre for the Analysis of Genome Evolution and Function, University of Toronto, M5S 3B2, Toronto, Ontario, Canada

4 DeepSeq, Centre for Genetics and Genomics, Queen’s Medical Centre, University of Nottingham, NG7 2UH, Nottingham, UK

5 Centre for Genetics and Genomics, Queen’s Medical Centre, University of Nottingham, NG7 2UH, Nottingham, UK

6 Current address: INRA, UMR1318, Institut Jean-Pierre Bourgin, F-78000, Versailles, France

7 Department of Chemistry and Molecular Biology, University of Gothenburg, 41390, Gothenburg, Sweden

For all author emails, please log on.

BMC Genomics 2013, 14:69  doi:10.1186/1471-2164-14-69

Published: 31 January 2013

Additional files

Additional file 1: Table S1:

Sequence homology of the small scaffolds. Each of the 18 small scaffolds (<10 kb) was compared to the S. cerevisiae genome using BLAST. For each small scaffold the name, size and sequence homology are listed.

Format: DOCX Size: 33KB Download file

Open Data

Additional file 2: Figure S1:

SOLiD coverage across the 16 chromosomes of S. arboricolus to detect regions of elevated copy number. There is one plot for each chromosome, with the x-axis representing the chromosomal coordinate and the y-axis (on a log2 scale) representing sequence coverage as a measure of copy number (normalized by the genome-wide average). SOLiD reads were mapped to the S. arboricolus assembly using BFAST [ [51]]. Reads with multiple equally good top scoring mapping locations were assigned randomly to one of these. The depth of coverage of reads mapping to the assembly was calculated in windows of size 1 kb along the chromosomes. Each step on the vertical axis corresponds to one unit on the logarithmic scale. Vertical bars in orange mark the locations of gaps in the assembly and bars in blue mark the locations of tandem repeat tracts longer than 30 bp, as predicted by Tandem Repeats Finder [ [52]].

Format: PDF Size: 494KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 3: File S1:

Fasta file containing the protein sequences of novel genes in S. arboricolus.

Format: FASTA Size: 1KB Download file

Open Data

Additional file 4: Figure S2:

Five possible placements of S. arboricolus within the sensu stricto complex. We find that the third placement is supported by the data (main manuscript Figure 5). S. cer: S. cerevisiae; S. par: S. paradoxus; S. mik: S. mikatae; S. kud: S. kudriavzevii; S. arb: S. arboricolus; and S. bay: S. bayanus.

Format: PDF Size: 301KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 5: Table S2:

Environments used in the phenotyping screen. Classification”carbon utilization” indicates that 2% glucose was substituted with the indicated carbon source, classification “nitrogen utilization” indicates that 0.5% ammonium sulfate was substituted with the indicated nitrogen sources at nitrogen limiting concentrations. In all nitrogen utilization experiments, two consecutive pre-cultures were performed to deplete internal nitrogen storages: the first with nitrogen limiting amounts of ammonium, the second with nitrogen limiting amounts of the indicated nitrogen source. # = pre-cultures were performed in medium similar to the experimental medium to deplete internal storages of the molecule.

Format: DOCX Size: 23KB Download file

Open Data