Open Access Research article

Analysis of high-identity segmental duplications in the grapevine genome

Giuliana Giannuzzi1, Pietro D'Addabbo1, Marica Gasparro2, Maurizio Martinelli34, Francesco N Carelli1, Donato Antonacci2* and Mario Ventura15*

Author Affiliations

1 Department of Biology, University of Bari, Bari 70126, Italy

2 Agricultural Research Council, Research Unit for Table Grapes and Wine Growing in Mediterranean Environment (CRA-UTV), Turi (BA) 70010, Italy

3 National Institute of Nuclear Physics (INFN), Bari 70126, Italy

4 Department of Physics, University of Bari, Bari 70126, Italy

5 Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA

For all author emails, please log on.

BMC Genomics 2011, 12:436  doi:10.1186/1471-2164-12-436

Published: 26 August 2011

Additional files

Additional file 1:

FISH results of a tandem-duplicated clone. FISH signals on grapevine metaphase chromosomes and interphase nucleus of the VV40024H153B02 tandem-duplicated BAC clone.

Format: PDF Size: 635KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 2:

FISH results of 100 random clones from the VVPN40024 library. The table lists the clone ID, chromosome mapping and FISH result of 100 random BAC clones from the VVPN40024 library. Mapping information on 12X grapevine genome assembly is obtained from http://urgi.versailles.inra.fr/cgi-bin/gbrowse/vitis_12x_pub/ webcite. OEA, one-end anchored.

Format: XLS Size: 27KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 3:

Identity percentages between end sequences of the tandem-duplicated BACs. Identity percentages between end sequences of the five BAC clones from plate 153 showing the 4-cluster pattern in FISH experiments.

Format: XLS Size: 20KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 4:

Supplemental Note. The note reports the test of five different statistical models to identify the most appropriate one describing the WSSD coverage data.

Format: PDF Size: 2.1MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 5:

Comparison of three repeat masking settings. The table shows the percentages of WSSD-negative (green), borderline (gray) and positive (red) 5 kub windows among all windows in the grapevine genome and in five subgroups, defined by the percentage of masked sequence in the window. The percentages are given for the three repeat masking settings "div10_low", "nodiv_low" and "nodiv_N".

Format: XLS Size: 30KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 6:

WSSD coverage of Vitis vinifera chromosomes. The graphs illustrate the WSSD coverage of all Vitis vinifera chromosomes and were produced using the Circos tool. WSSD negative, borderline and positive windows are represented by green, gray and red colored bars, respectively. Last segment of chrUn sequence misses WSSD coverage values as it is composed of blocks, spaced out by gaps, too short to calculate the WSSD coverage on 5 kub windows.

Format: PDF Size: 4.7MB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 7:

Segmental duplication data of single, undefined and duplicated clones. The table lists the number and percentage of clones containing segmental duplications, and the average and standard deviation values of gap number, for single, undefined and duplicated clones designated according to FISH results.

Format: XLS Size: 18KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 8:

Distribution size of duplicated intervals. The graph shows the number of duplicated intervals according to their size.

Format: PDF Size: 37KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data

Additional file 9:

Duplicated intervals identified in the Vitis vinifera PN40024 genome. The table lists all the duplicated intervals identified in the Vitis vinifera PN40024 genome and reports their chromosome position, size and average WSSD coverage.

Format: XLS Size: 227KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 10:

GC and repeat content of the whole-genome and segmentally duplicated regions. The table compares the GC level and repeat content defined by the RepeatMasker tool between grapevine whole-genome and the identified segmentally duplicated regions.

Format: XLS Size: 20KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 11:

Fully and partially duplicated gene content in grapevine chromosomes. The table lists the number of total genes, number of fully and partially duplicated genes (full and part dup genes), chromosome gene density, density of fully and partially duplicated genes in duplicated regions (full and part dup gene density), and ratio between full and part dup gene density and chr gene density (ratio of densities), for each chromosome, all nonrandom chromosomes, all random chromosomes, all nonrandom and random chromosomes, and for the whole genome. Gene density is calculated as the average number of genes present in 100 kb of genomic sequence.

Format: XLS Size: 30KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 12:

Gene-containing duplicated intervals in grapevine chromosomes. The table lists the number and fraction of duplicated intervals with fully duplicated genes and with fully and/or partially duplicated genes, in each grapevine chromosome, all nonrandom chromosomes, all random chromosomes, all nonrandom and random chromosomes, and in the whole genome.

Format: XLS Size: 30KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 13:

Summary of InterPro scan results for the whole genome and its segmentally duplicated portion. The table lists the number of genes, number of InterPro domains, number of InterProScan matches, number of genes with at least one InterPro domain, number of genes with no InterPro domain, percentage of genes with at least one InterPro domain, percentage of genes with no InterPro domain, for the whole genome and its segmentally duplicated portion.

Format: XLS Size: 26KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 14:

InterPro domains identified in grapevine proteins and their enrichment factor in duplicated vs. unique genes. The table lists the InterPro domains identified in proteins codified by the grapevine genome, their occurrence in unique and duplicated genes, and their enrichment factor in duplicated vs. unique genes. The InterPro domains are sorted by their enrichment factor in descending order.

Format: XLS Size: 564KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data

Additional file 15:

Top 100 duplicated genes. The table lists the 100 genes embedded in regions with the highest read depth values. It reports their chromosome location, the mean WSSD coverage of the region and their InterPro domain content. Further, the table indicates whether the genes detect homologous genes in chloroplast and/or mitochondria genomes.

Format: XLS Size: 37KB Download file

This file can be viewed with: Microsoft Excel Viewer

Open Data