Open Access Highly Accessed Research article

Transcriptome deep-sequencing and clustering of expressed isoforms from Favia corals

Shaadi F Pooyaei Mehr12*, Rob DeSalle2, Hung-Teh Kao3, Apurva Narechania2, Zhou Han4, Dan Tchernov5, Vincent Pieribone4 and David F Gruber126

Author Affiliations

1 The Graduate Center, Molecular, Cellular and Developmental Biology, City University of New York, New York, NY 10065, USA

2 American Museum of Natural History, Sackler Institute of Comparative Genomics, New York, NY 10024, USA

3 Department of Psychiatry and Human Behavior, Division of Biology and Medicine, Warren Alpert Medical School, Brown University, Providence RI 02912, USA

4 John B. Pierce Laboratory, Cellular and Molecular Physiology, Yale University, New Haven, CT 06519, USA

5 Marine Biology Department, The Leon H. Charney School of Marine Sciences, University of Haifa, Mount Carmel, Haifa 31905, Israel

6 Department of Natural Sciences, City University of New York, Baruch College, Box A-0506, 17 Lexington Avenue, New York, NY 10010, USA

For all author emails, please log on.

BMC Genomics 2013, 14:546  doi:10.1186/1471-2164-14-546

Published: 12 August 2013

Additional files

Additional file 1: File S1:

Parameters and commands used in this manuscript.

Format: DOCX Size: 142KB Download file

Open Data

Additional file 2: Files S2:

BlastP parsed output files against N. vectensis proteome for sample Fav1 and Fav2 with 2e-30.

Format: TXT Size: 10.2MB Download file

Open Data

Additional file 3: Files S3:

BlastP parsed output files against N. vectensis proteome for sample Fav1 and Fav2 with 2e-30.

Format: TXT Size: 8.8MB Download file

Open Data

Additional file 4: Files S4:

TRIBE-MCL input files.

Format: TXT Size: 394KB Download file

Open Data

Additional file 5: File S5:

TRIBE-MCL input files.

Format: TXT Size: 402KB Download file

Open Data

Additional file 6: Files S6:

Homologous protein clusters (TRIBE-MCL) output for sample Fav1 and Fav2.

Format: TXT Size: 286KB Download file

Open Data

Additional file 7: File S7:

Homologous protein clusters (TRIBE-MCL) output for sample Fav1 and Fav2.

Format: TXT Size: 286KB Download file

Open Data

Additional file 8: Table S1:

Completeness metrics for two samples compared to N. ventensis and A. digitifera.

Format: TIFF Size: 729KB Download file

Open Data

Additional file 9: Files S8:

GO,KOG, InterPro annotation for homologous protein clusters in Fav1.

Format: XLSX Size: 522KB Download file

Open Data

Additional file 10: Figure S1:

Distribution of Fav1 transcript clusters in different GO categories.

Format: TIFF Size: 286KB Download file

Open Data

Additional file 11: Files S9:

FASTA files for cDNA region encoding for non-symbiont annotated ORFs in Fav1 and Fav2.

Format: FAS Size: 14MB Download file

Open Data

Additional file 12: File S10:

FASTA files for cDNA region encoding for non-symbiont annotated ORFs in Fav1 and Fav2.

Format: FAS Size: 13MB Download file

Open Data

Additional file 13: File S11:

Alignment of Fav1 and Fav2 Cytb nucleotide sequences, including other Favia species.

Format: PHY Size: 10KB Download file

Open Data

Additional file 14: File S12:

Alignment of Fav1 and Fav2 COI nucleotide sequences, including other Favia species.

Format: PHY Size: 12KB Download file

Open Data

Additional file 15: File S13:

Alignment of Fav1 and Fav2 28S nucleotide sequences, including other Favia species.

Format: ALN Size: 11KB Download file

Open Data

Additional file 16: Figure S2:

Maximum likelihood tree of three loci (COI, Cytb, 28S). Data matrix was generated from 15 Favia species and Fav1 and Fav2. Nucleotide sequences were aligned using clustalw2 with default parameters, the 3 loci matrix was generated using FASconCAT, and the tree was constructed using RaxML (See methods). Montastrea cavernosa is selected as the out-group.

Format: TIFF Size: 3.2MB Download file

Open Data

Additional file 17: Figure S3:

Amino acid sequence alignment of full-length fluorescent protein isoforms.

Format: TIFF Size: 3.2MB Download file

Open Data

Additional file 18: Figure S4:

Maximum likelihood tree of 156 known fluorescent proteins, including our 11 newly identified sequences using RaxML. Shows the relationships of the major groups of known fluorescent proteins. Major lineages cluster together, although Ctenophore and Hydrozoa do not form a monophyletic group. Within Anthozoa class, order Ceriantharia (orange); Actinaria (red); Pennatulacea (dark green); and Scleractinia (black); Hydrozoa (purple); Copepoda (light green); Ctenophora (blue); Chordata (turquoise blue), most basal group; Newly identified sequences are colored blue within Scleractinia. The alignment was 1,000 times bootstrapped and B. floridae was the out-group.

Format: TIFF Size: 7.2MB Download file

Open Data

Additional file 19: File S14:

Alignment of 156 known fluorescent proteins, including the 11 newly identified FP sequences.

Format: PHYLIP Size: 56KB Download file

Open Data

Additional file 20: File S15:

Search result in Trans-ABySS and Trinity assembly output for homologous contig, similar to identified Fav1 s23Coting16657-5 produced by ABySS and CAP3.

Format: TXT Size: 5KB Download file

Open Data

Additional file 21: Figure S5:

Read-to-contig alignment. 75 bp read alignments to the coding region of s23Contig16657-5, 1,377 bp total length.

Format: TIFF Size: 1.5MB Download file

Open Data

Additional file 22: Files S16:

RPKM measurement for all annotated cDNA regions from Fav1 and Fav2.

Format: TXT Size: 452KB Download file

Open Data

Additional file 23: File S17:

RPKM measurement for all annotated cDNA regions from Fav1 and Fav2.

Format: TXT Size: 462KB Download file

Open Data

Additional file 24: File S18:

Protocol for preparing samples for sequencing of mRNA.

Scripts: Script 1: Perl script for performing blast search. Script 2: Perl script for pre-clustering the blast parsed file. Script 3: Perl script to calculate RPKM for the assembled file. Script S1: Perl script to shuffle short read sequences. Script S2: Perl script to measure the N50 statistics. Script S3: Unix shell script to remove Fasta files shorter than a threshold. Script S4: Generate the sub-Fasta file. Script S5: Extract the cDNA sequences corresponding to ORF files.

Format: PDF Size: 137KB Download file

This file can be viewed with: Adobe Acrobat Reader

Open Data