Email updates

Keep up to date with the latest news and content from BMC Genomics and BioMed Central.

Open Access Highly Accessed Research article

Deep sequencing-based transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus reveals insight into the immune-relevant genes in marine fish

Li-xin Xiang12, Ding He12, Wei-ren Dong12, Yi-wen Zhang12 and Jian-zhong Shao12*

Author affiliations

1 College of Life Science, Zhejiang University, Hangzhou 310058, China

2 Key lab for Cell and Gene Engineering of Zhejiang province, Hangzhou 310058, China

For all author emails, please log on.

Citation and License

BMC Genomics 2010, 11:472  doi:10.1186/1471-2164-11-472

The electronic version of this article is the complete one and can be found online at:

Received:7 May 2010
Accepted:13 August 2010
Published:13 August 2010

© 2010 Xiang et al; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.



Systematic research on fish immunogenetics is indispensable in understanding the origin and evolution of immune systems. This has long been a challenging task because of the limited number of deep sequencing technologies and genome backgrounds of non-model fish available. The newly developed Solexa/Illumina RNA-seq and Digital gene expression (DGE) are high-throughput sequencing approaches and are powerful tools for genomic studies at the transcriptome level. This study reports the transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus using RNA-seq and DGE in an attempt to gain insights into the immunogenetics of marine fish.


RNA-seq analysis generated 169,950 non-redundant consensus sequences, among which 48,987 functional transcripts with complete or various length encoding regions were identified. More than 52% of these transcripts are possibly involved in approximately 219 known metabolic or signalling pathways, while 2,673 transcripts were associated with immune-relevant genes. In addition, approximately 8% of the transcripts appeared to be fish-specific genes that have never been described before. DGE analysis revealed that the host transcriptome profile of Vibrio harveyi-challenged L. japonicus is considerably altered, as indicated by the significant up- or down-regulation of 1,224 strong infection-responsive transcripts. Results indicated an overall conservation of the components and transcriptome alterations underlying innate and adaptive immunity in fish and other vertebrate models. Analysis suggested the acquisition of numerous fish-specific immune system components during early vertebrate evolution.


This study provided a global survey of host defence gene activities against bacterial challenge in a non-model marine fish. Results can contribute to the in-depth study of candidate genes in marine fish immunity, and help improve current understanding of host-pathogen interactions and evolutionary history of immunogenetics from fish to mammals.


Since it is a representative population of lower vertebrates serving as an essential link to early vertebrate evolution, fish is believed to be an important model in various developmental and comparative evolutionary studies [1-3]. Fish immunogenetics has received considerable attention due to its essential role in understanding the origin and evolution of immune systems [4-6]. Further, it is also beneficial in the creation of immune-based therapy of severe fish diseases. Great progress in bioinformatics and genome projects in model organisms, including human (Homo sapiens), mouse (Mus musculus), frog (Xenopus laevis), chicken (Gallus gallus), and zebrafish (Danio rerio), has led to the emergence of studies focusing on the identification and characterization of immune-related genes in teleost fish based on comparative genomics. These have provided preliminary observations on fish immunogenetics and evolutionary history of immune systems from lower vertebrates to mammals [7,8]. However, large-scale identification of immune-related genes at the genome or transcriptome levels in fish was seen in limited species (e.g. Danio rerio) due to the inadequate number of high-throughput deep sequencing technologies available [9,10]. This is an even more difficult problem in non-model fish species with totally unknown genome sequences.

Recently developed RNA deep sequencing technologies, such as Solexa/Illumina RNA-seq and Digital gene expression (DGE), have dramatically changed the way immune-related genes in fish are identified because these technologies facilitate the investigation of the functional complexity of transcriptomes [11,12]. RNA-Seq refers to whole transcriptome shotgun sequencing wherein mRNA or cDNA is mechanically fragmented, resulting in overlapping short fragments that cover the entire transcriptome. DGE is a tag-based transcriptome sequencing approach where short raw tags are generated by endonuclease. The expression level of virtually all genes in the sample is measured by counting the number of individual mRNA molecules produced from each gene. Compared with DGE analysis, the RNA-Seq approach is more powerful for unraveling transcriptome complexity, and for identification of genes, structure of transcripts, alternative splicing, non-coding RNAs, and new transcription units. In contrast, the DGE protocol is more suitable and affordable for comparative gene expression studies because it enables direct transcript profiling without compromise and potential bias, thus allowing for a more sensitive and accurate profiling of the transcriptome that more closely resembles the biology of the cell [9,13-17]. These two technologies have been used in transcriptome profiling studies for various applications, including cellular development, cancer, and immune defence of various organisms [10,18-29]. However, they have not been used in immunogenetic analysis of marine fish species.

Japanese sea bass (Lateolabrax japonicus) is an economically important marine species widely cultured in fisheries worldwide. Various diseases caused by bacterial and viral pathogens plague this species [30]. High mortality is associated with infection with Vibrio harveyi, a typical gram-negative pathogen of a wide range of marine animals. Infection results in a variety of vibriosis, a common aquatic animal disease associated with high mortality throughout the world [31]. In L. japonicus, V. harveyi infection leads to bacterial septicaemia with muscle ulcer as well as subcutaneous and gastroenteritic haemorrhage.

The present study is the first to conduct a transcriptome profiling analysis of V. harveyi-challenged L. japonicus using RNA-seq and DGE to gain deep insight into the immunogenetics of marine fish. Bacteria-challenged L. japonicus is expected to be a model system for studying bacterial immunity in marine fish. Further, a global survey of anti-bacterial immune defence gene activities in marine fish can contribute to the in-depth investigation of candidate genes in fish immunity. Results are also expected to improve current understanding of host-pathogen interactions and evolutionary history of immunogenetics from fish to mammals.


Aligning raw sequencing reads to non-redundant consensus

Approximately 34.59 and 33.03 million 75-bp pair end (PE) raw reads (submitted to GEO database, Association No. GSE21721) from the head kidney and spleen tissues of bacteria- and mock-challenged fish, respectively, were generated using Solexa/Illumina RNA-seq deep sequencing analysis. Repetitive, low-complexity, and low-quality reads were filtered out prior to assembly of sequence reads for non-redundant consensus. Using Grape software, reliable reads were assembled into contigs, which were then compared with all PE reads. Overlap of PE reads with two contigs was taken to indicate that the contigs are short segments of a scaffold. Reads were used for gap-filling of these scaffolds to generate final scaffold sequences. Using tgicl and cap3 software programs, scaffold sequences were assembled into clusters that were then analysed for consensus. A total of 150,125 and 140,330 non-redundant consensus sequences, ranging from 100 to 2,000-bp, were generated from each group. Then, consensus sequences were merged for DGE analysis. Removal of partial overlapping sequences yielded 169,950 non-redundant consensus sequences (Table 1). These sequences provide abundant data on healthy and infected conditions, thus allowing for better reference of immune-relevant genes.

Table 1. Distribution of non-redundant consensus sequences

Annotation of all non-redundant consensus sequences

BLASTX and ESTscan software analysis of 169,950 non-redundant consensus sequences revealed that about 48,987 have reliable coding sequences (CDs) [32,33]. CD-containing consensus sequences have high potential for translation into functional proteins and most of them translated to proteins with more than 100 aa. Comparison with the Nr and Swissprot databases revealed that 44,842 consensus sequences had good comparability with known gene sequences in existing species. Annotation of the 44,842 sequences using GO and COG databases yielded good results for approximately 16,469 consensus sequences and 9,545 putative proteins [34,35]. GO-annotated consensus sequences belonged to the biological process, cellular component, and molecular function clusters and distributed among more than 50 categories (Figure 1), including biochemistry, metabolism, growth, development, apoptosis, and immune defence. Similarly, COG-annotated putative proteins were classified functionally into at least 25 molecular families, including cellular structure, biochemistry metabolism, molecular processing, signal transduction, gene expression, and immune defence, that correspond to the categories observed in GO analysis (Figure 2). The KEGG database was used to analyse potential involvement of the consensus sequences in cellular metabolic pathways (Figure 3) [36]. Among the 44,842 consensus sequences, 24,496 can be grouped into seven categories comprised of 219 known metabolic or signalling pathways, including cellular growth, differentiation, apoptosis, migration, endocrine, and various immune-relevant signalling or metabolic pathways (Table 2).

thumbnailFigure 1. GO annotations of non-redundant consensus sequences. Best hits were aligned to the GO database, and 16,469 transcripts were assigned to at least one GO term. Most consensus sequences were grouped into three major functional categories, namely biological process, cellular component, and molecular function.

thumbnailFigure 2. COG annotations of putative proteins. All putative proteins were aligned to the COG database and can be classified functionally into at least 25 molecular families.

thumbnailFigure 3. KEGG categories of non-redundant consensus sequences. All non-redundant consensus sequences were annotated using KEGG Automatic Annotation Server for pathway information, and about 24,496 consensus sequences were annotated. The categories GIP and EIP stand for genetic information processing and environmental information processing, respectively.

Table 2. Annotation of non-redundant consensus sequences

Annotation of immune-relevant genes and pathways

To gain deep insight into the molecular biology of immune systems in L. japonicus, the immune-relevant genes, metabolic and signalling pathways were analysed. Approximately 2,673 consensus sequences were found to be homologous to known immune-relevant genes in other species (Table 3, Additional file 1, Table S1), including the most important elements of innate and adaptive immunity, such as pattern recognition receptors (mannose binding proteins, scavenger receptors, Toll-like receptors, and lipopolysaccharide-binding proteins), inflammatory cytokines and receptors (IL-1β, IL-10, IL-12, IL-6, IL-8, TNF-α, IFN-γ, TGF-β, VEGF, IL-1, and TNF receptors), immunoglobins (IgM and IgD), transcriptional factors (NF-kB, Ik-B, AP-1, transcription factor 4, chorion specific transcription factor, and transcriptional enhancer factor 5), complement components (C1q, C1r, C1 s, C2, C4, MASP, Bf, Df, and CR1-5), leukocyte differentiation antigens (CD3, CD4, CD8, and CD80/86), antigen presenting and processing molecules (MHC I, MHC II, TAP1, TAP2, TAP2B, proteosome subunit, PA28α/β, and constant chain protein li), regulatory molecules involved in immune cellular proliferation, differentiation, and apoptosis (growth differentiation factor 11, proliferation associated protein 2G4, apoptotic peptidase activating factor 1, and anti-apoptotic protein NR13), and other molecules involved in immune response (hepcidin, lectin, lysozyme, antimicrobial peptide NK-lysin, chemokines, ferritin, RAG1, and RAG2).

Table 3. Immune-relevant genes/homologues in L. japonicus

Additional file 1. Table S1: Details on immune-relevant genes/homologues in L. japonicas.

Format: XLS Size: 844KB Download file

This file can be viewed with: Microsoft Excel ViewerOpen Data

KEGG analysis revealed that approximately 2,082 consensus sequences were significantly enriched in various known immune-relevant metabolic or signalling pathways (Figure 3). These suggest a considerable conservation of immune-relevant genes and pathways between L. japonicus and mammals. Conserved genes and pathway members might include Toll-like receptors (TLR1-9 and TLR13) and corresponding adaptors in mammals (MyD88, TRAM, FADD, CASP8, IRAK4, TRAF6, TAK1, MKK4/7, JNK, and AP-1) and in other fish species (TLR5a/b, TLR14 and TLR21-23) (Figure 4); T cell receptors (TCRα/β) and corresponding signalling transducers (Zap70, Lyn, Fyb, SHP1, CD3, CD28, CD45, LCK, PAK, CDC42, Vav, CARD11, PAG, ITAM, LAT, PIP2, SLP-76, MAP3Ks, IKKs, CBL, NCK, LAT, GRB2, CARMA1, NFAT, MALT1, GRB2, JNK, MKKs, PAK, Ras, Raf1, and MEKK1) (Figure 5); B cell receptors and downstream factors (B-cell antigen receptor complex-associated protein α/β chain, CD81, CD22, and BCL-10); key molecules involved in antigen presenting and processing pathways (MHC class I and II, Hsp70, Hsp90, and Calnexin); members of complement and coagulation pathways (classical pathway members C1q, C1r, C1 s, C2, and C4, MBL pathway members MBL and MASP, alternative pathway members Bf and Df, and complement receptor type 1); and members involved in FcεR I signalling pathway (PI3K and SYK), leukocyte migration (lymphotactin, Rho GTPase-activating protein 5, RhoH, integrin alpha-M, and leukaemia inhibitory factor receptor), and natural killer-mediated cytotoxicity (natural killer cell receptor 2B4, natural killer cell protease 1, natural killer cell enhancement factor, natural killer-tumour recognition sequence, and natural killer cell stimulatory factor 2). Furthermore, a number of consensus genes involved in cellular adhesion, energy production, and amino acid (arginine and praline) metabolisms were also conserved between fish and mammals. These genes are indirectly related to immune responses in mammals. For instance, L-arginine metabolism has been proven to be related to phagocytosis of macrophages, which eventually led to the discovery of NO signalling molecule [37]. Thus, the involvement of these consensus genes in metabolic pathways provides the basis for further identification of the biological functions of candidate genes in fish immune responses.

thumbnailFigure 4. Putative TLR signal pathway. Putative TLR signal pathway of L. japonicus was constructed based on knowledge of TLR signalling in mammalian species. However, most interactions have to be confirmed experimentally.

thumbnailFigure 5. Putative TCR signal pathway. Putative TCR signal pathway of L. japonicus was constructed based on knowledge of TCR signalling in mammalian species. However, most interactions require experimental confirmation.

Digital gene expression profile analysis after bacterial challenge

Solexa/Illumina DGE analysis was performed to identify the genes involved in L. Japonicus response to bacterial challenge [12]. A total of 3.44 and 3.22 million raw tags (submitted to GEO database, Association No. GSE21712) of the mRNAs extracted from head kidney and spleen of the mock- and bacteria-challenged groups, respectively, were identified by base calling [38,39]. After transformation of raw sequences into clean tags by data-processing steps using bio-perl scripts, approximately 0.33 and 0.27 million high quality non-redundant tags were obtained in both groups (Figure 6). Gene annotation was performed by tag mapping analysis using the 169,950 non-redundant consensus sequences from RNA-seq-based transcriptome analysis as reference transcript database. Results showed that 71.41% and 74.53% of all distinct tags can be mapped to the entire reference database (sense or anti-sense) in both groups. Out of the 26,394 sense strands and 23,790 anti-sense strands detected in the mock-challenged group, about 36,782 (75.1%) sense or anti-sense strands were mapped by the tags. In contrast, about 34,840 (71.1%) sense or anti-sense genes were mapped out of the 23,359 sense strands and 21,046 anti-sense strands in the infected group (Figure 6). Among the detectable expressed consensus sequences, 9,643 genes had successful annotations. Mapping results are summarized in Additional file 2, Table S2 and Additional file 3, Figure S1.

thumbnailFigure 6. Distribution of tags and gene expression between experimental and control groups. The black and gray columns indicate the distribution of tags and gene expression in bacteria- and mock-challenged groups, respectively. The distribution of tags matches the distribution of genes for both groups. Furthermore, an increase in tags or gene expression is accompanied by a decrease in the frequencies of tags or genes.

Additional file 2. Table S2: Summary of tag mapping in DGE analysis for experimental and control groups.

Format: XLS Size: 28KB Download file

This file can be viewed with: Microsoft Excel ViewerOpen Data

Additional file 3. Figure S1: Tag abundance for mock- (A) and bacteria- (B) challenged group. Normalised tag copy number was calculated by dividing tag counts for each gene with the total number of tags generated for each library and are presented per one million transcripts. PM and 1 MM stand for perfect match and 1 miss match, respectively.

Format: TIFF Size: 6.5MB Download fileOpen Data

Strict Bayesian algorithm was used in differential DGE analysis in order to consider the differences in library size for differential selection between the two differentially expressed gene libraries [40]. Soap2 software was used to map all measured tags to the corresponding assembled consensus sequences [41]. P ≤ 0.01 and absolute value of log2Ratio≥1 were used as the threshold of significant differences in gene expression. Approximately 1,224 CD-containing consensus sequences mapped by 19,548 differentially expressed tags exhibited significant differences after bacterial challenge. Among them, about 376 consensus sequences were significantly up-regulated (including 61 most over-expressed sequences that increased by more than 100-fold), while 848 consensus sequences were significantly down-regulated (including 10 most down-regulated sequences that decreased by 10-100-times) (Figure 7, Additional file 4, Table S3). The distribution of the significant changes detected is illustrated in a volcano plot (Figure 8), where the statistical significance of each consensus was plotted against fold change. Sequences with the highest average differences between the bacteria- and mock-challenged groups (far right and left of the plot) also had the smallest false discovery rate (FDR) values. Analysis of the CD-containing consensus sequences revealed that they all had FDR values less than 0.1, with the highest value being 0.063.

thumbnailFigure 7. Differential expression analyses of tags and consensus sequences by DGE. The expression level for each tag and consensus was included in the volcano plot (Figures A and B, respectively). 'Not DETs' and 'Not DEGs' indicate 'not detected expression tags' and 'not detected expression genes', respectively. For Figures A and B, the x-axis contains Log10 of transcript per million of the bacteria-challenged group and the y-axis indicates Log10 of transcript per million of the mock-challenged group. Limitations are based on P < 0.01, and the absolute value of Log2 (B/A) is greater than 1.

thumbnailFigure 8. Volcano plot of differentially expressed consensus sequences. For every consensus, the ratio of expression levels in the bacteria-challenged group over that in the mock-challenged group was plotted against the -log error rate. The horizontal line indicates the significance threshold (0.01 FDR), and the vertical lines indicate the twofold change threshold.

Additional file 4. Table S3: Summary of differentially expressed CD-containing consensus sequences.

Format: XLS Size: 557KB Download file

This file can be viewed with: Microsoft Excel ViewerOpen Data

Functional annotation of consensus sequences was performed to define the differentially regulated sequences more precisely. Majority of the 1,183 sequences annotated were related to factors involved in various immune-relevant pathways, such as TLR pathways (TLR1/3/13/18/21, BTK, IRAK1, IP2, NR2C2, and UBE2n), TCR signalling pathways (TCR beta chain, Zap70, LCK, SHP1, CARMA1, Vav, NFAT, GRB2, MALT1, NCK, and Raf1), antigen presenting and processing relevant pathways (MHC class II transactivator, CASP1, and CASP8), TGF-β signalling pathway (TGF-β receptor type-2, Serine/threonine-protein phosphatase 4, serine/threonine kinase receptor, BMPs, and SMAD7), and various inflammatory cytokines and receptor relevant pathways (IL18R, IL12R β-2 chain, C-C motif chemokine 7, C-C motif chemokine 19, TGFβ-1-induced transcript 1 protein, TNR superfamily member 1A, and TNFα-induced protein 2). There were annotations in several other biological processes that may indirectly participate in immune response, such as the cell cycle; DNA replication, transcription, and translation; metabolisms of carbohydrates, amino acids, and lipids; and activation of ATPase family members, transcription elongation factor B, membrane transport protein, NADH dehydrogenase, NAD kinase, nucleolar protein 6, tyrosine protein kinase, ribosomal protein L32, nuclear receptor, and replication initiator 1. Among the 61 most over-expressed transcripts and the 10 most down-regulated transcripts, enrichments of factors involved in metabolic or signalling pathways that have not been linked to immune responses previously, such as cytoskeleton regulation, calcium signalling pathway, MAPK signalling pathway, aimnocyl-tRNA biosynthesis, and methionine, glutamate, and aminosugar metabolism, were detected. Highly responsive consensus sequences are shown in Table 4.

Table 4. Representatives of putative immune relevant genes/homologs as predicted by DGE

Putative novel immune/stress response genes

Among the differentially expressed transcripts, more than 1,183 transcripts were well annotated, whereas approximately 41 transcripts had low sequence homology to known sequences in public databases, suggesting that they might be putative novel immune-relevant genes in L. japonicus involved in the response to bacterial challenge. Among these novel sequences, 13 were differentially regulated by more than 100-fold, implying that they were strongly infection-responsive genes. ProDom analysis identified one HSP domain- and one protein kinase domain-containing sequence [42]. SignalP and TMHMM programs distinguished 24 sequences with signal peptides or transmembrane domains, suggesting that they are cytokines and transmembrane proteins, respectively [43,44]. Observations presented in Additional file 5, Table S4 can provide guidance for further identification. In-depth functional studies of these novel sequences may benefit the exploration of potential marine fish-specific immune-relevant genes for application in the control of fish diseases.

Additional file 5. Table S4: Summary of putative novel immune/stress response consensus. Each consensus was analyzed using ProDom, SignalP, and TMHMM programs.

Format: XLS Size: 28KB Download file

This file can be viewed with: Microsoft Excel ViewerOpen Data

Experimental validation of consensus sequences

To validate the integrity of RNA-seq results, representative consensus sequences with complete encoding regions, such as hepcidin, Myf5, SNARE, and two IL-8-like CXC chemokine family members, were selected for experimental cloning and sequencing analyses by RT-PCR (Additional file 6, Table S5). All experimentally examined genes matched the RNA-seq-generated sequences perfectly. One of the two IL-8-like CXC chemokines was newly discovered by this study. The two IL-8-like CXC chemokine family members were identified through phylogenetic analysis. Both sequences conserved the four cysteine residues (C46-C48-C74-C91) that are the hallmarks of IL-8 CXC chemokines and can be found throughout the vertebrate IL-8 family (Additional file 7, Figure S2 and Additional file 8, Figure S3). This demonstrates the reliability of RNA-seq results and indicates the necessity for further identification of immune-related genes in L. japonicus.

Additional file 6. Table S5: Primers used for validation analysis.

Format: XLS Size: 23KB Download file

This file can be viewed with: Microsoft Excel ViewerOpen Data

Additional file 7. Figure S2: Clustal W analysis for all IL-8-like CXC chemokines across all vertebrates.

Format: PDF Size: 149KB Download file

This file can be viewed with: Adobe Acrobat ReaderOpen Data

Additional file 8. Figure S3: Phylogenetic analysis for all IL-8-like CXC chemokines across all vertebrates. A phylogenetic tree was constructed using the maximum likelihood method to show the relationship between L. japonicus IL-8 and other known vertebrate IL-8-like CXC-chemokines. Local bootstrap percentages were obtained after 10000 replications.

Format: TIFF Size: 2.9MB Download fileOpen Data


The transcriptome is the complete repertoire of expressed RNA transcripts in a cell. Its characterization is essential in deciphering the functional complexity of the genome and in obtaining a better understanding of cellular activities in organisms, including growth, development, disease, and immune defence. The definition of the transcriptome has long been a challenging task. Traditionally, global gene expression analysis has relied mostly on several approaches, including RNA hybridisation on high-density arrays, whole-genome tiling arrays, expressed sequence tag (EST), serial analysis of gene expression (SAGE), and SAGE-derived technologies, which include massively parallel signature sequencing (MPSS) and polony multiplex analysis of gene expression. However, these approaches have several inherent limitations. For example, the array-based approaches allow detection of specific sequences only and capture the transcriptome while ignoring splice-junction information or alternative splicing events. The EST approach provides only partial sequences of individual cDNA clones, is sensitive to cloning biases, and is associated with high costs and difficulties in data analysis. SAGE and MPSS are also costly and cannot be used for splicing events [9]. Thus, the newly developed Solexa/Illumina RNA-seq and DGE high-throughput deep sequencing approaches have dramatically changed how functional complexity of the transcriptome can be studied. These approaches overcome many of the inherent limitations of traditional systems, making the detection of alternative splicing events and low-abundance transcripts possible. They have been applied recently to several species, such as yeast, Arabidopsis, Chlamydomonas, Zebrafish, Drosophila, Caenorhabditis, and human, for different purposes [9,10,18-29].

In this study, the transcriptome profile analysis of bacteria-challenged L. japonicus was conducted through these two approaches in an attempt to gain deep insights into the immunogenetics of a marine species. As expected, a large set of transcriptional sequences with complete or differing lengths of encoding regions was generated. KEGG analysis showed that more than 52% of transcripts are enrichment factors involved in approximately 219 known metabolic or signalling pathways, including cellular growth, differentiation, apoptosis, migration, endocrine, and immune system processes. Further, more than 8% of transcripts represent novel fish-specific genes that have never been described previously. Detailed analysis of immune-relevant genes and pathways showed that more than 2,673 transcripts are homologous to known immune-relevant genes, whereas approximately 2,082 transcripts can be enriched in various immune-relevant metabolic or signalling pathways. Challenging the fish with V. harveyi resulted in large alterations of the host transcriptome profile, including significant up- or down-regulation of 1,224 transcripts, among which 41 sequences might be novel immune-relevant genes in fish. In addition, several other biological processes that have not been linked to host immunity before, such as the metabolism of carbohydrates, amino acids, and lipids; activation of ATPase, NADH dehydrogenase, NAD kinase, and tyrosine protein kinase; and up-regulation of nuclear receptors, replication initiators, and ribosomal proteins, were found to be dramatically involved in host immune response. These significantly regulated transcripts might represent strong infection-responsive genes in L. japonicus, and reflect a number of immune activities during fish defence against bacterial challenge. The transcriptome profiling data sets obtained in this study provide strong basis for future genetic research in marine fish and support further in-depth genome annotation in vertebrates. Future molecular and functional characterisation of infection-responsive genes could lead to global identification of immune-relevant genes and infection markers in marine fish.

At present, transcriptome analysis in fish relies mainly on the EST approach [45,46]. Although there have been an increasing number of ESTs sequenced in a large number of libraries in various fish species, including rainbow trout (Oncorhynchus mykiss), Atlantic salmon (Salmo salar), medaka (Oryzias latipes), and zebrafish (Danio rerio), the immune-relevant transcriptional profiling data sets obtained from fish are still insufficient. Recently, DGE- and microarray-based transcriptome profiling studies performed in zebrafish revealed that zebrafish and its developing embryo are useful in vivo models for the identification of host determinants of responses to bacterial infection [9,10]. However, transcriptional information on immune responses to infection in a non-model marine fish remains elusive. Therefore, the large set of immune-relevant genes and their role in responses to bacterial challenge in L. japonicus presented in this study may largely improve knowledge on fish immunogenetics in other analytical systems. The present study also demonstrates the advantages of new deep sequencing approaches for gene discovery, thus providing new leads for functional studies of candidate genes involved in host-bacteria interactions. The RNA-Seq and DGE analyses conducted in this study were found to complement each other well. RNA-Seq was very effective in unravelling transcriptome complexity, and can detect a large set of genes, including numerous low-expressing genes or novel genes. DEG data can be merged with RNA-Seq data sets, indicating an affordable method for comparative gene expression study. Thus, RNA-Seq was initially performed in this study to provide strong reference transcriptome database for subsequent DGE analysis.

Emerging hallmark components and the cells necessary for innate and adaptive immunity in higher vertebrates have been identified in fish [47,48]. This was the basis for the widely accepted notion that innate and adaptive immunity was established in teleosts about 470 million years ago. However, the exact molecular and cellular basis of immune systems in teleosts remains poorly understood. The precise regulatory mechanisms underlying the innate and adaptive immunity of teleosts remain vague due to the limited immune-relevant genetic information available in fish. The present work on the definition of high-throughput transcriptome data set of the immune system of L. japonicus may contribute greatly to better understanding of the molecular and cellular activities involved in fish immunity. Results unexpectedly showed that the fish immune system is more complex than previously thought. On one hand, the substantial amount of immune-relevant genes involved in metabolic and signalling pathways and the induction of genes encoding cell surface receptors, signalling intermediates, transcription factors, and inflammatory mediators show a clear conservation of mechanisms detected in other vertebrate models, including humans. On the other hand, a large set of novel immune response genes and infection markers that have never been linked previously to immune responses in other vertebrate systems was identified in L. japonicus, indicating the existence of numerous fish-specific immune activities during early vertebrate evolution.

For instance, the TLR family is the most important class of pattern recognition receptors that play crucial roles in mediating immune responses to pathogenic microorganisms [8,49,50]. Triggering of TLRs by ligands leads to the recruitment of adaptor proteins, resulting in the activation of a range of transcription factors, such as NF-κB, activator protein 1 (AP-1), and IFN regulatory factors (IRFs), through distinct signalling pathways. This eventually leads to the downstream activation of proinflammatory cytokines and receptors, such as IFN-α/β, TNF-α, IL-2, IL-6, IL-8, IL10, CD40, CD86, and MIP1α. To date, 13 TLRs (TLR1-13), at least five adaptor proteins (MyD88, Mal/TIRAP, TIR domain-containing adaptor protein, TRIF/TICAM1, TRAM/TICAM2, and SARM), and numerous downstream effectors have been described in mammals and humans. In the present study, a series of TLRs and corresponding adaptor proteins and downstream effectors were identified in L. japonicus. The identified TLRs include the majority seen in mammals and humans (TLR1-13), and four TLRs (TLR14, TLR18, TLR21, and TLR23) seen in fish species. Adaptor proteins and downstream effectors identified include the majority known in mammals and humans, including MYD88, BTK, TOLLIP, FADD, HMGB1, HRAS, HSPD1, CASP8, MAPK8IP3, PELI1, RIPK2, SARM1, TICAM2, TIRAP, EIF2AK2, IRAK1, IRAK2, MAP3K7, MAP3K7IP1, NR2C2, PPARA, PRKRA, TRAF6, UBE2N, and UBE2V1. These adaptor proteins and downstream effectors have been found to be well enriched in various known TLR signalling pathways. Downstream transcriptional factors and pro-inflammatory cytokines mediated by these pathways, including NF-κB, JNK/p38, NF/IL6, IRF, IFN-α/β, TNF-α, IL-2, IL-6, IL-8, and IL-10, was also be identified successfully. These suggest that TLR mechanisms are conserved from fish to mammals throughout vertebrate evolution. A putative draft of TLR signalling pathways in L. japonicus based on knowledge of TLR signalling in mammalian species was constructed (Figure 4). However, TLR signalling pathways in fish might be more intricate compared with those in mammalian species because of the novel TLRs (TLR14, TLR18, TLR21, and TLR23). An in-depth study of novel TLRs will improve understanding of fish-specific innate immunity in early vertebrates and even the complete evolutionary history of TLR-based innate immunity. DGE analysis revealed that TLR-1, -3, -13, -18, -21 and their signalling intermediates (Rac1, AKT, CASP8, IRAK1, TRAM, IRAK1, IKK alpha/beta, IRF7, and STAT1) were up- or down-regulated dramatically at different levels in the pathway upon bacterial challenge (P ≤ 0.01). This provides evidence that both conserved (TLR-1, -3, -13) and fish-specific (TLR-18, -21) TLR-based immunity participates in fish defence against bacterial challenge.

The innate immune system is generally believed to represent the evolutionarily ancient aspect of vertebrate immunity. As a representative of lower vertebrates, fish is suggested to possess stronger innate immune responses. However, fish adaptive immunity might be more primitive because of limited immunoglobulins and hallmark components necessary for adaptive immunity identified in this species [7]. In recent years, several hallmarks for T and B cells (TCR, BCR, CD3, CD4, and CD8), antigen presenting and processing molecules (MHCI, MHCII, and DC-SIGN/CD209), co-stimulatory factors (CD80/86, CD83, CD154, and CD40), and immunoglobulins (IgM, IgD, and IgZ/T) have been identified in teleost fish, thus providing preliminary evidence that the adaptive immune system might also be well-established in fish. However, the precise molecular and cellular bases and mechanisms underlying teleost adaptive immunity are still uncharacterised and require further immunogenetic studies. The present study successfully identified a large number of adaptive immune-relevant components homologous to those in higher vertebrates, providing abundant data sets for insights into the characterisation and origin of adaptive immunity in early vertebrates. Data sets imply that adaptive immunity in teleost fish seems to be much more complicated than previously believed. The basic components and signalling pathways necessary for adaptive immunity exist in fish, and a majority showed clear conservation between fish and mammals. For instance, T cell receptor (TCR) signalling pathways regulate T cell activation, one of the most important processes in adaptive immunity [51]. Majority of the four types of TCRs (TCR-α/β, -γδ) and numerous signalling transducers (Zap70, Lyn, LCK, SHP1, CD3, ITAM, LAT, Fyb, SLP-76, CBL, NCK, LAT, GRB2, CARMA1, NFAT, AP1, MALT1, and GRB2) discovered in humans and mammals can be identified in L. japonicus. DGE analysis showed that a number of TCR signalling pathway members, including TCR beta chain, Zap70, LCK, SHP1, CARMA1, Vav, NFAT, GRB2, MALT1, NCK, and Raf1, are induced significantly after bacterial challenge (P ≤ 0.01). These pathway members largely contribute to the proliferation and activation of T cells in mammals, thus suggesting that TCR signalling mechanisms underlying the T cell activation might be conserved between teleost fish and mammals. A putative draft of TCR signalling pathways based on knowledge of pathways known in mammals was constructed (Figure 5). Future studies on these pathways are expected to not only enrich current knowledge on fish immunology but also contribute to better understanding of the evolutionary history of adaptive immunity.


This study investigated the transcriptome profile of bacteria-challenged L. japonicus using Solexa/Illumina RNA-seq and DGE deep sequencing technologies. The substantial amount of transcripts obtained provides a strong basis for future genomic research on marine fish and supports in-depth genome annotation in vertebrates. Globally identified immune-candidate genes, infection markers, and putative signalling pathways in L. japonicus revealed that the immune system of fish may be much more complex than previously believed. A considerable amount of immune-relevant genes and pathways in fish showed significant similarity to vertebrate models, suggesting that mechanisms underlying the innate and adaptive immunity in fish may be conserved in higher vertebrates. In addition, a large set of novel immune response genes that have never been linked previously to immune responses in other vertebrate systems indicate the existence of numerous fish-specific immune events during early evolution. This suggests that innate and adaptive immunity might be well established in teleost marine fish. Findings provide deep insight into the immunogenetics of fish species, which can be clinically applied in the therapy of fish diseases. They also contribute to a better understanding of the evolutionary history of innate and adaptive immunity from fish to mammals.


Experimental fish

One-year-old Japanese sea bass of both sexes, weighing 48.6 ± 2.5 g, were obtained from the fishery institute of Zhejiang, China. They were kept in running aerated seawater (salinity 30) at 25°C and fed with commercial pellet food at a daily ration of 0.7% body weight. All fish were maintained in the laboratory for at least two weeks prior to experimental use to allow for acclimatisation and evaluation of overall fish health. Only healthy fish, as determined by general appearance and level of activity, were used in the experiment.

Bacterial strain

Wild-type marine fish-virulent V harveyi strain (96-915), a pathogen for bacterial septicaemia in L. japonicas, was maintained in the laboratory. It was cultured in Thiosulfate Citrate Bile Salts Sucrose at 27°C overnight. The desired number of cells was adjusted to 5 × 108 CFU/ml. Cells were inactivated with 5% formalin at 27°C overnight before thorough washing with sterile PBS (pH 7.0). They were re-suspended in PBS prior to use.

Bacterial challenge and RNA preparation

Fish in the experimental groups were inoculated intraperitoneally with 0.2 ml of V harveyi at 1 × 108 CFU per fish. In parallel, fish in the control groups were administrated with 0.2 ml of mock PBS. Both groups were kept under conditions as described above. At seven days post-challenge, fish were sacrificed after anaesthesia, and tissues from the head kidney and spleen were collected. Tissue samples from 15 fishes were mixed for RNA preparation. Total RNA was isolated using a TRIzol reagent (Gibco BRL) following the manufacturer's instructions and treated with RNase free DNase I (Qiagen). RNA concentrations were measured using a spectrophotometer and integrity was ensured through analysis on a 1.5% (w/v) agarose gel.

Sample Preparation for RNA-seq

After RNA extraction, poly-A-containing mRNAs were purified using oligo-dT-attached magnetic beads and fragmented into small pieces using divalent cations under elevated temperature. Cleaved RNA fragments were copied into first strand cDNA using reverse transcriptase and random primers. This was followed by second strand cDNA synthesis using DNA polymerase I and RNase H. These cDNA fragments underwent end repair process, addition of a single 'A' base, and ligation of adapters. Products were subsequently purified and amplified through PCR to create the final cDNA libraries.

Transcriptome analysis

Transcriptome sequencing was conducted using Solexa/Illumina RNA-seq. Four fluorescently labelled nucleotides and a specialised polymerase were used to determine the clusters base by base in parallel. The 75-bp raw PE reads (submitted to GEO database, Association No. GSE21721) were generated by the Illumina Genome Analyzer II system. Raw reads were then assembled into non-redundant consensus sequences using Grape, tgicl, and CAP3 softwares [52,53]. All sequences were examined for possible sequencing errors. Adaptor sequences were trimmed using the Cross_Match software in the Phrap package webcite[54]. Short sequences (< 100 bp in length) were removed using custom Perl program [55]. The resulting high-quality sequences were assembled into sequence contigs with the TGICL program, which creates an assembly using CAP3. Sequence homology searches were performed using local BLASTall programs against sequences in NCBI non-redundant (nr) protein database and the Swissprot database (E-value < 1e-10). Genes were tentatively identified according to the best hits against known sequences. Assembled consensus sequences were used to determine the GO term, COG term, and were analyzed further using KEGG.

DGE-tag profiling

DGE analysis included sample preparation and sequencing. Sequence tag preparation was performed using the Digital Gene Expression Tag Profile Kit (Illumina) according to the manufacturer's instructions. Briefly, 6 μg total RNA was used for mRNA purification using oligo-dT magnetic bead adsorption and oligo-dT was used to guide reverse transcription for double-stranded cDNA synthesis. The generation of 5' ends of tags was done using endonuclease NlaIII, which recognizes and cuts off the CATG sites on cDNA. cDNA fragments with 3' ends were purified through magnetic bead precipitation, and Illumina adapter 1 was added to the 5' ends. The junction of Illumina adapter 1 and CATG site was the recognition site of MmeI, which cuts 17 bp downstream of the CATG site, producing tags with adapter 1. After removal of 3' fragments with magnetic bead precipitation, the 21-bp unique tags with adaptor 1 were purified and ligated to adaptor 2 to form a cDNA tag library. These adapter-ligated cDNA tags were enriched after 15 cycles of linear PCR amplification. The resulting 85-bp fragments were purified by 6% TBE polyacrylamide gel electrophoresis. Fragments were then digested and the single-chain molecules were fixed onto the Solexa Sequencing Chip (flowcell). Sequencing by synthesis was performed using the Illumina Genome Analyzer II system according to the manufacturer's protocols. Image analysis, base calling, generation of raw 17-bp tags, and tag counting were performed using the Illumina pipeline. Raw data (tag sequences) were deposited in the GEO database under submission number GSE21712.

Aligning DGE tags to reference transcriptome data set

Clean tags and count number of DGE libraries from bacteria- and mock-challenged groups were collected and summarised using custom Bio-perl scripts. All tags were mapped to the reference transcriptome generated by RNA-seq. To monitor mapping events on both strands, both sense and complementary antisense sequences were included in the mapping process. Only perfect matches over the entire 21-bp length of the 17-bp tag plus the 4-bp NlaIII recognition site were allowed. This study was limited to tags that mapped to ORFs only and cannot show tags that mapped to mRNA with long 3'UTRs.

Identification of differentially expressed genes

Rigorous algorithms were developed to identify differentially expressed genes between two samples. The correlation of the detected count numbers between parallel libraries was assessed statistically by calculating the Pearson correlation. In addition to the P value, FDR was manipulated to determine differentially expressed genes [56]. Assuming that R differentially expressed genes have been selected, S genes really show differential expression, whereas the other V genes are false positives. If the error ratio Q = V/R must remain below a cutoff (5%), FDR should not exceed 0.05. In this research, P ≤ 0.01, FDR ≤ 0.1, and the absolute value of log2Ratio≥1 were used as threshold to assess the significance of gene expression difference. More stringent criteria with smaller FDR and bigger fold-change values can be used to identify differentially expressed genes.

Experimental validation

Representative consensus sequences with complete ORFs (hepcidin, Myf5, SNARE, syntaxin, and IL-8-like homologues) generated by RNA-seq were selected for experimental cloning and sequencing validation. The cDNAs of these genes were amplified by RT-PCR using the primers shown in Supplemental Table 6. All PCR products were purified using Gel Extraction Kit (Qiagen), cloned into pUCm-T vector (TaKaRa), and sequenced on MegaBACE 1000 Sequencer (GE) using the DYEnamic ET Dye Terminator Cycle Sequencing Kit (Pharmacia). Protein sequence alignments were generated using the Cluster W program (version 1.83). The phylogenies of protein sequences were estimated using MEGA 3.0 with the neighbour-joining method.


Most used abbreviations

ABCF: ATP-binding cassette subfamily F; AICDA: Activation-induced cytidine deaminase; AP-1: Activator protein 1; ASB: Ankyrin repeat and SOCS box protein; ASP: Apoptosis-stimulating of p53 protein; BTK: Bruton's tyrosine kinase; CAMP: Calmodulin-regulated spectrin-associated protein; CARD: Caspase recruitment domain; CBLB: E3 ubiquitin-protein ligase CBL-B; CDC: Cell division cycle; CDK: Cyclin-dependent kinase; CEBP: CCAAT/enhancer-binding protein; CFLAR: CASP8 and FADD-like apoptosis regulator; c-FOS: Cellular Proto-oncogene; CHUK: Conserved helix-loop-helix ubiquitous kinase; CNTFR: Ciliary neurotrophic factor receptor; CRADD: Death domain-containing protein CRADD; CRLF: Cytokine receptor-like factor; CSF: Macrophage/Granulocyte Colony-stimulating factor; CUB: First found in C1r/uEGF/bone morphogenetic protein; CYBB: Cytochrome b-245 heavy chain; CYFIP: Cytoplasmic FMR1 interacting protein; DEDD: Death effector domain-containing protein; DFD: Death-fold domain including CARD/DED/DEATH; DGCR: DiGeorge syndrome critical region; DMBT: Deleted in malignant brain tumors; DOCK: Dedicator of cytokinesis; DRAM: Damage-regulated autophagy modulator; Dscam: Down syndrome cell adhesion molecule; EBI: Epstein-Barr virus induced G-protein coupled receptor; EGF: Epidermal growth factor; EGR: Early growth response protein; EIF: Eukaryotic translation initiation factor; ELK: ETS domain-containing protein; EMAP: Echinoderm microtubule-associated protein; EMILIN: Elastin microfibril interfase located proteIN; ERCC: DNA-repair protein complementing XP-G cells homolog; FADD: Fas-associating death domain-containing protein; FAM: Family with sequence similarity; FGF: Fibroblast growth factor; FIMAC: Factor I membrane attack complex; FLT: Fms-related tyrosine kinase; FND: Fibronectin type III domain-containing protein; GALNAC4S-6ST: N-acetylgalactosamine 4-sulfate 6-O-sulfotransferase GALNAC4S6ST; GATA: Trans-acting T-cell specific transcription factor; GDF: Growth/differentiation factor; GFI: Growth factor independent; GPR: G-protein coupled receptor; HELLS: Lymphoid-specific helicase encoded; HLA: Human leukocyte antigen; HMG: High mobility group protein; HSP: Heat shock protein; HTR: HEAT repeat-containing protein; ICAM: Intercellular adhesion molecule; ICE: Interleukin Catalytic Enzyme; IFRD: Interferon-related developmental regulator; IGBP: Immunoglobulin-binding protein; IKKA: Inhibitor of nuclear factor κ-B kinase subunit α; IMPDH/IMDH: Inosine-5'-monophosphate dehydrogenase; IRAK: Interleukin-1 receptor-associated kinase; IRG: Immune-responsive gene; ITCH: Itchy homolog E3; JAK: Janus kinase; JAG: Protein jagged-1; JNK: c-Jun N-terminal kinase; KI13B: Kinesin-like protein KIF13B; KLF: Krueppel-like factor; LAG: Lymphocyte activation gene; LIFR: Leukemia inhibitory factor receptor; LRP: Prolow-density lipoprotein receptor-related protein; LRR: Leucine-rich repeat; MACPE: Membrane attack complex/perforin; MAF: Macrophage activating factor; MALT: Mucosa-associated lymphoma translocation protein; MAPK: Mitogen-activated protein kinase; MASP: Mannose binding lectin associated serine protease; MBL: Mannose binding lectin; MCA: Multisynthetase complex auxiliary component; MHC: Major histocompatibility complex; MIF: Macrophage migration inhibitory factor; Myd88: Myeloid differentiation primary response gene 88; NALP: NACHT; LRR and PYD domains-containing protein; NCK: Cytoplasmic protein; NF κ: Nuclear factor κ; NFAT: Nuclear factors of activated T cells; NOD: Nucleotide-binding oligomerization domain; NOS: Nitric oxide synthase; NR2C: Nuclear receptor subfamily 2 group C; OSGI: Oxidative stress-induced growth inhibitor; PAMP: Pathogen-associated molecular pattern; PARC: p53-associated parkin-like cytoplasmic protein; PAWR: PRKC apoptosis WT1 regulator protein; PCGF: Polycomb group RING finger protein; PEA15: Phosphoprotein enriched in astrocytes 15 gene; PGRP: Peptidoglycan recognition protein; PHLA: Pleckstrin homology-like domain family A; PIGR: Polymeric immunoglobulin receptor; PPARA: Peroxisome proliferator-activated receptor α; PRR: PAMP recognition receptor; PSMA: Proteasome subunit α type; PSMB: Proteasome subunit β type; PSMC: Protease 26 S subunit; ATPase; PSMD: Proteasome 26 S subunit; non-ATPase; PSME: Proteasome activator complex subunit; PTAFR: Platelet-activating factor receptor; PYCARD: PYD and CARD domain containing protein; PYD: Domain in pyrin; RAC: Ras-related C3 botulinum toxin substrate; REL: C-Rel proto-oncogene protein; RFX5: Regulatory factor × 5; RGC: Response gene to complement protein; RGS: Regulator of G-protein signaling; RHOA: Ras homolog gene family A; RIPK: Receptor-interacting serine/threonine-protein kinase; SARM: SAM and TIR containing; SCYE: Small inducible cytokine subfamily E; SFTPD/SP-D: Surfactant; pulmonary-associated protein D; SIGIRR: Single Ig IL-1-related receptor; SLC20A-1/2: Sodium-dependent phosphate transporter 1/2; SMAD: Mothers against decapentaplegic; SNED: Sushi nidogen and EGF-like domain-containing protein; SOCS: Suppressor of cytokine signaling; TANK: TRAF associated NF-κ-B activator; TAPBP: TAP binding protein; TBX: T-box transcription factor; TGFA: Transforming growth factor -associated protein; TICAM: Toll-like receptor adapter molecule; TIRAP: Toll/interleukin-1 receptor domain-containing adapter protein; TMED: Transmembrane emp24 domain-containing protein; TNF(TNR): Tumor necrosis factor (receptor); TOLLIP: Toll-interacting protein; TPR: Tetratrico peptide repeat; TRADD: TNFRSF1A-associated death domain protein; TRAF: TNF-receptor-associated factor; TRIM: Tripartite motif-containing protein; TSP1: Thrombospondin type 1 repeats; TYK: Tyrosine-protein kinase; UBE: Ubiquitin-conjugating enzyme; UCRP/CRP: Ubiquitin cross-reactive protein; VEGF: Vascular endothelial growth factor; WAS: Wiskott-Aldrich syndrome; WSC: Cell wall integrity and stress response component; WWP: WW domain-containing proteins; XPA: DNA-repair protein complementing XP-A cells.

Authors' contributions

LXX and DH conceived and designed the study, participated in the bioinformatics analysis, and drafted the manuscript. WRD and YWZ performed the experiments and designed the tables. JZS conceptualized the project, reviewed the manuscript, and provided guidance. All authors read and approved the final manuscript.


We acknowledge the Beijing Genomics Institute at Shenzhen for its assistance in original data processing and related bioinformatics analysis. We are thankful to professor Guo-liang Wang of Ningbo University for providing us with Vibrio harveyi culture. We would also like to thank Guang-ping Liu, Hui-hui Liu, and Jian-qiu Zou for their help in data and figure processing. This work was supported by grants from Hi-Tech Research and Development Program of China (863) (2008AA09Z409), the National Basic Research Program of China (973) (2006CB101805), the National Natural Science Foundation of China (30871936, 30571423), and the Science and Technology Foundation of Zhejiang Province (2006C12038, 2006C23045, 2006C12005, 2007C12011).


  1. Kocher TD: Adaptive evolution and explosive speciation: the cichlid fish model.

    Nat Rev Genet 2004, 5(4):288-298. PubMed Abstract | Publisher Full Text OpenURL

  2. Venkatesh B: Evolution and diversity of fish genomes.

    Curr Opin Genet Dev 2003, 13(6):588-592. PubMed Abstract | Publisher Full Text OpenURL

  3. Kasahara M, Naruse K, Sasaki S, Nakatani Y, Qu W, Ahsan B, Yamada T, Nagayasu Y, Doi K, Kasai Y, et al.: The medaka draft genome and insights into vertebrate genome evolution.

    Nature 2007, 447(7145):714-719. PubMed Abstract | Publisher Full Text OpenURL

  4. Peatman E, Liu Z: Evolution of CC chemokines in teleost fish: a case study in gene duplication and implications for immune diversity.

    Immunogenetics 2007, 59(8):613-623. PubMed Abstract | Publisher Full Text OpenURL

  5. Zou J, Tafalla C, Truckle J, Secombes CJ: Identification of a second group of type I IFNs in fish sheds light on IFN evolution in vertebrates.

    J Immunol 2007, 179(6):3859-3871. PubMed Abstract | Publisher Full Text OpenURL

  6. Jin HJ, Shao JZ, Xiang LX, Wang H, Sun LL: Global identification and comparative analysis of SOCS genes in fish: insights into the molecular evolution of SOCS family.

    Mol Immunol 2008, 45(5):1258-1268. PubMed Abstract | Publisher Full Text OpenURL

  7. Flajnik MF, Kasahara M: Origin and evolution of the adaptive immune system: genetic events and selective pressures.

    Nat Rev Genet 2010, 11(1):47-59. PubMed Abstract | Publisher Full Text OpenURL

  8. Leulier F, Lemaitre B: Toll-like receptors--taking an evolutionary approach.

    Nat Rev Genet 2008, 9(3):165-178. PubMed Abstract | Publisher Full Text OpenURL

  9. Hegedus Z, Zakrzewska A, Agoston VC, Ordas A, Racz P, Mink M, Spaink HP, Meijer AH: Deep sequencing of the zebrafish transcriptome response to mycobacterium infection.

    Mol Immunol 2009, 46(15):2918-2930. PubMed Abstract | Publisher Full Text OpenURL

  10. Stockhammer OW, Zakrzewska A, Hegedus Z, Spaink HP, Meijer AH: Transcriptome profiling and functional analyses of the zebrafish embryonic innate immune response to Salmonella infection.

    J Immunol 2009, 182(9):5641-5653. PubMed Abstract | Publisher Full Text OpenURL

  11. Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for transcriptomics.

    Nat Rev Genet 2009, 10(1):57-63. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  12. Anisimov SV: Serial Analysis of Gene Expression (SAGE): 13 years of application in research.

    Curr Pharm Biotechnol 2008, 9(5):338-350. PubMed Abstract | Publisher Full Text OpenURL

  13. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq.

    Nat Methods 2008, 5(7):621-628. PubMed Abstract | Publisher Full Text OpenURL

  14. Sultan M, Schulz MH, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D, et al.: A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome.

    Science 2008, 321(5891):956-960. PubMed Abstract | Publisher Full Text OpenURL

  15. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M: The transcriptional landscape of the yeast genome defined by RNA sequencing.

    Science 2008, 320(5881):1344-1349. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  16. Morozova O, Marra MA: Applications of next-generation sequencing technologies in functional genomics.

    Genomics 2008, 92(5):255-264. PubMed Abstract | Publisher Full Text OpenURL

  17. t Hoen PA, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RH, de Menezes RX, Boer JM, van Ommen GJ, den Dunnen JT: Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms.

    Nucleic Acids Res 2008, 36(21):e141. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  18. Wang B, Guo G, Wang C, Lin Y, Wang X, Zhao M, Guo Y, He M, Zhang Y, Pan L: Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing.

    Nucleic Acids Res 2010, in press. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  19. Levin JZ, Berger MF, Adiconis X, Rogov P, Melnikov A, Fennell T, Nusbaum C, Garraway LA, Gnirke A: Targeted next-generation sequencing of a cancer transcriptome enhances detection of sequence variants and novel fusion transcripts.

    Genome Biol 2009, 10(10):R115. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  20. Tang F, Barbacioru C, Nordman E, Li B, Xu N, Bashkirov VI, Lao K, Surani MA: RNA-Seq analysis to capture the transcriptome landscape of a single cell.

    Nat Protoc 2010, 5(3):516-535. PubMed Abstract | Publisher Full Text OpenURL

  21. Asmann YW, Klee EW, Thompson EA, Perez EA, Middha S, Oberg AL, Therneau TM, Smith DI, Poland GA, Wieben ED, et al.: 3' tag digital gene expression profiling of human brain and universal reference RNA using Illumina Genome Analyzer.

    BMC Genomics 2009, 10:531. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  22. Zhang G, Guo G, Hu X, Zhang Y, Li Q, Li R, Zhuang R, Lu Z, He Z, Fang X, et al.: Deep RNA sequencing at single base-pair resolution reveals high complexity of the rice transcriptome.

    Genome Res 2010, 20:646-654. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  23. David JP, Coissac E, Melodelima C, Poupardin R, Riaz MA, Chandor-Proust A, Reynaud S: Transcriptome response to pollutants and insecticides in the dengue vector Aedes aegypti using next-generation sequencing technology.

    BMC Genomics 2010, 11(1):216. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  24. Liu S, Li D, Li Q, Zhao P, Xiang Z, Xia Q: MicroRNAs of Bombyx mori identified by Solexa sequencing.

    BMC Genomics 2010, 11:148. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  25. Yassour M, Kaplan T, Fraser HB, Levin JZ, Pfiffner J, Adiconis X, Schroth G, Luo S, Khrebtukova I, Gnirke A, et al.: Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing.

    Proc Natl Acad Sci USA 2009, 106(9):3264-3269. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  26. Veitch NJ, Johnson PC, Trivedi U, Terry S, Wildridge D, MacLeod A: Digital gene expression analysis of two life cycle stages of the human-infective parasite, Trypanosoma brucei gambiense reveals differentially expressed clusters of co-regulated genes.

    BMC Genomics 2010, 11:124. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  27. Morrissy AS, Morin RD, Delaney A, Zeng T, McDonald H, Jones S, Zhao Y, Hirst M, Marra MA: Next-generation tag sequencing for cancer gene expression profiling.

    Genome Res 2009, 19(10):1825-1835. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  28. Wang Y, Brahmakshatriya V, Zhu H, Lupiani B, Reddy SM, Yoon BJ, Gunaratne PH, Kim JH, Chen R, Wang J, et al.: Identification of differentially expressed miRNAs in chicken lung and trachea with avian influenza virus infection by a deep sequencing approach.

    BMC Genomics 2009, 10:512. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  29. Chen X, Li Q, Wang J, Guo X, Jiang X, Ren Z, Weng C, Sun G, Wang X, Liu Y, et al.: Identification and characterization of novel amphioxus microRNAs by Solexa sequencing.

    Genome Biol 2009, 10(7):R78. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  30. Xie ZY, Hu CQ, Zhang LP, Chen C, Ren CH, Shen Q: Identification and pathogenicity of Vibrio ponticus affecting cultured Japanese sea bass, Lateolabrax japonicus (Cuvier in Cuvier and Valenciennes).

    Letters in Applied Microbiology 2007, 45(1):62-67. PubMed Abstract | Publisher Full Text OpenURL

  31. Austin B, Zhang XH: Vibrio harveyi: a significant pathogen of marine vertebrates and invertebrates.

    Lett Appl Microbiol 2006, 43(2):119-124. PubMed Abstract | Publisher Full Text OpenURL

  32. Iseli C, Jongeneel CV, Bucher P: ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences.

    Proc Int Conf Intell Syst Mol Biol 1999, 138-148. PubMed Abstract OpenURL

  33. Ye J, McGinnis S, Madden TL: BLAST: improvements for better sequence analysis.

    Nucleic Acids Res 2006, (34 Web Server):W6-9. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  34. Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, Wang J, Li S, Li R, Bolund L: WEGO: a web tool for plotting GO annotations.

    Nucleic Acids Res 2006, (34 Web Server):W293-297. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  35. Tatusov RL, Galperin MY, Natale DA, Koonin EV: The COG database: a tool for genome-scale analysis of protein functions and evolution.

    Nucleic Acids Res 2000, 28(1):33-36. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  36. Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, Katayama T, Kawashima S, Okuda S, Tokimatsu T, et al.: KEGG for linking genomes to life and the environment.

    Nucleic Acids Res 2008, (36 Database):D480-484. PubMed Abstract | PubMed Central Full Text OpenURL

  37. Rassaf T, Kleinbongard P, Kelm M: The L-arginine nitric oxide pathway: avenue for a multiple-level approach to assess vascular function.

    Biol Chem 2006, 387(10-11):1347-1349. PubMed Abstract | Publisher Full Text OpenURL

  38. Ewing B, Hillier L, Wendl MC, Green P: Base-calling of automated sequencer traces using phred. I. Accuracy assessment.

    Genome Res 1998, 8(3):175-185. PubMed Abstract | Publisher Full Text OpenURL

  39. Ewing B, Green P: Base-calling of automated sequencer traces using phred. II. Error probabilities.

    Genome Res 1998, 8(3):186-194. PubMed Abstract | Publisher Full Text OpenURL

  40. Audic S, Claverie JM: The significance of digital gene expression profiles.

    Genome Res 1997, 7(10):986-995. PubMed Abstract | Publisher Full Text OpenURL

  41. Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, Wang J: SOAP2: an improved ultrafast tool for short read alignment.

    Bioinformatics 2009, 25(15):1966-1967. PubMed Abstract | Publisher Full Text OpenURL

  42. Bru C, Courcelle E, Carrere S, Beausse Y, Dalmar S, Kahn D: The ProDom database of protein domain families: more emphasis on 3D.

    Nucleic Acids Res 2005, (33 Database):D212-215. PubMed Abstract | PubMed Central Full Text OpenURL

  43. Bendtsen JD, Nielsen H, von Heijne G, Brunak S: Improved prediction of signal peptides: SignalP 3.0.

    J Mol Biol 2004, 340(4):783-795. PubMed Abstract | Publisher Full Text OpenURL

  44. Chen Y, Yu P, Luo J, Jiang Y: Secreted protein prediction system combining CJ-SPHMM, TMHMM, and PSORT.

    Mamm Genome 2003, 14(12):859-865. PubMed Abstract | Publisher Full Text OpenURL

  45. Borchardt T, Looso M, Bruckskotten M, Weis P, Kruse J, Braun T: Analysis of newly established EST databases reveals similarities between heart regeneration in newt and fish.

    BMC Genomics 2010, 11:4. PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  46. Lo L, Zhang Z, Hong N, Peng J, Hong Y: 3640 unique EST clusters from the medaka testis and their potential use for identifying conserved testicular gene expression in fish and mammals.

    PLoS One 2008, 3(12):e3915. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  47. Lin AF, Xiang LX, Wang QL, Dong WR, Gong YF, Shao JZ: The DC-SIGN of zebrafish: insights into the existence of a CD209 homologue in a lower vertebrate and its involvement in adaptive immunity.

    J Immunol 2009, 183(11):7398-7410. PubMed Abstract | Publisher Full Text OpenURL

  48. Gong YF, Xiang LX, Shao JZ: CD154-CD40 interactions are essential for thymus-dependent antibody production in zebrafish: insights into the origin of costimulatory pathway in helper T cell-regulated adaptive immunity in early vertebrates.

    J Immunol 2009, 182(12):7749-7762. PubMed Abstract | Publisher Full Text OpenURL

  49. Meijer AH, Gabby Krens SF, Medina Rodriguez IA, He S, Bitter W, Ewa Snaar-Jagalska B, Spaink HP: Expression analysis of the Toll-like receptor and TIR domain adaptor families of zebrafish.

    Mol Immunol 2004, 40(11):773-783. PubMed Abstract | Publisher Full Text OpenURL

  50. Oshiumi H, Tsujita T, Shida K, Matsumoto M, Ikeo K, Seya T: Prediction of the prototype of the human Toll-like receptor gene family from the pufferfish, Fugu rubripes, genome.

    Immunogenetics 2003, 54(11):791-800. PubMed Abstract | Publisher Full Text OpenURL

  51. Meeker ND, Smith AC, Frazer JK, Bradley DF, Rudner LA, Love C, Trede NS: Characterization of the zebrafish T cell receptor beta locus.

    Immunogenetics 62(1):23-29. PubMed Abstract | Publisher Full Text OpenURL

  52. Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, et al.: TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets.

    Bioinformatics 2003, 19(5):651-652. PubMed Abstract | Publisher Full Text OpenURL

  53. Huang X, Madan A: CAP3: A DNA sequence assembly program.

    Genome Res 1999, 9(9):868-877. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  54. de la Bastide M, McCombie WR: Assembling genomic DNA sequences with PHRAP.

    Curr Protoc Bioinformatics 2007., Chapter 11(Unit11 14) PubMed Abstract | Publisher Full Text OpenURL

  55. Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JG, Korf I, Lapp H, et al.: The Bioperl toolkit: Perl modules for the life sciences.

    Genome Res 2002, 12(10):1611-1618. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  56. Benjamini Y, Drai D, Elmer G, Kafkafi N, Golani I: Controlling the false discovery rate in behavior genetics research.

    Behav Brain Res 2001, 125(1-2):279-284. PubMed Abstract | Publisher Full Text OpenURL