Open Access Research article

Gene discovery by genome-wide CDS re-prediction and microarray-based transcriptional analysis in phytopathogen Xanthomonas campestris

Lian Zhou1, Frank-Jörg Vorhölter2, Yong-Qiang He3, Bo-Le Jiang3, Ji-Liang Tang3, Yuquan Xu1, Alfred Pühler2* and Ya-Wen He1*

Author Affiliations

1 National Center for Molecular Characterization of GMOs and State Key Laboratory of Microbial Metabolism, School of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China

2 Universität Bielefeld, CeBiTec, Universitätsstr.25, D-33615 Bielefeld, Germany

3 State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Nanning 530004, China

For all author emails, please log on.

BMC Genomics 2011, 12:359  doi:10.1186/1471-2164-12-359

Published: 12 July 2011



One of the major tasks of the post-genomic era is "reading" genomic sequences in order to extract all the biological information contained in them. Although a wide variety of techniques is used to solve the gene finding problem and a number of prokaryotic gene-finding software are available, gene recognition in bacteria is far from being always straightforward.


This study reported a thorough search for new CDS in the two published Xcc genomes. In the first, putative CDSs encoded in the two genomes were re-predicted using three gene finders, resulting in the identification of 2850 putative new CDSs. In the second, similarity searching was conducted and 278 CDSs were found to have homologs in other bacterial species. In the third, oligonucleotide microarray and RT-PCR analysis identified 147 CDSs with detectable mRNA transcripts. Finally, in-frame deletion and subsequent phenotype analysis of confirmed that Xcc_CDS002 encoding a novel SIR2-like domain protein is involved in virulence and Xcc_CDS1553 encoding a ArsR family transcription factor is involved in arsenate resistance.


Despite sophisticated approaches available for genome annotation, many cellular transcripts have remained unidentified so far in Xcc genomes. Through a combined strategy involving bioinformatic, postgenomic and genetic approaches, a reliable list of 306 new CDSs was identified and a more thorough understanding of some cellular processes was gained.

Xanthomonas campestris; CDS re-prediction; microarray analysis; new CDS