Computational cloning of drug target genes of a parasitic nematode, Oesophagostomum dentatum
1 Departments of Veterinary Pathology, College of Veterinary Medicine, Iowa State University, Ames, IA 50011, USA
2 Departments of Biomedical Sciences, College of Veterinary Medicine, Iowa State University, Ames, IA 50011, USA
BMC Genetics 2013, 14:55 doi:10.1186/1471-2156-14-55Published: 18 June 2013
Gene identification and sequence determination are critical requirements for many biological, genomic, and bioinformatic studies. With the advent of next generation sequencing (NGS) technologies, such determinations are predominantly accomplished in silico for organisms for which the genome is known or for which there exists substantial gene sequence information. Without detailed genomic/gene information, in silico sequence determination is not straightforward, and full coding sequence determination typically involves time- and labor-intensive PCR-based amplification and cloning methods.
An improved method was developed with which to determine full length gene coding sequences in silico using de novo assembly of RNA-Seq data. The scheme improves upon initial contigs through contig-to-gene identification by BLAST nearest–neighbor comparison, and through single-contig refinement by iterative-binning and -assembly of reads. Application of the iterative method produced the gene identification and full coding sequence for 9 of 12 genes and improved the sequence of 3 of the 12 genes targeted by benzimidazole, macrocyclic lactone, and nicotinic agonist classes of anthelminthic drugs in the swine nodular parasite Oesophagostomum dentatum. The approach improved upon the initial optimized assembly with Velvet that only identified full coding sequences for 2 genes.
Our reiterative methodology represents a simplified pipeline with which to determine longer gene sequences in silico from next generation sequence data for any nematode for which detailed genetic/gene information is lacking. The method significantly improved upon an initial Velvet assembly of RNA-Seq data that yielded only 2 full length sequences. The identified coding sequences for the 11 target genes enables further future examinations including: (i) the use of recombinant target protein in functional assays seeking a better understanding of the mechanism of drug resistance, and (ii) seeking comparative genomic and transcriptomic assessments between parasite isolates that exhibit varied drug sensitivities.