The HTS barcode checker pipeline, a tool for automated detection of illegally traded species from high-throughput sequencing data
1 Naturalis Biodiversity Center, Darwinweg 4, 2333 CR Leiden, The Netherlands
2 Dutch Customs Laboratory, Kingsfordweg 1, 1043 GN, Amsterdam, The Netherlands
3 University of Applied Sciences Leiden, Zernikedreef 11, 2333 CK, Leiden, The Netherlands
4 Leiden University, Faculty of Science, Einsteinweg 55, 2333 CK, Leiden, The Netherlands
BMC Bioinformatics 2014, 15:44 doi:10.1186/1471-2105-15-44Published: 6 February 2014
Mixtures of internationally traded organic substances can contain parts of species protected by the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). These mixtures often raise the suspicion of border control and customs offices, which can lead to confiscation, for example in the case of Traditional Chinese medicines (TCMs). High-throughput sequencing of DNA barcoding markers obtained from such samples provides insight into species constituents of mixtures, but manual cross-referencing of results against the CITES appendices is labor intensive. Matching DNA barcodes against NCBI GenBank using BLAST may yield misleading results both as false positives, due to incorrectly annotated sequences, and false negatives, due to spurious taxonomic re-assignment. Incongruence between the taxonomies of CITES and NCBI GenBank can result in erroneous estimates of illegal trade.
The HTS barcode checker pipeline is an application for automated processing of sets of ‘next generation’ barcode sequences to determine whether these contain DNA barcodes obtained from species listed on the CITES appendices. This analytical pipeline builds upon and extends existing open-source applications for BLAST matching against the NCBI GenBank reference database and for taxonomic name reconciliation. In a single operation, reads are converted into taxonomic identifications matched with names on the CITES appendices. By inclusion of a blacklist and additional names databases, the HTS barcode checker pipeline prevents false positives and resolves taxonomic heterogeneity.
The HTS barcode checker pipeline can detect and correctly identify DNA barcodes of CITES-protected species from reads obtained from TCM samples in just a few minutes. The pipeline facilitates and improves molecular monitoring of trade in endangered species, and can aid in safeguarding these species from extinction in the wild. The HTS barcode checker pipeline is available at https://github.com/naturalis/HTS-barcode-checker webcite.