Open Access Highly Accessed Software

CLOTU: An online pipeline for processing and clustering of 454 amplicon reads into OTUs followed by taxonomic annotation

Surendra Kumar1*, Tor Carlsen1, Bjørn-Helge Mevik2, Pål Enger2, Rakel Blaalid1, Kamran Shalchian-Tabrizi1 and Håvard Kauserud1*

Author Affiliations

1 Microbial Evolution Research Group (MERG), Department of Biology, University of Oslo, P.O. Box 1066 Blindern, N-0316 Oslo, Norway

2 Centre of Information Technology, University of Oslo, Norway

For all author emails, please log on.

BMC Bioinformatics 2011, 12:182  doi:10.1186/1471-2105-12-182

Published: 20 May 2011



The implementation of high throughput sequencing for exploring biodiversity poses high demands on bioinformatics applications for automated data processing. Here we introduce CLOTU, an online and open access pipeline for processing 454 amplicon reads. CLOTU has been constructed to be highly user-friendly and flexible, since different types of analyses are needed for different datasets.


In CLOTU, the user can filter out low quality sequences, trim tags, primers, adaptors, perform clustering of sequence reads, and run BLAST against NCBInr or a customized database in a high performance computing environment. The resulting data may be browsed in a user-friendly manner and easily forwarded to downstream analyses. Although CLOTU is specifically designed for analyzing 454 amplicon reads, other types of DNA sequence data can also be processed. A fungal ITS sequence dataset generated by 454 sequencing of environmental samples is used to demonstrate the utility of CLOTU.


CLOTU is a flexible and easy to use bioinformatics pipeline that includes different options for filtering, trimming, clustering and taxonomic annotation of high throughput sequence reads. Some of these options are not included in comparable pipelines. CLOTU is implemented in a Linux computer cluster and is freely accessible to academic users through the Bioportal web-based bioinformatics service ( webcite).