Email updates

Keep up to date with the latest news and content from BMC Research Notes and BioMed Central.

Open Access Research article

454 antibody sequencing - error characterization and correction

Ponraj Prabakaran12, Emily Streaker12, Weizao Chen1 and Dimiter S Dimitrov1*

Author Affiliations

1 Protein Interactions Group, Center for Cancer Research Nanobiology Program, National Cancer Institute (NCI)-Frederick, National Institutes of Health (NIH), Frederick, MD 21702-1201, USA

2 Basic Research Program, Science Applications International Corporation-Frederick, Inc., NCI-Frederick, Frederick, MD 21702, USA

For all author emails, please log on.

BMC Research Notes 2011, 4:404  doi:10.1186/1756-0500-4-404

Published: 12 October 2011

Abstract

Background

454 sequencing is currently the method of choice for sequencing of antibody repertoires and libraries containing large numbers (106 to 1012) of different molecules with similar frameworks and variable regions which poses significant challenges for identifying sequencing errors. Identification and correction of sequencing errors in such mixtures is especially important for the exploration of complex maturation pathways and identification of putative germline predecessors of highly somatically mutated antibodies. To quantify and correct errors incorporated in 454 antibody sequencing, we sequenced six antibodies at different known concentrations twice over and compared them with the corresponding known sequences as determined by standard Sanger sequencing.

Results

We found that 454 antibody sequencing could lead to approximately 20% incorrect reads due to insertions that were mostly found at shorter homopolymer regions of 2-3 nucleotide length, and less so by insertions, deletions and other variants at random sites. Correction of errors might reduce this population of erroneous reads down to 5-10%. However, there are a certain number of errors accounting for 4-8% of the total reads that could not be corrected unless several repeated sequencing is performed, although this may not be possible for large diverse libraries and repertoires including complete sets of antibodies (antibodyomes).

Conclusions

The experimental test procedure carried out for assessing 454 antibody sequencing errors reveals high (up to 20%) incorrect reads; the errors can be reduced down to 5-10% but not less which suggests the use of caution to avoid false discovery of antibody variants and diversity.