Email updates

Keep up to date with the latest news and content from BMC Microbiology and BioMed Central.

Open Access Methodology article

Hepatitis C virus whole genome position weight matrix and robust primer design

Ping Qiu1*, Xiao-Yan Cai2, Luquan Wang1, Jonathan R Greene1 and Bruce Malcolm3

Author Affiliations

1 Bioinformatics Group and Discovery Technology Department, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, New Jersey 07033, USA

2 Bioanalytical Department, Schering-Plough Research Institute, 1011 Morris Avenue, Union, New Jersey 07083, USA

3 Antiviral Therapy Department, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, New Jersey 07033, USA

For all author emails, please log on.

BMC Microbiology 2002, 2:29  doi:10.1186/1471-2180-2-29

Published: 25 September 2002

Abstract

Background

The high degree of sequence heterogeneity found in Hepatitis C virus (HCV) isolates, makes robust nucleic acid-based assays difficult to generate. Polymerase chain reaction based techniques, require efficient and specific sequence recognition. Generation of robust primers capable of recognizing a wide range of isolates is a difficult task.

Results

A position weight matrix (PWM) and a consensus sequence were built for each region of HCV and subsequently assembled into a whole genome consensus sequence and PWM. For each of the 10 regions, the number of occurrences of each base at a given position was compiled. These counts were converted to frequencies that were used to calculate log odds scores. Using over 100 complete and 14,000 partial HCV genomes from GenBank, a consensus HCV genome sequence was generated along with a PWM reflecting heterogeneity at each position. The PWM was used to identify the most conserved regions for primer design.

Conclusions

This approach allows rapid identification of conserved regions for robust primer design and is broadly applicable to sets of genomes with all levels of genetic heterogeneity.