Surveillance designed to detect changes in the type-specific distribution of HPV in cervical intraepithelial neoplasia grade 3 (CIN-3) is necessary to evaluate the effectiveness of the Australian vaccination programme on cancer causing HPV types. This paper develops a protocol that eliminates the need to calculate required sample size; sample size is difficult to calculate in advance because HPV’s true type-specific prevalence is imperfectly known.
A truncated sequential sampling plan that collects a variable sample size was designed to detect changes in the type-specific distribution of HPV in CIN-3. Computer simulation to evaluate the accuracy of the plan at classifying the prevalence of an HPV type as low (< 5%), moderate (5-15%), or high (> 15%) and the average sample size collected was conducted and used to assess its appropriateness as a surveillance tool.
The plan classified the proportion of CIN-3 lesions positive for an HPV type very accurately, with >90% of simulations correctly classifying a simulated data-set with known prevalence. Misclassifying an HPV type of high prevalence as being of low prevalence, arguably the most serious kind of potential error, occurred < 0.05 times per 100 simulations. A much lower sample size (21–22 versus 40–48) was required to classify samples of high rather than low or moderate prevalence.
Truncated sequential sampling enables the proportion of CIN-3 due to an HPV type to be accurately classified using small sample sizes. Truncated sequential sampling should be used for type-specific HPV surveillance in the vaccination era.
Infection with human Papillomavirus (HPV) is recognised as the main cause worldwide of both cervical cancer [1-3] and its precursor lesions, cervical intraepithelial neoplasia grades 2 and 3 (CIN-2, 3) . In 2007, following the development of highly efficacious vaccines against oncogenic HPV subtypes 16 and 18 [1,5,6], Australia became the first country in the world to embark on a national vaccination program [7,8], and other countries have now begun various forms of program.
Ultimately, the success of such programs will be measured by the extent to which they reduce cancer incidence, but shorter term changes in benign and pre-cancerous lesions caused by HPV can be used to assess their impact . In Australia, reductions have been reported in the incidence of CIN-3,  and genital warts . Warts are caused by HPV types 6 and 11, which the quadrivalent vaccine protects against in addition to HPV 16 and 18 .
Key indicators in the early evaluation of an HPV vaccination program are provided by monitoring the distribution of HPV types. It is important both to track the expected decline in prevalence of the types that the vaccine protects against, and to monitor the prevalence of oncogenic types that are not the target of current vaccines . However, planning surveys of HPV type distribution is challenging because current prevalences are only known imprecisely, so sample size estimates cannot be made with any confidence. It is therefore difficult to plan logistical and resource requirements. In order to address this problem, we developed a sequential approach to survey design, with particular application to the monitoring of HPV type distribution in CIN-3.
The sequential sampling approach to monitoring the prevalence of particular characteristics in a population has been used extensively in agriculture and industry, as well as in public health, where lot quality assurance sampling methods are particularly important [11-18]. We developed an approach based on the truncated sequential sampling method used for monitoring drug resistance in HIV infection . Truncated sequential sampling differs from the Wald sampling method used in agriculture and industry by stopping (truncating) sampling when a classification of moderate prevalence can be made; normally, sequential sampling only allows classifications of high and/ or low prevalence [12,15,17]. The underlying strategy is to start with a limited number of specimens (Nmin), make an initial estimate of prevalence of the characteristic of interest (in this case, the proportion positive for a specific HPV type), and then continue to obtain specimens and recalculate the prevalence estimate until predetermined statistical criteria have been met.
The specific criteria are that the estimate can be classified as coming from a population with a low, moderate or high prevalence of an HPV type with an acceptably low probability of being a false positive or a false negative . Real world data are used to define thresholds used to classify prevalence as low, moderate or high . Stopping rules, based on these thresholds and the probability of a false positive or false negative are used to determine whether a classification of high or low prevalence can be made (sampling should stop) with an acceptable probability of error or whether sampling should continue. If a predefined maximum sample size is collected and a classification of high or low prevalence has not been made, prevalence is classified as moderate.(Nmax) Stopping rules are commonly represented visually as pairs of lines on a graph delineating an area which, if breached, mean that sampling should be stopped and a classification made (see Figure 1) .
Figure 1. Graphical representation of the truncated sequential sampling plan. Classifications of high and low prevalence are based on breaching the upper or lower stop lines after the starting sample size (Nmin) has been collected. Sampling continues when the proportion of the sample positive for an HPV type falls between the lines. A classification of moderate prevalence is made and sampling is stopped (truncated) when the maximum sample size (Nmax) has been collected.
In equations [1-3]α and β are the nominal probabilities of a false positive and false negative, p1 and p2 are thresholds defining low and high prevalence and q1 and q2 are equal to 1 - p1 and 1 - p2. The maximum sample size, Nmax, used for truncation is determined arbitrarily .
Applying the strategy to CIN-3
The sequential sampling approach can be used in any population, but in the context of evaluating the impact of HPV vaccination, it is most relevant for CIN-3. Monitoring HPV prevalence in normal cervical specimens, or lower grade CIN, is known to detect a high proportion of infections that are transient or lesions that will regress , and are therefore unlikely to cause cancer, while the time to develop invasive cancer is too long to detect changes due to vaccination.
We used thresholds of (a) less than 5% for an HPV type, designating low prevalence; ( b) 5-15% positive, designating moderate prevalence; and ( c) greater than 15% positive, designating high prevalence. In Australia only HPV 16 may occur in more than 15% of CIN-2 and 3; other types are evenly divided between those associated with ≤5% (HPV 52, 39, 68, 66 and 82) and 5 -15% (HPV 18, 33, 31, 58 and 73) prevalence in CIN-2 and 3 specimens . We employed nominal false positive (α) and false negative (β) probabilities of 0.01 and 0.025, exceeding commonly used standards of α = 0.05 (statistical significance) and 1-β = 0.8 (statistical power).
Based on prior experience in other fields, a starting sample size (Nmin) of 20 specimens was considered large enough to make a classification in many cases. The choice of starting sample size does not affect the eventual classification of prevalences . We used a maximum sample size for truncation of three times the minimum sample size (Nmax = 60). Myatt and Bennett  trialled maximum sample sizes of 25–60 when testing the truncated sampling method, and found that a sample size of 47 was adequate for a plan using similar thresholds, so a maximum sample size of 60 was expected to provide an appropriate balance between accuracy and sampling intensity.
Assessment of the sampling plan for CIN-3
We used simulation to assess how well the sequential sampling approach worked in classifying HPV types as follows. First, we generated synthetic data representing samples of CIN-3 cases, each with a different proportion positive for an HPV type). We generated 1,000 data sets under two different scenarios: one in which all prevalences between 0.00 and 1.00 were equally likely (uniform distribution); and another in which the most likely prevalence (mode) was 0.05, or 5%, with a linear decay to 0.00 at the lower limit and 1.00 at the upper limit (triangular distribution). The use of an upper limit of 1.00 is legitimate because very large proportions of some samples may be positive for high prevalence types – HPV 16 in particular has been detected in up to 95% of specimens in some studies . The use of a triangular distribution is intended to simulate the most frequent sampled value for the prevalence of an HPV type being low (0.05). At present most types other than HPV 16 and 18 occur in 0-10% of samples so this distribution might capture the most likely value for positivity to most types [2,3]. We then used the sampling plan to simulate the process of sequential sampling 10,000 times from each of the data sets. Each one of these 10,000 implementations of the sampling plan is referred to individually as a resembling iteration. The code for simulating the implementation of the sampling plan was originally written in R ( http://www.r-project.org webcite) but for this paper the code was translated into plain C for improved efficiency.
For each of the 1,000 data sets, we summarised the result of the 10,000 resampling iterations by first calculating the proportion of times the prevalence was classified as low, medium or high (designated PClow, PCmod and PChigh) and the sample size required to make each classification. PClow, PCmod and PChigh were then plotted against the true prevalence in each data set, to show the relationship between the probability of classifying a data-set as being either of low, moderate or high positivity and the true proportion positive in the data set. Variation in the sample size collected was summarised using tables. Graphs were created using SigmaPlot 8.0 (Systat, Richmond, CA, USA).
The rate at which incorrect classifications were made over all sampling iterations was also calculated. Incorrect classifications were sub-categorised as incorrect classifications of low, moderate or high prevalence. Gross misclassifications are those in which a sample with truly high prevalence is classified as having low prevalence, or vice versa . The rate at which each of these types of gross misclassification occurred was also recorded and the results tabulated.
Results and discussion
The sequential sampling strategy resulted in correct classification of HPV prevalence in > 90% of resampling iterations (see Table 1). Figure 2 shows that over 10,000 resampling iterations on a data-set in which the proportion positive for HPV type was 0.001 (i.e. clearly low prevalence) or 0.55 (clearly high prevalence), the sampling strategy always led to a correct classification. In a more difficult case, for example where the positive proportion was very close to the high prevalence threshold (0.149), the sequential strategy still performed very well, returning a classification of moderate prevalence on ~23% of occasions, of high prevalence on ~77% and of low prevalence almost never. The results shown in Figure 2 differed little whether the plan was employed on data-sets across which HPV positivity was distributed uniformly or where positivity was most frequently around 5% (triangular distribution). Therefore, for clarity of interpretation of Figure 2, only curves and scatter from the uniform distribution scenario are presented.
Table 1. Rate and type of incorrect classifications during all sampling iterations
Figure 2. Probability of classifying prevalence as low, moderate, or high over 10,000 resampling iterations over1,000 data sets with true prevalence (thexaxis) sampled from a uniform distribution. The dashed vertical lines represent the classification thresholds for low and high prevalence.
The most serious kind of gross misclassification is arguably when a plan classified a sample as having a “low” prevalence when it was in fact “high”. This type of error occurred very rarely (less than 0.05 times per 100 resampling iterations – see Table 1). Serious classification errors occurred more often under the scenario where prevalences were clustered around 0.05 (triangular distribution) than under the uniform scenario because a higher proportion of data sets had HPV positivity between 0 - 20% because of the peaked nature of the sampling distribution (35% vs.18%). These types of errors were still extremely rare however, also occurring less than 0.05 times per 100 resampling iterations (see Table 1).
A much higher sample size was required to classify samples with low prevalence than high prevalence (see Table 2). This is not unexpected given the negative intercept of the lower stop line (see Figure 1), which mandates that if prevalence is low a substantial number of samples are required to determine this. The sample size required to make a classification of moderate prevalence could obviously be reduced by arbitrarily lowering the value of Nmax, but little would be gained by doing so as it would result in more opportunities of misclassifying samples of low prevalence as having moderate prevalence (inspect Figure 1). These results suggest that the plan could be most efficiently employed by using it only to monitor HPV types that are more common.
Table 2. Mean sample size collected over 10,000 resampling bouts from 1,000 simulated data sets with uniformly distributed HPV positivity, and over 10,000 resampling bouts from 1,000 data sets with a triangular distribution of HPV positivity
In practical applications of this method, pathology services could prepare specimens from cervical samples and send them to a suitable reference laboratory . After analysis at the reference laboratory, proportions positive for given HPV types would be recorded; on reaching the starting sample size, the number positive for an HPV type would be compared to the stop chart in Figure 1, and a determination made about whether more specimens were required. Though a strict application of sequential sampling would require a single additional specimen be analysed if required, it would be more practical to increment by five. This would not fundamentally undermine the sequential sampling approach, but it could reduce its efficiency by resulting in more samples being analysed than required.
The use of the resampling method for validation has some affinities with how the plan might be used in reality, as the plan might well be employed multiple times on one sample to classify positivity of multiple HPV types without a loss of precision, providing the risk of infection with one HPV type is not affected by infection with another, as appears to be the case [20,21]. It should be noted that random sampling is not assumed or required in sequential sampling [14,15].
Truncated sequential sampling represents a practical scheme for conducting type-specific surveillance of HPV in pre-cancerous lesions. It eliminates the need to accurately calculate a priori sample size and makes no assumptions about current HPV type distribution. During resampling, it classified prevalence of an HPV type with > 90% accuracy. This is not unexpected as the plans utilised a far higher statistical power (nominal type II errors of 2.5%) than implied by the 20% type II error rate often used as a benchmark in sample size calculations. A much lower average sample size (21–22 versus up to 48) was required to accurately classify high prevalence samples. Truncated sequential sampling represents a practical, efficient method of conducting HPV type-specific surveillance in the absence of accurate, prior information about type specific prevalence, and is especially efficient at classifying the likely prevalence of more common HPV types.
Linkage Project (LP0883831) (DGR, DJP, AMAS, JK, and EKW) includes contributions from CSL Biotherapies Ltd (which distributes the quadrivalent HPV vaccine in Australia) and Victorian Cytology Service Inc, which monitors cervical screening in Victoria and the National HPV Vaccination Register. BD has received honoraria from CSL Biotherapies and is a member of an advisory board for GlaxoSmithKline and CSL Biotherapies. AJH has no conflicts of interest to declare.
EKW devised the research idea, conducted and analysed the computer simulations, and wrote the manuscript. AJH, AMAS, DJP and DGR assisted with the development of methods and edited the manuscript. JK and BD provided epidemiological advice on the biology of HPV infection and current surveillance protocols and edited the manuscript.
This paper was funded from the following sources: the Australian Government Department of Health and Ageing; Australian Research Council (ARC) Linkage Project (LP0883831) which included contributions from CSL Ltd and Victorian Cytology Service Inc. The views expressed in this publication do not necessarily represent the position of the Australian Government.
The authors would like to thank Dr Mark Myatt for providing useful feedback during the review of this manuscript.
The FUTURE I/II Study Group: Four year efficacy of prophylactic human Papillomavirus quadrivalent vaccine against low grade cervical, vulvar, and vaginal intraepithelial neoplasia and anogenital warts: randomised controlled trial.
Goldie SJ, Grima D, Kohli M, Wright TC, Weinstein M, Franco E: A comprehensive natural history model of HPV infection and cervical cancer to estimate the clinical impact of a prophylactic HPV-16/18 vaccine.
Villa LL, Costa RLR, Petta CA, Andrade RP, Paavonen J, Iversen O-E, Olsson S-E, Hoye J, Steinwall M, Riis-Johannessen G, Andersson-Ellstrom A, Elfgren K, von Krogh G, Lehtinen M, Malm C, Tamms GM, Giacoletti K, Lupinacci L, Railkar R, Taddeo FR, Bryan J, Esser MT, Sings HL, Saah AJ, Barr E: High sustained efficacy of a prophylactic quadrivalent human Papillomavirus types 6/11/16/18 L1 virus-like particle vaccine through 5 years of follow-up.
Brown DR, Kjaer SK, Sigurdsson K, Iversen O-E, Hernandez-Avila M, Wheeler CM, Perez G, Koutsky LA, Tay EH, Garcia P, Ault KA, Garland SM, Leodolter S, Olsson S-E, Tang GWK, Ferris DG, Paavonen J, Steben M, Bosch FX, Dillner J, Joura EA, Kurman RJ, Majewski S, Muñoz N, Myers ER, Villa LL, Taddeo FJ, Roberts C, Tadesse A, Bryan J, Lupinacci LC, Giacoletti KED, Sings HL, James M, Hesley TM, Barr E: The impact of quadrivalent human papillomavirus (HPV; Types 6, 11, 16, and 18) L1 virus-like particle vaccine on infection and disease due to oncogenic nonvaccine HPV types in generally HPV-naive women aged 16–26 years.
Donovan B, Franklin N, Guy R, Grulich AE, Regan DG, Ali H, Wand H, Fairley CK: Quadrivalent human Papillomavirus vaccination and trends in genital warts in Australia: analysis of national sentinel surveillance data.
Entomol Exp Appl 2009, 130:282-289. Publisher Full Text
World Health Stat Q 1991, 44:133-139. PubMed Abstract
World Health Stat Q 1991, 44:115-132. PubMed Abstract
Suppl J Royal Stat Soc 1946, 8:1-26. Publisher Full Text
Antivir Ther 2008, 13(Suppl 2):37-48. PubMed Abstract
Obstet Gynecol Surv 2009, 64:306-308. Publisher Full Text
Vaccarella S, Franceschi S, Snijders P, Herrero R, Meijer CJLM, Plummer M, the IARC HPV Prevalence Surveys Study Group: Concurrent infection with multiple human papillomavirus types: pooled analysis of the IARC HPV prevalence surveys.
Chaturvedi A, Katki H, Hildesheim A, Rodriguez AC, Quint W, Schiffman M, van Doorn L-J, Porras C, Wacholder S, Gonzalez P, Sherman ME, Herrero R: Human papillomavirus infection with multiple types: pattern of coinfection and risk of cervical disease.
The pre-publication history for this paper can be accessed here: