Open Access Open Badges Research article

Assessing risk of breast cancer in an ethnically South-East Asia population (results of a multiple ethnic groups study)

Fei Gao123*, David Machin5, Khuan-Yew Chow6, Yu-Fan Sim1, Stephen W Duffy7, David B Matchar34, Chien-Hui Goh1 and Kee-Seng Chia8

Author Affiliations

1 Division of Clinical Trials and Epidemiological Sciences, National Cancer Centre Singapore, 11 Hospital Drive, Singapore 169610

2 National Heart Centre Singapore, 17 Third Hospital Drive Avenue, Singapore 168752

3 Health Services & Systems Research, Duke-NUS Graduate Medical School, 8 College Road, Singapore, 169857

4 Department of Medicine, Duke University Medical Center, 2400 Pratt Street, Durham, NC, 27705, USA

5 Medical Statistics Unit, School of Health and Related Research, University of Sheffield, Regents Court, 30 Regent Street, Sheffield, S1 4DA, UK

6 National Registry of Diseases Office, Health Promotion Board, Ministry of Health, Singapore, 168937

7 Wolfson Institute of Preventive Medicine, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, Charterhouse Square, London, EC1M 6BQ, UK

8 Centre for Molecular Epidemiology, National University of Singapore, Singapore, 138671

For all author emails, please log on.

BMC Cancer 2012, 12:529  doi:10.1186/1471-2407-12-529

The electronic version of this article is the complete one and can be found online at:

Received:18 October 2011
Accepted:8 November 2012
Published:19 November 2012

© 2012 Gao et al.; licensee BioMed Central Ltd.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.



Gail and others developed a model (GAIL) using age-at-menarche, age-at-birth of first live child, number of previous benign breast biopsy examinations, and number of first-degree-relatives with breast cancer as well as baseline age-specific breast cancer risks for predicting the 5-year risk of invasive breast cancer for Caucasian women. However, the validity of the model for projecting risk in South-East Asian women is uncertain. We evaluated GAIL and attempted to improve its performance for Singapore women of Chinese, Malay and Indian origins.


Data from the Singapore Breast Screening Programme (SBSP) are used. Motivated by lower breast cancer incidence in many Asian countries, we utilised race-specific invasive breast cancer and other cause mortality rates for Singapore women to produce GAIL-SBSP. By using risk factor information from a nested case-control study within SBSP, alternative models incorporating fewer then additional risk factors were determined. Their accuracy was assessed by comparing the expected cases (E) with the observed (O) by the ratio (E/O) and 95% confidence interval (CI) and the respective concordance statistics estimated.


From 28,883 women, GAIL-SBSP predicted 241.83 cases during the 5-year follow-up while 241 were reported (E/O=1.00, CI=0.88 to 1.14). Except for women who had two or more first-degree-relatives with breast cancer, satisfactory prediction was present in almost all risk categories. This agreement was reflected in Chinese and Malay, but not in Indian women. We also found that a simplified model (S-GAIL-SBSP) including only age-at-menarche, age-at-birth of first live child and number of first-degree-relatives performed similarly with associated concordance statistics of 0.5997. Taking account of body mass index and parity did not improve the calibration of S-GAIL-SBSP.


GAIL can be refined by using national race-specific invasive breast cancer rates and mortality rates for causes other than breast cancer. A revised model containing only three variables (S-GAIL-SBSP) provides a simpler approach for projecting absolute risk of invasive breast cancer in South-East Asia women. Nevertheless its role in counseling the individual women regarding their risk of breast cancer remains problematical and needs to be validated in independent data.


The best-known statistical model available for predicting an individual woman’s chance of developing breast cancer is that derived using information from regularly screened Caucasian women from the USA participating in the Breast Cancer Detection Demonstration Project (BCDDP) [1]. This model uses age-at-menarche, age-at-birth of first live child, number of previous benign breast biopsy examinations, and number of first-degree-relatives with breast cancer as well as baseline age-specific breast cancer risks, to provide a predicted probability of invasive or in situ breast cancer development. Subsequently, the baseline hazard was modified using invasive breast cancer rates from the National Cancer Institute’s Surveillance, Epidemiology and End Results (SEER) program from 1983-7 to obtain the model we term GAIL [2].

GAIL is well calibrated among Caucasian women who received annual screening [1-3]. Although derived from a particular group of Caucasian women, GAIL also permits projections for women with differing characteristics including those of other ethnic groups. But, because of the wide variation in international breast cancer rates and the risk factors associated with breast cancer, GAIL may not always perform well [4-7]. For example, Kaur et al[5] concluded that GAIL only applied to their subpopulation of women who had received screening mammograms and is not readily applicable to all American-Indian and Alaska-Native women. Similar conclusions were found for women from the Czech Republic [6] and Italy [7].

Because breast cancer rates are higher for Caucasian than African-American women over 40 years, and the reverse for younger women, Gail et al[8] amended GAIL to account for this racial difference using data from African-American women participating in the Women’s Contraceptive and Reproductive Experiences (CARE) Study. Further this modified model, termed CARE by Gail et al[8], is more parsimonious in that age-at-birth of first live child and its interaction with the number of affected first-degree-relatives are no longer included. CARE fits the Women’s Health Initiative Studies [8] data well with 350 cases observed and 323 expected but under predicts risk in African-American women with previous breast biopsy examinations.

Breast cancer rates are increasing throughout Asia and it is the leading cancer among Singaporean women [9], although the incidence rate is markedly lower than that for Caucasian women with a different etiology, particularly an earlier age-of-onset. It is also likely that only a small proportion of Asian women have received regular mammograms based on coverage of available screening programs. Thus, it is important to recognize limitations of the breast cancer prediction models when counseling women of different ethnic groups. The aim of this paper is to examine and modify models of 5-year invasive breast cancer risk in participants of the Singapore Breast Screening Programme (SBSP) to account specifically for women of Chinese, Malay and Indian origins.


Components of the Gail Models

To estimate the probability of invasive breast cancer in women using Gail models several components for the calculations have to be determined, the values of which depend on the specific women concerned. If, apart from those aged 0-19 years and greater than 85 years, the age range is divided into 15 equal divisions of 5 years, with the end of the age-group j indexed by τj (Table  1) then, for an individual at age a=τj-1 within a particular relative risk, rij, the probability of developing breast cancer by age a+5 is given, following Gail (1989, Equation 6) [1], by

<a onClick="popup('','MathML',630,470);return false;" target="_blank" href="">View MathML</a>


where i refers to the binary split of the current age (AGECAT=0,1) at 50 years.

Table 1. Age-specific breast cancer incidence rates per 100,000 women years, B j and competing mortality rates per 100,000 women years, c j established for Caucasian women in the USA when developing GAIL and the comparative values for Singapore as a whole, and for the three main ethnic groups used in risk calculation of GAIL-SBSP and modified GAIL-SBSP

In a USA context, the important risk factors, and their category weightings, for the development of relative risk, rij, include the current age, age-at-menarche (AGEMEN), age-at-first-live-birth (AGEFLB) (nulliparous coded 2), number of first-degree-relatives with breast cancer (NUMREL), and number of previous benign breast biopsies (NBIOPS), presence of atypical hyperplasia (ATYPICAL) and interaction terms (AGEFLB×NUMREL and NBIOPS×AGECAT) [1,2].

In equation (1), bj=Bj[1–AR(AGECAT)] is the baseline age-specific composite breast cancer rate for age-group j, Bj is the age-specific breast cancer incidence rate and AR is the attributable risk in the broader age category within which j falls. When developing GAIL, AR for the USA Caucasian population was found to be approximately constant in those less than 50 years at AR−49=0.4771, and for older women at AR50+=0.4736.

For an individual in risk group i of age a=τj-1, the probability of remaining breast cancer free up to the age, τj, is estimated by Sj)=Sj − 1)exp(−bjrijΔ). In addition, the age-specific hazard cj of dying of other causes is assumed to be the same for all subjects in the age-group j. The probability of surviving competing risks up to the end of the age-group j, τj, is estimated by C(τj)=C(τj − 1)exp(−cjΔ), where C(0)=1.

The Fortran program BCPTCARE of the National Cancer Institute calculates equation (1), for given values rij, by combining these with data providing information on a, Bj and cj.

Data sources


SBSP recruited 29,193 female permanent residents and citizens of Singapore, including 24,609 ethnically Chinese, 1,630 Malay and 1,434 Indian, from 01 October 1994 to 28 February 1997. Women were eligible with no previously diagnosed cancers (except non-melanoma of the skin), no mammography within the past year or biopsy within the last 6 months. Prior to mammography, all attendees completed a questionnaire including demographics; reproductive and family histories; smoking; and menopausal hormone therapy use [10,11].

Included in the risk evaluation are those who were disease-free (including 33 in situ) at the time of breast cancer screening. In order to focus on incident breast cancer, women were included only if they were followed-up to be alive without disease (5 in situ) for the next 3 years. As the prevalent breast cancers are not included, the study women have a lower absolute risk than the general female population [12]. Thus the 'clock' was started 3 years from the date of their negative screen and, amongst these women, those who developed invasive breast cancer in the following 4-8 year period are the designated cases. Any women with in situ disease who then developed an invasive cancer were considered as invasive in the year of this latter diagnosis. Women with unknown age-at-menarche or date-of-diagnosis were excluded.

The study was approved by the Singhealth Institutional Review Board (2008/468/B) and National Cancer Centre Institutional Review Board (NC08-041). As this was a large population based study, with full anonymity of all data, direct consent from the participants is waivered.

Nested case-control study

To study risk factors for breast cancer, a nested case-control study was conducted within SBSP in 2006 [13]. Women who were screened-positive (including those with in situ disease) or developed invasive breast cancer before 2006 were defined as case patients. Control subjects were selected from those who did not have a breast cancer diagnosis at the time of study. These were matched to cases by 5-year at age-at-entry groups and calendar year of entry into the SBSP program and ethnicity. Data from these women was used to build a model to project absolute invasive breast cancer risk.

Follow-up and breast cancer ascertainment

Breast cancer incidence and death status for SBSP participants were notified as either detected through SBSP or subsequently through record linkage with the Singapore Cancer Registry (SCR). All whose death status was not captured were assumed alive at 1 March 2009. SCR includes all cases of cancer occurring in citizens and permanent residents (population near 5 million) between 1968 and 2008. Annual invasive breast cancer cases and annual non-breast cancer deaths are obtained from SCR and annual population numbers from Singapore Resident Population report (2003-2007) [14]. These were used to calculate average race-specific estimates of Bj, and cj for the period 2003-2007 (Table  1, Singaporean).

Statistical analyses

To estimate the probability of invasive breast cancer for a different population one can assume that all the components necessary are already contained in GAIL. That is, the regression coefficients, β, (Table  2, BCDDP), together with Bj and cj, remain as those specified when formulating that model (Table  1, Caucasian women). The calculated probabilities can be applied to an age-specific group of interest to provide the expected number of cases, E. This can then be compared to the actual number of cases observed, O. A ratio of E/O = 1 indicating perfect agreement within that age category. The corresponding 95% confidence interval (CI) is: <a onClick="popup('','MathML',630,470);return false;" target="_blank" href="">View MathML</a>[2,3]. If k age categories are concerned then, under the null hypothesis, ∑(OE)2/E follows a χ2 distribution with k degrees of freedom [8].

Table 2. Breast cancer risk factors, and associated regression coefficients (β), used in developing the alternative Gail-based models

In a preliminary investigation, we noted that applying GAIL unchanged to Singaporean women substantially overestimated the number of invasive breast cancer cases: (O=241, E=401.54, E/O=1.67). Consequently our study aim was modified, to one investigating features of the local population which might influence the ultimate predictions while preserving the Gail-based modelling approach.

Breast cancer rates and competing mortality rates are much lower for Singaporeans of all ethnic groups than Caucasians in those over the age of 30; particularly so in those > 70 years (Table  1, Singaporean). To take into account such differences, which may influence the expected value E[7,8,15], we formulated GAIL-SBSP using the regression coefficients and AR derived from the BCDDP cohort (Table  2) combined with the average race-specific estimates of Bj, and cj for the period 2003-2007 for Singaporean women (Table  1).

In validating GAIL-SBSP we were unable to classify SBSP participants with respect to either history of previous benign biopsy or atypical hyperplasia status. Consequently those ever having previous benign breast biopsies [11] were categorized as a single biopsy, and atypical hyperplasia was categorized as unknown.

As it is known that the etiological factors for breast cancer vary according to, for example, ethnicity and/or geographical location of the women, the risk factors concerned and/or their weightings in the established Gail model may require some modification. To explore this, we used subjects from the nested case-control study. Relative odds were obtained by use of multiple logistic regression with the same independent variables and coding as GAIL-SBSP (Table  2). A simplified model (S-GAIL-SBSP) with only three variables – age-at-menarche, age-at-birth of first live child and number of first-degree-relatives with breast cancer was identified. In contrast, in order to explore whether adding other risk factors could predict invasive breast cancer with improved accuracy, the extended model (E-GAIL-SBSP) added ethnicity (ETHNICITY), parity (PARITY), smoking (SMOKING), body mass index (BMI), use of hormonal replacement therapy (HRT), use of oral contraception (OC) and waist-to-hip ratio (WHR). The body mass index was categorized as <23.0, 23.0–27.4 and ≥27.5 kg/m2 (coded as 0, 1 or 2) following the World Health Organization (WHO) guideline for Asian populations [16]. To avoid missing any potentially important predictors P<0.05 was used for statistical significance to select variables for multivariate modeling. Finally E-GAIL-SBSP was created by taking only those variables with prognostic significance into the model.

Model discriminatory accuracy was measured by the age-specific concordance statistic, using a logistic regression model of breast cancer status on the estimated risks. Thus each model was assessed by use of the area under the receiver operating characteristics curve (AUC) created by computing sensitivity and specificity [17]. The CI was based on the standard normal approximation. The average of the age-specific concordances used weights proportional to the number of women in each age group [18]. The variance for the average age-specific concordance was the sum, over the age groups, of the weight squared multiplied by the estimated variance of the age-specific concordance estimate. Age-groups with no cases are excluded from the calculations.



Of the 29,193 women in SBSP, 28,883 were available for the 5-year risk assessment. A total of 241 invasive cases were observed and these are categorized by ethnicity and age-group together with the numbers predicted by GAIL-SBSP (Table  3). In total GAIL-SBSP predicted 241.83 cases (E/O=1.00, CI=0.88 to 1.14) – suggesting good model calibration (goodness-of-fit, P=0.957). This satisfactory prediction was also seen within all age groups (goodness-of-fit, P=0.092). This agreement was reflected in Chinese and Malay, but not in the relatively few Indian women as 17 cases were observed and only 10.16 predicted (E/O=0.60, CI=0.37 to 0.96).

Table 3. Comparison of the expected cases (E) of invasive breast cancer predicted by each respective model, to the observed cases (O) in the Singapore Breast Screening Programme (SBSP) cohort for each ethnic group

In general, predictions were good amongst the various risk categories (Table  4). However, among women who had two or more first-degree-relatives with breast cancer, the numbers were under predicted (E/O=0.18, CI=0.04 to 0.71) while for those with no history there was excellent calibration (E/O=1.02, CI=0.90 to 1.16).

Table 4. Comparison of the expected cases (E) of invasive breast cancer predicted by each respective model, to the observed cases (O) in the Singapore Breast Screening Programme (SBSP) cohort by risk factor category


To estimate the relative risk function, we analyzed 439 invasive breast cancer cases (121 diagnosed at screening and 318 subsequently) and 1,198 controls from the nested case-control study (Table  5). As far as possible, those risk factors identified for GAIL were initially used to estimate the regression coefficients which were reported in Table  2 (GAIL-SBSP (FULL)). Using the same model structure there are some substantial differences, and a good deal of instability when estimating the interaction terms, as compared to those derived for GAIL. As a consequence, the simplified model S-GAIL-SBSP including only age-at-menarche, age-at-birth of first live child and number of first-degree-relative with breast cancer to obtain the relative risks (RR) was derived (Table  2, S-GAIL-SBSP). The corresponding RRs for each of the risk categories are given in Table  6 where they are compared with those used in GAIL (Table  6, BCDDP). Omitting age and number of previous benign breast biopsies and the interactions did not degrade the fit of the model (P=0.359).

Table 5. Baseline characteristics and odds ratios (OR) of invasive breast cancer in the nested case-control study

Table 6. Prevalence of breast cancer risk factors in the nested case-control study within Singapore Breast Screening Programme (SBSP) cohort and relative risks (RR) from SBSP and the Breast Cancer Detection and Demonstration Project (BCDDP)

The differences in the RRs between the nested case-control study and BCDDP are largest only in the groups where the number of first-degree-relatives with breast cancer is two or more.

The simplified model with only three variables – age-at-menarche, age-at-birth of first live child and number of first-degree-relative with breast cancer utilized the modified RRs and predicted 241.80 cases (E/O=1.00, CI=0.88 to 1.14) (Table  3). The satisfactory prediction was seen in all ethnic groups although among Indian women, as was the case for GAIL-SBSP, the calibration was not entirely consistent across all age-groups.

Again similar to GAIL-SBSP, S-GAIL-SBSP predictions were relatively close amongst the various risk categories (Table  4). However, the model underestimated the observed incidence of breast cancer for women who had a biopsy although this was not statistically significant (E/O=0.67, CI=0.44 to 1.01) while for those without a biopsy there was a very good calibration (E/O=1.04, CI=0.91 to 1.19).


To determine whether other risk factors could improve S-GAIL-SBSP performance, the effects of the Gail model risk factors were re-estimated in a multiple logistic regression that used subjects from the nested case-control study. We expanded the model by including ethnicity, parity, smoking, BMI, use of hormonal replacement therapy, use of oral contraception and waist-to-hip ratio to estimate the regression coefficients (Table  5). In addition to age-at-menarche, age-at-birth of first live child and number of first-degree-relatives with breast cancer, both parity and BMI were significantly associated with the probability of invasive breast cancer.

Following Gail et al[8], the ARs necessary to convert Singaporean age-specific invasive breast cancer rates to baseline rates for SBSP women were calculated. In order to match the follow-up period of the SBSP participants, this was based on invasive cases diagnosed over 1993-2002 [9,19] and over 2003-2007 [SCR unpublished]. Estimates of AR were 0.5356 for those younger than 50 years and 0.5397 for older women. These ARs and modified RRs were used to re-evaluate equation (1) and hence formulated E-GAIL-SBSP together with Singapore race-specific estimates of Bj and cj (Table  1).

Overall, E-GAIL-SBSP predicted 289.47 cases (E/O=1.20; CI=1.06 to 1.36) and so was not able to satisfactorily capture the number of cases among the various risk categories (goodness-of-fit, P=0.004). Moreover, E-GAIL-SBSP statistically significantly underestimated the number of cases among women with two or more first-degree-relatives with breast cancer (E/O=0.24, CI=0.06 to 0.97).

Comparison between the three models

In the calibration, the SBSP participants were divided into deciles of 5-year invasive breast cancer risks predicted by GAIL-SBSP, S-GAIL-SBSP and E-GAIL-SBSP, respectively. These predicted rates were compared with those observed in Figure  1. Thus, for example, GAIL-SBSP under predicted in the seventh, ninth and tenth deciles, there were generally closer predictions with S-GAIL-SBSP except in the sixth decile, and a considerable over prediction in the tenth decile with E-GAIL-SBSP. Clearly the addition of BMI and parity to E-GAIL-SBSP did not materially improve calibration.

thumbnailFigure 1. Comparison of observed breast cancer risk with that predicted by each model. The horizontal axis shows the grouping by deciles of risk and the vertical axis the observed and corresponding predicted risk.

The unweighted average concordance statistics were very similar, and not statistically significantly different, with AUC = 0.6098 (CI=0.57 to 0.65), 0.5997 (CI=0.56 to 0.64), and 0.6162 (CI=0.58 to 0.65) for GAIL-SBSP, S-GAIL-SBSP and E-GAIL-SBSP, respectively (Figure  2). In addition, the estimated age-specific AUC of S-GAIL-SBSP for the intervals from 50 to 74 years were modest except in the oldest group; specifically, 0.5766 (CI=0.53 to 0.62) for those aged 50 - 59, 0.5838 (CI=0.53 to 0.64) for 60 - 69 years, and 0.8938 (CI=0.70 to 1.00) for 70 - 74 years.

thumbnailFigure 2. Receiver operating characteristic and corresponding area-under-the-curve (AUC) for the breast cancer risks predicted by GAIL-SBSP, S-GAIL-SBSP and E-GAIL-SBSP, respectively.


Although first developed by MH Gail and his associates [1] some 25 years ago, the GAIL model continues to play an important role in predicting the 5-year risk of invasive breast cancer. Thus Schonfeld et al[15] have shown that GAIL remains well calibrated in more recent cohorts. However, although refinements have been made, application of the underlying methodology to non-Caucasian women has been limited and suggests that further modification may need to be made for its use in women of other ethnic groups. Singapore has a population which is predominantly of Chinese ethnicity but also with those of Malay and Indian descent and has also completed a large mammographic screening study involving 29,193 randomly selected women follow-up from which enables invasive breast cancer rates to be determined. Thus the very different breast cancer rates, and etiological risk factors varying in their presence and magnitude when compared to other populations, enables the GAIL model itself to be tested and variants (if relevant) to be established.

Retaining the GAIL model structure, but applying Singapore national and race-specific invasive breast cancer and other cause mortality rates, to develop GAIL-SBSP resulted in absolute risk projections that worked reasonably well as assessed by the comparison of observed and expected cases across all age groups and amongst the majority of risk categories (Tables  3 & 4). In total 241 cases were recorded while GAIL-SBSP predicted 241.83 (E/O=1.00, CI=0.88 to 1.14). However, the model under predicted for the very few women who had two or more first-degree-relatives with breast cancer. Prediction was satisfactory for Chinese women but over-predicted for the Malay and Indian women. Since these latter groups each comprise of only 5% of the population studied, the accuracy of these specific predictions requires further investigation in residents of the Malaysian Peninsula and the Indian Sub-Continent. Nevertheless, the results for the Chinese women suggests the potential that GAIL can be improved for South-East Asian populations by using local (and/or updated) estimates of incidence and competing mortality rates.

Although the performance of GAIL-SBSP is in general satisfactory, at least amongst Singapore-Chinese women, one might anticipate that taking into account implications of the different health systems and etiological factors may produce improved prediction. For example, only 10% of the subjects in the SBSP cohort had had a mammogram in the previous year and this may have an impact on the apparent natural history of disease. More generally, women from much of the Asia-Pacific region do not receive regular mammogram screening [20]. Also, some factors included in GAIL may have different consequences in an Asian-Pacific population due to genetic predisposition, geographic, or other influences. To explore these aspects we initially used the same risk factors and coding that were in the original model of GAIL [1,2] to estimate relative risks with subjects from a case-control study, nested within the SBSP cohort (GAIL-SBSP (FULL)) and to compare these estimates with those from the BCDDP Table  2. Finally we derive a simplified model (S-GAIL-SBSP) with fewer risk factors and also an extended one (E-GAIL-SBSP) with additional risk factors included both incorporating local baseline race-specific breast cancer and other mortality rates.

S-GAIL-SBSP, which included age-at-menarche, age-at-birth of first live child and number of first-degree-relatives with breast cancer was well calibrated in the total SBSP cohort and across most subgroups (Tables  3 & 4). It was not surprising that ‘ever having previous benign breast biopsy’ was not included in this revised prediction for Singaporean women as this reflects a specific health care delivery system in which biopsies were not common as is the case for the majority of Asian women (including Chinese-Occidental migrants) [20,21] although an increasing proportion of these women now receive mammographic screening [22,23]. Other evidence for the use of simpler, but more targeted, models has been provided by predictions of estrogen receptor-positive breast cancer in postmenopausal women in the USA [24]. However, the concordance from GAIL-SBSP (AUC = 0.61) is relatively low. This is similar to previous validation studies in non-Asian populations which have recorded an AUC between 0.56 and 0.60 for the GAIL model for Caucasian women [3], for the modified GAIL (CARE) an average age-specific AUC of 0.555 in African-American women [8], and 0.614 in Asian- and Pacific-Islander-American women [25]. Thus a good model with a higher discriminatory accuracy, in addition to good calibration, is needed [26]. Unfortunately, given the low relative risks associated with most established non-modifiable breast cancer risk factors, it is unlikely that any prediction model will have a much higher discriminatory accuracy [3,27].

The modest concordance suggests that additional factors prognostic for outcome may be required. In this respect we found that of BMI and parity were independent predictors of risk. Thus their inclusion in E-GAIL-SBSP, with modified AR and RRs derived from the SBSP cohort, marginally improved the discriminatory power (AUC = 0.62) but overestimated the predicted breast cancer cases substantially in, for example, the highest decile (Figure  1). One reason for this is that the ARs calculated may be inappropriate, possibly due to the true risk factor prevalence by age not taking a binary form with a cut at 50 years and/or influencing Asian women in different ways. Using the original AR values of GAIL made no improvement. Also overestimates may be a consequence of over-fitting a model with many risk factors based on only a modest number of cases and controls [28]. Similar mixed results were observed when the use of hormonal replacement therapy, oral contraception, smoking and waist-to-hip ratio were investigated (unreported analyses).

It has been suggested that the addition of mammographic breast density could provide improved discriminatory power for the GAIL model for Caucasian women [29,30] as the density is associated with an increased risk of breast cancer [13,31]. Further women with Tabar IV [32] parenchymal patterns amongst the SBSP cohort also have a significant higher risk of breast cancer when compared to those with the remaining patterns (odds ratio=2.30, CI=1.14 to 4.63). However those screened for this in the SBSP study were too few in number for us to validate any model that incorporates this risk factor.

The GAIL-SBSP, S-GAIL-SBSP and E-GAIL-SBSP models should always be applied with caution or avoided for certain specific populations as is true for GAIL itself. For example, although large SBSP was essentially confined (95%) to those of 50 and more years, they are applicable to younger women but further validation is needed. Further, we started the “clock” three years after negative screening which implies the SBSP based models are pertinent to women thought to be free of breast cancer. A woman who has just had a negative breast examination and mammogram, as the United Kingdom breast screening programme has shown, has about one third the absolute risk of breast cancer in the following three years [33]. Nevertheless, to establish the risks definitively, validation studies from a more representative sample of South-East Asian women in regular follow-up are required.

The strengths of this study include the use of a predominately postmenopausal group of women from three ethnic groups, Chinese, Malay and Indian, drawn from a very large screening program in which more than 29,000 women were randomly chosen to participate. Further, since neither Asian- or Chinese-Occidental (born in or migrated to the West) women were systematically included in the development of the Gail-based models [1,2]. We believe this is the first attempt to validate and modify the basic GAIL model to ethnically diverse women living in an Asian region. Earlier studies [21] have explored those of Chinese-Occidental origin who had migrated when aged less than 21 years (N=216) or had been residents in a western country for 10 years or less (N=421) [21] and Asian- and Pacific-Islander-American women [25].

Limitations of using SBSP data for individual absolute risk predictions include the inability of the nested case-control study to estimate elaborate models with sufficient precision. Also our validation data included relatively small numbers of breast cancer cases, especially amongst Singapore-Malay and -Indian women and those with two or more affected first-degree relatives. Furthermore, as with retrospective studies in general, the level of ascertainment of incident cases is of concern. However, Singapore is a small island where all citizens and permanent residents are registered in the population registry with a unique registration number. Also cancer notification is mandatory and this enables near complete ascertainment of breast cancer incidence by linkage of SBSP participants with the Singapore Cancer Register database.


In conclusion, we found that among South-East Asian postmenopausal women, the GAIL type model could be refined using race-specific estimates of invasive breast cancer incidence and other cause mortality rates. A model which includes age-at-menarche, age-at-birth of first live child and number of first-degree-relatives with breast cancer appears to provide a simpler approach for projecting absolute risk of invasive breast cancer in South-East Asia women. Nevertheless its role in counseling the individual women regarding their risk of breast cancer remains problematical and needs to be validated in independent data.

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

FG, DM, SWD, DBM, KSC conceived the study and participated in its design. FG, DM, DBM, CHG, YFS participated in data analysis and interpretation. FG, DM, KYC, YFS, CHG drafted the manuscript. All authors read and approved the final manuscript.


The authors thank Ng Eng Heng who successfully conducted the SBSP and Bee Guat Lee and Anthony Wong from National Registry of Disease Office (NRDO) for their invaluable assistance in the breast carcinoma incidence identification and survival status updates.

The authors also thank M H Gail and other workers for releasing FORTRAN code (BCPTCARE.for) in April 27, 2009.


The study was supported by a research grant from the National Medical Research Council Singapore (NMRC/NIG/0048/2008).


  1. Gail MH, Brinton LA, Byar DP, Corle DK, Green SB, Schairer C, Mulvihill JJ: Projecting individualized probabilities of developing breast cancer for white females who are being examined annually.

    J Natl Cancer Inst 1989, 81:1879-1886. PubMed Abstract | Publisher Full Text OpenURL

  2. Costantino JP, Gail MH, Pee D, Anderson S, Redmond CK, Benichou J, Wieand HS: Validation studies for models projecting the risk of invasive and total breast cancer incidence.

    J Natl Cancer Inst 1999, 91:1541-1548. PubMed Abstract | Publisher Full Text OpenURL

  3. Rockhill B, Spiegelman D, Byrne C, Hnter DJ, Colditz GA: Validation of the Gail et al. Model of Breast Cancer Risk Prediction and Implications for Chemoprevention.

    J Natl Cancer Inst 2001, 93:358-366. PubMed Abstract | Publisher Full Text OpenURL

  4. Spiegelman D, Colditz GA, Hunter D, Hertzmark E: Validation of the Gail et al. model for predicting individual breast cancer risk.

    J Natl Cancer Inst 1994, 86:600-607. PubMed Abstract | Publisher Full Text OpenURL

  5. Kaur JS, Roubidoux MA, Sloan J, Novotny P: Can the Gail Model be useful in American Indian and Alaska Native populations.

    Cancer 2004, 100:906-912. PubMed Abstract | Publisher Full Text OpenURL

  6. Novotny J, Pecen L, Petruzelka L, Petruzelka L, Svobodnik A, Dusek L, Danes J, Skovajsova M: Breast cancer risk assessment in the Czech female population – an adjustment of the original Gail model.

    Breast Cancer Res Treat 2006, 95:29-35. PubMed Abstract | Publisher Full Text OpenURL

  7. Boyle P, Mezzetti M, Vecchia CL, Franceschi S, Decarli A, Robertson C: Contribution of three components to individual cancer risk predicting breast cancer risk in Italy.

    Eur J Cancer Prev 2004, 13:183-191. PubMed Abstract | Publisher Full Text OpenURL

  8. Gail MH, Costantino JP, Pee D, Bondy M, Newman L, Selvan M, Anderson GL, Malone KE, Marchbanks PA, McCaskill-Stevens W, Norman SA, Simon MS, Spirtas R, Ursin G, Bernstein L: Projecting individualized absolute invasive breast cancer risk in African American women.

    J Natl Cancer Inst 2007, 99:1782-1792. PubMed Abstract | Publisher Full Text OpenURL

  9. Seow A, Koh WP, Chia KS, Shi LM, Lee HP, Shanmugaratnam K:

    Trends In Cancer Incidence in Singapore 1968 – 2002: Singapore Cancer Registry Report No. 6. 2004. PubMed Abstract | Publisher Full Text OpenURL

  10. Ng EH, Ng FC, Tan PH, Ng EH, Ng FC, Tan PH, Low SC, Chiang G, Tan KP, Seow A, Emmanuel S, Tan CH, Ho GH, Ng LT, Wilde CC, Singapore Breast Cancer Screening Project Working Committee, Ministry of Health, Singapore, et al.: Results of intermediate measures from a population-based, randomized trial of mammographic screening prevalence and detection of breast carcinoma among Asian women - The Singapore Breast Screening Project.

    Cancer 1998, 82:1521-1528. PubMed Abstract | Publisher Full Text OpenURL

  11. Ng EH, Gao F, Ji CY, Ho GH, Soo KC: Risk factors for breast carcinoma in Singaporean Chinese Women: the role of central obesity.

    Cancer 1997, 80:725-731. PubMed Abstract | Publisher Full Text OpenURL

  12. Gao F, Chia KS, Ng FC, Ng EH, Machin D: Interval cancers following breast cancer screening in Singaporean women.

    Int J Cancer 2002, 101:475-479. PubMed Abstract | Publisher Full Text OpenURL

  13. Wong CS, Lim GH, Gao F, Jakes RW, Offman J, Chia KS, Duffy SW: Mammographic density and its interaction with other breast cancer risk factors in an Asian population.

    Br J Cancer 2011, 104:871-4. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  14. Singapore Department of Statistics:

    Singapore Resident Population, 2003 – 2007 (February 2008).

    Available from webcite

    PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  15. Schonfeld SJ, Pee D, Greenlee RT, Hartge P, Lacey JV Jr, Park Y, Schatzkin A, Visvanathan K, Pfeiffer RM: Effect of changing breast cancer incidence rates on the calibration of the Gail model.

    J Clin Oncol 2010, 28:2411-2417. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  16. WHO expert consultation: Appropriate body-mass index for Asian populations and its implications for policy and intervention strategies.

    Lancet 2004, 363(9403):157-163. PubMed Abstract | Publisher Full Text OpenURL

  17. Gail MH, Pfeiffer RM: On criteria for evaluating models of absolute risk.

    Biostatistics 2005, 6:227-239. PubMed Abstract | Publisher Full Text OpenURL

  18. Decarli A, Calza S, Masala G, Specchia C, Palli D, Gail MH: Gail Model for prediction of absolute risk of invasive breast cancer: independent evaluation in the Florence-European prospective investigation into cancer and nutrition cohort.

    J Natl Cancer Inst 2006, 98:1686-1693. PubMed Abstract | Publisher Full Text OpenURL

  19. Chia KS, Seow A, Lee HP, Shanmugaratnam K:

    Cancer Incidence in Singapore 1993-1997. Singapore Cancer registry Report No. 5. 2000. OpenURL

  20. Kwong A, Cheung PS, Wong AY, Hung GT, Lo G, Tsao M, Chan EW, Wong T, Ma M: The acceptance and feasibility of breast cancer screening in the East.

    Breast 2008, 17:42-50.

    Epub 2007 Aug 27

    PubMed Abstract | Publisher Full Text OpenURL

  21. Tam CY, Martin LJ, Hislop G, Hanley AJ, Minkin S, Boyd NF: Risk factors for breast cancer in postmenopausal Caucasian and Chinese-Canadian women.

    Breast Cancer Res 2010, 12:R2.

    Epub 2010 Jan 6

    PubMed Abstract | BioMed Central Full Text | PubMed Central Full Text OpenURL

  22. Yeoh KG, Chew L, Wang SC: Cancer screening in Singapore, with particular reference to breast, cervical and colorectal cancer screening.

    J Med Screen 2006, 13(Suppl 1):S14-S19. PubMed Abstract OpenURL

  23. Shin HR, Joubert C, Boniol M, Hery C, Ahn SH, Won YJ, Nishino Y, Sobue T, Chen CJ, You SL, Mirasol-Lumague MR, Law SC, Mang O, Xiang YB, Chia KS, Rattanamongkolgul S, Chen JG, Curado MP, Autier P: Recent trends and patterns in breast cancer incidence among Eastern and Souteastern Asian women.

    Canc Causes Contr 2010, 21:1777-1785. Publisher Full Text OpenURL

  24. Chlebowski RT, Anderson GL, Lane DS, Aragaki AK, Rohan T, Yasmeen S, Sarto G, Rosenberg CA, Hubbell FA: Women's Health Initiative Investigators. Predicting risk of breast cancer in postmenopausal women by hormone receptor status.

    J Natl Cancer Inst 2007, 99:1695-1705. PubMed Abstract | Publisher Full Text OpenURL

  25. Matsuno RK, Costantino JP, Ziegler RG, Anderson GL, Li H, Pee D, Gail MH: Projecting Individualized Absolute Invasive Breast Cancer Risk in Asian and Pacific Islander American Women.

    J Natl Cancer Inst 2011, 103:951-961. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  26. Gail MH: Personalized estimates of breast cancer risk in clinical practice and public health.

    Stat Med 2011, 30:1090-1104. PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

  27. Bondy ML, Newman LA: Assessing breast cancer risk: evolution of the Gail Model.

    J Natl Cancer Inst 2006, 98:1172-1173. PubMed Abstract | Publisher Full Text OpenURL

  28. Steyerberg EW, Vickers AJ, Cook NR, Gerds T, Gonen M, Obuchowski N, Pencina MJ, Kattan MW: Assessing the performance of prediction models: a framework for traditional and novel measures.

    Epidemiology 2010, 21:128-138. PubMed Abstract | Publisher Full Text OpenURL

  29. Chen J, Pee D, Ayyagari R, Graubard B, Schairer C, Byrne C, Benichou J, Gail MH: Projecting absolute invasive breast cancer risk in white women with a model that includes mammographic density.

    J Natl Cancer Inst 2006, 98:1215-1226. PubMed Abstract | Publisher Full Text OpenURL

  30. Barlow WE, White E, Ballard-Barbash R, Vacek PM, Titus-Ernstoff L, Carney PA, Tice JA, Buist DS, Geller BM, Rosenberg R, Yankaskas BC, Kerlikowske K: Prospective breast cancer risk prediction model for women undergoing screening mammography.

    J Natl Cancer Inst 2006, 98:1204-1214. PubMed Abstract | Publisher Full Text OpenURL

  31. Boyd NF, Guo H, Martin LJ, Sun L, Stone J, Fishell E, Jong RA, Hislop G, Chiarelli A, Minkin S, Yaffe MJ: Mammographic density and the risk and detection of breast cancer.

    N Engl J Med 2007, 356:227-236. PubMed Abstract | Publisher Full Text OpenURL

  32. Jakes RW, Duffy SW, Ng FC, Gao F, Ng EH: Mammographic parenchymal patterns and risk of breast cancer at and after a prevalence screen in Singaporean women.

    Int J Epidemiol 2000, 29:11-19. PubMed Abstract | Publisher Full Text OpenURL

  33. Bennett RL, Sellars SJ, Moss SM: Interval cancers in the NHS breast cancer screening programme in England, Wales and Northern Ireland.

    Br J Cancer 2011, 104:571-577.

    Epub 2011 Feb 1

    PubMed Abstract | Publisher Full Text | PubMed Central Full Text OpenURL

Pre-publication history

The pre-publication history for this paper can be accessed here: