BMC Medical Research Methodology - Latest Articles
http://www.biomedcentral.com/bmcmedresmethodol/
The latest research articles published by BMC Medical Research Methodology2015-05-01T00:00:00Z Efficiency and effectiveness evaluation of an automated multi-country patient count cohort system Background:
With the increase of clinical trial costs during the last decades, the design of feasibility studies has become an essential process to reduce avoidable and costly protocol amendments. This design includes timelines, targeted sites and budget, together with a list of eligibility criteria that potential participants need to match.The present work was designed to assess the value of obtaining potential study participant counts using an automated patient count cohort system for large multi-country and multi-site trials: the Electronic Health Records for Clinical Research (EHR4CR) system.
Methods:
The evaluation focuses on the accuracy of the patient counts and the time invested to obtain these using the EHR4CR platform compared to the current questionnaire based process. This evaluation will assess the patient counts from ten clinical trials at two different sites. In order to assess the accuracy of the results, the numbers obtained following the two processes need to be compared to a baseline number, the “alloyed” gold standard, which was produced by a manual check of patient records.
Results:
The patient counts obtained using the EHR4CR system were in three evaluated trials more accurate than the ones obtained following the current process whereas in six other trials the current process counts were more accurate. In two of the trials both of the processes had counts within the gold standard’s confidence interval.In terms of efficiency the EHR4CR protocol feasibility system proved to save approximately seven calendar days in the process of obtaining patient counts compared to the current manual process.
Conclusions:
At the current stage, electronic health record data sources need to be enhanced with better structured data so that these can be re-used for research purposes. With this kind of data, systems such as the EHR4CR are able to provide accurate objective patient counts in a more efficient way than the current methods.Additional research using both structured and unstructured data search technology is needed to assess the value of unstructured data and to compare the amount of efforts needed for data preparation.
http://www.biomedcentral.com/1471-2288/15/44
Iñaki Soto-ReyBenjamin TrinczekYannick GirardeauEric ZapletalNadir AmmourJustin DoodsMartin DugasFleur FritzBMC Medical Research Methodology 2015, null:442015-05-01T00:00:00Zdoi:10.1186/s12874-015-0035-9/content/figures/s12874-015-0035-9-toc.gifBMC Medical Research Methodology1471-2288${item.volume}442015-05-01T00:00:00ZPDF A Wild Bootstrap approach for the selection of biomarkers in early diagnostic trials Background:
In early diagnostic trials, particularly in biomarker studies, the aim is often to select diagnostic tests among several methods. In case of metric, discrete, or even ordered categorical data, the area under the receiver operating characteristic (ROC) curve (denoted by AUC) is an appropriate overall accuracy measure for the selection, because the AUC is independent of cut-off points.
Methods:
For selection of biomarkers the individual AUC’s are compared with a pre-defined threshold. To keep the overall coverage probability or the multiple type-I error rate, simultaneous confidence intervals and multiple contrast tests are considered. We propose a purely nonparametric approach for the estimation of the AUC’s with the corresponding confidence intervals and statistical tests. This approach uses the correlation among the statistics to account for multiplicity. For small sample sizes, a Wild-Bootstrap approach is presented. It is shown that the corresponding intervals and tests are asymptotically exact.
Results:
Extensive simulation studies indicate that the derived Wild-Bootstrap approach keeps and exploits the nominal type-I error at best, even for high accuracies and in case of small samples sizes. The strength of the correlation, the type of covariance structure, a skewed distribution, and also a moderate imbalanced case-control ratio do not have any impact on the behavior of the approach. A real data set illustrates the application of the proposed methods.
Conclusion:
We recommend the new Wild Bootstrap approach for the selection of biomarkers in early diagnostic trials, especially for high accuracies and small samples sizes.
http://www.biomedcentral.com/1471-2288/15/43
Antonia ZapfEdgar BrunnerFrank KonietschkeBMC Medical Research Methodology 2015, null:432015-05-01T00:00:00Zdoi:10.1186/s12874-015-0025-y/content/figures/s12874-015-0025-y-toc.gifBMC Medical Research Methodology1471-2288${item.volume}432015-05-01T00:00:00ZPDF Meta-analysis of incidence rate data in the presence of zero events Background:
When summary results from studies of counts of events in time contain zeros, the study-specific incidence rate ratio (IRR) and its standard error cannot be calculated because the log of zero is undefined. This poses problems for the widely used inverse-variance method that weights the study-specific IRRs to generate a pooled estimate.
Methods:
We conducted a simulation study to compare the inverse-variance method of conducting a meta-analysis (with and without the continuity correction) with alternative methods based on either Poisson regression with fixed interventions effects or Poisson regression with random intervention effects. We manipulated the percentage of zeros in the intervention group (from no zeros to approximately 80 percent zeros), the levels of baseline variability and heterogeneity in the intervention effect, and the number of studies that comprise each meta-analysis. We applied these methods to an example from our own work in suicide prevention and to a recent meta-analysis of the effectiveness of condoms in preventing HIV transmission.
Results:
As the percentage of zeros in the data increased, the inverse-variance method of pooling data shows increased bias and reduced coverage. Estimates from Poisson regression with fixed interventions effects also display evidence of bias and poor coverage, due to their inability to account for heterogeneity. Pooled IRRs from Poisson regression with random intervention effects were unaffected by the percentage of zeros in the data or the amount of heterogeneity.
Conclusion:
Inverse-variance methods perform poorly when the data contains zeros in either the control or intervention arms. Methods based on Poisson regression with random effect terms for the variance components are very flexible offer substantial improvement.
http://www.biomedcentral.com/1471-2288/15/42
Matthew SpittalJane PirkisLyle GurrinBMC Medical Research Methodology 2015, null:422015-04-30T00:00:00Zdoi:10.1186/s12874-015-0031-0/content/figures/s12874-015-0031-0-toc.gifBMC Medical Research Methodology1471-2288${item.volume}422015-04-30T00:00:00ZPDF Dichotomisation using a distributional approach when the outcome is skewed Background:
Dichotomisation of continuous outcomes has been rightly criticised by statisticians because of the loss of information incurred. However to communicate a comparison of risks, dichotomised outcomes may be necessary. Peacock et al. developed a distributional approach to the dichotomisation of normally distributed outcomes allowing the presentation of a comparison of proportions with a measure of precision which reflects the comparison of means. Many common health outcomes are skewed so that the distributional method for the dichotomisation of continuous outcomes may not apply.
Methods:
We present a methodology to obtain dichotomised outcomes for skewed variables illustrated with data from several observational studies. We also report the results of a simulation study which tests the robustness of the method to deviation from normality and assess the validity of the newly developed method.
Results:
The review showed that the pattern of dichotomisation was varying between outcomes. Birthweight, Blood pressure and BMI can either be transformed to normal so that normal distributional estimates for a comparison of proportions can be obtained or better, the skew-normal method can be used. For gestational age, no satisfactory transformation is available and only the skew-normal method is reliable. The normal distributional method is reliable also when there are small deviations from normality.
Conclusions:
The distributional method with its applicability for common skewed data allows researchers to provide both continuous and dichotomised estimates without losing information or precision. This will have the effect of providing a practical understanding of the difference in means in terms of proportions.
http://www.biomedcentral.com/1471-2288/15/40
Odile SauzetMercy OfuyaJanet PeacockBMC Medical Research Methodology 2015, null:402015-04-24T12:00:00Zdoi:10.1186/s12874-015-0028-8/content/figures/s12874-015-0028-8-toc.gifBMC Medical Research Methodology1471-2288${item.volume}402015-04-24T12:00:00ZPDF Can obtaining informed consent alter self-reported drinking behaviour? A methodological experiment Background:
Informed consent is the foundation of the ethical conduct of health research. Obtaining informed consent may unwittingly interfere with the data collected in research studies, particularly if they concern sensitive behaviours that participants are requested to report on. To address gaps in evidence on such research participation effects, we conducted a methodological experiment evaluating the impact of the informed consent procedure on participants’ reporting behaviour, specifically on their self-report of drinking behaviour as measured by Alcohol Use Disorder Identification Test (AUDIT).
Methods:
A two arm double blinded randomised controlled trial was used. University students present in London student unions at the time of recruitment were contacted in two phases (an initial run-in phase followed by the main phase). Those providing positive responses to verbal questions: 1) “are you a student?”; 2) “do you drink alcohol?”; 3) “would you like to take part in a brief health survey, which will take around 5 minutes?” were recruited. Participants received one of the two envelopes by chance, with the sequence generated by an online random sequence generator. One contained the participant information sheet, informed consent form and the AUDIT questionnaire (the intervention group), while the other contained only the AUDIT questionnaire (the comparator group). The primary outcome was the mean AUDIT score, which ranges from 0 to 40. The secondary outcome was the proportion of participants in each group scoring 8 or more on the AUDIT, the threshold score for hazardous and harmful drinking warranting intervention.
Results:
A total of 380 participants were successfully recruited, resulting in 190 participants in each group, of which 378 were included in the final analysis. There is no evidence of any statistically significant difference between groups in the primary outcome. A statistically significant difference in the secondary outcome was found in the run-in phase only, and not in the main phase, or overall. Moreover, between-group outcome differences between the two phases suggest an important influence of setting on reporting behaviour.
Conclusions:
There is no strong evidence that completion of informed consent itself alters self-reporting behaviour with regards to alcohol, though the effect of settings needs to be further studied.
http://www.biomedcentral.com/1471-2288/15/41
Lambert FelixPatrick KeatingJim McCambridgeBMC Medical Research Methodology 2015, null:412015-04-24T00:00:00Zdoi:10.1186/s12874-015-0032-z/content/figures/s12874-015-0032-z-toc.gifBMC Medical Research Methodology1471-2288${item.volume}412015-04-24T00:00:00ZPDF Bayesian estimation of a cancer population by capture-recapture with individual capture heterogeneity and small sample Background:
Cancer incidence and prevalence estimates are necessary to inform health policy, to predict public health impact and to identify etiological factors. Registers have been used to estimate the number of cancer cases. To be reliable and useful, cancer registry data should be complete. Capture-recapture is a method for estimating the number of cases missed, originally developed in ecology to estimate the size of animal populations. Capture recapture methods in cancer epidemiology involve modelling the overlap between lists of individuals using log-linear models. These models rely on assumption of independence of sources and equal catchability between individuals, unlikely to be satisfied in cancer population as severe cases are more likely to be captured than simple cases.
Methods:
To estimate cancer population and completeness of cancer registry, we applied Mth models that rely on parameters that influence capture as time of capture (t) and individual heterogeneity (h) and compared results to the ones obtained with classical log-linear models and sample coverage approach. For three sources collecting breast and colorectal cancer cases (Histopathological cancer registry, hospital Multidisciplinary Team Meetings, and cancer screening programmes), individual heterogeneity is suspected in cancer population due to age, gender, screening history or presence of metastases. Individual heterogeneity is hardly analysed as classical log-linear models usually pool it with between-“list” dependence. We applied Bayesian Model Averaging which can be applied with small sample without asymptotic assumption, contrary to the maximum likelihood estimate procedure.
Results:
Cancer population estimates were based on the results of the Mh model, with an averaged estimate of 803 cases of breast cancer and 521 cases of colorectal cancer. In the log-linear model, estimates were of 791 cases of breast cancer and 527 cases of colorectal cancer according to the retained models (729 and 481 histological cases, respectively).
Conclusions:
We applied Mth models and Bayesian population estimation to small sample of a cancer population. Advantage of Mth models applied to cancer datasets, is the ability to explore individual factors associated with capture heterogeneity, as equal capture probability assumption is unlikely. Mth models and Bayesian population estimation are well-suited for capture-recapture in a heterogeneous cancer population.
http://www.biomedcentral.com/1471-2288/15/39
Laurent BaillyJean DaurèsBrigitte DunaisChristian PradierBMC Medical Research Methodology 2015, null:392015-04-24T00:00:00Zdoi:10.1186/s12874-015-0029-7/content/figures/s12874-015-0029-7-toc.gifBMC Medical Research Methodology1471-2288${item.volume}392015-04-24T00:00:00ZPDF Comparing denominator degrees of freedom approximations for the generalized linear mixed model in analyzing binary outcome in small sample cluster-randomized trials Background:
Small number of clusters and large variation of cluster sizes commonly exist cluster-randomized trials (CRTs) and are often the critical factors affecting the validity and efficiency of statistical analyses. F tests are commonly used in the generalized linear mixed model (GLMM) to test intervention effects in CRTs. The most challenging issue for the approximate Wald F test is the estimation of the denominator degrees of freedom (DDF). Some DDF approximation methods have been proposed, but their small sample performances in analysing binary outcomes in CRTs with few heterogeneous clusters are not well studied.
Methods:
The small sample performances of five DDF approximations for the F test are compared and contrasted under CRT frameworks with simulations. Specifically, we illustrate how the intraclass correlation (ICC), sample size, and the variation of cluster sizes affect the type I error and statistical power when different DDF approximation methods in GLMM are used to test intervention effect in CRTs with binary outcomes. The results are also illustrated using a real CRT dataset.
Results:
Our simulation results suggest that the Between-Within method maintains the nominal type I error rates even when the total number of clusters is as low as 10 and is robust to the variation of the cluster sizes. The Residual and Containment methods have inflated type I error rates when the cluster number is small (<30) and the inflation becomes more severe with increased variation in cluster sizes. In contrast, the Satterthwaite and Kenward-Roger methods can provide tests with very conservative Type I error rates when the total cluster number is small (<30) and the conservativeness becomes more severe as variation in cluster sizes increases. Our simulations also suggest that the Between-Within method is statistically more powerful than the Satterthwaite or Kenward-Roger method in analysing CRTs with heterogeneous cluster sizes, especially when the cluster number is small.
Conclusion:
We conclude that the Between-Within denominator degree of freedom approximation method for F tests should be recommended when the GLMM is used in analysing CRTs with binary outcomes and few heterogeneous clusters, due to its type I error properties and relatively higher power.
http://www.biomedcentral.com/1471-2288/15/38
Peng LiDavid ReddenBMC Medical Research Methodology 2015, null:382015-04-23T12:00:00Zdoi:10.1186/s12874-015-0026-x/content/figures/s12874-015-0026-x-toc.gifBMC Medical Research Methodology1471-2288${item.volume}382015-04-23T12:00:00ZPDF Permutation-based variance component test in generalized linear mixed model with application to multilocus genetic association study Background:
In many medical studies the likelihood ratio test (LRT) has been widely applied to examine whether the random effects variance component is zero within the mixed effects models framework; whereas little work about likelihood-ratio based variance component test has been done in the generalized linear mixed models (GLMM), where the response is discrete and the log-likelihood cannot be computed exactly. Before applying the LRT for variance component in GLMM, several difficulties need to be overcome, including the computation of the log-likelihood, the parameter estimation and the derivation of the null distribution for the LRT statistic.
Methods:
To overcome these problems, in this paper we make use of the penalized quasi-likelihood algorithm and calculate the LRT statistic based on the resulting working response and the quasi-likelihood. The permutation procedure is used to obtain the null distribution of the LRT statistic. We evaluate the permutation-based LRT via simulations and compare it with the score-based variance component test and the tests based on the mixture of chi-square distributions. Finally we apply the permutation-based LRT to multilocus association analysis in the case–control study, where the problem can be investigated under the framework of logistic mixed effects model.
Results:
The simulations show that the permutation-based LRT can effectively control the type I error rate, while the score test is sometimes slightly conservative and the tests based on mixtures cannot maintain the type I error rate. Our studies also show that the permutation-based LRT has higher power than these existing tests and still maintains a reasonably high power even when the random effects do not follow a normal distribution. The application to GAW17 data also demonstrates that the proposed LRT has a higher probability to identify the association signals than the score test and the tests based on mixtures.
Conclusions:
In the present paper the permutation-based LRT was developed for variance component in GLMM. The LRT outperforms existing tests and has a reasonably higher power under various scenarios; additionally, it is conceptually simple and easy to implement.
http://www.biomedcentral.com/1471-2288/15/37
Ping ZengYang ZhaoHongliang LiTing WangFeng ChenBMC Medical Research Methodology 2015, null:372015-04-22T12:00:00Zdoi:10.1186/s12874-015-0030-1/content/figures/s12874-015-0030-1-toc.gifBMC Medical Research Methodology1471-2288${item.volume}372015-04-22T12:00:00ZXML <it>BMC Medical Research Methodology</it> reviewer acknowledgement 2014 Contributing reviewersThe editors of BMC Medical Research Methodology would like to thank all our reviewers who have contributed to the journal in Volume 14 (2014).
http://www.biomedcentral.com/1471-2288/15/36
Giulia MangiameliBMC Medical Research Methodology 2015, null:362015-04-15T00:00:00Zdoi:10.1186/s12874-015-0003-4/content/figures/s12874-015-0003-4-toc.gifBMC Medical Research Methodology1471-2288${item.volume}362015-04-15T00:00:00ZXML The heterogeneity statistic <it>I</it> <sup>2</sup> can be biased in small meta-analyses Background:
Estimated effects vary across studies, partly because of random sampling error and partly because of heterogeneity. In meta-analysis, the fraction of variance that is due to heterogeneity is estimated by the statistic I
2. We calculate the bias of I
2, focusing on the situation where the number of studies in the meta-analysis is small. Small meta-analyses are common; in the Cochrane Library, the median number of studies per meta-analysis is 7 or fewer.
Methods:
We use Mathematica software to calculate the expectation and bias of I
2.
Results:
I
2 has a substantial bias when the number of studies is small. The bias is positive when the true fraction of heterogeneity is small, but the bias is typically negative when the true fraction of heterogeneity is large. For example, with 7 studies and no true heterogeneity, I
2 will overestimate heterogeneity by an average of 12 percentage points, but with 7 studies and 80 percent true heterogeneity, I
2 can underestimate heterogeneity by an average of 28 percentage points. Biases of 12–28 percentage points are not trivial when one considers that, in the Cochrane Library, the median I
2 estimate is 21 percent.
Conclusions:
The point estimate I
2 should be interpreted cautiously when a meta-analysis has few studies. In small meta-analyses, confidence intervals should supplement or replace the biased point estimate I
2.
http://www.biomedcentral.com/1471-2288/15/35
Paul von HippelBMC Medical Research Methodology 2015, null:352015-04-14T00:00:00Zdoi:10.1186/s12874-015-0024-z/content/figures/s12874-015-0024-z-toc.gifBMC Medical Research Methodology1471-2288${item.volume}352015-04-14T00:00:00ZXML