TY - JOUR
T1 - Sample size of surgical randomized controlled trials
T2 - a lack of improvement over time
AU - Ahmed Ali, Usama
AU - ten Hove, Joren R.
AU - Reiber, Beata M.
AU - van der Sluis, Pieter C.
AU - Besselink, Marc G.
N1 - Publisher Copyright:
© 2018 Elsevier Inc.
Copyright:
Copyright 2019 Elsevier B.V., All rights reserved.
PY - 2018/8/1
Y1 - 2018/8/1
N2 - Background: Interpretation of randomized controlled trials (RCTs) without a significant difference regarding the primary outcome (negative RCTs) is frequently challenging, due to concerns about sample size and thus sufficient statistical power. We aimed to assess the adequacy of sample size and corresponding power of surgical RCTs. Methods: We previously identified all surgical RCTs available in PubMed in two distinct years a decade apart (1999 and 2009). For all “negative” trials, we estimated whether the sample size of the trial was appropriate to detect a difference in the primary outcome measure. The main outcome measure was a sufficient sample size to detect large, medium, and small treatment effects. We also performed a post hoc power analysis based on the actual observed effect difference. Results: A total of 228 negative RCTs (74 in 1999 and 121 in 2009) were included. The median sample size was 76 (± 222) and 80 (± 163) in 1999 and 2009, respectively. Sample size calculation was increasingly reported from 40% in 1999 to 54% in 2009 (P = 0.02). The proportion of studies adequately powered to detect large (57% versus 68%), medium (26% versus 25%), or small (8% versus 7%) differences did not differ significantly between 1999 and 2009, respectively. To reach sufficient power, the required increases in sample size were 130%, 240%, and 1032% for large, medium, and small differences, respectively. Reporting a sample size calculation was the only independent predictor for adequate power. Conclusions: Despite slight improvement in the reporting of a sample size calculation, about a third of surgical trials remains underpowered to demonstrate differences that are likely to be clinically significant. Increased attention of researchers, medical ethical boards, and journal editors is required to reduce potentially wasted resources on underpowered trials.
AB - Background: Interpretation of randomized controlled trials (RCTs) without a significant difference regarding the primary outcome (negative RCTs) is frequently challenging, due to concerns about sample size and thus sufficient statistical power. We aimed to assess the adequacy of sample size and corresponding power of surgical RCTs. Methods: We previously identified all surgical RCTs available in PubMed in two distinct years a decade apart (1999 and 2009). For all “negative” trials, we estimated whether the sample size of the trial was appropriate to detect a difference in the primary outcome measure. The main outcome measure was a sufficient sample size to detect large, medium, and small treatment effects. We also performed a post hoc power analysis based on the actual observed effect difference. Results: A total of 228 negative RCTs (74 in 1999 and 121 in 2009) were included. The median sample size was 76 (± 222) and 80 (± 163) in 1999 and 2009, respectively. Sample size calculation was increasingly reported from 40% in 1999 to 54% in 2009 (P = 0.02). The proportion of studies adequately powered to detect large (57% versus 68%), medium (26% versus 25%), or small (8% versus 7%) differences did not differ significantly between 1999 and 2009, respectively. To reach sufficient power, the required increases in sample size were 130%, 240%, and 1032% for large, medium, and small differences, respectively. Reporting a sample size calculation was the only independent predictor for adequate power. Conclusions: Despite slight improvement in the reporting of a sample size calculation, about a third of surgical trials remains underpowered to demonstrate differences that are likely to be clinically significant. Increased attention of researchers, medical ethical boards, and journal editors is required to reduce potentially wasted resources on underpowered trials.
KW - Power
KW - Sample size
KW - Trials
UR - http://www.scopus.com/inward/record.url?scp=85044109201&partnerID=8YFLogxK
U2 - 10.1016/j.jss.2018.02.014
DO - 10.1016/j.jss.2018.02.014
M3 - Article
AN - SCOPUS:85044109201
SN - 0022-4804
VL - 228
SP - 1
EP - 7
JO - Journal of Surgical Research
JF - Journal of Surgical Research
ER -