Fragility Index (Walsh et al, 2014)

Author: Birinder Giddey
Reviewer: Chris Nickson

Journal Club 019

Walsh M, Srinathan SK, McAuley DF, et al. The statistical significance of randomized controlled trial results is frequently fragile: a case for a Fragility Index. Journal of clinical epidemiology. 67(6):622-8. 2014. [pubmed] [free full text]


  • The Fragility index is the minimum number of patients whose status would have to change from non-event to event to cause the result to no longer be significant (p > or = 0.05).  Could this metric could be a useful interpretive tool to assess the “fragility” of the results of a randomised controlled trial (RCT)?




  • Post-hoc analysis of selected RCTs published in major medical journals


  • 1273 abstracts were reviewed for eligibility
  • 399 published trials met eligibility criteria
  • Inclusion criteria for RCTs:
    • Published in selected major medical journals: NEJM, The Lancet, JAMA, Annals of Internal Medicine and the BMJ
    • Published between Jan 2004 and Dec 2010
    • Parallel arm or two by two factorial design RCTs
    • 1:1 ratio to intervention and control
    • At least one dichotomous or time-to-event outcome as significant (P<0.05 or 95% CI excluding null value)
  • Exclusion criteria
    • Non-inferiority trials


  • Fragility Index was calculated as follows:
    • Results of each trial represented in two-by-two contingency table. Index calculated by adding an event from the group with smaller number of events (and subtracting a nonevent from the same group to keep numbers constant).
    • P-value re-calculated using Fisher’s Exact Test
    • This was repeated until P-value > or = to 0.05
    • The number of additional events required was called the Fragility Index
  • Fragility Index was analyzed as follows:
    • correlated with trial characteristics, including sample size and total number of outcomes


  • Trial characteristics:
    • Median sample size: 682
    • Median number of events: 112
    • Median Fragility Index = 8
    • 25% Trials had FI < or = 3
  • Trials with higher Fragility Index (i.e. ‘less fragile’) had:
    • Had results with smaller p-values
    • larger number of events
    • larger sample size
  • Trials with lower Fragility Index (i.e. ‘more fragile’) had:
    • poor or unclear allocation concealment




  • The Fragility Index is a simple metric that encompasses important trial characteristics such as sample size and the event rate (and hence study power)
  • The Fragility Index appears to be useful as many clinicians are unlikely to have substantial training in probability and statistics and incorrect interpretation of P-values and confidence intervals appears to be widespread.
  • Fragility Index may identify trials at high risk of ‘medical reversal’ when further studies of the same intervention are performed.
    • The study authors give the example of the LIMIT-2 study published in the Lancet in 1992, which found improved 28-day survival from IV magnesium after acute MI with p=0.04. However, the Fragility Index was only 1. Three years later, ISIS-4 was published. This study had a larger sample size (58,000 patients) and a mortality benefit was no longer found.
  • A paper recently published in CCM found that the Fragility Index of ICU trials is low (median of 2, with 40% of trials having a Fragility Index of 1 or less!), suggesting that much of our evidence base is weak.
  • The results of a trial should be viewed with particular skepticism if loss to follow up exceeds the Fragility Index, as this could easily explain the significant result


  • Fragility Index has limitations:
    • applies to trials with 1:1 randomisation
    • cannot be applied to continuous data, requires dichotomous outcomes
    • use in time-to-event analysis may not be appropriate.
  • Could this simply be another tool for clinicians to use improperly?
  • Will the Fragility Index really help swing the balance of belief, or will clinicians just interpret results according to their pre-existing biases?


  • Clinicians are at risk of over-interpreting the significance of RCT findings when their results hinge on the occurrence of very few events. The Fragility Index promises to be a useful tool to guard against this, by indicating the number of events required to make a statistically significant result nonsignificant.


  • LITFL CCC — Fragility Index
  • Ridgeon EE, Young PJ, Bellomo R, Mucchetti M, Lembo R, Landoni G. The Fragility Index in Multicenter Randomized Controlled Critical Care Trials. Critical care medicine. 2016. [pubmed]

2 comments to “Fragility Index (Walsh et al, 2014)”
  1. Pingback: The Fragility Index: Assessing Usefulness of Randomized Clinical Trials - R.E.B.E.L. EM - Emergency Medicine Blog

  2. Pingback: LITFL Review 225 • LITFL Medical Blog • FOAMed Review

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.