A perfect trial would require minimal statistical tools to assist in its analysis. Such a trial would be so large that the sample approached the true likeness of the broader population it intended to emulate, thus the risk of sampling error would be minimal. The confidence intervals surrounding the point estimates would be so minute, one could simply view the results laid bare of any statistical judgment. But such trials are logistical impossibilities. And so we are forced to lean on statistics in order to quantify the potential influence sampling error has on the observed results. At times these very tools, intended to add clarity, serve only further to distort our interpretation of the underlying truth.
Such is the case of the IRIS (Sellick Interest in Rapid Sequence Induction) trial. Published in JAMA Surgery, Birenbaum et al examined the use of cricoid pressure in surgical patients undergoing rapid sequence induction (RSI) (1). At 10 centers across France, the authors enrolled 3,471 adult patients requiring endotracheal intubation for any type of surgical procedure who were considered to have a full stomach (< 6 hours of fasting), or the presence of at least 1 risk factor for pulmonary aspiration and required RSI. Patients were randomized to receive either true cricoid pressure, defined as an expected pressure equivalent to 30 newtons applied using the first 3 fingers on the cricoid cartilage, or a sham placebo. Operators were trained on the correct application of the procedure prior to being permitted to participate in the study.
While this study was technically negative, the authors reported the sham cricoid procedure failed to demonstrate non-inferiority when compared to the true Sellick technique. This is most likely an error in the authors statistical analysis rather than any clinical benefit of cricoid pressure. The rate of primary endpoint, pulmonary aspiration (detected at the glottis level during laryngoscopy or by tracheal aspiration just after tracheal intubation) was essentially identical in both groups. It occurred in 10 patients (0.6%) in the Sellick group and in 9 patients (0.5%) in the sham group. As were the rates of suspected pneumonia within 24-hours of intubation (0.9% vs 0.6%), aspiration pneumonia (0.2% vs 0.2%), and severe pneumonia (0.1% vs 0.1%).
The only element that was noticeably different between the groups was the difficulty of intubation. Patients randomized to the cricoid pressure group had a higher incidence of grade 3 and 4 Cormack and Lehane views. In addition, interruption of the maneuver occurred more frequently in the cricoid pressure group. Abandoning these attempts more often improved the view after its release. The cricoid pressure group required longer times to intubation and more frequently experienced intubations exceeding 30 seconds. And while the incidence of difficult tracheal intubation did not reach statistical significance, it was numerically higher in the cricoid pressure group (72 vs 51).
From a frequentist perspective this was a negative trial, due to the form of hypothesis testing which was utilized in the primary analysis. The authors used a non-inferiority trial design, which asks a different question than the traditional superiority trials that we are accustomed. Rather than presenting a null hypothesis that states there is no difference between the groups, the non-inferiority trial design operates under the assumption that the novel intervention is inferior to the standard treatment. The alternative hypothesis states that the treatment options are equivalent. In order to reject the null hypothesis, the novel treatment must demonstrate a near equivalent efficacy within a degree of certainty. This means that both the point estimate and surrounding 95% confidence interval must fall above an a priori selected non-inferiority margin (2,3). In this case the authors designated true cricoid pressure group as the established approach and the sham maneuver as the novel comparator. In doing so, Birenbaum et al designed a trial in which cricoid pressure could not fail, at worst the sham group would be found to be non-inferior to the traditional approach. The authors chose an inferiority margin of 50% worse than the cricoid group, or a relative risk of 1.5. The authors predicted the rate of their primary endpoint to occur in 2.8% of the patients in the cricoid pressure arm based on previous literature. Meaning the sham control group could have a rate of aspiration of no greater than 4.2% to be considered non-inferior.
The actual rate of aspiration events in the cricoid pressure was far lower than the authors anticipated (0.6%). While the rate of aspiration in the sham control group was numerically lower at 0.5% and clinically equivalent, due to the paucity of aspiration events, the confidence interval surrounding this outcome was larger than anticipated. Despite the relative risk of aspiration falling in favor of the sham group at 0.90 the confidence interval surrounding this point estimate crossed the non-inferiority margin (95% CI, 0.33-2.38).
We utilize Frequentist statistics as a tool to estimate the risk of sampling error in any given cohort. But at times its single-minded dichotomous temperament limits our ability to interpret what is otherwise in plain sight. In the case of the IRIS trial, from a Frequentist perspective, we are unable to demonstrate the non-inferiority of sham cricoid pressure. But a simple inspection of the results demonstrates that true aspiration events are uncommon, and cricoid pressure fails to prevent them. In addition, in our futile attempts to prevent this rarity we actively thwart our own efforts at securing an airway. The applicability of these results to the ED or ICU patient population where the risk of aspiration is much higher is unclear, but given the obvious harm demonstrated in this study, the onus should now fall on us as clinicians, to demonstrate its utility. Until then it should be considered nothing more than a melancholy specimen of medical nostalgia.
Sources Cited
- Birenbaum A, Hajage D, Roche S, et al. Effect of Cricoid Pressure Compared With a Sham Procedure in the Rapid Sequence Induction of Anesthesia The IRIS Randomized Clinical Trial. JAMA Surg. Published online October 17, 2018. doi:10.1001/jamasurg.2018.3577
- Kaji AH, Lewis RJ. Noninferiority Trials: Is a New Treatment Almost as Effective as Another?. JAMA. 2015;313(23):2371-2.
- Kaul S, Diamond GA. Good Enough: A Primer on the Analysis and Interpretation of Noninferiority Trials. Ann Intern Med. 2006;145:62-69.
- EM Nerd-The Case of the Partial Cohort - May 24, 2020
- EM Nerd: The Case of the Sour Remedy Continues - January 20, 2020
- EM Nerd-The Case of the Adjacent Contradictions - December 23, 2019
Thank you for writing this article. I think it will go some way to correcting the misconceptions around this study, which is already being widely reported on social media as showing a positive result. There is one error in your summary of the paper which is actually very important. You state that the study included unfasted (full-stomach) patients — that’s true, but the study also enrolled patients who were fasted and having elective surgery but had one risk factor for aspiration. These risk factors were defined so broadly that, at least in Australia, many of these patients would not have… Read more »
Thanks for your thoughtful response Jon. I have updated the inclusion criteria to as you correctly stated. I will have to disagree with you on your second point. Yes if you view evidence strictly from a Frequentist perspective than you are restricted to view these results as negative. But some of us are not prisoners of the dichotomous analysis one is restricted to when utilizing Frequentist analyses. Rather if you view these results from a Bayesian perspective they appear far different. Bayesian statistics do not seek to make a dichotomous declaration that leads to the rejection or acceptance of a… Read more »
Thanks for your reply Rory, The thing I dispute is that the presumption that we can just ‘eyeball’ this data and draw your conclusion. The Bayesian methodology does not free us from the requirement to use statistical tests. Quiet the opposite: “The Bayesian approach to inference is characterized by the explicit use of probability distributions to draw inferences” and “provides the mathematical machinery needed to combine the information in these two distributions to obtain the posterior distribution…” (1). “Bayes factors provide a coherent approach to determining whether non-significant results support a null hypothesis over a theory, or whether the data… Read more »
I doubt meta-analysis is a useful next step – it would be the first step to ‘’meta-inflation’ (when the number of meta-analyses exceeds the number of useful trials…) Overall, I agree with Rory. The evidence base is poor, the situation uncertain, and there is nothing compelling to persuade people at either end of the spectrum of belief. I am not a proponent of cricoid pressure, but I’m not going to get upset if people perform it as part of a reasoned approach. I encourage people with opposing views to do the same when people choose not to perform this procedure.… Read more »
The Bayes Factor (or likelihood ratio) will be very, very close to 1. Assuming ones prior probability distribution was centred around “no benefit of cricoid” then the posterior probability distribution will still be centred around this point, just somewhat narrower.