The Adventure of the Impassable Stone

As medical skeptics we have a tendency to revel in the negative study. We bemoan the p-value's tendency to underestimate the risk of type I error and cite Frequentist statistics’ history of getting it wrong almost as often as it gets it right. Despite these nihilistic inclinations it is important that we are equally vigilant in identifying circumstances in which the risk of type II errors is high. A number of recent trials examining the use of medical expulsion therapy (MET) in ureteral colic illustrate the risk of such errors.

The first of these trials published by Pickard et al in The Lancet, in May 2015, examined both alpha blocker (tamsulosin 0.4 mg) and calcium channel blocker (nifedipine 30 mg) therapy in patients with CT confirmed ureterolithiasis (1). The authors randomized 1137 patients with stones 10 mm or less to receive either 0.4 mg of tamsulosin, 30 mg of nifedipine or placebo. Patients were excluded if they presented with obvious signs of sepsis, had significant renal failure (GFR<30) or required immediate invasive therapy as prescribed by the treating physician.

The authors found there to be no significant difference in their primary outcome, the rate of spontaneous passage at 4-weeks, between those randomized to the tamsolusin, nifedipine or placebo arms. Spontaneous stone passage, defined by absence of need for intervention to assist stone passage during the 4 week follow up, was 307 (81%), 304 (80%), and 303 (80%) respectively. There was also no significant differences noted in the need for pain medication, the number of days pain medication was required, or the visual analog scale (VAS) of patients pain at 4 weeks (1). By all accounts this was an impressively negative trial.

A second study was recently published online in July 2015 in Annals of Emergency Medicine. Like the Pickard et al trial, this trial, by Furyk et al examined the effects of MET in patients with CT confirmed ureterolithiasis(2). The authors randomized patients with stones 10 mm or less located in the distal ureter to either MET with 0.4 mg of tamsulosin or placebo. Patients were excluded if they demonstrated signs of infection or presented with a compromised GFR. And like the previous study, the authors found no statistical difference in the number of patients who experienced stone passage at 28 days (87.0% and 81.9% in the tamsulosin and placebo groups respectively)(2). We now have two high quality RCTs demonstrating that the use of MET is not beneficial in the management of acute ureteral colic. This should conceivably end the debate regarding the utility of alpha blockade for ureteral colic.

And yet despite what on first glance appears to be convincing evidence, neither of these trials address the pressing question regarding MET. The majority of patients in both these trials had stones less than 5 mm in diameter. Most small stones will pass without difficulty (6,7). As these trials demonstrate it is impossibly hard to show a statistically significant difference in an undifferentiated cohort of renal colic patients. The real question is, does MET work in patients with stones greater than 5 mm in diameter? Can these trials definitively demonstrate a lack of utility of MET in these patients?

To examine this question appropriately we first must define statistical power. Power is the ability of a trial to detect a statistically significant difference between two groups when a true difference exists (3). It is the ability to separate true positives from false negatives, essentially the trial’s sensitivity. Traditionally, an acceptable statistical power has been set at 80 or 90%. The true meaning of such a statement is nebulous and it becomes far easier to understand statistical power when utilizing quantifiable measures.

The Pickard et al trial based their sample size calculation on the ability to detect a 10% absolute difference between the tamsulosin group and its comparators with a power of 90%(1). What this translates to is, if the observed difference between the tamsulosin group and its comparators were zero (p=1.0), the trial would not be able to confidently rule out an absolute difference as large as 6%. Conversely if the trial did in fact find a 10% improvement in patients randomized to alpha blockade, this effect size could range as low as 4% or as high as 16%. In fact, this is exactly what they found. The 95% confidence interval surrounding 1% absolute risk reduction (ARR) in patients randomized to receive tamsulosin was –4.4% to 6.9 %. Conversely, in the subset of patients with stones greater than 5 mm in width, Pickard et al observed an absolute difference of 10% in the rate of stone passage at 4 weeks in favor of those randomized to receive tamsulosin. This difference did not reach statistical significance. It is important to note that power is a prospective concept calculated prior to knowing the results of a study. To retrospectively state a trial is underpowered once the results of the study are known is somewhat disingenuous. The claim that the observed difference is true and only failed to reach statistical significance due to an inappropriately small sample size, may in fact be correct, but is not justifiable due to the data alone. Any post-hoc power calculation performed on such a data set will inevitably demonstrate the limited ability to differentiate a true difference from the null hypothesis(4). Once the trial results are obtained, post-hoc calculations should be avoided, focusing instead on the confidence intervals surrounding the point estimates for a more honest interpretation of the data (3). In this case, we are unable to differentiate a 10% difference in stone passage from no effect. In fact the 95% confidence interval ranged from -2.8% to 23.6% (1). Clearly this trial was not designed to answer the question of whether MET is beneficial in patients with large diameter ureteral stones.

The results of the Furyk trial are even more compelling. Though the primary endpoint was the overall proportion of patients with stone passage at 28-days, the authors powered their study for an entirely different question. The study was powered to detect a difference in the rate of stone passage in patients with larger stone diameters (5-10 mm). The authors calculated they would require 98 patients with stones greater than 5 mm to detect a 20% difference in stone passage with an 80% power (2). This means that if no difference was observed, the authors would be unable to exclude a difference as large as 14%. While their primary outcome was negative, in the subgroup of patients this study was powered to examine, the authors found a 22.4% absolute difference in the rate of stone passage at 28-days. The confidence interval surrounding this point estimate ranged from 3.1%-41.6%. Although it is unwise to make claims of significance based off a secondary endpoint with such a wide confidence interval, it is equally unfair to use this data to disprove a hypothesis, which this trial is not designed to refute.

We are all aware of the hazards of subgroup analyses, and yet it is important to be honest in our skepticism. This in no way should be viewed as an endorsement of MET or the necessity of obtaining imaging to identify a subgroup of patients who may benefit from tamsulosin. On the contrary, these trials demonstrate that for the majority of patients presenting to the Emergency Department with renal colic, MET provides little additional benefit above symptomatic treatment. But a trial can only answer the question it was designed to ask. Neither of these trials were built to confidently address whether MET is beneficial in patients presenting with larger stones. Earlier trials examining this question are either so confounded by non-blinding and selection bias to make them interpretable or suffer from the same deficiencies in statistical power to confidently address the effects of MET for patients with larger stones (5). We are left with statistical and philosophical uncertainty regarding the utility of alpha-blockers in acute ureteral colic. We will continue to exist in this state of ambiguity until we have a study sufficiently powered to ask whether MET is efficacious in patients with large ureteral stones. Many would love to discard alpha-blockers for renal colic in our ever-growing pile of medical impotencies, but given the current state of the literature, this renouncement would be premature and unjust.

Sources Cited:

Pickard R, Starr K, Maclennan G, et al. Medical expulsive therapy in adults with ureteric colic: a multicentre, randomised, placebo-controlled trial. Lancet. 2015
Furyk, Jeremy S. et al. Distal Ureteric Stones and Tamsulosin: A Double-Blind, Placebo-Controlled, Randomized, Multicenter Trial. Annals of Emergency Medicine. Published online: July 17 2015
Goodman SN, Berlin JA. The use of predicted confidence intervals when planning experiments and the misuse of power when interpreting results. Ann Intern Med. 1994;121(3):200-6.
Goodman SN. A comment on replication, P-values and evidence. Stat. Med. 1992;11:875-9.
Campschroer, T., Zhu, Y., Duijvesz, D. et al. Alpha-blockers as medical expulsive therapy for ureteral stones. Cochrane Database Syst Rev. 2014; : CD008509
Coll, D.M., Varanelli, M.J., and Smith, R.C. Relationship of spontaneous passage of ureteral calculi to stone size and location as revealed by unenhanced helical CT. AJR Am J Roentgenol. 2002; 178: 101–103
Miller, O.F., Kane, C.J. Time to stone passage for observed ureteral calculi: a guide for patient education. J Urol. 1999;162:688–690 (discussion 690-691).