PulmCrit - Correlation confounded by timing: One reason we believe bad therapies are awesome

example 1: anti-emetics

Let’s get started with a RCT comparing droperidol versus ondansetron for the treatment of nausea in the emergency department. Meek et. al randomized 144 nauseated patients to receive either droperidol or ondansetron.¹ The rates of symptom improvement were indistinguishable (75% vs 80%). So, you might conclude that these drugs are equally efficacious.

And you would be wrong in reaching that conclusion. These authors also included a placebo group, which had exactly the same rate of symptomatic improvement (76%)! So the correct conclusion is that droperidol and ondansetron are equally worthless in treating nausea in the emergency department! This illustrates the importance of placebo controls.

Let’s dig a bit further into this. One possible interpretation here is that the placebo effect is extremely powerful – 75% effective! That’s possible… but, my guess is that something additional is going on here.

Nausea naturally waxes and wanes over time. Let’s imagine that, without any treatment, the severity of nausea over time is shown below. Nausea comes and goes in waves:

Now, let’s imagine that we treat the patient with IV ondansetron PRN for severe nausea. We will observe this pattern:

The symptom severity is exactly the same here (the ondansetron is having no effect at all). However, it sure looks as though the ondansetron is working! Every time we give the IV ondansetron, the symptoms improve shortly thereafter. A casual observer would conclude that ondansetron is incredibly effective.

This illustrates what might be called correlation confounded by timing. Administration of ondansetron correlates with symptomatic improvement. However, the ondansetron isn’t causing symptomatic improvement. Timing is a confounding factor here – we give the ondansetron only when the symptoms are very severe, at a time when the symptoms are about to abate anyway.

Incidentally, this may explain why it’s so hard to show a benefit from ondansetron or droperidol. The natural history of nausea is to improve over time. To prove benefit, any drug would need to accelerate this improvement substantially. In the context of this fluctuating background, proving efficacy is difficult (a moderate signal of efficacy could easily be lost).

example 2: fluid boluses

Let’s start off with a patient who is having episodic hypotension. There are many physiologic reasons a patient could have episodic hypotension. The cardiovascular system has auto-regulatory mechanisms aimed at restoring blood pressure. Therefore, it’s easy to imagine how an unstable patient could have bouts of hypotension with subsequent improvements (as compensatory mechanisms kick into gear).

Now, let’s imagine that our default approach to managing hypotension is to administer a fluid bolus. Let’s further imagine that the fluid boluses have no effect on our patient (indeed, RCTs show that fluid boluses often have minimal effect on blood pressure). Even though the fluid has no effect, it will still appear that the fluid boluses are causing the blood pressure to increase:

Time is again a confounding factor. Fluid is given only when the blood pressure is low – at a time when the blood pressure is likely to rebound on its own. This artificially generates a pattern wherein administration of fluid correlates with an improvement in blood pressure.

My practice has been shifting away from the use of fluid boluses to the use of vasopressor infusions. Using vasopressors unmasks this phenomenon rather nicely. Let’s imagine that instead of using fluid boluses, we place the patient on a vasopressor infusion when the blood pressure is low. The resulting pattern will look like this:

Episodic low blood pressure is transiently treated with norepinephrine. The norepinephrine infusion is then weaned off when the blood pressure returns to normal. The overall result is that norepinephrine is turned on only for short periods of time.

With norepinephrine, it becomes obvious that the blood pressure is fluctuating over time (the patient is requiring norepinephrine only for very brief periods of time). It is evident that in between episodes of hypotension, the patient is maintaining their own blood pressure (because the norepinephrine is off). This is unlike the situation with fluid boluses, where the provider will typically conclude that the blood pressure remains adequate in-between episodes of hypotension because of the previous fluid bolus.

example 3: toxicology case reports/series

Literature about toxicology generally focuses on flashy antidotal therapies. However, for the vast majority of patients with intoxication, basic supportive care is sufficient (e.g. airway protection if necessary, establishment of euvolemia, electrolyte management, and perhaps a bit of vasopressor support). The trajectory of illness could be represented as follows – patients often get a bit worse, then they recover:

Now, let's suppose that we give this patient with a series of treatments intended to counteract the intoxication. The resulting pattern may look like this:

New treatments will often be layered upon one another, until the patient recovers. This will create the following misperceptions:

Treatments used first will be perceived as ineffective (Treatment A above).
The treatment initiated last will be perceived as effective (Treatment C above). Indeed, this treatment succeeded in “curing” a very sick patient who was “refractory” to several previous treatments!

The unfortunate reality is that no causality can be inferred from these reports. For example:

It's possible that none of the treatments are having any impact at all (the patient may simply be improving over time on their own, as the intoxicant is metabolized).
It's possible that initial treatments (Treatment A) are in fact effective (perhaps the clinical course would have deteriorated more severely, if Treatment A hadn't been utilized).
It's possible that the last treatment (Treatment C) is merely added on at a time when all the prior therapies are finally starting to work – so this treatment may be adding nothing.

Over-interpretation of these case reports may lead to a systematic bias in favor of newer, experimental treatments:

Standard therapies will always be utilized first – at a time-point when the patient is sickest. This may tend to cause the standard treatments to appear least effective.
Experimental therapies will always be utilized last – at a time-point when the patient may be recovering anyway. This may cause the experimental therapy to appear most effective.

conclusions

Correlation confounded by timing is likely endemic in medicine. This will occur whenever patients have transient instability which we respond to (an extremely common occurrence). Often, patients would have improved on their own following these short-lived fluctuations (the body has numerous homeostatic mechanisms which pull things back toward equilibrium). However, our bias will always be to assume that if the patient improved, it is a direct causal effect of our intervention.

Correlation confounded by timing is a dangerous phenomenon, because it will reinforce our belief in whatever therapies we are using. The casual observer will see that every time an anti-emetic is given, the nausea goes away. Or after giving a fluid bolus, hemodynamics usually improve. Or a new antidotal therapy works for intoxications that were refractory to standard therapies. This reinforces our belief in therapies which are garbage.

Avoiding correlation confounded by timing is not easy. The following techniques may be helpful:

Be conservative about believing that a therapy caused improvement (particularly if the patient’s condition was very dynamic before the therapy was started).
Be familiar with high-quality literature undergirding various therapies (if such literature exists).
Be mindful of how different approaches to the same clinical problem may be successful. For example, Clinician A treats urosepsis with 6 liters of fluid, whereas Clinician B treats urosepsis with 2 liters of fluid followed by norepinephrine – and both strategies are usually very successful. It may then be inferred that the 6-liter resuscitation isn’t truly mandatory. (This obviously isn’t the ideal approach to research, but it’s sometimes all we have.)

more ramblings about methodology

reference

1.
Meek R, Mee M, Egerton-Warburton D, et al. Randomized Placebo-controlled Trial of Droperidol and Ondansetron for Adult Emergency Department Patients With Nausea. Acad Emerg Med. 2019;26(8):867-877. https://www.ncbi.nlm.nih.gov/pubmed/30368981.