So, Gilead’s first RCT on remdesivir was just published, and it’s very interesting.1 Gilead’s, you say? Yep. The study was designed, monitored, analyzed, and written by Gilead:
Before getting into the study, let’s take a moment and think about what Gilead’s first RCT could look like. Gilead knows more about remdesivir than anyone (they built it). So, their RCT ought to be a tour de force. More than anyone else, they ought to know which patients to select, what dose to use, which endpoints to evaluate, and how to present the results. If remdesivir were highly effective, this study should have been a slam dunk.
preamble: some interesting bits about the study design
Since the study is designed by Gilead, the design itself reveals some interesting bits about what the company truly believes about remdesivir. Some issues bear particular mention.
(1) Lack of placebo control
The most unusual aspect of this study is lack of a placebo arm. Why? One might imagine roughly two possibilities:
- Excess optimism: Gilead assumed that remdesivir would be a wonder-drug that obviously works, thereby obviating the need for a large placebo-controlled trial. Or perhaps Gilead assumed that the Wang et al. study would establish the efficacy of remdesivir (a study which was actually rather neutral).2 Either way, Gilead was banking on the assumption that the drug would clearly work, so all they needed to do was establish the appropriate dose.
- Lack of confidence: Gilead didn’t have much confidence that remdesivir is the cure, but they were hoping that it would get accepted anyway (out of haste and desperation). In this scenario, testing remdesivir against placebo would risk exposing remdesivir as ineffective.
My guess is scenario #1. Regardless, lack of placebo controls makes this study difficult to interpret.
(2) Changing the primary endpoint
The original primary endpoint of the study was normalization of temperature and oxygen saturation through day 14. This was changed to assessment of clinical status using a 7-point ordinal scale on March 15 (before data was available). An ordinal scale is a more sensitive metric for small improvements in outcome. This switch suggests that Gilead may have been recognizing that remdesivir was less effective than they initially thought.
(3) Lack of secondary endpoints
Most RCTs suffer from the opposite problem – excessive secondary endpoints (leading to statistical problems due to the likelihood that one secondary endpoint is positive due to chance alone).
This study design has only a single pre-specified secondary endpoint (a safety endpoint). This is quite unusual. In a way, it suggests that perhaps Gilead was avoiding really kicking the tires here – they didn’t want to look too closely into what was going on with this study.
(4) Exclusion of patients with GFR <50 ml/min
It remains controversial whether remdesivir is nephrotoxic. Prior studies have quietly excluded patients with GFR <30 ml/min (an exclusion criteria which has been largely ignored by the NIH guideline recommendations).
This study excluded any patients with the slightest hint of renal dysfunction (GFR <50 ml/min). This suggests that Gilead is not confident that remdesivir is safe for patients with renal dysfunction (even for patients with borderline renal dysfunction, i.e., GFR 30-50 ml/min).
subjects
This is a multi-center, open-label, phase 3 trial comparing two regimens of remdesivir among hospitalized COVID-19 patients (either 5 or 10 days of therapy).
Key inclusion criteria were:
- Oxygen saturation 94% or lower on room air
- Radiologic evidence of pneumonia
- PCR assay within four days of randomization
- Not intubated or on ECMO, nor in multi-organ failure
- AST or ALT not above 5 times the upper limit of normal
- Glomerular filtration rate >50 ml/min (by the Cockcroft-Gault equation)
- Age >11 years old
- Women included only if not pregnant. Both men and women were required to use contraception (if relevant).
Here is where things start getting controversial. The two groups are fairly similar, although slightly more patients in the 10-day group required intubation or noninvasive respiratory support (69 vs. 53):
How big of a difference is this? It’s debatable:
- Using a Fisher’s exact test, 69/197 vs. 53/200 isn’t statistically significant (p = 0.08).
- Using a Wilcoxon rank sum test, the difference is statistically significant (p = 0.02).
- Incidentally, the use of any statistical test here is arguably invalid (we know that the patients were randomized, so we should already know that the null hypothesis is true!).
efficacy endpoints
As shown below, patients receiving longer courses of remdesivir did worse by a variety of different metrics:
Are these differences significant? Well, that’s debatable. In an unadjusted analysis, patients in the 5-day group did better. However, an adjusted analysis based on initial illness severity shows no statistically significant difference:
In a randomized controlled trial, randomization should ideally eliminate baseline differences between patient groups. So generally, adjustment based on baseline variables is unnecessary. However, adjustment of RCTs based on baseline differences is occasionally performed. For example, this might be appropriate in the following situations:
- Recruitment of an adequate sample size is difficult. Pre-planned adjustment for baseline characteristics could help remove confounding variables, thereby improving the power of the study.
- There is an unexpected difference in baseline characteristics between groups, due to bad luck. Post-hoc adjustment could be used to estimate the impact of this imbalance.
Use of an adjusted statistical analysis here seems a little dubious. It feels a bit like patients receiving remdesivir for longer courses did worse, so the authors are covering this up with some statistical wizardry.
post-hoc subgroup analysis
Post-hoc subgroups were evaluated to see if there might be any patient population where giving more remdesivir could be helpful. Well, run enough statistical tests on enough subgroups and…
Of patients on invasive mechanical ventilation, those treated with 10 days of remdesivir had lower mortality (7/41 vs. 10/25, p=0.048). There are a few reasons that this analysis isn’t valid. First, considering the multiplicity of comparisons in this post-hoc subgroup analysis, a p-value of 0.048 isn’t exciting. Second, these subgroups were generated based on clinical status on day #5 – five days after patients had started therapy! They essentially re-drew the starting line for the race, several days into the study! This is wild – you shouldn’t initiate a therapy, wait several days for some patients to deteriorate, and then initiate a subgroup analysis.
safety endpoints
Patients treated with longer courses of remdesivir had higher rates of serious adverse events, especially renal failure:
The authors attempt to explain away these differences on the basis of baseline imbalance between the two patient groups (again). That’s possible, but the creatinine values were essentially identical at baseline. Furthermore, even when performing an adjusted analysis which takes into account baseline differences in disease severity, there was still a significant increase in serious adverse events among patients receiving longer courses of remdesivir:
Prior RCTs on remdesivir in COVID-19 have not reported increased rates of renal failure, so this could very well be a statistical anomaly. However, it remains concerning.
- This is a trial designed, monitored, and written by Gilead. In some ways, the design of the trial and its missing parts are more notable than what is actually reported in the study (e.g., secondary endpoints, viral load data).
- Patients treated with longer courses of remdesivir (10 days vs. 5 days) had worse outcomes. It’s unclear whether this is due to baseline imbalance between the groups, or toxicity from remdesivir.
- Patients were included in the study only if they had a GFR >50 ml/min, suggesting that Gilead might lack confidence regarding whether remdesivir is safe in patients with renal dysfunction. There were higher rates of kidney injury among patients receiving longer courses of remdesivir.
- If a decision is made to use remdesivir, it should be limited to a 5-day course.
- Lack of a placebo group prevents this study from evaluating whether or not remdesivir works. However, the study’s construction and results do raise some red flags. Given that prior placebo-controlled RCTs of remdesivir have failed to demonstrate durable clinical benefit, further placebo-controlled trials are required prior to concluding that remdesivir provides meaningful benefit in COVID-19.
related data on remdesivir
- NIAID ACTT-1 trial
- PulmCrit review
- EMNerd: The Case of the Partial Cohort
- TheBottomLine review (Fraser Magee)
- REBEL-EM review (Salim Rezaie)
- First10EM review (Justin Morgenstern)
- Wang trial in Lancet
- PulmCrit review
- First10EM review (Justin Morgenstern)
- Intensive review (Matthew Durie)
- NEJM “compassionate use” study
- COVID AKI chapter at NephJC (multi-author awesomeness).
Image Credit: Photo by Marvin Esteve on Unsplash
references
- 1.Goldman JD, Lye DCB, Hui DS, et al. Remdesivir for 5 or 10 Days in Patients with Severe Covid-19. N Engl J Med. Published online May 27, 2020. doi:10.1056/nejmoa2015301
- 2.Wang Y, Zhang D, Du G, et al. Remdesivir in adults with severe COVID-19: a randomised, double-blind, placebo-controlled, multicentre trial. The Lancet. Published online May 2020:1569-1578. doi:10.1016/s0140-6736(20)31022-9
- PulmCrit Blogitorial – Use of ECGs for management of (sub)massive PE - March 24, 2024
- PulmCrit Wee: Propofol induced eyelid opening apraxia – the struggle is real - March 20, 2024
- PulmCrit wee: Why I like central lines for GI bleed resuscitation - March 13, 2024
This is niggling about the only drug that is saving lives. Does the commentor have better?
There are 4 other anti-virals utilized and investigate by other countries and global community. Who uses what is quite an interesting data point. And just about everything can admin easier than Remdesivir IV and can admin earlier during viral replication stage. NIH is tunnel visioned on the shiny new object (only trial it started early on. Same goes for vaccines, single trial). Avoid just looking at US.
https://docs.google.com/spreadsheets/d/1w2OP04n18YQ48OIEpxas7eaU7uToOF22IAGtcozNCpE/edit?usp=sharing
What evidence do you have to support that remdesivir saves any lives?
I am a nurse in Jamaica Hospital were a lot of Covid patients were brought in located in Van Wyck Expressway Jamaica NY, I saw the effectives of Remdesivir to the patients who were almost dying of Covid and were ready for intubations but after two hours after the dosage of Remdesivir, the patients was able to breath freely and in two days he was discharged. For you who wants to know the effectivesness of Remdesivir, the Doctors, the nurses who was there inside the ICU, hospital who be the best sources of the Remdesivir. Do not deny the patients… Read more »
I understand that you believe in what you’ve seen personally, but thank the heavens medicine follows science and not what people believe!
It troubles me that 1/3 of severe covid patients are being found to be in renal failure and the excluded gfrs under 50 in this study. Very Shakey to not evaluate or at least release viral load effects. I would rather gamble if I was on a vent in an ICU and take leronlimab which does drop viral load, inhibits cytokine storm, and brings balance to the immune system with data to show it. Monoclonal antibodies to the rescue!
excellent detailed review, Josh. am a little perturbed by some of the comments below. it almost seems that the passions fly high in some conversations regarding COVID, almost as though we were discussing abortion, or religious beliefs. but as long. as done respectfully, it’s good to see.
still, thank you for your frequent, detailed posts.
tom
Adjusting for important baseline covariates is not heresy even with balanced baselines… It helps account for explainable variation and boost power. Andrew Althouse, stat editor for circ intvn, has commented on this in the last. So that part doesn’t bother me. Rest is fair critique.
excellent critical analysis. I am a pharmacist and I am also very skeptical about the 1. the effectiveness and 2. cost effectiveness of remdesivir, especially now that we know that steroids work (against the advice of the CDC!). seems like a cash grab.
Wonderful statistical and emphatically analysis. It’s obvious the manufacture doctors every aspect of the study to get approval. But Remdesivir must have been given to several thousands by now. How come we don’t have a simple summary of what was the outcome of the patients who were given the drug. Over a large sample size, we should know the effectiveness instead of relying on adhoc doctors comments or poor and biased profit centric trials done by the manufacturer. So we are not able to coordinate a global or even national tracking ???