PulmCrit – Bad news for sepsis-3.0: qSOFA fails validation

October 1, 2016 by Josh Farkas 18 Comments

Sepsis 3.0 replaced the SIRS criteria with a new risk-stratification tool, qSOFA. qSOFA was initially developed within the Sepsis-3 publication itself. Until now, qSOFA has never been validated. The value of qSOFA vs. SIRS remains controversial.

Churpek 2016: qSOFA, SIRS, and early warning scores for detecting clinical deterioration in infected patients outside the ICU.

This was a study of 30,677 patients in the emergency department and ward at the University of Chicago who were suspected of having infection (defined as any anyone cultured and started on IV antibiotics). Electronic records were retrospectively analyzed to calculate SIRS, qSOFA, and two risk-stratification scores (MEWS and NEWS). These scores were compared to a primary outcome of in-hospital mortality and a combined outcome of mortality or ICU admission.

MEWS and NEWS are risk-stratification scores, designed and validated to identify patients at risk for deterioration (tables below). They are fairly similar, with NEWS being newer:

The overall performance of these tests is best reflected by the area under the receiver-operator curve. NEWS consistently came out ahead. In predicting mortality or ICU transfer (arguably the most relevant outcome), qSOFA and SIRS were similar (1):

The figure below shows the test sensitivity versus percent of patients with a positive screen for various cutoffs of different tests, using the combined outcome (mortality or ICU transfer). qSOFA and SIRS have similar overall test performance, the main difference being that SIRS is more sensitive whereas qSOFA is more specific. MEWS and NEWS have superior performance:

These differences are clinically meaningful. For example, at an equal level of specificity, NEWS achieved a sensitivity that was 13% higher than qSOFA:

qSOFA is an insensitive and late indicator of deterioration

This study focused mostly on the highest test score before ICU transfer, rather than the test score at the point in time when infection was first suspected. This data won't apply to the bedside clinician who is interpreting the test score at a single time-point. For example, the sensitivity of a single test score will be lower than the sensitivity of the worst score before ICU transfer.

This figure explores when these indices turned positive. The cumulative likelihood that a patient would meet various criteria at any point in their hospitalization is shown, prior to the time of death or ICU transfer. The sensitivity of qSOFA is poor, especially >12 hours before deterioration (<40%).

These results shouldn't be too surprising

This study suggests that compared to SIRS, qSOFA has increased specificity at cost of decreased sensitivity. Both tests have similar global performance. This was previously discussed on this blog and comes as no surprise:

This study also shows that the NEWS score out-performs qSOFA. qSOFA is a rudimentary risk-stratification score, so it isn't surprising that it would be out-performed by more sophisticated risk-stratification scores (e.g. APACHE-II). NEWS is a more detailed score than qSOFA, which integrates a greater amount of vital information:

Indeed, if you look very carefully at the NEWS score, it actually contains the components of the qSOFA score (figure below). Since qSOFA is effectively a pared-down version of the NEWS score, it makes sense that qSOFA shouldn't perform as well.

Where does this leave Sepsis 3.0?

qSOFA plays a central role in the Sepsis 3.0 definition:

This study challenges the Sepsis-3 definition for several reasons:

qSOFA has a poor sensitivity
qSOFA is a late indicator of deterioration
qSOFA is inferior to the NEWS score (despite the NEWS score being based on data which is equally easy to obtain at the bedside)

Sepsis-3 still hasn't been widely adopted into hospital protocols or Medicare guidelines. Widespread change of such protocols requires an enormous expenditure of time and money (e.g. re-writing policies, educating staff). Such change can only be justified by a definition which has been successfully validated. Failure of qSOFA to be validated suggests that additional evidence is required before considering adoption of Sepsis-3.

Should we follow the lead of our colleagues in the UK?

These organizations don't recommend qSOFA as a primary evaluation for sepsis. Instead, the screening test for sepsis in the UK is the NEWS score. They may be far ahead of us on this one. Notably, the NEWS score has been successfully validated and deployed on a large scale.

If the NEWS score is used, it should be understood that it is not a test for sepsis. NEWS is a global risk-stratification tool which identifies patients who are critically ill from any disease. Thus, an elevated NEWS score should prompt a thoughtful evaluation for any potential life-threat. Increased utilization of risk-stratification tools such as NEWS could facilitate early recognition of sepsis as well as other critical illnesses (cardiogenic shock, pulmonary embolism, hemorrhage, etc.).

Churpek et al. 2016 is the first study to attempt validation of qSOFA.
qSOFA and SIRS have similar overall performance in predicting the combined outcome of death or ICU transfer. qSOFA has a higher specificity, but this comes as a tradeoff for lower sensitivity.
qSOFA is <40% sensitive for detecting a patient who will die or need ICU transfer in 12 hours.
qSOFA is consistently out-performed by the NEWS score, a more sophisticated bedside risk-stratification tool (figure below).
This study doesn't support the Sepsis-3 definition. Further evidence is needed before considering the widespread adoption of qSOFA and Sepsis-3. The British approach using the NEWS score appears superior.

Addendum: After posting this, I received the following poster from Drs James Price and Narani Sivayoham of St. George’s University Hospital in London. Similar to Churpek et al., their group found a very low sensitivity for qSOFA:

Addendum 2: This is in response to a discussion with Jon-Emile (see below).

Related material

Churpek MM et al. qSOFA, SIRS, and early warning scores for detecting clinical deterioration in infected patients outside the ICU. E-pub ahead of print in Am J Respir Crit Care Med.
Blogs on Sepsis-3.0
- Top 10 problems with sepsis 3.0 (Pulmcrit)
- Sepsis 3.0 with Merv Singer, Additional thoughts with Cliff Deutschman (EMCrit)
- Sepsis 3.0 (Rebel EM)
- Sepsis 3.0 – No thank you (First 10 in EM)
- Sepsis definitions and diagnosis (LITFL)
- Sepsis: Redefined (FOAMcast)
- Batman, the sofa, and the latest sepsis definitions (St Emlyns)
- Sepsis isn't a disease (Intensive Care Network)
- Critique of Sepsis-III (Deranged Physiology)
Debate in the literature
- UK Sepsis Trust toolkit & statement regarding Sepsis 3.0, also see interim statement about sepsis 3.0.
- New sepsis criteria: A change we should not make (American College of Chest Physicians)
- The misapplication of severity-of-illness scores towards clinical decision making. Moskowitz A et al. AJRCCM 2016.
- Change is not necessarily progress: Revision of the sepsis definition should be based on new scientific insights. Cortes-Puch et al. AJRCCM 2016

Notes

It might have been more appropriate to designate the primary outcome as the combined outcome of mortality or ICU transfer. The real goal of identifying patients with sepsis is to transfer them to the ICU (and prevent mortality), therefore it would be logical to include ICU transfer in the primary outcome.

Author
Recent Posts

Social Me

Josh Farkas

Josh is the creator of PulmCrit.org. He is an associate professor of Pulmonary and Critical Care Medicine at the University of Vermont.

Social Me

Latest posts by Josh Farkas (see all)

Pulmcrit wee: The cutoff razor - April 15, 2024
PulmCrit Blogitorial – Use of ECGs for management of (sub)massive PE - March 24, 2024
PulmCrit Wee: Propofol induced eyelid opening apraxia – the struggle is real - March 20, 2024

18 Comments

oldest

newest most voted

Inline Feedbacks

View all comments

Frank Farkash

7 years ago

I know it takes longer and more data to calculate, but where does the full SOFA score fall in the reliability/utility here? Argument being that by the time you can calculate the full score it may be too late?

Author

Josh Farkas

7 years ago

Reply to Frank Farkash

Trying to design the perfect risk-stratification tool is incredibly difficult, which explains why hundreds of these tools exist and there is no consensus about which one is best (MEWS, NEWS, SOFA, APACHE-II, etc.). It looks like qSOFA is too simple, and NEWS is a substantial improvement on it. The question, then, is whether a more complex tool would be even better than NEWS? Adding some lab values can improve the performance of a risk stratification tool. One neat example of this is the NEWS-L score, which is the NEWS score plus the lactate level in mM. Addition of this single… Read more »

Melissa Sheridan

7 years ago

Reply to Josh Farkas

Is the NEWS-L validated? I am drafting a prospectus for a research project that hypothesizes that the use of a screening tool will increase early identification of sepsis and initiate goal directed therapy faster. I was planning on the use of the qSOFA, until I came across this information that it is not valid tool.

Allison T.

6 years ago

Reply to Josh Farkas

Hi , I am seeking helping for my DNP project. I was planning on using the qSOFA score for early identification of high risk home puts with suspected infection. Unfit. I am not allowed to use a point of care lactate, therefore thought the qSOFA would be the most direct on identification. However, after reading a few recent studies from 2016/2017 I am second guessing myself. Please share your insights. Thank-you!

What's Your Job?

Nurse Practitioner

Melissa Sheridan

6 years ago

Reply to Allison T.

I am currently a dnp student working on my DPI project which is on septis and the use of Q sofa in triage. All of the research that is current emphasize early identification which is pertinent to early goal-directed therapy. The problem is we don’t have a standardized screening tool. There’s multiple tools in the literature but they have not been validated or replicated. Q sofa seems to be the only one thus far. There is a lot of mixed emotions between use of Q sofa and the sirs criteria as one may be over sensitive but not specific and… Read more »

What's Your Job?

Nurse practitioner

Ingunn Granum

6 years ago

Reply to Josh Farkas

Comment Here – Speak Your Mind – Also leave your name, affiliations, and any conflicts you may have.
Hi!
In Sweden we use a triage/decision tool guide in the ED and ambulance called RETTS in which the recommended basic labtests include lactate from the second lowest triage level. The vital parameters match NEWS scores and combines with emergency symptoms and signs to establish a medical statistical level of potential lethal illness.
The system is well validated In Europe. http://www.predicare.se

What's Your Job?

physician ED

Jon-Emile

7 years ago

Nice analysis Josh. I know that a lot is made about SIRS versus qSOFA/SOFA because it is a part of the algorithm in Sepsis 3.0, but the heart of the matter – to my eye – is the actual redefinition of sepsis [i.e. now as a life-threatening organ dysfunction caused by a dysregulated response to infection]. This is – essentially – the old ‘severe sepsis’ which makes some sense because this is the threshold that had to have been met to be enrolled in EGDT, ARISE, ProCESS and ProMISe. So the question, I think, we really must ask is which… Read more »

Author

Josh Farkas

7 years ago

Reply to Jon-Emile

Thanks, Jon-Emile, that’s an interesting perspective. I agree with you that SOFA probably has similar performance to NEWS or MEWS. Yu 2014 found that SOFA and MEWS had statistically similar performance (https://ccforum.biomedcentral.com/articles/10.1186/cc13947). NEWS does have an advantage compared to SOFA, which is that it can be performed immediately at the bedside of any patient without requiring any labs. This may be a major advantage, particularly when triaging patients in the ED who have no labs or evaluating a ward patient who hasn’t had a fresh set of labs in a while. My guess is that different settings might require different… Read more »

Jon-Emile

7 years ago

Reply to Josh Farkas

Thanks for the Yu paper, i will check it out.

I have been thinking about this a lot lately; I’m curious to know your opinion, however, on just the redefinition …

Do you think that sepsis 3.0 should now include/require end-organ dysfunction [in effect, become the old ‘severe sepsis’]?

Would it be crazy to simply re-define sepsis 3.0 as:

“presumed infection [however, the clinician wants to define this, which often inherently includes SIRS] + evidence of end-organ dysfunction [however the clinician wants to define this, which inherently includes much of NEWs, MEWs, and SOFA]?

thanks

Author

Josh Farkas

7 years ago

Reply to Jon-Emile

I don’t think that’s crazy at all. I put a diagram above (see “addendum 2”) showing how I would define sepsis. No definition is perfect, but this could be a reasonable approach to sepsis & sick patients. Often sepsis ends up being a diagnosis of exclusion. That’s probably OK as long as you’ve exercised due diligence in excluding alternative diagnoses. One useful aspect of this definition (and illness-severity tools like the NEWS score) is that they may help us identify patients with critical illness in general, not just patients with sepsis. One possible problem with this definition is that it… Read more »

Michal

7 years ago

Hey Josh,
Maybe a bit off topic, but how do you actually measure respiratory rate in your practice?
I find this parameter quite cumbersome to obtain and yet a lot of these scores rely heavily on this number.
We do?t routinely measure RR at triage or at examination nor do we have a standardized way how to do so. We note if the pt is tachypnic, where the cut off is around 30/min… I wonder if there is a simple, reliable and reproducible way how to do this.

Thank you.

Michal Pisar
Emergency department
Zlin, Czech republic

Author

Josh Farkas

7 years ago

Reply to Michal

Our monitors record respiratory rate, although at times they seem to err on the high side. For a patient in the ICU, I will generally trend the respiratory rate based on the monitor (which is transcribed into the vital signs). When evaluating a patient in the ED for possible ICU transfer, if in doubt I will actually measure respiratory rate manually, by counting breaths for 30 seconds and doubling it (either watching the patient or listening with a stethoscope). That may sound silly, but it’s a very important parameter and its often not measured accurately. So the short answer is… Read more »

Nick Barnett

7 years ago

Great review Josh. As an intensive care clinician, Sepsis 3.0 just doesn’t ring right – not clinically and probably not even from a public health perspective, in terms of identifying patients early and treating early even if that means over-treating. It feels like something of a noble failure as its intentions were clearly well motivated. Two quick questions: What are the ‘efficiency curves’? They look like a mirror-image of a ROC curve and are presumably some variation thereof. Hadn’t encountered these before! What is your view of the Eamonn Raith presentation and supposed validation of qSOFA at ESICM congress –… Read more »

Eamon Raith

7 years ago

Reply to Nick Barnett

Hi Nick, With regards to our ‘supposed validation of qSOFA’, I would emphasise that: 1. We report the AUROC for qSOFA, SOFA and SIRS in ICU patients admitted with infection. 2. Our study was not designed to examine the use of these scores outside the ICU (so has limited utility in the EM environment). 3. We do not ‘validate’ qSOFA in quite the way I think you mean. In fact, we make the point in our presentation: “Using a qSOFA Score ?2 or ?2 SIRS Criteria in patients admitted to ICU with a diagnosis of infection – Misses people who… Read more »

Derek Louey

7 years ago

In the ED or wards, I would use a score with the highest sensitivity i.e. SIRS to triage the at-risk patients and begin early meaningful intervention e.g. antibiotics, fluids, surgical consultation

Thereafter, I would continuously apply another tool that provided a trigger for escalation beyond ward therapy e.g. needing inotropic support, RRT, ventilation.

Josh, could you further unpack the statement “The real goal of identifying patients with sepsis is to transfer them to the ICU (and prevent mortality),”

At what point during the course of their disease do you think these patients ought to be transferred?

Gleb Esin

7 years ago

Reply to Derek Louey

The main idea of ??a new definition of sepsis is the identification of patients at high risk. Not all patients with SIRS will have high risks. Therefore, in my opinion, it is not fundamentally how you define risks (for example, UK management), the main thing is that a patient with an infection and high risks is a sepsis patient. If we are careful, the authors suggest using qSOFA, when the use of a full SOFA is not possible. It’s like using D-Dimer to diagnose PE: a positive result does not guarantee 100% availability of PE, but high risks and requires… Read more »

Steinar Konradsen

6 years ago

Nice analysis, but I think using the algorithm shown in addendum 2 will take much too long time to be able to give septic patients antibiotics etc within one hour after suspicion. Obtaining CXR/CT etc often delays the diagnosis, and in the face of serious illness, treatment -decisions often have to be made before these test results are available.

What's Your Job?

Specialist in family medicine, working at a community hospital

caroline barnes

6 years ago

I see that NEWS2 was released for use as of Dec 2017. Your thoughts? It looks very promising and is approved for use in Europe.

What's Your Job?

Infomatics Nurse Specialist

PulmCrit – Bad news for sepsis-3.0: qSOFA fails validation

Churpek 2016: qSOFA, SIRS, and early warning scores for detecting clinical deterioration in infected patients outside the ICU.

qSOFA is an insensitive and late indicator of deterioration