Oximetry is fundamental to critical care. Consequently, even small biases in pulse oximetry measurements could have real clinical impact (especially when leveraged across innumerable measurements among thousands of patients).
Racial bias in pulse oximetry was the subject of two studies in 2005 and 2007. The topic was then largely ignored over the past 13 years. I was vaguely aware of this issue, but given a lack of modern research on it I had assumed that it was a technical glitch restricted to older pulse oximeters. A fresh publication in the New England Journal of Medicine shows that I was wrong. Let’s start by reviewing the evidence.
Evidence Review: A tale of three studies
Bickler et al. 2005: Effects of skin pigmentation on pulse oximeter accuracy at low saturation
These authors at the University of California in San Francisco compared the performance of three pulse oximeters to ABG analysis among 10 volunteers with dark skin pigmentation and 11 volunteers with light skin pigmentation.1 Volunteers underwent placement of an indwelling radial arterial catheter. Subsequently, they inhaled gas with various FiO2 levels via a mouthpiece. The performance of the pulse oximeters was compared to ABG results across a range of oxygen saturations.
At lower oxygen saturations, bias emerges wherein the pulse oximeters overestimate the oxygen saturation in people with dark skin. Performance of the different pulse oximeters varies substantially. For example:
- For the Nonin Onyx, only a small amount of bias emerges at saturation levels <80%. This amount of bias is clinically insignificant (e.g., a saturation of 72% vs. 75% has a similar clinical implication – it’s really bad). Additionally, clinicians are often aware that pulse oximetry is less accurate at very low saturation, so they are unlikely to make clinical decisions on the basis of a saturation difference of 72% versus 75%.
- The Nellcor N-595 overestimates saturation among people with dark skin by ~2% at saturations <90%. This may affect clinical management (e.g., a saturation of 89% vs. 91% may be interpreted differently).
This study was rigorously performed, with excellent internal validity. However, there are some limitations involved in generalizing the results to other contexts. Only people with extremes of skin pigmentation were studied (light versus dark) – whereas skin pigmentation isn’t actually a binary variable. Most subjects were men.
Feiner et al. 2007: Dark skin decreases the accuracy of pulse oximeters at low oxygen saturation: The effects of oximeter probe type and gender
This is a follow-up study performed by the same group of investigators using the same design: volunteers underwent radial artery catheterization and inhaled gas with varying FiO2.2 The performance of different devices was compared to ABG results.
This study replicated the results of the 2005 study and also expanded upon it:
- More patients were enrolled (36 total), with the inclusion of more women.
- Three groups were included: Dark, light, and intermediate skin pigmentation.
- Different types of finger sensors were evaluated (clip-on versus adhesive sensors).
The key findings from their 2005 original study were replicated. Specifically:
- The Nellcor system overestimates saturation by ~2% among people with dark skin within a saturation range of 80-90%. This is clinically problematic.
- The Nonin system with a clip-on sensor preforms well, with a clinically insignificant amount of bias.
Some additional findings are notable:
- People with intermediate skin pigmentation experience an amount of bias which often falls between people with light skin and people with dark skin.
- Adhesive sensors performed worse than clip-on sensors.
- Bias was a greater problem in women than men, and in people with lower hemoglobin levels. The effects of sex and hemoglobin levels were not statistically independent, suggesting that the effect of sex might be mediated by women tending to have lower hemoglobin levels.
- The Masimo system overestimates saturation in people with dark skin using the clip-on sensor, but not using the adhesive sensor. In fact, the Masimo system underestimates oxygen saturation among all patients at low oxygen saturation when using the adhesive sensor! This is quite unexpected, emphasizing that results obtained using one device cannot be extrapolated to different devices.
This study leads us to a surprising conclusion: the amount of bias seems to vary in a seemingly haphazard fashion between various devices and probes. Some devices are capable of accurately measuring pulse oximetry across people with varying skin tone (e.g., the Nonin oximeter using a clip-on probe). However, many devices suffer from unacceptable amounts of bias within clinically relevant ranges.
This is a rigorous and well-designed study, but it still has limitations. Perhaps most importantly, the study is limited to healthy volunteers. It remains unknown whether bias could be amplified in patients with anemia or poor perfusion.
Sjoding et al. 2020: Racial bias in pulse oximetry measurement
This is a retrospective study involving data collected from the University of Michigan in 2020 and also a multicenter ICU database collected in 2014-2015.3 Measurements of pulse oximetry were compared with ABG analysis obtained within <10 minutes. Patients were categorized based on identification as Black or White.
Findings from the University of Michigan cohort are shown above. Among patients saturating >92% based on pulse oximetry, occult hypoxemia (saturation <88% based on arterial blood gas analysis) was found in 12% of Black patients versus 4% of White patients. Similar discrepancies were also obtained in the multicenter ICU database.
An interesting pattern emerges if we rearrange this figure in order to compare Black patients with White patients whose pulse oximetry is 2% lower (figure below). There seems to be a consistent bias wherein Black patients have 2% higher pulse oximetry than white patients. For example, Black patients with a pulse oximetry of 89% have similar ABG values compared to White patients with a pulse oximetry reading of 91%.
The finding that Black patients have a bias of +2% in the oxygen saturation value is consistent with prior studies above by Bickler et al. and Feiner et al. These prior studies often detected a ~2% bias when using the Nellcor oximeters. Nellcor is a common provider of oximeters, so I wonder whether the devices used in this study were actually Nellcor oximeters (the study doesn’t report this).
This is a seminal study which will change the way we think about pulse oximetry among patients with darker skin. Most importantly, it demonstrates that oximetry bias persists to the current day, when using oximeters which are commonly employed in the ICU. It also provides some quantitative information about the amount of bias – suggesting that it is often ~2%.
A major limitation in the study is that pulse oximetry and ABG were separated by up to 10 minutes time in the context of critically ill patients. Oxygen saturation often fluctuates rapidly in ICU patients, which will tend to increase the differences between pulse oximetry and ABG values. For example, there were commonly differences of >10% between pulse oximetry and ABG oxygen saturation among White patients – this degree of variation is well beyond the inaccuracy of pulse oximetry, so it likely reflects dynamic variation in patients over time. Consequently, this design cannot allow for precise determination of the sensitivity or specificity of any given oxygen saturation cutoff to detect hypoxemia. Thus, the rates of missing occult hypoxemia (12% vs. 4%) have relative importance (e.g., 12% >> 4%, revealing the presence of bias), but the exact “12%” and “4%” figures taken alone may be inaccurate. To determine precise sensitivity and specificity, simultaneous measurement of oxygen saturation and ABG is required.
Where do we go from here?
overall perspective: A minefield of potential pitfalls
It’s obviously essential to provide accurate and prompt evaluation for all patients, regardless of skin color. However, achieving that goal may be tricky. For example, four potential pitfalls are as follows:
- Overestimation of oxygen saturation. This is currently the problem. Patients with dark skin may have their oxygen saturation overestimated by ~2%. This could lead to inadequate therapy. For example, patients with COVID who should be admitted and treated with dexamethasone might be sent home. This is the most disturbing and potentially dangerous pitfall.
- Underestimation of oxygen saturation. This is not a problem currently, but it conceivably could become a problem if we were to overcorrect for issue #1 above. For example, imagine that everyone starts subtracting 2% from the measured oxygen saturation among patients with dark skin. If the pulse oximeter being used was actually accurate, this correction could result in underestimation of oxygen saturation. Underestimating the oxygen saturation could lead to unnecessarily aggressive therapies – which could also cause harm. More is not necessarily better in medicine – so overcorrecting for problem #1 could also lead to poor care.
- Excessive reliance on ABG analysis. Suppose we just decide not to pay attention to pulse oximetry in patients with dark skin, but use ABG analysis instead. This may also result in poor care. ABG testing is painful and invasive. Results from ABGs are often delayed, interfering with timely therapy. Finally, a need for frequent ABGs may lead to the placement of an indwelling arterial line, which can cause a substantial amount of blood loss due to serial phlebotomy.
- Underutilization of pulse oximetry. Although pulse oximetry appears biased in patients with darker skin pigmentation when using some devices, the trends in oxygen saturation should remain accurate. Therefore, continuous pulse oximetry remains a useful tool to track clinical progress in all patients. Wholly discarding the use of pulse oximetry in patients with darker skin pigmentation could cause enormous harm.
Some ideas about possible approaches to this conundrum are explored below. However, right now there is no perfect solution.
#1) Pulse oximeters should be required to be accurate in patients with dark skin
If we can take a step back for a moment, it’s important to recognize a very basic truth: The problem lies with the pulse oximeters, not with skin pigmentation. Some pulse oximeters are capable of discerning four different types of hemoglobin in White patients (including carboxyhemoglobin and methemoglobin) by simultaneously measuring the absorption at four different wavelengths. Thus, it is definitely possible to design a pulse oximeter which is accurate regardless of skin color. In fact, one brand (Nonin) appears to have already solved this problem.
Most pulse oximeters have probably been calibrated using light-skinned individuals, with the assumption that skin pigment does not matter.1
The ultimate fix for racial bias in pulse oximetry will be to require that pulse oximeters function accurately regardless of skin color. This sort of a macro-level fix will require years, so I won’t dwell on it too much. However, it’s essential to recognize that dismantling systemic racism starts from the ground up. The equitable solution is to insist that our medical devices work for all people – rather than designing a system optimized for White patients and then subjecting everyone else to MacGyvered work-arounds.
#2) Current devices should be tested to quantify bias
Until medical devices can be fixed, the next-best strategy could be to perform a modern-day study of the bias involved in different pulse oximeters. This would essentially involve replicating Feiner et al. using a full variety of modern pulse oximeters.
Publishing data regarding the accuracy of different devices would have major benefits, specifically:
- Clinicians could know how much bias to expect from their specific pulse oximeters.
- Companies would be pressured to develop more accurate pulse oximeters.
- Companies that developed accurate pulse oximeters would be rewarded for their efforts.
#3) What should we do right now?
The first step is to understand that current monitors may often be overestimating pulse oximetry in patients with dark skin by ~2%. This seems to be a common phenomenon among ICU monitors, given its occurrence in both datasets within Sjoding et al. However, unfortunately, it remains unclear precisely which devices this applies to.
Consequently, a grey zone may exist for patients with dark skin who have a measured pulse oximetry of ~90%. If the precise saturation is critical to medical management, ABG measurement may be needed. However, it must also be recognized that increasing ABG utilization could potentially pose as a disservice to some patients (e.g., by delaying medical therapy). Ultimately, patient management must be thoughtfully personalized, taking all aspects of the clinical picture into account.
Using portable Nonin Onyx pulse oximeters might be another option to consider (available through Amazon for under $200). Although these devices haven’t been validated recently, the company seems to be aware of issues regarding pulse oximetry and skin pigmentation. Nonin has previously promoted their “PureSAT oximetry technology” to cope with the problem. So, it might be a reasonable assumption that their current devices are at least as good as prior clip-on devices (which were validated to be accurate regardless of skin pigmentation in 2005 and 2007). The reuse of oximeters may pose some infection-control problems, but these may be avoided by using a clear plastic bag to prevent contact between the patient’s finger and the oximeter.4
- Pulse oximetry may overestimate oxygen saturation among patients with dark skin, often by ~2% when using hospital bedside monitors. This may have substantial treatment implications for patients with borderline oxygen saturation (e.g., saturation ~90%).
- Bias in pulse oximeters varies substantially between different devices. Available evidence suggests that Nonin clip-on pulse oximeters seem to be accurate regardless of skin color.
- Trending pulse oximetry over time should remain an accurate tool to monitor clinical progress.
- Further studies defining the performance of different devices among diverse populations are urgently needed.
- It is possible to design pulse oximeters which function well regardless of skin color. Fully dismantling the systemic racism of pulse oximetry will require demanding devices which are universally accurate.
- Racial bias with pulse oximetry? Salim Rezaie, RebelEM
Conflicts of Interest: Never
- 1.Bickler P, Feiner J, Severinghaus J. Effects of skin pigmentation on pulse oximeter accuracy at low saturation. Anesthesiology. 2005;102(4):715-719. doi:10.1097/00000542-200504000-00004
- 2.Feiner J, Severinghaus J, Bickler P. Dark skin decreases the accuracy of pulse oximeters at low oxygen saturation: the effects of oximeter probe type and gender. Anesth Analg. 2007;105(6 Suppl):S18-23, tables of contents. doi:10.1213/01.ane.0000285988.35174.d9
- 3.Sjoding M, Dickson R, Iwashyna T, Gay S, Valley T. Racial Bias in Pulse Oximetry Measurement. N Engl J Med. 2020;383(25):2477-2478. doi:10.1056/NEJMc2029240
- 4.Cheung P, Hardman J, Whiteside R. The effect of a disposable probe cover on pulse oximetry. Anaesth Intensive Care. 2002;30(2):211-214. doi:10.1177/0310057X0203000215