In the era before the ubiquitous use of bedside ultrasound, BNP and its derivative natriuretic peptides were, at best, a mediocre test that added little to clinical judgment. In today’s world of sonographic abundance, they simply add noise to our already deafening workflow.
Despite a wealth of evidence demonstrating natriuretic peptides’ lack of clinical utility, their use has become an abundant and reflexive component in the workup of suspected acute decompensated heart failure. While consistently failing to adequately lend diagnostic guidance in patients where clinical uncertainty is present, in the eyes of many, natriuretic peptides have remained a viable diagnostic pathway, simply for lack of a better option.
In a recent publication by Pivetta et al, in CHEST (1), the authors remind us that when presented with a diagnostic question it is important to select a test capable of providing the answer. Authors enrolled 1,005 patients presenting to the Emergency Department with acute dyspnea. Patients were excluded if they had an obvious cause of symptoms clearly unrelated to acute decompensated heart failure (trauma), or if there was no Emergency Physician present with ultrasound expertise (defined as > 40 completed scans). Patients underwent a standardized workup including history, physical exam, EKG and arterial blood gas (ABG), after which the Emergency Physician was asked to categorize the presentation as acute decompensated heart failure or non-cardiac in origin. After this, they performed a standardized point of care ultrasound (POCUS) examination that consisted of a 6-zone scanning protocol. Diffuse interstitial syndrome (DIS) was defined as the presence of two or more zones with three or more B-lines on bilateral lung fields. The final diagnosis was determined by a review of each patient’s hospital course performed by an Emergency Physician and Cardiologist, who were blinded to the POCUS findings (1).
Of the 1,005 patients enrolled, 463 patients (46%) were given the final diagnosis of acute decompensated heart failure. The agreement of the two physicians determining this gold standard was excellent, only disagreeing on 3.5% of the cases. The treating physician’s ability to clinically differentiate cardiac from a non-cardiac cause of the presenting dyspnea was exceptionally good. The physicians demonstrated a sensitivity and specificity of 85.3% and 90% respectively. In fact the performance of the POCUS alone, though numerically better (sensitivity of 90.5% and a specificity of 93.5%), did not differ statistically from the physician’s intrinsic diagnostic capabilities. Although in isolation each performed well, the combination of the clinical and sonographic exams significantly augmented their mutual diagnostic capabilities. The sensitivity and specificity of the physician’s judgment in addition to lung US was 97% and 97.4% respectively. More importantly for the purposes of this post, was its performance when compared to the natriuretic peptides. Of the 1,005 patients, 486 had a natriuretic peptide drawn. Its ability to differentiate cardiac causes of dyspnea was worse than the unassisted judgment of the treating physician. The sensitivity and specificity was 85% and 67.1% respectively (when threshold for a positive test was prospectively set at 400 pg/mL for BNP, and 450, 900, and 1,800 pg/mL for patients, 50 years old, between 50 and 75 years old, and 75 years of age, respectively, for NT-pro-BNP) (1).
This study is far from perfect. This was a prospective observational study that did not enroll consecutive patients, required an Emergency Physician competent in the use of bedside US, and only obtained natriuretic assays in approximately 50% of the cohort (1). And yet despite these obvious flaws, this trial serves to illustrate an important point in the interpretation of diagnostic test results. In the Emergency Department we function in varying degrees of uncertainty. We are constantly being shown a single cross section of a disease process and asked to predict its subsequent velocity and acceleration. We are expected to perform the impossible task of calculating the slope of a line with only one point of data. We estimate these slopes in the form of risk. The greater the risk the stronger the force acting to overcome our intrinsic inertia. There is a certain probability above which the risk of pathology is high enough to compel further investigation. Below this threshold the probability of disease and its accompanying burdens are not worth further diagnostic consideration. Conversely there are cases where the potential of disease is so high that the treatment threshold has already been crossed, and further diagnostic studies are incapable of lowering the risk enough to justify withholding the necessary interventions (2). As Emergency Physicians we exist in is the gray zone, the area between the test and treatment thresholds. As such, it behooves us to utilize tests with the diagnostic capability necessary to shift the post-test probability into either extremes of the continuum.Fig 1
Using the more traditional test characteristics, sensitivity and specificity, it is very difficult to intuit how a particular test result will shift an individual patient’s probability of disease. Through the use of a two-by-two table we are able to determine how often a patients with the disease in question is correctly identified by a positive test result (sensitivity) and how often a patient without the disease is likely to have a negative test result (specificity). But this retrospective evaluation defines a test’s performance from the perspective of a population in which the final diagnosis is already known (3). It does little to prospectively predict the risk of an individual patient with a specific test result. In contrast, the likelihood ratio (LR) is a prospective mathematical concept describing a diagnostic test’s ability to alter a patient’s risk. Essentially an LR calculates the percentage of patients with the disease that will have a specific test result, divided by the percentage of patients without the disease who will have the same test result (4).
A negative LR (-LR) measures the probability of patients with the disease who will have a negative test result, divided by the probability of patients without the disease who will have a negative test result. The positive LR (+LR) is the exact opposite; the probability that patients with the disease will have a positive test result divided by the probability that patients without the disease will have a positive test result. LRs greater than one will shift the probability towards the treatment threshold, and ratios less than one shift the post-test probability in the opposite direction, towards the test threshold. The marker of a useful test is one that will consistently move the post-test probability out of this zone of uncertainty. Typically negative and positive LRs of 10 and 0.1 are considered the minimal level for diagnostic utility. Levels less than 10 or greater than 0.1 will not consistently shift the post-test probability above or below the test or treatment threshold (4,5).Fig 3
Pivetta et al illustrated that when the Emergency Physician is confident in their clinical diagnosis, they consistently identify the presence or absence of decompensated heart failure. In these cases, clinical judgment alone has correctly identified the patients, as below the test threshold or above the treatment threshold, further diagnostic studies are not required. In the remainder of patients where clinical judgment is insufficient, the LRs possessed by the natriuretic peptides (2 and 0.2 respectively) are insufficient to reliably shift the post-test probability out of this zone of uncertainty. Conversely, in the spectrum of patients where clinical judgment was unable to correctly differentiate decompensated heart failure from other causes of dyspnea, lung ultrasound was exceptionally useful. Pivetta et al found that when POCUS was used to augment clinical judgment, the positive and negative LRs were effectively diagnostic (22.3 and 0.03 respectively) (1).
The vast majority of the time, the Emergency Physician is more than capable of clinically identifying patients presenting in acute decompensated heart failure. In the few cases that cast a diagnostic dilemma, natriuretic peptides provide no additional diagnostic guidance. Bedside ultrasound is a swift non-invasive tool in possession of likelihood ratios robust enough to shift post-test probability to a degree that is clinically relevant. Now is the time to speak frankly about natriuretic peptides. They are diagnostic clutter, another lab value flagged as abnormal that must be acknowledged before discarding as unhelpful. Natriuretic peptides add noise to an already uncertain baseline, making it only more difficult to detect the signal through the already thunderous cacophony that is diagnostic uncertainty.
- Pivetta E, Goffi A, Lupia E, et al. Lung ultrasound-implemented diagnosis of acute decompensated heart failure in the Emergency Department – A SIMEU multicenter study. Chest. 2015
- Pauker SG, Kassirer JP. The threshold approach to clinical decision making. N Engl J Med. 1980;302(20):1109-17.
- Altman DG, Bland JM. Diagnostic tests. 1: Sensitivity and specificity. BMJ. 1994;308(6943):1552.
- Deeks JJ, Altman DG. Diagnostic tests 4: likelihood ratios. BMJ. 2004;329(7458):168-9.
- Fagan TJ.Letter: Nomogram for Bayes theorem.N Engl J Med1975; 293:257.