|Articles|June 2, 2021

Shortcuts Are Bad – Even for AI

Research shows artificial intelligence models rely on shortcuts for detecting COVID-19.

Relying on shortcuts for work isn’t always a great idea – and that holds true for artificial intelligence (AI) with chest X-rays and COVID-19, as well.

New research from the University of Washington (UW), shows that AI, like humans, has a tendency to lean on shortcuts for disease detection with these scans. If the tools are deployed clinically, investigators said, the result could be diagnostic errors that impact real patients.

Rather than learning from actual medical pathology and clinically significant indicators, the team, led by Paul G. Allen School of Computer Science & Engineering doctoral students, Alex DeGrave, who is also a medical student in the UW Medical Scientist Training Program, and Joseph Janizek, also a UW medical student, showed that algorithms used during the pandemic relied on text markers and patient positioning specific to each dataset to predict whether someone was COVID-19-positive.

The team published their results May 31 in Nature Machine Intelligence.

“A physician would generally expect a finding of COVID-19 from an X-ray to be based on specific patterns in the image that reflect disease processes,” DeGrave said. “But, rather than relying on those patterns, a system using shortcut learning might, for example, judge that someone is elderly and, thus, infer that they are more likely to have the disease because it is more common in older patients. The shortcut is not wrong per se, but the association is unexpected and not transparent. And, that could lead to an inappropriate diagnosis.”

By depending on these shortcuts, the team said, the algorithms are far less likely to work outside of their original setting where they were trained and tested. Models that do use shortcuts have a greater chance of failure when used in a different facility, opening the door for potentially incorrect diagnoses and improper treatment, he added.

It’s the lack of transparency – the “black box” phenomenon associated with AI – that contributes heavily to the problem, the team said. And, they suspected it existed with the models touted as detecting COVID-19, so they conducted their own training and testing to assess the models’ trustworthiness. Specifically, they focused on “worst-case confounding,” a situation, resulting in shortcuts, that arises when an AI tool lacks sufficient training data to learn the underlying pathology of a disease.

“Worst-case confounding is what allows an AI system to just learn to recognize datasets instead of learning any true disease pathology,” Janizek said. “It’s what happens when all of the COVID-19-positive cases come from a single dataset while all of the negative cases are in another. And, while researchers have come up with techniques to mitigate associations like this in cases where those associations are less severe, these techniques don’t work in situations where you have a perfect association between an outcome, such as COVID-19 status, and a factor like the data source.”

For their study, the team trained several deep convolutional neural networks on X-ray images from a dataset that replicated the approach that was used in existing papers. In a two-step process, they tested each model’s performance with an internal image set from the initial dataset that was not part of the training data, and they tested performance on a second, external dataset that was meant to be a new hospital system.

Their analysis showed that once models were removed from their original setting – where they showed high performance – they floundered. In outside environments, accuracy fell by half. DeGrave and Janizek’s team called this the “generalization gap,” asserting that confounding factors were behind the models’ predictive success with the initial dataset.

The team also applied explainable AI techniques in an attempt to pinpoint image features that played the largest role in each model’s predictions. They had similar results when they tried to reduce confounding by training the models with a second dataset that included both positive and negative COVID-19 images pulled from similar sources.

These results, the team said, shed light on the degree to which high-performance medical AI system can depend on undesirable, unreliable shortcuts for disease detection. They also contradict the belief that confounding isn’t as much of a danger when datasets come from similar sources. Fortunately, DeGrave said, it is unlikely that the models tested in this study were deployed clinically, largely because providers still use RT-PCR for COVID-19 diagnosis rather than chest X-rays.

Overall, said senior author Su-In Lee, Ph.D., associate professor in the Allen School, the findings from this study underscore the fundamental role explainable AI will play in ensuring the safety and efficacy of these models in medical decision-making.

Janizek agreed.

“Our findings point to the importance of applying explainable AI techniques to rigorously audit medical AI systems,” he said. “If you look at a handful of X-rays, the AI system might appear to behave well. Problems only become clear once you look at many images. Until we have methods to more efficiently audit these systems using a greater sample size, a more systematic application of explainable AI could help researchers avoid some of the pitfalls we identified with the COVID-9 models.”

For more coverage based on industry expert insights and research, subscribe to the Diagnostic Imaging e-Newsletter here.

Stay at the forefront of radiology with the Diagnostic Imaging newsletter, delivering the latest news, clinical insights, and imaging advancements for today’s radiologists.

Subscribe Now!

Latest CME

In-Person + Virtual Event

Live Tumor Board: Squamous Cell Carcinoma of the Head & Neck – Post-CRT Decisions in the Locally Advanced Setting

February 19, 2026

In-Person Event

43rd Annual Miami Breast Cancer Conference®

March 5-8, 2026

Video

Inaugural Brain & Spine Metastases Conference: Evolving Practice and Emerging Therapies

Manmeet Ahluwalia, MD, MBA

In-Person Event

19th Annual New York GU Cancers Congress™

March 13-14, 2026

Video

Mastering Advances in Managing Unresectable and Metastatic NSCLC—Immunotherapy, Targeted Therapies, and Emerging Strategies

Marina Chiara Garassino, MD; Sarah Goldberg, MD, MPH; Biagio Ricciuti, MD, PhD

Video

Cases & Conversations™: Expert Perspectives on Leveraging Recent Advances to Transform SCLC Treatment

Jacob Sands, MD; Anne Chiang, MD, PhD; Alissa J. Cooper, MD

Multimedia

Community Practice Connections™: Empowering Interventional Radiologists in the Emerging Era of Oncolytic Immunotherapies for Melanoma

Yana G. Najjar, MD; Douglas B. Johnson, MD, MSCI; Rahul A. Sheth, MD, FSIR

Video

(CME Credit) Advancing Outcomes in Limited-Stage Small Cell Lung Cancer: From Evidence to Practice

Lauren Averett Byers, MD; Percy Lee, MD, FASTRO; Erminia Massarelli, MD, PhD, MS

Video

PER Tumor Board®: Applying Recent Advances to Transform the Treatment Paradigm in SCLC—Expert Perspectives on New Approvals and Emerging Strategies

Jonathan W. Goldman, MD; Percy Lee, MD, FASTRO; Erminia Massarelli, MD, PhD, MS; Misty D. Shields, MD, PhD

Shortcuts Are Bad – Even for AI

Newsletter

Related Content

Diagnostic Imaging's Weekly Scan: February 1 — February 7

Radiology Roundup of New FDA Clearances — February 1 — February 7

FDA Clears AI-Powered Triage Platform for Digital Breast Tomosynthesis

Is AI Better Than Neuroradiologists at Evaluating Aneurysm Growth on CTA and MRA Scans?

FDA Clears 3T MRI Device for Neonates and Infants

Latest CME

Live Tumor Board: Squamous Cell Carcinoma of the Head & Neck – Post-CRT Decisions in the Locally Advanced Setting

43rd Annual Miami Breast Cancer Conference®

Inaugural Brain & Spine Metastases Conference: Evolving Practice and Emerging Therapies

19th Annual New York GU Cancers Congress™

Mastering Advances in Managing Unresectable and Metastatic NSCLC—Immunotherapy, Targeted Therapies, and Emerging Strategies

Cases & Conversations™: Expert Perspectives on Leveraging Recent Advances to Transform SCLC Treatment

Community Practice Connections™: Empowering Interventional Radiologists in the Emerging Era of Oncolytic Immunotherapies for Melanoma

(CME Credit) Advancing Outcomes in Limited-Stage Small Cell Lung Cancer: From Evidence to Practice

PER Tumor Board®: Applying Recent Advances to Transform the Treatment Paradigm in SCLC—Expert Perspectives on New Approvals and Emerging Strategies

Trending on Diagnostic Imaging

Leading Breast Radiologists Discuss the Recent Lancet Study on AI and Interval Breast Cancer

Is AI Better Than Neuroradiologists at Evaluating Aneurysm Growth on CTA and MRA Scans?

Radiology Roundup of New FDA Clearances — February 1 — February 7

FDA Clears AI-Powered Triage Platform for Digital Breast Tomosynthesis

Diagnostic Imaging's Weekly Scan: February 1 — February 7