Commentary|Videos|March 14, 2026

Pertinent Insights on Evaluating the Value of AI Models in Radiology, Part 1

Author(s)Jeff Hall

In the first of a two-part interview, David Larson, M.D., M.B.A., and Jason Poff, M.D., discussed a new study examining the use of a structured pre-deployment method for evaluating the value of a portfolio of AI models for radiology.

Recognizing the challenges with assessing the capabilities of artificial intelligence (AI) models to help improve efficiency and enhance detection in radiology, researchers recently developed and evaluated the use of a structured pre-deployment analysis for ascertaining the adjunctive value of a portfolio of 13 AI models.

For the prospective study, recently published in the American Journal of Roentgenology, the researchers utilized the structured analysis to evaluate the value of various Aidoc AI triage models for conditions ranging from incidental pulmonary embolism (PE) to intracranial hemorrhage. While the study involved assessment of imaging from 88,645 exams over a two-year period, study co-author Jason Poff, M.D., emphasized in a recent Diagnostic Imaging interview that the principles of the pre-deployment evaluation model can be utilized at a smaller scale.

“We often found that we had a very good sense of an AI model, just by looking at 10,15, 20 real patient experiences, and if you focus on, in particular, those examples where the AI says one thing, but the radiologist says another. That is really the sweet spot for telling you where this tool is going to benefit you and your patients, and where you might need to be cautious,” noted Dr. Poff, the director of clinical AI for Radiology Partners, who is in private practice in Greensboro, N.C.

In order to bolster the evaluation of the AI models, the researchers utilized metrics such as the enhanced detection rate and the gain to pain ratio (GPR), which assesses the tradeoff between improved detection and the possibility of increased false positives with adjunctive AI.

David Larson, M.D., MBA, the lead author of the study, said the GPR evaluates the trustworthiness of the given AI model for radiologists.

“It’s a metric of trust, both acceptance and trust in general. If you are getting all kinds of false positives (with an AI model), chances are that it's not going to help you very much so you have a low gain to pain ratio. Not only will (radiologists) not enjoy (the AI model), they won't trust it and that's part of the burden,” said Dr. Larson, the executive vice chair of the Department of Radiology at the Stanford University School of Medicine and medical director of performance improvement at Stanford Health Care.

(Editor’s note: For related content, see “Assessing the Potential Impact of Agentic AI in Radiology: An Interview with Nina Kottler, MD,” “FDA Clears CT-Based AI Triage Platform from Aidoc” and “Can a Next-Generation Diagnostic Viewer Improve Efficiency with Radiology Workflows?”)


Latest CME