Regardless of experience level, radiologists are likely to be affected by automation bias when utilizing adjunctive artificial intelligence (AI) for mammography interpretation, according to newly published research.
Noting a significant decline in correct Breast Imaging Reporting and Data System (BI-RADS) mammography assessment by radiologists of all experience levels when a purported artificial intelligence (AI) modality provided an incorrect BI-RADS assessment, the authors of a new prospective study suggested that “all radiologists … can be subject to automation bias.”
In the study of 27 radiologists utilizing a purported adjunctive AI system for 50 mammograms, researchers assessed radiologist performance, the degree of bias with BI-RADS scoring and radiologist confidence in their own BI-RADS assessments, according to recently published research in Radiology. The researchers noted that 11 radiologists were deemed inexperienced radiologists with a mean of five months of experience interpreting mammograms, 11 radiologists had a mean moderate experience with 13 months of experience with mammography and five very experienced radiologists had a mean of 129.6 months of experience with mammography assessment.
When the AI system utilized in the study suggested the correct BI-RADS category, the researchers saw “no significant difference” with assessments between inexperienced radiologists (mean of 79 percent correct BI-RADS assessments), moderately experienced radiologists (mean of 81.3 percent) and very experienced radiologists (mean of 82.3 percent).
However, correct radiologist BI-RADS scoring declined significantly when there was an incorrect BI-RADS assessment by the AI system with a nearly 60 percent decrease for inexperienced radiologists (19.8 percent correct), over a 56 percent decrease for moderately experienced radiologists (24.8 percent correct) and over a 36 percent decline for experienced radiologists (45.5 percent correct).
“Inexperienced, moderately experienced, and very experienced radiologists were worse at assigning the correct (BI-RADS) scores for cases in which the purported AI suggested an incorrect BI-RADS category. These results suggest that all radiologists, regardless of expertise, can be subject to automation bias,” wrote lead study author Thomas Dratsch, M.D., who is affiliated with the Institute of Diagnostic and Interventional Radiology at the University of Cologne in Germany, and colleagues.
The researchers also noted that inexperienced radiologists rated the accuracy of the AI system utilized in the study as a median nine out of 10 on a Likert scale and rated their own BI-RADS assessment skills as a median two out of 10.
“ … Inexperienced radiologists were less confident in their own BI-RADS ratings compared with moderately and very experienced readers, which may potentially make them more vulnerable to following incorrect suggestions by AI,” suggested Dratsch and colleagues.
(Editor’s note: For related content, see “Study: Emerging AI Platform for DBT Shows 23 Percent Increase in Breast Cancer Detection Rate,” “Study: AI Improves Cancer Detection Rate for Digital Mammography and Digital Breast Tomosynthesis” and “Meta-Analysis Finds High Risk of Bias in 83 Percent of AI Neuroimaging Models for Psychiatric Diagnosis.”)
In an accompanying editorial, Pascal A.T. Baltzer, M.D., Ph.D., said the research from Dratsch and colleagues “highlights the need for caution when implementing AI-assisted breast imaging without proper training and knowledge of its reliability.”
In order to mitigate possible automation bias in breast imaging, Dr. Baltzer emphasizes ongoing training, performance benchmarking and continuous feedback for radiologists. Ensuring appropriate validation and transparency with adjunctive AI is also critical, according to Dr. Baltzer, a consultant radiologist in breast imaging with the Department of Biomedical Imaging and Image-guided Therapy at the Medical University of Vienna and executive board member of the European Society of Breast Imaging.
In regard to study limitations, the researchers acknowledged they did not compare adjunctive use of the AI-based system to radiologist performance without AI. Dratsch and colleagues also noted they did not consider other factors that may have contributed to radiologist assessment and only focused on mammograms that incorporated standard BI-RADS ratings.