New research demonstrates that artificial intelligence (AI) significantly enhances overall sensitivity for breast cancer detection on traditional screening mammography exams, including cases involving interval and future breast cancers.
For the retrospective study, recently published in Lancet Digital Health, researchers compared single radiologist reading, double radiologist reading, stand-alone artificial intelligence (AI) assessment (Transpara version 1.7.0, ScreenPoint Medical) and the combination of AI with single radiologist evaluation in a cohort of 42,100 women who had a total of 42,236 2D screening mammography exams.
The study authors found that adjunctive AI offered the highest sensitivity (60.2 percent) in comparison to double reading by radiologists (51.7 percent), single radiologist assessment (46.9 percent) and stand-alone AI (48.6 percent). Adjunctive AI also provided comparable specificity (95.8 percent) to double reading (97.7 percent), single radiologist interpretation (97.7 percent) and stand-alone AI (97.8 percent), according to the researchers.
“This study shows that AI detection of breast cancer in population-based mammography screening is comparable with double human reading. AI misses some breast cancers that are recalled by human-assessment but detects a similar number of breast cancers otherwise missed by the interpreting radiologists,” wrote lead study author Suzanne L. van Winkel, MSc, who is affiliated with the Medical Imaging Department at Radboud University Nijmegen Medical Center in Nijmegen, the Netherlands, and colleagues.
Among the true-positive cases, interval cancers accounted for 8.3 percent of cases detected with adjunctive AI and 9.9 percent of cases detected with stand-alone AI in comparison to 1.8 percent for single radiologist interpretation and 1.3 percent for double reading, according to the researchers.
The study authors also pointed out larger percentages of future breast cancer detection with adjunctive AI (10.9 percent) and stand-alone AI (11.7 percent) in contrast to 2.2 percent for single radiologist evaluation and 1.7 percent for double reading.
Three Key Takeaways
- Adjunctive AI improves sensitivity. Combining AI with a single radiologist significantly increased breast cancer detection sensitivity (60.2 percent) compared to double reading (51.7 percent) or radiologists alone, while maintaining comparable specificity.
- Earlier detection of interval and future cancers. Adjunctive AI identified a higher proportion of interval cancers (8.3%) and future cancers (10.9 percent) than radiologists, suggesting the potential to detect clinically relevant cancers earlier.
- Detection of higher-risk tumors. Among AI-detected cases missed by radiologists, over a quarter were invasive cancers and 16.6 percent involved tumors larger than 20 mm, underscoring AI’s potential value in catching more aggressive disease.
Examining AI detection of interval cancer and future breast cancer cases missed by radiologists, the researchers noted that 26.7 percent involved invasive breast cancer and 16.6 percent had tumors larger than 20 mm in diameter.
“These findings suggest that a substantial number of women who were later diagnosed with interval cancers or future breast cancers might have benefited from earlier AI detection and highlight that extended follow-up might demonstrate that AI recalls, initially deemed to be false positives due to lack of agreement with human readers, can ultimately prove clinically relevant,” emphasized van Winkel and colleagues.
(Editor’s note: For related content, see “Reducing Mammography Workload by Nearly 40 Percent? What a New Hybrid AI Study Reveals,” “Mammography Study: AI Facilitates Greater Accuracy and Longer Fixation Time on Suspicious Areas” and “Considering Breast- and Lesion-Level Assessments with Mammography AI: What New Research Reveals.”)
In regard to study limitations, the authors acknowledged incomplete or unavailable data on tumor characteristics and suggested that the second reviewing radiologist may have been aware of the initial reviewer’s results. The researchers also conceded use of a single AI software and noted the recall rate for double reading was derived from consensus decisions by both reviewing radiologists.