Six reader studies on digital mammography revealed a pooled sensitivity rate of 80.8 percent for stand-alone artificial intelligence (AI) in comparison to 72.4 percent for radiologist assessment while seven historic cohort studies showed a 75.8 percent pooled sensitivity rate for stand-alone AI versus 72.6 percent for radiologist interpretation of digital mammography.
A new meta-analysis of over 1.1 million breast cancer screening exams for over 497,000 women suggests that stand-alone artificial intelligence (AI) may be more effective or equally effective as radiologists in assessing digital mammography.
For the meta-analysis, recently published in Radiology, researchers reviewed 13 studies (including six reader studies and seven historic cohort studies) comparing AI and digital mammography as well as four studies comparing AI assessment to digital breast tomosynthesis (DBT). The researchers analyzed data from a total of 1,108,328 breast screening examinations in 497,091 women, according to the study.
Overall, across the reviewed studies, the researchers found a pooled sensitivity rate of 80.6 percent for stand-alone AI in comparison to 73.6 percent for radiologist assessment. Overall pooled specificity rates were 89.6 percent for radiologists and 85.7 percent for stand-alone AI, according to the study.
For the six reader studies on digital mammography, the study authors noted that stand-alone AI had an 8.4 percent higher pooled sensitivity rate (80.8 percent vs. 72.4 percent for radiologists) and a 4.5 percent lower specificity rate (76.9 percent vs.81.6 percent for radiologists). In the seven historic cohort studies of stand-alone AI and digital mammography, the researchers noted slightly better pooled sensitivity (75.8 percent) and nearly equivalent pooled specificity for stand-alone AI (95.6 percent) in comparison to radiologists (72.6 percent sensitivity and 96.4 percent specificity respectively).
“Our meta-analysis of studies on the stand-alone performances of artificial intelligence (AI) for interpretation of digital mammography … shows that current algorithms perform on par with, if not better than, the average performance of breast radiologists,” wrote meta-analysis co-author Ritse M. Mann, M.D., Ph.D., who is affiliated with the Department of Medical Imaging at Radboud University Medical Center in Nijmegen, Netherlands, and the Department of Radiology at the Netherlands Cancer Institute in Amsterdam, Netherlands, and colleagues.
The study authors emphasized that the small number of DBT studies (four) included in the meta-analysis was not sufficient to draw concrete conclusions about stand-alone AI in DBT interpretation. However, in the pooled estimates from these studies, the researchers did note that stand-alone AI had an overall 11 percent higher area under the receiver operating characteristic curve (AUC) and a nearly 11 percent higher sensitivity rate as well as an 18.5 percent lower specificity rate in comparison to radiologists.
(Editor’s note: For related content, see “What a New Study Reveals About AI, Bias and Mammography Assessment,” “Study: Emerging AI Platform for DBT Shows 23 Percent Increase in Breast Cancer Detection Rate” and “Study: AI Improves Cancer Detection Rate for Digital Mammography and Digital Breast Tomosynthesis.”)
Pointing out that acceptable recall rates range from 2 to 3 percent in northern Europe and up to 12 percent in the United States, the study authors noted that different AI thresholds for specificity as well as sensitivity can factor into the variability between the reviewed studies.
“Setting a threshold for AI is an important choice when implementing AI in practice, and the optimum may be different for different use cases,” maintained Mann and colleagues. “For instance, an AI threshold at high sensitivity might be used when AI is used for triaging when all recalls are still performed by humans whereas a threshold for higher specificity may be essential when AI is used as an additional (or independent) reader.”
In regard to study limitations, the researchers noted AI assessments had no effect on actual radiology workflows as all of the reviewed studies were non-interventional. Mann and colleagues acknowledged a lack of access to individual patient data from the reviewed studies. While the meta-analysis included AI systems from multiple companies, the authors said the meta-regression analysis prohibited assessment of individual AI systems. The researchers also acknowledged that AI vendors participated in the study design for seven of the 16 studies reviewed in the meta-analysis and predefined thresholds from AI vendors were utilized in three of the studies.