For DBT breast cancer screening, 47 percent of radiologist-only flagged false positives involved mass presentations whereas 40 percent of AI-only flagged false positive cases involved benign calcifications, according to research presented at the recent American Roentgen Ray Society (ARRS) conference.
While a recent study revealed a 10 percent false positive rate for AI software and unassisted radiologist assessment for digital breast tomosynthesis (DBT), there were significant differences with the nature of the false positive findings, according to a poster presentation at the American Roentgen Ray Society (ARRS) conference.
For the retrospective study, researchers reviewed data from the use of the AI software (Transpara v1.7.1, ScreenPoint Medical) for 3,183 DBT screening exams to compare false positive findings between the AI software and radiologists. The study authors acknowledged differences between the AI false positive and radiology false positive cohorts with respect to mean patient age (60 vs. 53).
For the 304 false positive cases flagged only by the AI software, 40 percent involved benign calcifications with 13 percent of cases focusing on asymmetries and 12 percent of findings representing benign post-surgical changes.1
Benign calcifications accounted for 40 percent of the findings flagged only by AI software in a recent study comparing false positives of radiologists and AI in digital breast tomosynthesis (DBT) screening. Examples of benign calcifications only flagged by the AI software (shown above) include dystrophic, round calcifications (A), prominent skin calcifications (B) and very prominent vascular calcifications (C). (Images courtesy of ARRS.)
Of the 308 false positive findings flagged only by radiologists, the study authors noted that masses were involved in 47 percent of cases, followed by asymmetries (19 percent) and indeterminate calcifications (15 percent).1
“ … AI was more likely to flag benign calcifications, asymmetries and benign post-surgical changes, and these findings (occurred) more than 50 percent of the time … compared to the radiologists who tended to flag masses, asymmetries and indeterminate calcifications more often,” noted lead study author Tara Shahrvini, an MD/MBA candidate at the David Geffen School of Medicine at the University of California-Los Angeles (UCLA), and colleagues.
The researchers noted higher percentages of false positives in the AI cohort with Asian (16 percent vs. 9 percent) and African American women (14 percent vs. 8 percent) in comparison to false positives with unassisted radiologists.1
Reviewing radiologists also had higher percentages of false positives in women with dense breasts with the study authors citing a 37 percent false positive rate in BI-RADS category C cases (vs. 22 percent for the AI software) and a 14 percent false positive rate in BI-RADS category D cases (vs. 5 percent for AI).1
In cases that were flagged by AI and unassisted radiologists, the researchers pointed out a 39 percent rate of biopsy recommendations and pathology-confirmed high-risk lesions in 44 percent of those cases. However, they also noted that overlapping findings between AI and unassisted radiologist interpretation only occurred in 1.4 percent of the larger DBT screening cohort.1
“Given the minimal overlap between AI and radiologist FPs, these findings suggest the potential for a synergistic interpretation by both AI and radiologists to decrease the recall rate in real-world practice,” maintained Shahrvini and colleagues.
Reference
1. Shahrvini T, Wood EJ, Joines MM, et al. Radiologist versus artificial intelligence false positives in digital breast tomosynthesis. Presented at the American Roentgen Ray Society (ARRS) conference April 27-May 1, 2025, San Diego. Available at: https://www2.arrs.org/am25/ . Accessed May 7, 2025.
Emerging AI Mammography Model May Enhance Clarity for Initial BI-RADS 3 and 4 Classifications
May 21st 2025In a study involving over 12,000 Asian women, researchers found that an artificial intelligence (AI) model converted over 83 percent of false positives in patients with initial BI-RADS 3 and 4 assessments into benign BI-RADS categories.
Can AI Predict Future Lung Cancer Risk from a Single CT Scan?
May 19th 2025In never-smokers, deep learning assessment of single baseline low-dose computed tomography (CT) scans demonstrated a 79 percent AUC for predicting lung cancer up to six years later, according to new research presented today at the American Thoracic Society (ATS) 2025 International Conference.