In a prospective study of over 55,000 women who had screening mammography, researchers found that double-reading by a radiologist and artificial intelligence (AI) was non-inferior to double-reading by two radiologists in detecting breast cancer.
Can artificial intelligence (AI) provide a viable alternative for second reading of screening mammography?
In a new prospective study, recently published in Lancet Digital Health, researchers reviewed data from 55,581 women who had mammography screening from April 1, 2021 to June 9, 2022 at a Sweden hospital in order to assess the capabilities of AI (Lunit Insight MMG version 1.1.6, Lunit) as a second reader for mammography exams. The study authors compared double reading by two radiologists to double reading by a radiologist and AI, single AI reading and triple reading by two radiologists and AI.
According to the study, mammography findings were deemed as abnormal for 6,002 women with 1,716 women recalled for further investigation after consensus discussions. Of the 269 cases of diagnosed breast cancer, the researchers noted that 200 patients had invasive breast cancer and 63 women had ductal carcinoma in situ.
The study authors found that double reading by two radiologists diagnosed 250 cases of breast cancer in comparison to 261 cases of breast cancer detected by double reading with one radiologist and AI. Single AI reading diagnosed 246 of the breast cancer cases and was deemed non-inferior to double radiologist reading. Triple reading (with two radiologists and AI) detected breast cancer in 269 cases and was deemed superior by the researchers to the double reading by two radiologists.
The use of AI in double mammography reading led to a 21 percent increase in abnormal findings, according to the study authors. However, they pointed out that subsequent consensus discussions, which took medical history into account with review of mammography and AI findings, reduced the recall rate by 4 percent in comparison to double reading by two radiologists.
“Thus, the consensus discussion was effective in ensuring that the higher abnormal interpretation rate for AI plus one radiologist did not translate into an increased recall rate. … In a screening population of 100,000 women, replacing one radiologist with AI would save 100,000 radiologist reads while increasing consensus discussions by 1,562. Even if the consensus discussions would take five times longer than an independent read, the workload reduction would be considerable,” wrote lead study author Karin Dembrower, M.D., the head physician in the Department of Breast Radiology at Capio Sankt Gorans Hospital in Stockholm, Sweden, and colleagues.
The researchers also noted that the 11 radiologists who participated in the study had a median experience of 17 years.
While the triple reading approach had a slightly higher detection rate for breast cancer in comparison to double reading with radiologists, the study authors said there were corresponding increases of 50 percent more consensus discussions and a 5 percent higher recall rate.
“The additional cost in terms of workload for radiologists and worry for women must be weighed against the incremental increase in cancer detection,” cautioned Dembrower and colleagues.
In regard to study limitations, the researchers conceded that basing the threshold for AI abnormality detection on data from retrospective studies may not be optimal and that subsequent calibration may be necessary to obtain a viable abnormality threshold in clinical practice. The single-arm paired design in the study prevented comparison of interval breast cancer rates between the different reader strategies assessed in the study, according to Dembrower and colleagues.