AI Algorithm Reaches Equivalent Accuracy of Average Radiologist

August 27, 2020

Artificial intelligence algorithm can identify the same percentage of women with breast cancer as most radiologists.

Artificial intelligence (AI) – at least one algorithm – is ready to pinpoint which women have breast cancer without additional oversight or interpretation needed from the radiologist. That’s according to one research team that compared the performance of several AI tools.

In a study published today in JAMA: Oncology, investigators from Karolinska Institutet in Sweden tested and compared the accuracy of three AI algorithms designed to identify breast cancer based on previously captured mammograms. According to study author Fredrik Strand, a radiologist and researcher in the oncology-pathology department, this is the first independent comparison of multiple AI algorithms.

“We conducted the study in order to find out how far the algorithms have developed and whether there is any difference between the available systems,” he said. “The results show that, in principle, the best algorithm is ready for use and that there is a significant difference between the various algorithms on the market.”

In particular, he explained, one algorithm rose above the rest, equally the accuracy of the average radiologist.

Related Content: Artificial Intelligence & Mammography: Where It’s Been & Where It’s Going

For this study, the team examined mammograms from 8,805 women between the ages of 40 and 74 who have completed breast cancer screening between 2008 and 2015. Of that group, 739 received a breast cancer diagnosis either at the time they were screened or within 12 months.

Based on their results, the most effective algorithm – that was developed using 72,000 cancer images and 680,000 normal images – diagnosed the same percentage of women correctly compared to the average practicing radiologist, he said. According to their analysis, the algorithm had 81.9-percent sensitivity, 96.6-percent specificity, and an area under the cover of 0.956 for the detection of cancers at screening or within 12 months. The team also simulated an operating point that corresponded to 88.6-percent sensitivity and 88.9-percent specificity, producing results that were favorable to the U.S. Breast Cancer Surveillance Consortium benchmarks of 86.9-percent sensitivity and 88.9-percent specificity.

Screening Performanc Benchmarks for AI Algorithms

BenchmarkAlgorithm 1Algorithm 2Algorithm 3
Specificity96.6 percent96.6 percent96.7 percent
Sensitivity81.9 percent67.0 percent67.4 percent
Accuracy96.5 percent96.4 percent96.5 percent

Alongside these findings, Strand’s team said, they concluded that combining the interpretation of one radiologist with the highest-performing AI algorithm produced better results than the combination of two radiologists’ image evaluations.

The outcome of this study underscores the findings from another article Strand’s group published recently in The Lancet Digital Health that showed an AI algorithm could successfully sort mammography images into groups that require further radiologist attention and those that AI can accurately assess without overlooking cancers. The team now plans to investigate how AI can improve on the current breast screening system of having two radiologists interpret a mammogram and discuss any disagreements.

Massachusetts General Hospital director of breast imaging Constance Dobbins Lehman, M.D, Ph.D., pointed to the need for advancement in this area in an accompanying invited commentary. Applying the power of AI to categorizing screening mammograms quickly and effectively can open the door to more widely available and affordable breast cancer screenings for a much larger population of women.

“In the continued evolution of AI applied to improve human health, it is time to move beyond simulation and reader studies and enter the critical phase of rigorous, prospective clinical evaluation,” said Lehman, who is also a professor of radiology at Harvard Medical School. “The need is great and a more rapid pace of research in this domain can be partnered with safe, careful, and effective testing in prospective clinical trials. The authors are to be commended for providing data that support this next critical phase of discovery.”