Current Insights on AI, Breast Cancer Screening and the FDA

In a recently published article, researchers from Yale University discuss the pros and cons of current FDA regulations as they apply to the clearance and use of adjunctive artificial intelligence (AI) software with conventional breast cancer screening modalities such as mammography.

Is there enough scrutiny of artificial intelligence (AI) software prior to clearance by the Food and Drug Administration (FDA) for adjunctive use in breast cancer screening?

Despite the FDA clearance in recent years of several AI products to help identify suspicious breast lesions and facilitate mammography triage, researchers suggested in a recent review, published in JAMA Internal Medicine, that questions remain about data sources, clinical outcome measures and external validation.

Here are a few takeaways from their review of the research leading to FDA clearance for nine AI-related products for breast cancer screening between January 1, 2017 and December 31, 2021.

1. All of the clearances for the AI products were based on retrospective analysis of previously existing databases. Only six of the nine products had multicenter studies to support their use and research for four of the AI products lacked information about external validation, according to the review.

2. While sensitivity rates, specificity rates and area under the curve (AUC) were commonly reported performance measures for the AI products, the researchers found that none of the research for these products noted clinical outcome measures such as interval cancer detection or cancer stage at diagnosis.

3. Research for seven of the nine AI products relied on enriched data sets that may contain a higher number of true positive cases than data from a typical patient population that radiologists would see in practice, according to the researchers.

“Although they are convenient and increase statistical power, enriched data sets do not reflect the typical spectrum and prevalence of breast cancer and may produce an AUC different from what would be observed in typical conditions,” wrote study co-author Ilana B. Richman, M.D., MHS, an assistant professor at the Yale University School of Medicine, and colleagues.

4. The review authors also cautioned against overreliance on high sensitivity rates when assessing AI software. While enhanced sensitivity can facilitate earlier diagnosis of breast cancer in some cases, Richman and colleagues say it can also contribute to an increased number of false positive tests and unnecessary biopsy procedures.

5. While acknowledging the successful use of convolutional neural networks for a variety of imaging assessment tools, Richman and colleagues pointed out that AI products may miss aggressive breast cancers and potentially over diagnose “indolent breast cancers that may not affect women in their lifetimes.”

6. The study authors also emphasized the need for improved racial, ethnic, and clinical diversity in the training sets used to develop adjunctive AI algorithms for breast cancer screening. Richman and colleagues pointed out that the majority of the training sets utilized in the development of AI breast cancer screening algorithms were based on mammogram results from White women. In order to reflect breast cancer risks associated with age and breast density, the study authors emphasized that training sets reflect this clinical diversity as well.

7. In order to address the review of dynamic adaptive AI algorithms that can be challenging to assess within the regulatory framework of the FDA, Richman and colleagues noted that the FDA has proposed a voluntary Software Pre-Cert Pilot Program, which could streamline AI product review through vetting of companies that develop AI software as well as their processes and require post-clearance assessment of “real-world performance analytics.” However, Richman and colleagues raised the concern that this program may emphasize process assessment over initial assessment of the actual AI products.

“ … Without trials of these products, we may implement technology based on evidence that incorporates bias and without a full understanding of the potential risks and benefits,” maintained Dr. Richman and colleagues. “Pragmatic trial designs, such as stepped wedge trials, point-of-care trials, or other designs emphasizing clinician- or practice-level randomization may address some clinical trial challenges.”