Deep learning models trained on a dataset lacking racial diversity could hinder the detection of pathology in underrepresented minority patients.
A study presented at the Radiological Society of North America (RSNA) 2021 Annual Meeting demonstrates the importance of using racially diverse datasets while training artificial intelligence (AI) systems to ensure fair outcomes.
“As the rapid development of deep learning in medicine continues, there are concerns of potential bias when interpreting radiological images,” the authors wrote. “As future medical AI systems are approved by regulators, it is crucial that model performance on different racial/ethnic groups is shared to ensure that safe and fair systems are being implemented.”
The findings were presented by Brandon Price, a medical student at Florida State University College of Medicine in Tallahassee.
Many studies have shown that deep learning systems are subjective in their interpretation of data. Bias is often accidentally introduced into the training data, or a racial or ethnic group is under sampled causing susceptible models to develop bias. In this study, the researchers investigated how a deep learning model trained on a dataset lacking racial diversity could impede the detection of pathology in underrepresented minority patients.
The researchers used a dataset with over 300,000 chest X-ray images and 14 labeled findings. A low sample size of other races/ethnicities meant that only images of Black and White patients were included. One training dataset included only White patients and the other training dataset comprised 26% Black and 74% White patients. An equal distribution of labeled findings was shared between the datasets and a DenseNet model was trained on each dataset 25 times. The receiver operating characteristics (ROC) area under the curve (AUC) and sensitivity, with a specificity threshold of 0.75, were compared for each of the 14 labeled findings.
Compared with a model trained on only White patients, the model trained with a diverse dataset had a significantly better ROC-AUC performance at identifying six of the 14 labeled findings in a test dataset of only Black patients (P <0.05). Additionally, compared with a model trained on only White patients, the model trained with a diverse dataset found a significant increase in sensitivity performance for six of the 14 labeled findings on a test dataset of only Black patients (P <0.05).
“As more AI systems are developed, it is imperative that they are fair and perform equally well with groups that have been historically underserved,” the authors wrote.
For more coverage of RSNA 2021, click here.
Stay at the forefront of radiology with the Diagnostic Imaging newsletter, delivering the latest news, clinical insights, and imaging advancements for today’s radiologists.
Considering Breast- and Lesion-Level Assessments with Mammography AI: What New Research Reveals
June 27th 2025While there was a decline of AUC for mammography AI software from breast-level assessments to lesion-level evaluation, the authors of a new study, involving 1,200 women, found that AI offered over a seven percent higher AUC for lesion-level interpretation in comparison to unassisted expert readers.
FDA Clears Virtually Helium-Free 1.5T MRI System from Siemens Healthineers
June 26th 2025Offering a cost- and resource-saving DryCool magnet technology, the Magnetom Flow.Ace MRI system reportedly requires 0.7 liters of liquid helium for cooling over the lifetime of the device in contrast to over 1,000 liters commonly utilized with conventional MRI platforms.