Study results show DCNN can use imaging biomarkers to predict sex, potentially confounding the accurate prediction of disease.
A deep convolutional neural network (DCNN) can differentiate between males and females on chest X-ray, according to new research.
Based on existing clinical experience, DCNNs have been shown to be biased against men or women if they are trained on datasets that do not have balanced sex representation. Until now, though, is has been unknown if the algorithms can use visual biomarkers that are beyond a radiologist’s perception to accurately predict sex.
In a poster presented during the Society for Imaging Informatics in Medicine (SIIM) 2021 Virtual Annual Meeting, David Li from the University of Ottawa, detailed the predictive efficacy of a DCNN trained on datasets that included equal numbers of males and females.
For more SIIM 2021 coverage, click here.
“DCNNs trained on two large chest X-ray datasets accurately predicted sex on internal and external test data with similar heatmaps across DCNN architectures and datasets,” he said. “These findings support the notion that DCNNs can leverage imaging biomarkers to predict sex and potentially confound the accurate prediction of disease on chest X-rays and contribute to biased models.”
To test DCNN ability to predict sex, as well as evaluate the visual biomarkers used, Li gathered chest X-ray data from the Stanford CheXPert and National Institutes of Health (NIH) Chest XRay-14 data set, including 224,316 and 112,120 scans from two heterogeneous patient populations, respectively. By using random under-sampling the data volume was reduced to 97,560 images that were balanced to 50 percent male and 50 percent female.
Overall, the dataset was split into 70-percent training, 10-percent validation, and 20-percent test sets. Li used multiple DCNN architectures pre-trained on ImageNet – Inception-V3, ResNet-18, ResNet-50, and VGG-19 – for transfer learning. They were also externally validated.
According to his analysis, on the internal test set, DCNNs trained with both datasets reached an area under the cure ranging from 0.98 to 0.99. External validation showed a peak cross-dataset performance of 0.94 for VGG19-Stanford model and 0.95 for InceptionV3-NIH model.
Additionally, heatmaps showed similar attention areas between model architectures and datasets. They were localized to the mediastinal and upper rib regions and to the lower chest and diaphragmatic regions.
For more coverage based on industry expert insights and research, subscribe to the Diagnostic Imaging e-Newsletter here.
Mammography Study Compares False Positives Between AI and Radiologists in DBT Screening
May 8th 2025For DBT breast cancer screening, 47 percent of radiologist-only flagged false positives involved mass presentations whereas 40 percent of AI-only flagged false positive cases involved benign calcifications, according to research presented at the recent American Roentgen Ray Society (ARRS) conference.
FDA Clears MRI-Based AI Segmentation of Organs at Risk During Radiation Therapy
May 2nd 2025Capable of segmenting over 37 organs and structures in the head, neck and pelvis, the MR Contour DL software is currently being showcased at the European Society for Radiotherapy and Oncology (ESTRO) conference.
Study Examines CT-Based AI Detection of Incidental Abdominal Aortic Aneurysms
April 29th 2025The AI software Viz AAA offered a sensitivity of 87.5 percent in detecting abdominal aortic aneurysms on contrast-enhanced CT, according to new retrospective research presented at the American Roentgen Ray Society (ARRS) conference.