News|Articles|May 15, 2026

MRI-Based AI Models Show Significant Decline in External Validation for Pre-Op Prediction Grading of HCC

In a new meta-analysis examining the impact of MRI-based AI models for predicting high-grade hepatocellular carcinoma, researchers noted a 10 percent decline in AUC between internal validation and external validation.

While noting the potential of MRI-based AI models for pre-op evaluation of hepatocellular carcinoma (HCC) grading, a new meta-analysis showed declining sensitivity, specificity and AUC for these models in external validation assessments.

For the meta-analysis, recently published in the European Journal of Radiology, researchers compared internal and external validation of MRI-based AI models for prognostic HCC grading based on a review of 18 studies.

In internal validation testing, the meta-analysis authors noted that the MRI-based AI models offered a 78 percent pooled sensitivity, 80 percent specificity and an 85 percent AUC for predicting high-grade HCC.

However, external validation assessment revealed an eight percent decline in sensitivity (70 percent), a six percent decline in specificity (74 percent) and a 10 percent decline in AUC (75 percent), according to the researchers.

“Our findings confirm that while MRI-AI models demonstrate favorable discriminatory ability within internal validation cohorts, their performance declines significantly in external validation, highlighting substantial challenges in cross-center generalization,” noted lead study author Langshan Yang, MD, who is affiliated with the Department of Hepatobiliary Surgery in the General Surgery Center at Zhujiang Hospital and Southern Medical University in Guangzhou, China, and colleagues.

The meta-analysis authors pointed out that deep learning models offered 16 percent higher pooled sensitivity than machine learning models (88 percent vs. 72 percent).

Three Key Takeaways

• Limited generalizability of MRI-AI models. MRI-based AI models for HCC grading show good performance in internal validation (AUC 85 percent), but clinically meaningful drops in sensitivity, specificity, and AUC occur with external validation, underscoring challenges in cross-institutional reliability.

• Deep learning improves sensitivity but not overall robustness
Deep learning models demonstrate higher sensitivity than traditional machine learning (88 percent vs. 72 percent), likely due to better capture of intratumoral heterogeneity, but no significant specificity advantage and limited external validation data may restrict clinical adoption.

• Heterogeneity and study design limit clinical translation. Predominantly retrospective data, variable definitions of high-grade HCC, and inconsistent imaging/segmentation methods contributed to the heterogeneity of the reviewed studies, highlighting the need for standardization and prospective multicenter validation before routine clinical use of the MRI-based AI models.

“(Deep learning architectures) capture more subtle patterns of intratumoral heterogeneity, such as cellular density variations, nuclear atypia, and microvascular infiltration patterns, which are often difficult to quantify using traditional radiomic features,” added Yang and colleagues.

However, the researchers noted there was no significant difference in specificity between deep learning and machine learning models, and that only eight of the reviewed studies assessed deep learning models.

(Editor’s note: For related content, see “Meta-Analysis Examines MRI-Based AI for Predicting Microvascular Invasion in Hepatocellular Carcinoma,” “Study Suggests Merits of PSMA PET/MRI for Detecting HCC in LI-RADS 3 Cases” and “Multicenter Study Affirms Value of Updated AASLD Criteria for Surveillance of Hepatocellular Carcinoma.”)

In regard to limitations with the meta-analysis, the authors acknowledged that all of the reviewed studies were retrospective and conceded variations with the definition of high-grade HCC. The researchers also suggested that a lack of assessment of possible confounding factors, such as segmentation methodologies and parameters in image acquisition, may have contributed to the significant heterogeneity among the included studies.

Stay at the forefront of radiology with the Diagnostic Imaging newsletter, delivering the latest news, clinical insights, and imaging advancements for today’s radiologists.

Latest CME

Multimedia

BURST CME™ Resource Center: Integrating Novel PSMA-Directed Radioligand Approaches for Diagnosis and Management of Prostate Cancer

Jeremie Calais, MD, PhD; Tanya B. Dorff, MD; Nerina McDonald, MSPAS, PA-C; Scott T. Tagawa, MD, MS, FACP, FASCO

Multimedia

Working Together: Overcoming Barriers to Optimize Outcomes in Patients Treated With Radioligand Therapy Through Multidisciplinary Care

Jeremie Calais, MD, PhD; Tanya B. Dorff, MD; Nerina McDonald, MSPAS, PA-C; Scott T. Tagawa, MD, MS, FACP, FASCO

Multimedia

Radioligand Therapy 101: The Science Behind the Strategy

Jeremie Calais, MD, PhD; Tanya B. Dorff, MD; Scott T. Tagawa, MD, MS, FACP, FASCO

Multimedia

Ready for Radioligand Therapy? Patient Selection and Sequencing Simplified

Jeremie Calais, MD, PhD; Tanya B. Dorff, MD; Scott T. Tagawa, MD, MS, FACP, FASCO

Multimedia

Community Practice Connections™: Beyond the Basics— Revolutionizing Advanced Prostate Cancer Management With PSMA-Targeted Therapies

Jeremie Calais, MD, PhD; Scott T. Tagawa, MD, MS, FACP, FASCO

Video

26th Annual International Lung Cancer Congress

Roy S. Herbst, MD, PhD; Sandip Patel, MD, FASCO; Heather A. Wakelee, MD, FASCO

Video

Enhancing Prostate Cancer Outcomes – The Role of PSMA and Targeted Treatment Strategies

Ana Kiess, MD, PhD; Erin Grady, MD, CCD, FACNM, FSNMMI; Himanshu Nagar, MD, MS; Scott T. Tagawa, MD, MS, FASCO, FACP

Video

9th Annual School of Nursing Oncology™

Beth Faiman, PhD, MSN, APN-BC, BMTCN, AOCN, FAAN, FAPO; Beth Sandy, MSN, CRNP, FAPO; Lindsay Adkins, MSN, FNP-BC, BMTCN; Jeneth Aquino, DNP, FNP-BC; Casey Gormley, MSN, FNP-C, AOCNP; Heather J. Jackson, PhD, FNP-BC; Kelsey Martin, AG-ACNP-BC, AOCNP; Nerina T. McDonald, PA-C; Lauren Verity Moore, DNP, MSN, AGACNP-BC; Faith A. Mutale, DNP, CRNP; Tiffany Richards, PhD, ANP-BC, AOCNP; Emily Skotte, DNP, MSN, ACNP-BC; Leslie Smith, DNP, RN, APRN-CNS, AOCNS, BMTCN; Saneese Stephen, PA-C, MPAS; Sara M. Tinsley-Vance, PhD, APRN, AOCN

MRI-Based AI Models Show Significant Decline in External Validation for Pre-Op Prediction Grading of HCC

Three Key Takeaways

Related Content

Mammogram Interpretation and AI Automation Bias: What New Research Reveals

Breast Imaging in Focus: Can Automated Breast Ultrasound Have an Impact in Screening for Women with Dense Breasts?

Six Key Considerations on Aortic Dissection for Radiologists

Competitive Eating: A New Niche for Radiologists?

SCCT: Pre-Pregnancy CAC Associated with Threefold Higher Risk of Preeclampsia

Latest CME

BURST CME™ Resource Center: Integrating Novel PSMA-Directed Radioligand Approaches for Diagnosis and Management of Prostate Cancer

Working Together: Overcoming Barriers to Optimize Outcomes in Patients Treated With Radioligand Therapy Through Multidisciplinary Care

Radioligand Therapy 101: The Science Behind the Strategy

Ready for Radioligand Therapy? Patient Selection and Sequencing Simplified

Community Practice Connections™: Beyond the Basics— Revolutionizing Advanced Prostate Cancer Management With PSMA-Targeted Therapies

26th Annual International Lung Cancer Congress

Enhancing Prostate Cancer Outcomes – The Role of PSMA and Targeted Treatment Strategies

9th Annual School of Nursing Oncology™

Trending on Diagnostic Imaging

Mammogram Interpretation and AI Automation Bias: What New Research Reveals

Breast Imaging in Focus: Can Automated Breast Ultrasound Have an Impact in Screening for Women with Dense Breasts?

Six Key Considerations on Aortic Dissection for Radiologists

SCCT: Pre-Pregnancy CAC Associated with Threefold Higher Risk of Preeclampsia

Competitive Eating: A New Niche for Radiologists?