Machine-learned models provided substantial gains in model sensitivity with slight loss of specificity.
A natural language processing (NLP) system helped identify lumbar spine findings, providing substantial gains in model sensitivity, according to a study published in the journal Academic Radiology.
Researchers from several states sought to evaluate an NLP system built with open-source tools for identification of lumbar spine imaging findings related to low back pain on magnetic resonance and x-ray radiology reports from four health systems.
The researchers used a limited data set sampled from lumbar spine imaging reports of a prospectively assembled cohort of adults. A total of 871 reports were randomly selected from 178,333 available reports; 413 were x-rays and 458 were MR reports.
Related article: Are You Tired of Repeating Yourself?
Using standardized criteria, four spine experts annotated the presence of 26 findings, where 71 reports were annotated by all four experts and 800 were each annotated by two experts. The researchers calculated inter-rater agreement and finding prevalence from annotated data. The annotated data was randomly split into development (80%) and testing (20%) sets. The researchers developed an NLP system from both rule-based and machine-learned models. The system was validated using accuracy metrics such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUC).
The results showed the multirater annotated dataset achieved inter-rater agreement of Cohen's kappa > 0.60 (substantial agreement) for 25 of 26 findings, with finding prevalence ranging from 3% to 89%. In the testing sample, rule-based and machine-learned predictions both had comparable average specificity (0.97 and 0.95, respectively). The machine-learned approach had a higher average sensitivity (0.94, compared to 0.83 for rules-based), and a higher overall AUC (0.98, compared to 0.90 for rules-based).
The researchers concluded that their NLP system performed well in identifying the 26 lumbar spine findings, as benchmarked by reference-standard annotation by medical experts. Machine-learned models provided substantial gains in model sensitivity with slight loss of specificity, and overall higher AUC.
Stay at the forefront of radiology with the Diagnostic Imaging newsletter, delivering the latest news, clinical insights, and imaging advancements for today’s radiologists.
The Reading Room Podcast: Current and Emerging Insights on Abbreviated Breast MRI, Part 3
August 3rd 2025In the last of a three-part podcast episode, Stamatia Destounis, MD, Emily Conant, MD and Habib Rahbar, MD, share additional insights on practical considerations and potential challenges in integrating abbreviated breast MRI into clinical practice, and offer their thoughts on future research directions.
Possible Real-Time Adaptive Approach to Breast MRI Suggests ‘New Era’ of AI-Directed MRI
August 3rd 2025Assessing the simulated use of AI-generated suspicion scores for determining whether one should continue with full MRI or shift to an abbreviated MRI, the authors of a new study noted comparable sensitivity, specificity, and positive predictive value for biopsies between the MRI approaches.
Mammography Study Compares False Positives Between AI and Radiologists in DBT Screening
August 3rd 2025For DBT breast cancer screening, 47 percent of radiologist-only flagged false positives involved mass presentations whereas 40 percent of AI-only flagged false positive cases involved benign calcifications, according to research presented at the recent American Roentgen Ray Society (ARRS) conference.