Pediatric Thyroid Nodules on Ultrasound: Deep Learning Model and TI-RADS Show Higher Sensitivity than Radiologist Assessment

The retrospective study of patients 21 years of age or younger found that a deep learning algorithm and use of the American College of Radiology’s Thyroid Imaging Reporting and Data System (TI-RADS) both had more than a 26 percent greater sensitivity for differentiating thyroid nodules on ultrasound in comparison to radiologist assessment.

Emerging research suggests that employing a deep learning algorithm could be beneficial in differentiating between malignant and benign thyroid nodules on ultrasound in children and young adults.

In a recently published retrospective study in the American Journal of Roentgenology, researchers reviewed data from 139 patients 21 years of age or younger who had a thyroid nodule on ultrasound. Looking at the sensitivity and specificity of differentiating between malignant and benign thyroid nodules, the study authors subsequently compared independent radiologist assessment to use of the American College of Radiology’s Thyroid Imaging Reporting and Data System (TI-RADS) and a previously developed deep learning algorithm.

In contrast to a mean 58.3 percent sensitivity for independent impressions from radiologists, the study authors noted an 87.5 percent sensitivity for the deep learning algorithm and an 85.1 percent sensitivity for use of the TI-RADS classification.

“Given the deep learning algorithm’s relatively high sensitivity, the algorithm may be useful to help identify potentially malignant nodules for further radiologist evaluation at institutions that are currently relying solely on radiologists’ overall impression without the use of ACR TI-RADS,” wrote Maciej Mazurowski, Ph.D., an associate professor at Duke University and the scientific director of the Duke Center for Artificial Intelligence in Radiology, and colleagues.

However, the researchers did note that the specificity of the deep learning algorithm was significantly lower (36.1 percent) than use of the TI-RADS classification (mean of 50.6 percent) and overall independent impression from radiologists (mean of 79.9 percent). In light of these findings, further training of a deep learning system and subsequent validation in a cohort of children with thyroid nodules are necessary prior to clinical use of a deep learning model in this patient population, emphasized Mazurowski and colleagues.

That said, the study authors note the potential of a deep learning model that could offer a more objective, reproducible approach to pediatric thyroid nodule assessment in settings where physicians with specific pediatric training are not available.

Mazurowski and colleagues also note that sensitivity in diagnosing thyroid nodules is particularly important in the pediatric population.

“In adults, high specificity in the diagnostic evaluation of thyroid nodules is important given the desire to limit unnecessary biopsies. However, in children, a thyroid cancer that is small or has low aggressiveness has a prolonged period to grow and/or metastasize. Thus, in children, high sensitivity is a priority,” maintained Mazurowski and colleagues.

In regard to study limitations, the authors acknowledged that the deep learning algorithm was trained with images of adult thyroid nodules. The researchers pointed out that the reviewing radiologists were only asked to provide their impressions of whether a thyroid nodule was malignant or benign and did not include any possible recommendations for fine needle aspiration (FNA) in cases of diagnostic uncertainty. Mazurowski and colleagues conceded that radiologist impressions as well as the deep learning algorithm assessments were based upon two static greyscale images for each nodule. The high malignancy rate of 40.3 percent may have been affected by selection bias and potential referral basis in a study setting at a tertiary care facility, according to the study authors.