Free Text Reporting Difficult for Machines to Interpret

March 22, 2018
Diagnostic Imaging Staff

Using free text reporting for radiology studies vary considerably in length and are difficult to interpret as a result.

Free text reporting results in extensive variability in report length and report terms used and is difficult for machines to interpret, according to a study published in the Journal of the American College of Radiology.

Researchers from the Milton S. Hershey Medical Center in Hershey, Penn., sought to quantify the variability of language in free text reports of pulmonary embolus (PE) studies and to gauge the informativeness of free text to predict PE diagnosis using machine learning as proxy for human understanding.

The researchers evaluated 1,133 consecutive chest CTs with contrast studies performed under a PE protocol. Commercial text-mining and predictive analytics software was used to parse and describe all report text and to generate a suite of machine learning rules that sought to predict the “gold standard” radiological diagnosis of PE.

The results demonstrated an extensive variation in the length of findings section and impression section texts across the reports, only marginally associated with a positive PE diagnosis. A marked concentration of terms was found, such as 20 words were used in the findings section of 93 percent of the reports, and 896 of 2,296 distinct words were each used in only one report’s impression section. In the validation set, machine learning rules had perfect sensitivity but imperfect specificity, a low positive predictive value of 73 percent, and a misclassification rate of 3 percent.

The researchers concluded that free text reporting was difficult for the machines to interpret due to the extensive variability of report length and number of terms used. They suggested that this presents potential difficulties for human recipients in fully understanding such reports. “These results support the prospective assessment of the impact of a fully structured report template with at least some mandatory discrete fields on ease of use of reports and their understanding,” they wrote.