A voice recognition system didn't fare well in a study by researchers from Thomas Jefferson University in Philadelphia that reviewed the number of errors on signed reports from attending radiologists. A spokesperson for the dictation company said the numbers can be misleading.
A voice recognition system didn't fare well in a study by researchers from Thomas Jefferson University in Philadelphia that reviewed the number of errors on signed reports from attending radiologists. A spokesperson for the dictation company said the numbers can be misleading.
Dr. Ronald Dolin and colleagues reviewed 395 consecutive reports from 41 attending radiologists. All reports were generated using PowerScribe 4.7, which had been in use for about 16 months.
Researchers classified errors as significant if they altered or obscured the meaning of the sentence in which they appeared. Dictation errors were categorized into 10 subtypes including missing or extra words, wrong words, typographical or grammatical errors, nonsense phrases with unknown meaning, and inaccuracies in dictation date.
Investigators found 239 errors in 146 reports. They identified at least one error in reports from 40 of the 41 attending radiologists. Significant errors accounted for 17% of the total. Twenty-two radiologists had at least one significant error on a report, while five had two or more. Nearly 70% of insignificant errors were the wrong word or missing or extra words. Dolin concluded that a periodic audit of reports may identify specific error patterns and alert radiologists to alter their dictation practices to prevent future significant errors.
"It's important to note that 249 out of the 395 reports had no errors," said Peter Durlach, senior vice president of healthcare marketing and product strategy at Nuance, which sells PowerScribe.
Durlach said the product has an accuracy rate of 97% and that Dolin's results actually reflect a 1.2% error rate. To figure it out, he estimated the average word length of the reports to be 50. He then divided the number of errors (239) by the total word count (19,750) and arrived at an error rate of 1.2%.
"Some accounts of the study reported a double-digit error rate," Durlach said. "Overall, Dolin found that 37% of the reports had at least one error. Some people mistakenly interpreted that as an overall 37% error rate."
Manufacturers of voice recognition software should implement a confidence level below which any uncertain phrase or word would be highlighted, said Dr. William Morrison, director of musculoskeletal imaging at TJU. He lamented the fact that these systems do not even have grammar checks, which could reduce errors.
"When we had humans doing this, they'd flag anything that went below a certain confidence level of understanding. You want the program to do the same," Morrison said.
Durlach said the company is aggressively working on improving flagging algorithms.
Considering Breast- and Lesion-Level Assessments with Mammography AI: What New Research Reveals
June 27th 2025While there was a decline of AUC for mammography AI software from breast-level assessments to lesion-level evaluation, the authors of a new study, involving 1,200 women, found that AI offered over a seven percent higher AUC for lesion-level interpretation in comparison to unassisted expert readers.
SNMMI: Can 18F-Fluciclovine PET/CT Bolster Detection of PCa Recurrence in the Prostate Bed?
June 24th 2025In an ongoing prospective study of patients with biochemical recurrence of PCa and an initial negative PSMA PET/CT, preliminary findings revealed positive 18F-fluciclovine PET/CT scans in over 54 percent of the cohort, according to a recent poster presentation at the SNMMI conference.