Significant errors appear in voice recognition reports

June 1, 2007

A voice recognition system didn't fare well in a study by researchers from Thomas Jefferson University in Philadelphia that reviewed the number of errors on signed reports from attending radiologists. A spokesperson for the dictation company said the numbers can be misleading.

A voice recognition system didn't fare well in a study by researchers from Thomas Jefferson University in Philadelphia that reviewed the number of errors on signed reports from attending radiologists. A spokesperson for the dictation company said the numbers can be misleading.

Dr. Ronald Dolin and colleagues reviewed 395 consecutive reports from 41 attending radiologists. All reports were generated using PowerScribe 4.7, which had been in use for about 16 months.

Researchers classified errors as significant if they altered or obscured the meaning of the sentence in which they appeared. Dictation errors were categorized into 10 subtypes including missing or extra words, wrong words, typographical or grammatical errors, nonsense phrases with unknown meaning, and inaccuracies in dictation date.

Investigators found 239 errors in 146 reports. They identified at least one error in reports from 40 of the 41 attending radiologists. Significant errors accounted for 17% of the total. Twenty-two radiologists had at least one significant error on a report, while five had two or more. Nearly 70% of insignificant errors were the wrong word or missing or extra words. Dolin concluded that a periodic audit of reports may identify specific error patterns and alert radiologists to alter their dictation practices to prevent future significant errors.

"It's important to note that 249 out of the 395 reports had no errors," said Peter Durlach, senior vice president of healthcare marketing and product strategy at Nuance, which sells PowerScribe.

Durlach said the product has an accuracy rate of 97% and that Dolin's results actually reflect a 1.2% error rate. To figure it out, he estimated the average word length of the reports to be 50. He then divided the number of errors (239) by the total word count (19,750) and arrived at an error rate of 1.2%.

"Some accounts of the study reported a double-digit error rate," Durlach said. "Overall, Dolin found that 37% of the reports had at least one error. Some people mistakenly interpreted that as an overall 37% error rate."

Manufacturers of voice recognition software should implement a confidence level below which any uncertain phrase or word would be highlighted, said Dr. William Morrison, director of musculoskeletal imaging at TJU. He lamented the fact that these systems do not even have grammar checks, which could reduce errors.

"When we had humans doing this, they'd flag anything that went below a certain confidence level of understanding. You want the program to do the same," Morrison said.

Durlach said the company is aggressively working on improving flagging algorithms.

Related Content:

News