A voice recognition system didn't fare well in a study by researchers from Thomas Jefferson University in Philadelphia that reviewed the number of errors on signed reports from attending radiologists. A spokesperson for the dictation company said the numbers can be misleading.
A voice recognition system didn't fare well in a study by researchers from Thomas Jefferson University in Philadelphia that reviewed the number of errors on signed reports from attending radiologists. A spokesperson for the dictation company said the numbers can be misleading.
Dr. Ronald Dolin and colleagues reviewed 395 consecutive reports from 41 attending radiologists. All reports were generated using PowerScribe 4.7, which had been in use for about 16 months.
Researchers classified errors as significant if they altered or obscured the meaning of the sentence in which they appeared. Dictation errors were categorized into 10 subtypes including missing or extra words, wrong words, typographical or grammatical errors, nonsense phrases with unknown meaning, and inaccuracies in dictation date.
Investigators found 239 errors in 146 reports. They identified at least one error in reports from 40 of the 41 attending radiologists. Significant errors accounted for 17% of the total. Twenty-two radiologists had at least one significant error on a report, while five had two or more. Nearly 70% of insignificant errors were the wrong word or missing or extra words. Dolin concluded that a periodic audit of reports may identify specific error patterns and alert radiologists to alter their dictation practices to prevent future significant errors.
"It's important to note that 249 out of the 395 reports had no errors," said Peter Durlach, senior vice president of healthcare marketing and product strategy at Nuance, which sells PowerScribe.
Durlach said the product has an accuracy rate of 97% and that Dolin's results actually reflect a 1.2% error rate. To figure it out, he estimated the average word length of the reports to be 50. He then divided the number of errors (239) by the total word count (19,750) and arrived at an error rate of 1.2%.
"Some accounts of the study reported a double-digit error rate," Durlach said. "Overall, Dolin found that 37% of the reports had at least one error. Some people mistakenly interpreted that as an overall 37% error rate."
Manufacturers of voice recognition software should implement a confidence level below which any uncertain phrase or word would be highlighted, said Dr. William Morrison, director of musculoskeletal imaging at TJU. He lamented the fact that these systems do not even have grammar checks, which could reduce errors.
"When we had humans doing this, they'd flag anything that went below a certain confidence level of understanding. You want the program to do the same," Morrison said.
Durlach said the company is aggressively working on improving flagging algorithms.
Leading Breast Radiologists Discuss the USPSTF Breast Cancer Screening Recommendations
May 17th 2024In recognition of National Women’s Health Week, Dana Bonaminio, MD, Amy Patel, MD, and Stacy Smith-Foley, MD, shared their thoughts and perspectives on the recently updated breast cancer screening recommendations from the United States Preventive Services Task Force (USPSTF).
Multicenter CT Study Shows Benefits of Emerging Diagnostic Model for Clear Cell Renal Cell Carcinoma
May 15th 2024Combining clinical and CT features, adjunctive use of a classification and regression tree (CART) diagnostic model demonstrated AUCs for detecting clear cell renal cell carcinoma (ccRCC) that were 15 to 22 percent higher than unassisted radiologist assessments.
CT Study: AI Algorithm Comparable to Radiologists in Differentiating Small Renal Masses
May 14th 2024An emerging deep learning algorithm had a lower AUC and sensitivity than urological radiologists for differentiating between small renal masses on computed tomography (CT) scans but had a 21 percent higher sensitivity rate than non-urological radiologists, according to new research.