A little background noise improves accuracy of automated speech recognition reporting

November 26, 2007

The soft, white sound of an air conditioner forcing air into a radiology reading room may be music to the virtual ear of speech recognition reporting systems and improve accuracy during automated radiology report transcription.

The soft, white sound of an air conditioner forcing air into a radiology reading room may be music to the virtual ear of speech recognition reporting systems and improve accuracy during automated radiology report transcription.

Common sense suggests that more background noise would produce more errors, but informatics researchers at the VA Maryland Healthcare System learned by studying ambient white noise that the opposite is true under specific conditions. Dr. Jonah Zwemer presented results of their study Sunday at the 2007 RSNA meeting.

Ten radiologists digitally recorded 20 reports, randomly selected from the hospital's radiology information system, over one of four levels of ambient white noise. White noise levels ranged from 41.7 to 53.9 dB. The studies covered a range of imaging modalities and applications. The recorded reports were transcribed using Scansoft Dragon NaturallySpeaking 8, a commercial speech recognition software system.

The 10.3% mean baseline transcription error rate for the lowest white noise level was significantly lower (p = 0.006) than the 11.6% mean baseline TER for dictation without background noise, Zwemer reported. Error rates were significantly higher (p

The findings can be explained by human behavior, rather than anything intrinsic to ambient noise and computer program performance.

Zwemer said that a study presented at the 2006 RSNA inspired the underlying hypothesis tested in his trial. That investigation found that the reporting accuracy of automated speech recognition transcription was more accurate in the presence of some background noise than with no ambient noise at all.

The results led the VA Maryland Healthcare System group, including radiology chief Dr. Eliot Siegel, to wonder if the phenomenon may be linked with the Lombard effect. It is the inherent tendency of humans to speak louder and more distinctly when speaking over background noise.

Siegel noted during the question-and-answer session after Zwemer's presentation that only a little white noise is needed to trigger the behavior. About 47 dB appears to be optimal. That corresponds with the sound of an air conditioner humming in the background, he said.

The results were persuasive enough for Siegel to introduce constant ambient noise into the specifications for his model radiology reading room of the future.