Why I Prefer Transcription to Voice Recognition Software

November 28, 2011

Voice recognition reduces costs and produces faster reports, but can have a high recognition error rate and requires more hardware, software, and constant training.

Based on the principles of aerodynamics, a bumblebee shouldn’t be able to fly. Yet, it does. Similarly, using physicians with decades of advanced and specialized training and education to edit text reports on a hardwired computer would seem as awkward as a bumblebee – yet voice recognition works, and it’s become a mainstay of many radiology practices. However, the traditional transcribing workflow has much elegance, ease and efficiency to commend it.

While I certainly don’t begrudge my colleagues who effectively use voice recognition, here are a few of the issues many authors and I have with this process.

Transcribers can correct simple errors of binary confusion like left/right, centimeter/millimeter, man/woman, no mass/mass, vertebral body levels. These errors easily slip by an author rereading his own work, but stand out clearly to a reader, whether a typist or a clinician customer. Transcribers can save us from our Rick Perry moments.

Voice recognition doesn’t work with simple or complex formatting, such as boldface, italics, automatically numbered lists, bulleted lists, tables, indents, boxes, font changes, and the like. Flat mono-spaced unenhanced text that characterized the PC in the days of DOS may be, strictly speaking, an adequate commodity: not gorgeous, but (just barely) good enough for patient care. I prefer to differentiate my work product by making it more visually appealing, however, as did Steve Jobs with his first Macintosh.

In this jobless recovery, some institutions may have to lay off transcribers who are providing a useful service, to hospitals, physicians and patients. Healthcare leaders talk about reducing healthcare costs and balancing healthcare budgets, but we’re balancing on the backs of workers like transcribers. Extended unemployment benefits due to transcribers being laid off after outsourcing transcription services can represent a substantial expense - why not simply keep them typing?

Using voice recognition and self editing requires the radiologist to rapidly and repeatedly shift his visual attention, like watching a tennis match, between gray scale images and black and white text screens. Can you say whiplash - both cervical and psychobiological?

Our brains can simultaneously speak (dictate; cranial nerves 7 and 8) and view (interpret images; cranial nerve 2). One cannot view images and type (or view or edit text) simultaneously.

The much greater error rate of radiologists-as-typists is well documented. Basma found that at least one major error was discovered in almost a quarter of the speech recognition reports (23 percent), while only 4 percent of reports generated by conventional dictation transcription contained errors. Errors included a missing or prepended “no”, missing words, incorrect measurement units (metric, therefore off by a factor of 10!), and nonsense phrases. Her research should not be surprising: transcribers (being trained, experienced and paid for transcribing) are better typists than radiologists. Radiology residencies perhaps need to provide more keyboarding experience, since this task and talent is increasingly required as voice recognition is increasingly disseminated.

An undisputed advantage of voice recognition is faster apparent availability of a final finished report. However, rapid turnaround can be accomplished just as well by having a smart transcriber who is typing three minutes behind my spoken word. It was letting the voice recording sit on a cassette tape overnight that introduced delays. I’ve enjoyed (insisted on) having a smart transcriber typing right behind me, but such real time availability of a paraprofessional does indeed cost more. I value her function as an editor, formatter and error-corrector, and allowing me to focus on the image and the diagnosis rather than the text on screen.

Voice recognition is extremely intensive in consumption of desktop workspace real estate, hardware and software. It requires two input devices (a keyboard and a microphone), two output devices (a screen and speakers or headphones), usually hardwired and thus tethered to a PC. Advanced voice recognition software and a low ambient noise working environment are required.

In contrast, transcription requires only a telephone (can be wireless; noise tolerant), word processing software, and a transcriber.

Many other professions and industries routinely employ editors and transcribers. Lawyers rely on secretaries and paralegals to revise their run-on sentences and nonsense phrases. Dentists and their hygienists crosscheck one another’s prophylaxis and preparations during routine cleanings and fillings. Newspaper editors aggressively vet the work of their reporters on everything from the veracity of sources to typographical errors. Novelists use their typists, editors and agents to improve the elegance, clarity, appeal and relevance of their work product.

Transcription has been seen by some as a “routine” task and, so, it should be automated. There are many aspects of transcriptionists’ jobs, however, that are not routine, and require human judgment: error correction, formatting, clarifying the unclear, telephoning reports, interacting with people in the health information communication continuum.

And then there’s the issue of physician efficiency. When initially starting with voice recognition, radiologists’ efficiency drops by about 30 percent in terms of RVUs per hour during the first two weeks. With experience, much of this inefficiency is mitigated, and in some settings gains of 10 percent above baseline have been achieved after four weeks.

These gains above baseline throughput, however, require (a) continuously learning software, (b) extensive use of templates and standard normal reports and (c) use of structured data, wherein data fields are mapped and automatically imported and populated from other parts of the healthcare enterprise, like CT radiation exposure or ultrasound size measurements. Until the templates are emplaced and structured data fields are mapped and interfaces created by our IS teams, maintaining efficiency relies solely on physicians working harder and concentrating more intently on a screen document. Touch typing, voice typing, and editing are not the best and highest use of the training, education and experience of radiologists.

As you know, I am hardly a Luddite – I am an early adopter, and some of my best friends are computers. But, just because a computer can do something doesn’t mean that it should. No one would want a machine to be our lady in the parlor, or for Watson to be our golf partner. The judgment, conscientiousness, concern for patient care, experience, and just plain common sense of an interested transcriber is a valuable contribution to the healthcare process and information flow.

This newly documented high error rate of voice recognized reporting will doubtless motivate initiatives for error reduction: reducing medical errors must continue to be of the highest priority. One way to reduce errors is for radiologists to concentrate harder and be more careful with our voice-typed reports. Another way to reduce errors is voice typing software enhancements, like building in algorithms to detect left/right and missing “no” defects. Another way, already demonstrated by Basma and colleagues, is to use transcribers.

I think we’re hard pressed to find a more effective method of converting our medical diagnoses into an attractive readable document than an attentive, intelligent, experienced transcriber who’s typing right on our heels, and walking across the hall to clarify any inconsistencies in what we’ve just dictated. But, I recognize that we live in an era which values people-replacement, machine intelligence, cost reduction, doing-more-with-less, and increasing velocity in all enterprises.

So, while my colleagues make their bumblebees fly very effectively, many authors contributing to the medical record continue to enjoy smooth sailing on the wings of the elegant secretary bird.


¹Thomas Priselac, Be careful of cuts to hospitals, Los Angeles Times, 7 October 2011.
²Sarah Basma, et al., Error Rates in Breast Imaging Reports, American Journal of Roentgenology 197:923-927, 2011.
³Personal communications, MedQuist Corporation, 2011.