Why the NEJM study on CAD is wrong

April 6, 2007

Next year, my son will begin driving the family car. Before he does, he’ll learn the rules of the road in a classroom and behind the wheel, not to mention undergoing the umpteen hours of driving I have planned for the two of us in parking lots and on side roads. Yet I know that teenagers, though just 7% of licensed drivers, suffer 14% of fatalities and 20% of all reported accidents. How do I know? Because statisticians have told me.

Next year, my son will begin driving the family car. Before he does, he'll learn the rules of the road in a classroom and behind the wheel, not to mention undergoing the umpteen hours of driving I have planned for the two of us in parking lots and on side roads. Yet I know that teenagers, though just 7% of licensed drivers, suffer 14% of fatalities and 20% of all reported accidents. How do I know? Because statisticians have told me.

Insurance companies, realizing that newbie drivers - a group heavily weighted toward teenagers - have a lot to learn, charge them higher premiums. Several factors contribute to this. Certainly, a lack of maturity is one. But so is the learning curve that goes with using a new technology, such as computer-aided detection.

This fact did not escape Dr. Ferris M. Hall, who wrote in yesterday's New England Journal of Medicine: "One possible flaw in the study by Fenton et al was the failure to assess the time it takes to adjust to computer-aided detection."

I don't know how this fact escaped the attention of the well-qualified researchers whose article in the same issue of the NEJM described CAD screening mammography as doing more harm than good. Their claims are based on data obtained from novices in its use.

Survey a million young people and compare the number of accidents they have behind the wheel with an equal number of experienced drivers, and you'll find the increased risk of rookie drivers - not the overall risk of driving a car.

The data presented in the NEJM support the conclusion that CAD screening mammography, when performed by physicians with experience ranging from two to 25 months, increased the number of false positives, leading to more callbacks and biopsies. They do not support a sweeping generalization that CAD screening mammography in itself reduces accuracy or exposes patients to unneeded biopsies.

It is clear that the researchers did not set out to look at just the efficacy of CAD. Rather, they put together a survey that "measured factors that may affect the interpretation of mammograms (e.g., procedures used in reading the images, use of computer-aided detection, years of experience of radiologists in mammography, and number of mammograms interpreted by radiologists in the previous year)." What popped out at them was the drop in specificity after CAD implementation (90.2% to 87.2%), increase in recall rates (10.1% to 13.2%), and rise in the number of biopsies per 1000 screening mammograms (14.7 to 17.6).

Given that these surveys examined a study period from 1998 to 2002 and that the data were published only yesterday - five years later - it seems reasonable that, considering the potential importance of the findings, the researchers could have added a few years to the database, say through 2005. This would have allowed them to stratify providers by experience at the seven institutions that implemented CAD during the study period and then compare their performance. It might then have been possible to determine if mammographers' performance improved as their experience with CAD increased and, if so, at what point they got better than they were without it. It might even have provided enough data to determine the effect of software upgrades on performance.

But that is not what the researchers did. They published a study with findings that may or may not be valid. In their NEJM paper, the researchers allude to the need for "more precise" testing.

Too bad they didn't take the time to do it themselves.