Journal publishers seek solution to missing links

June 2, 2008

Medical and scientific journal publishers increasingly use Internet-based citations to reference data and software mentioned in published papers. URLs cited in the literature, however, continue to disappear, or decay, at an alarming rate.

Medical and scientific journal publishers increasingly use Internet-based citations to reference data and software mentioned in published papers. URLs cited in the literature, however, continue to disappear, or decay, at an alarming rate.

Dr. Mauricio Castillo, editor of the American Journal of Neuroradiology, estimates in medicine in general, about 80% of all webpages are gone four months after their introduction.

"About 40% of URLs will eventually decay, although it is not known how many merely change location," Castillo said.

Evidence of URL decay is the familiar "404 not found" HTTP response code returned when a server cannot find a requested webpage.

The current decay rate is no different from what it was five years ago when the problem was first identified, according to a 2008 follow-up study.

"How URL decay impacts future research is not clear, but it can't be positive," said Jonathan D. Wren, Ph.D., of the Arthritis and Immunology Research Program at the Oklahoma Medical Research Foundation.

Wren recently reexamined URL growth and decay in the literature to see what effect mitigation efforts invoked by some journal publishers and webmasters were having (Bioinformatics 2008 24(11):1381-1385).

"Our study of decayed URLs in Medline since 2003 finds that nothing has changed, except the number of published URLs is growing exponentially," he said.

The problem is pervasive throughout medical and scientific literature. One 2006 study of newly published papers in PubMed found that 12.4% of URLs cited within the articles were inaccessible, even at the time of publication (AMIA Annu Symp Proc 2006:1019).

Preservation of website content is best done at the time of publication, according to Wren.

"That's when both authors and journals care the most and are willing to make the effort," he said.

Wren recommends that journals incorporate a program into their electronic manuscript submission systems that scans for URLs and automatically sends them to a server that takes snapshots of their content.

Radiology- and digital imaging-related journals have reacted to the issue in different ways. The American Journal of Roentgenology requires authors during the submission process to specify when URL citations were last accessed, said Becky Haines, director of publications for the American Roentgen Ray Society.

"AJR also uses supplemental data postings with articles to reduce the need to access URLs to display software programs," she said.

The Journal of Digital Imaging also employs a supplemental material mechanism.

"We publish supplemental material on the publisher's website to assure longevity instead of allowing authors to reference their own websites," said editor Janice Honeyman-Buck, Ph.D. "We do not attempt to save external references in the publications but try to make our own publications persistent."

Max A. Viergever, DSc, editor of IEEE Transactions on Medical Imaging, said he does not consider URL decay to be a serious issue and has therefore taken no specific measures to address it.

URLs identifying articles in science and medicine will remain fairly stable if they have been cited more than twice, Castillo said. All URLs posted by professional organizations, such as the American Society of Neuroradiology, tend to be more stable than those posted individually.

"Our circulation is about 5000, but we had 3.5 million downloaded articles last year, so I think that our URLs are not decaying that fast," he said.