Struggle against cancer inspires PACS data-mining methods

February 4, 2002

PACS in combination with data mining yields the possibility of advanced computer-assisted medical diagnosis systems. Several new tools for mining data from PACS archives are being adopted in the escalating fight against cancer. One technique, called

PACS in combination with data mining yields the possibility of advanced computer-assisted medical diagnosis systems. Several new tools for mining data from PACS archives are being adopted in the escalating fight against cancer.

One technique, called i2i Vision, is being developed at the Georgia Tech Research Institute (GTRI). It allows clinical researchers to search through large archives of digitized mammograms to help them learn more about patients at high risk for developing breast cancer.

The technique enables computers to sift rapidly through large databases, according to Christopher Barnes, Ph.D., a GTRI research engineer.

"Data-mining techniques capable of exploring this raw data can discover pixel patterns and correlated health factors useful to medical researchers seeking cures and clues to breast cancer," he said.

The i2i Vision system scans the pixels of digital mammograms in the database to correlate the content of the images, a process designed to discover common and indicative pixel patterns among benign and malignant tumors recorded in the image archive. Further data mining of patient records in the archive should discover other clues useful for understanding breast cancer, Barnes said.

The process doubles as a clinical tool for new breast cancer screenings. In early mammogram screening trials, it showed promising results for detecting microcalcifications.

"The approach achieved nearly 100% detection with what appears to be acceptable levels of false alarms," Barnes said. "That's where you want to be with a mammogram."

Another data-mining method, presented at the Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology last year in Berlin, uses decision tree induction to interpret lung cancer images.

This method of computer-assisted radiology, however, will only work if PACS databases are carefully designed to supply sufficient data for the development of decision support systems, an aspect rarely considered when PACS is being implemented, according to Petra Perner, Ph.D., director of the Institute of Computer Vision and Applied Computer Sciences in Leipzig, Germany.

"This system will not give the expected result if the PACS has not been set up in the right way," she said. "Images and expert descriptions have to be stored in a standard format."

The use of image-processing methods may be an important adjunct to facilitate spiral CT lung cancer screening, said Larry Clark, Ph.D., branch chief of the National Cancer Institute's Biomedical Imaging Program.

Image-processing algorithms have the potential to assist in lesion detection on spiral CT studies and to assess the stability or change in size of lesions on serial CT studies, he said.

"Investigators developing image-processing algorithms, however, need standardized databases with which to work," Clark said. "The generation of standardized databases requires the development of consensus on many issues related to database design, accessibility, metrics, and statistical methods for evaluating image-processing algorithms."

NCI plans to establish a consortium of institutions, called the Lung Image Database Consortium (LIDC), to develop a consensus and the necessary database.