Radiologic data mining may quench thirst for medical knowledge

December 3, 2001

Publishing teaching file collections online has become popular among many radiology departments. While these efforts facilitate widespread dissemination of educational cases, however, the proliferation of independent teaching file repositories has

Publishing teaching file collections online has become popular among many radiology departments. While these efforts facilitate widespread dissemination of educational cases, however, the proliferation of independent teaching file repositories has created an integration problem for users who want to query multiple resources, according to an infoRAD exhibit at the RSNA meeting last week.

"Each repository is an information island, whose query mechanism, classification system, and presentation formats are isolated from the rest of the teaching file universe," said Rex M. Jakobovits, Ph.D., president of Workhost Data Solutions.

To retrieve pertinent cases from multiple repositories, a user must query each interface separately, yielding highly heterogeneous result formats. As a possible solution to the dilemma, Jakobovits is developing a teaching file document standard that facilitates interoperability across independent repositories, allowing queries to be made from a unified interface.

Other experts believe a solution lies in better skills in data mining, which MIT Technology Review magazine ranked among the ten emerging technologies that will change the world. Responding to the need to create more proficient data miners, Central Connecticut State University (CCSU) recently launched an online Master of Science program in data mining.

"Only a handful of M.S. programs currently exist in data mining worldwide, and no other university offers a program online," said Daniel Larose, Ph.D., an associate professor of statistics and director of CCSU's data mining curriculum.

In this age of HMOs and cost-benefit optimization, medical informatics represents one of the most fruitful applications of data mining, according to Larose.

"As providers seek to maximize profits in an increasingly competitive market, they're turning to the wealth of patient data available for decision support," he said. "Data mining allows providers to uncover profitable patterns and trends in their existing data warehouses that would otherwise be wasted."

While most programs stress research only, helping students work toward advanced degrees, the CCSU program is designed to prepare students to actually perform data mining analysis on real-world data sets rather than researching some new nonlinear neural net algorithm.

"Just as with any new information technology, data mining is easy to do badly," Larose said.

Researchers may apply inappropriate analysis to data sets that call for a completely different approach, for example, or models may be derived that are built upon wholly specious assumptions.

"Our program requires students to understand the statistical and mathematical model structures underlying the software," he said.

Instead of viewing the field of data mining as simply a set of disparate techniques, the CCSU program aims to show how these techniques are interrelated. Medical researchers can thus be assured that their research is built on solid statistical and mathematical foundation and does not simply reflect spurious trends in a particular data set.