Grid debuts to facilitate inter-institutional image interoperability

April 4, 2008

A grid designed to link applications in the research environment leverages grid computing technologies and web resources to provide a set of programming tools for research application development and deployment.

A grid designed to link applications in the research environment leverages grid computing technologies and web resources to provide a set of programming tools for research application development and deployment.

The National Cancer Institute's cancer Biomedical Informatics Grid (caBIG) program reached a major milestone with the recent launch of its companion open source grid architecture, caGrid 1.0.

"This Grid provides the core enabling infrastructure for caBIG," said Scott Oster of the biomedical engineering department at Ohio State University.

The Grid is aimed primarily at application developers and provides core services, toolkits, and wizards for the development and deployment of community services, application programming interfaces for building client applications, and reference implementations of applications and services already available in the production Grid, Oster said.

A major obstacle to more effective use of information in multi-institutional settings is the lack of interoperability among resources and mechanisms to securely access distributed information.

"Interoperability is an especially challenging problem in biomedical research because data sets are stored in different formats, information is represented using different terminology, and analysis applications are invoked and executed in different ways," Oster said.

The purpose of caGrid is to facilitate inter-institutional interoperability.

For instance, suppose researchers want to evaluate their computer-aided detection algorithm to screen for clinically significant lung nodules. The researchers want to query multiple digital image repositories that contain chest CT images. The Grid enables federated access to distributed image sources via a common data service.

"The advantage is there is no need to collect all the data in a centralized database, nor to host various analysis programs on a central server," Oster said.

The Grid also provides a means by which query results, which can range in size from hundreds of megabytes to hundreds of gigabytes, can be handled.

"Several ways of accessing and transferring large volumes of data have been implemented in caGrid, including WS-Transfer, WS-Enumeration, and GridFTP," Oster said.

In the example scenario, data transfer between the client program and the data service can be achieved either by WS-Enumeration, in which the client retrieves images in the result set one by one, or by GridFTP, which allows bulk transfer of images quickly from server to client host, Oster said.

The Grid software is available online.