1. Merelli I, Landenna M, Milanesi L
Biological database access and integration using web services in GRID technology
Meeting: BITS 2005 - Year: 2005
Full text in a new tab
Topic: Database annotation and data mining
Abstract: Data integration is a fundamental process in Bioinformatics because the enormous quantity of information available is often difficult to interpret. However, using a high performance platform such as GRID, it will be possible to complete important studies to improve the understanding of the biological process. Through facing this challenge, the importance of creating a data management system that guarantees efficiency on a distributed platform has emerged. This study concerns the definition of an innovative tool for the databases management in GRID technology and the implementation of a concrete case of use in integrating biological data. The core software is a Web Service that allows the execution of SQL query on a series of distributed databases, available on different computer, through the SOAP protocol. In this way, through the client, that can be run from a GRID Computing Element, it is possible to interact with the database. The data extracted from each local database are then integrated by the Web Server and sent to the application that asks for them, optimizing the communication times. Through this software it is possible to perform elaborations that involve data access and integration on GRID easily. This Web Service has been tested during the development of an integration pipeline among two important biological databases as UNIPROT and ENSEMBL in order to coordinate different information about a certain protein sequences. Thanks to this distributed system of data access it has been possible to conduct a systematic GRID analysis of the kinase sequence references in these important databases.