Last updated: April 23 2014 12:27:28
Task 3.1 Biomedical applications  

Under the bioinformatics and medical informatics terms, this proposal considers the research and development on biology and medicine using advanced information and communication technologies (ICT). Since there are many different complementary meanings, we will refer as “biomedicine” all the activities related to bioinformatics and medical informatics.

With the advent of the post-genomic era, the requirements for the interpretation and research on the sequenced human genome have extensively grown with respect to those needed in its identification. New tools such as pair-wise alignment, pattern scanning, multiple alignment, secondary structure prediction, etc. require dealing with large databases and long CPU processing times. The applicability of Grid is clearly manifested in projects such as the GPS@ (Grid Protein Sequence Analysis) a pilot application of the EGEE project project.

The advances in bio-models have enabled the simulation of multi-level and coupled models of tissues, organs or even whole functional systems. However, the increase in complexity has leaded to an important increase on the computational requirements. Enabling the use of such models in clinical practice through the Grid will lead to clear benefits for the healthcare, as it has been demonstrated in the GATE (GEANT4 Application to Tomographic Emission) consortium, in which a platform for accurate simulation of nuclear therapy and tomographic emission will reduce the morbidity and secondary effects of current practice.

The last but not least, the continuous development of Electronic Patient Records (EPR) is generating vast databases of clinical information whose benefits are not yet exploited. The knowledge that exhaustive clinical records from large populations obtained during relevant periods of time is absolutely valuable for the studies of real efficiency of treatments, vaccines and drugs. Currently, most clinical studies concentrate on effectiveness (assessment of the validity of a drug for treating a specific pathology using a test population) and rarely on efficacy (analysis of the cost, morbidity and secondary effects on the test population), and very few on the efficiency (real analysis of the performance on a treatment on the real population and without controlling factors). The extraction of clinical knowledge from Clinical Decision Support Systems (A Clinical Decision Support System on the Grid) will strongly benefit the treatment and the coordination of distributed support systems within a Grid and will increase the quality and accuracy of a final diagnosis and prognosis.

The EELA biomedical applications, BLAST in Grid and Phylogeny, WISDOM and GATE, cover the well known areas of Biomedicine, namely Bioinformatics, Computational Biochemical Processes and Biomedical Models, respectively.

DESCRIPTION OF THE WORK
  • Deployment of pre-existing EGEE biomedical applications.

    The applications described and running in EGEE will be the first applications to be deployed, mainly due to the following reasons: these applications shown an adequate state of maturity because they have been tested by hundreds of specialists for several months in the EGEE infrastructure and the LA partners involved have shown a particular interest for deploying these applications at their sites. Due to the complexity in the deployment of the applications and the characteristics of the countries and entities involved in this first deployment, the first two applications to be in production will be the GATE and WISDOM. New applications are being developed in the EGEE project, and thus the selection of subsequent pilots might vary depending on the interest of the user communities found overseas. Latin-American partners will firstly look, with the assistant of the European counterparts, for the interested communities of users and will therefore select the most suitable applications. Installation and deployment will be performed by the European partners providing the necessary assistance for the customisation. Induction on the use of the applications will be given if necessary, sharing the effort with the partners involved in the dissemination tasks, and the local partners.

  • Support for the identification of new applications and exploration of possible migration.

    Along with the process of transferring the applications, it is also important to identify new applications that could provide solutions to problems that are more relevant to the new communities and also to transfer the expertise to start-up new HealthGrid applications. These new Biomed applications are BLAST in Grid and Phylogeny. Thus, a process of identification, migration and deployment of new applications has been organised and implemented with the collaboration of the local partners. Local partners have identified problems and user communities, who will be encouraged to submit proposals of integration into the EELA infrastructure. Maturity of the application, target user community, Grid relevance and compliance to standards have been considered among other factors to rank the proposals, Assistance in the migration of the application has been provided by the European and Latin American partners to the application developers in the form of direct collaboration, training or consultancy depending on the application and resources available. Integration of these new applications in the European EGEE infrastructure has been already proposed. Success will be measured in the number of user communities, countries and applications. International coverage of the user communities will be encouraged.

GATE (Geant4 Application for Tomographic Emission)

Radiotherapy and brachytherapy use ionizing radiations to treat cancer. Before each treatment, physicians and physicists plan the treatment using analytical planning systems and medical images data of the tumour area.
These analytical solutions simplify the problem not considering the different density of the tissues and leading to a significant inaccuracy. In order to treat patients with the best accuracy, Monte Carlo simulations are today the best tool to model and plan the tumour treatment for complex requirements.
GATE is a C++ platform based on the Monte Carlo Geant4 software.
It has been typically designed to model nuclear medicine applications, such as the Positron Emission Tomography (PET) and the Single Positron Emission Computed Tomography (SPECT) among the OpenGATE collaboration. Its functionalities, combined to its ease of use, make this platform also adequate for radiotherapy and brachytherapy treatment planning.


WISDOM (Wide In Silico Docking of Malaria)



The objective of the WISDOM is the creation of new inhibitors for a family of proteins produced by Plasmodium falciparum. This protozoan parasite causes malaria and affects around three hundred million people and more than 4 thousand people die daily in the world.
Drug resistance has emerged for all classes of antimalarials except artemisinins. The main reason is that the available drugs focus on a limited number of biological targets, producing a cross-resistance to antimalarials. There is a consensus that substantial scientific effort is needed to identify new targets for antimalarials.
The main problem is that the development of new drugs with new targets is a costly and lengthy process, and the economic profit is not clear for the drug manufacturers.
This application consists on the deployment of a high throughput virtual screening platform in the perspective of in silico drug discovery for neglected diseases. The WISDOM platform performs a High-Throughput virtual Docking of million of chemical compounds available in the databases of ligands to several targets of Plasmepsin.

BLAST (Basic Local Alignment Searching Tool)



One of the most important efforts on the analysis of the genome is the study of the functionality of the different genes and regions. Sequence alignments provide a powerful way to compare novel sequences with previously characterized genes. Both functional and evolutionary information can be inferred from well designed queries and alignments.
BLAST finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches.
This process of finding homologous of sequences is a very computationally-intensive process. The size of the non-redundant databases currently available increases daily, reaching the size of more than a gigabyte. Searching alignment of a single sequence is not a costly task, but normally, thousands of sequences are searched at the same time. Moreover, since the databases are periodically updated, it will be convenient to periodically update the results of previous studies.

Phylogeny (MrBayes)



A phylogeny is a reconstruction of the evolutionary history of a group of organisms. Phylogenies are used throughout the life sciences, as they offer a structure around which to organize the knowledge and data accumulated by researchers.
The inference of phylogenies with computational methods is widely used in medical and biological research and has many important applications, such as gene function prediction, drug discovery and conservation biology. Bayesian inference is a powerful mathematical method which is implemented in the program MrBayes for estimating phylogenetic trees that are based on the a posteriori probability distribution of the trees.
Compared to other methods, Bayesian inference takes full advantage of the information contained in the alignment of DNA sequences when estimating phylogenies because it can even make use of morphological data. The complexity of large-scale phylogeny studies, represents a true computational grand challenge. Due to the nature of Bayesian inference, the simulation can be prone to entrapment in local maxima. To overcome local maxima and achieve better estimation, the MrBayes program has to run for millions of iterations (generations) that require a large amount of computation time.

Bioinformatics portal



The Ibero-American Portal of Bioinformatics (http://portal-bio.ula.ve) installed at the National Centre for Scientific Computation (CeCalCULA) of the Universidad de Los Andes in Venezuela is an initiative for the spreading of findings in the Bioinformatics area in Venezuela and in other Spanish speaking countries developed before EELA starting date.
This portal of portals is the result of the incorporation of several servers developed at CeCalCULA which aims to create on-line academic and research communities.
It also has several on-line applications for registered users and, the number of which expects to increase by joining EELA. Thus, the availability of an independent Grid-enabled version integrated on the Bioinformatics Portal will provide registered users with results in a shorter time within the frame of the BLAST application or a Grid service for the parallelised version of MrBayes.
For the estimation of resources needed, it must be considered that not all users are normally working simultaneously on the portal. Peak usage is estimated in the order of 50-100 simultaneous users. Own resources of the Bioinformatics portal (10 Opteron processors and 36 GBytes of RAM) are adequate for setting up the basic services, but will not be enough to deploy a production system for mpiBLAST or MrBayes. The linkage of the portal to the EELA Grid is necessary to deal with the computational demand estimated. The 24 Mbps network connection of the centre will be in September 2006, which will not penalise the performance if computing is moved within the Grid.

GrEMBOSS



The application EMBOSS is being ported to the Grid creating in this way a new application called GrEMBOSS. With this free Open Source software analysis package specially developed for the needs of the molecular biology user community EELA will keep on working in the biomedical field as well as increasing their users. The versions that will be ported are the latest released, i.e, 4.0 and 4.1 as well as the databases that will be stored in the Storage Elements and the LFC by means of the GFAL library.
The script for the submission of jobs from a UI makes all the necessary commands and deals with the databases and the EMBOSS tool in order to obtain the final results transparently for the user.

GAMOS / MIRaS



New tools developed for the simulation of the effects of the radiation in the human body have joined recently the EELA Project. They are GAMOS and MIRaS, which will complete the scope of GATE since they are also based on GEANT4 and Monte Carlo simulations, but their application is not in the field of Nuclear Medicine. The main lines of activities are involved in medical imaging, radiotheraphy, medical computing, biomedical engineering and radio protection to the patient. Thus, the aim is to offer to the medical, physic, medical-physic and engineer community open tools to design its own medical imaging systems, verification and planification of treatment diseases, and in general to understand all the physics implicated in many areas of the healthcare.
OBJECTIVES

The EELA proposal focuses on transferring the organisational model of the European project EGEE into the Latin-American Countries. Thus it shares many interests and strategies and exploits the know-how already generated in the EGEE project. The experience and the large amount of information concerning the groups in Latin America should be also considered as an important counterpart to the EGEE project.

The objective of the Biomedical Application Identification and Support task will be two fold: first, to migrate and deploy the biomedical applications already identified and migrated in EGEE, making profit of the pre-existing know-how on the application, grid-enabling and deployment; second, to assist on the identification, migration and deployment of new biomedical applications that could arise from the new users of the Grid.

EXPECTED RESULTS
  • The running of at least 2 EGEE biomedical applications in the EELA Pilot Testbed. These applications will be exploited in order to disseminate the Grid culture on the Latin American partners and to touch the feelings of the LA Heath Authorities. As a consequence of the present state of the art, Gate and WISDOM should be the first applications deployed.

  • The processes of identification of, at least two, new applications that could provide solutions to problems, which are more relevant to local LA partners, and use them on dissemination activities through the Pilot Testbed.

  • To achieve at least 3000 executions of the biomedical applications from at least 50 users.

  • To transfer at least 1 of the new EELA applications to EGEE infrastructure.

  • To reach at least impact on 4 different countries (European + Latin American).


References

- Project EELA, Technical Annex I. Available at: http://www.eu-eela.org/

- EGEE, http://www.eu-egee.org

- EGEE BioMed Applications, http://egee-na4.ct.infn.it/biomed/applications.html

- HealthGrid collaboration, ?HealthGrid White Paper?, http://whitepaper.healthgrid.org/
- S. Jan et al., "GATE: a simulation toolkit for PET and SPECT", submitted to Phys. Med. Biol.

- OpenGATE collaboration, http://www-lphe.epfl.ch/~PET/research/gate/

- L. Maigne, GATE Application in EGEE http://egee-na4.ct.infn.it/biomed/gate.html

- Wide In Silico Docking Of Malaria (WISDOM), home page http://wisdom.healthgrid.org

- BLAST: http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/information3.html

- BLAST Processing service: http://www.ncbi.nlm.nih.gov/blast/

- mpiBLAST http://mpiblast.lanl.gov/

- NCBI BLAST home page http://www.ncbi.nlm.nih.gov/blast/

- mpiBLAST: Open-Source Parallel BLAST home page, http://mpiblast.lanl.gov/

- K. Lesheng , "Phylogenetic Inference Using Parallel Version of MrBayes"

http://www.nus.edu.sg/comcen/svu/publications/hpc_nus/sep_2005/ParallelMrBayes.pdf

- F. Ronquist, J. P. Huelsenbeck, "MrBayes 3: Bayesian phylogenetic inference", Bioinformatics 19 12, 1572?1574 (2003)

- Requirements of the BioMed VO, http://egee-na4.ct.infn.it/requirements/

- L.Maigne et al., "Parallelization of Monte Carlo Simulations and Submission to a Grid Environment", Parallel Processing Letters journal 14 2, 177-196 (2004)

- GEANT4, http://geant4.web.cern.ch/geant4/

- The LHCb experiment, http://lhcb-public.web.cern.ch/lhcb-public/default.htm

- D. Navarro "Epidemiología de las enfermedades del tiroides en Cuba", Rev Cubana Endocrinología 15 (2004)

- J. Alert, J. Jiménez, "Tendencias del tratamiento radiante en los tumores del sistema nervioso central", Rev Cubana Med 43 2-3 (2004)

- J.L. Valenciaga Rodríguez et al., "Cáncer de tiroides en Cuba: estudio de 14 años" Revista Cubana de Endocrinología 16 3 (2005)

- National Cancer Institute (NCI) May 2005, http://clinicaltrial.gov

- SIMDAT, http://www.scai.fraunhofer.de/simdat.html

- SwissBioGRID, http://www.swissbiogrid.com

- The Swiss Institue of Bioinformatics, http://www.isb-sib.ch/

- INSTRUIRE, Auvergrid, http://www.auvergrid.fr

- CampusGRID, http://www.campusgrid.upv.es

- V. Breton, "Grid added value to fight neglected diseases", Wisdom Open Day, http://www.scai.fraunhofer.de/fileadmin/download/vortraege/wisdom/wisdom_breton.pdf

- A. Brandling-Bennet, F. Penheiro, "Infectious Diseases in Latin America and the Caribbean: Are They Really Emerging and Increasing?", Emerging Infectious Diseases 2 1, 59-61 (1996)

- National Centre for Biotechnology Information (NCBI) http://www.ncbi.nlm.nih.gov/

- Grid Protein Sequence Analysis web portal, http://gpsa.ibcp.fr

- CeCalCULA, http://www.cecalc.ula.ve

Dissemination Activities

  • Two Information Sheets
  • Madrid (Spain), EELA KoM and 1st Workshop, 1-2 February 2006
    • A presentation of the EELA Biomed Applications.
  • Mérida (Venezuela), Taller de Herramientas para Análisis de Secuencias (THAS), 27-31 March 2006
    • A presentation of the status of the EELA applications BiG and Phylogeny
  • Abarca, et al. Building a Network in Latin America: e-Infrastructure and Applications. Proceedings of the Spanish Conference on e-Science Grid Computing 1, 83-96 (2007)
  • Mérida (Venezuela), EELA 2nd Workshop, 24-25 April 2006
    • A presentation of the status of the EELA Biomed Applications.
  • Santander (Spain), Workshop On Complex Systems: New trends in Technology, 5-9 June 2006
    • A presentation of the status of the EELA Biomed and Climate Applications
  • Lisbon (Portugal), EU-LAC Summit, 28-29 April 2006
    • A demonstration of GATE in LA
  • Valencia (Spain), WISDOM and SHARE Workshop, 6 June 2006
    • A presentation of the status of the EELA plans for the Data Challenge.
  • Valencia (Spain), IV HealthGrid Conference, 7-9 June 2006
    • Participation in the WISDOM and SHARE Workshop.
    • Poster with all the EELA Biomed Applications.
    • Paper in Studies In Health Technology and Informatics 120, 397-400 (2006)
    • Talk about Technical Details of BiG.
    • Stand of the UPV with a Demo on BiG and a Poster and Shared Flyer.
    • Discussions on Biomed Collaboration
      • International Relationships Coordinator of the Latin American Bioinformatics Network.
  • Abarca, et al. Building a Network in Latin America: e-Infrastructure and Applications. Proceedings of the Spanish Conference on e-Science Grid Computing 1, 83-96 (2007)
  • Bogota (Colombia), Universidad de Los Andes, 16-19 June 2006
    • Invited seminar "Developing new drug in Malaria"
  • Itacuruçá (Brazil), EELA 3rd Workshop, 24-25 June 2006
    • A presentation of the status of the EELA Biomed Applications
  • Brussels (Belgium), ICT for BIO-Medical Sciences Meeting, 29-30 June 2006
    • A presentation of the status of the EELA Biomed Applications
    • Concertation Meeting on HealthGrid - Health Information Infrastructure and Applications.
  • Sardinia (Italy), NETTAB-Network Tools and Applications in Biology Conference, 10-13 July 2006
    • Invited Plenary Talk about the Biomedical Applications
    • Paper in Proceedings of the NETTAB conferences 6, 8-13 (2006)
  • Santiago (Chile), XXXII Conferencia Latinoamericana de Informática - CLEI, 20-25 August 2006
    • Invited Plenary Talk "HealthGrids: Challenges and Opportunities"
  • Santiago (Chile), Coordination Meeting of CYTEDGRID Project, 22-26 August 2006
    • A presentation of the status of the EELA Biomed Applications
  • Santiago (Chile), 1st EELA Conference, 4-5 September 2006
    • A presentation of the status of the EELA Biomed Applications
    • Demo in Phylogeny, Biomed Application
  • Mérida (Venezuela), Taller de Herramientas para Análisis de Secuencias (THAS), 4-8 September 2006
    • A presentation of the status of the EELA applications BiG and Phylogeny
  • Geneva (Switzerland), EGEE06 Conference, 25-29 September 2006
    • A presentation of the deployment of the EELA Applications in the SEE-GRID Regional Grids Workshop
    • Demo in Phylogeny, Biomed Application
    • A presentation of BiG in the NA4 parallel session
    • A presentation of EELA Biomed Applications in the NA4 parallel session
  • San José (Costa Rica), Centro de Biotecnologia, 9-13 October 2006
    • Invited Talk "Find new targets in Malaria"
  • Mérida (Venezuela), ULA course "Genome in Merida", 16-20 October 2006
    • Invited talk "Genome phylogenetic with MrBayes"
  • Popayán-Cauca (Colombia), 2nd International Seminar on Genomics, Proteomics and Bioinformatics, 25-27 October 2006
    • Invited talk "Computación de Alto Rendimiento en GRID. Actividades del Proyecto EELA en Bioinformática y Clima"
  • Granada (Spain), Jornadas Técnicas RedIRIS 2006 y XXII Grupos de Trabajo, 13-17 November 2006
    • A presentation of the EELA Applications
  • Lima (Peru), EELA 4th Workshop, 11-12 January 2007
    • A presentation of the status of the EELA Biomed Applications
  • Mérida (Venezuela), Taller de Herramientas para Análisis de Secuencias (THAS), 22-26 January 2007
    • A presentation of the status of the EELA applications BiG and Phylogeny
  • Bogotá (Colombia), EELA 5th Workshop, 5 March 2007
    • A presentation of the status of the EELA Applications
  • La Plata (Argentina), EELA 6th Workshop, 29-30 March 2007
    • A presentation of the status of the EELA Applications
  • Mérida (Venezuela), Taller de Herramientas para Análisis de Secuencias (THAS), 23-27 April 2007
    • A presentation of the status of the EELA applications BiG and Phylogeny
  • Geneva (Switzerland), V HealthGrid Conference, 24-27 April 2007
    • Participation in the WISDOM and SHARE Workshop.
    • Paper in Studies In Health Technology and Informatics 126, 31-36 (2007)
  • Manchester (UK), EGEE User Forum, 9-11 May 2007
    • A presentation of the EELA application BiG
    • A presentation of the EELA applications
  • Maputo (Mozambique), IST-Africa, 9-11 May 2007
    • A presentation of the EELA applications
    • Paper in Proceedings of IST-Africa Conference
  • Santiago de Compostela (Spain), IBERGRID Conference, 14-16 May 2007
    • A presentation of the EELA applications
    • Paper in Proceedings of IBERGRID Conference 1, 29-35 (2007)
  • Rio de Janeiro (Brazil), LAGrid Conference, 14-17 May 2007
    • A presentation of the EELA applications
    • Paper in Proceedings of the LAGrid conference
  • Varadero (Cuba), EELA Workshop, 29-30 May 2007
    • A presentation of the status of the EELA Applications
  • Torremolinos (Spain), Bioinformatics 2007, 11-14 June 2007
    • A presentation of the status of the EELA applications
  • Pisa (Italy), NETTAB Conference, 12-15 June 2007
    • Paper in Proceedings of the NETTAB Conference 7, 145-156 (2007)
  • Rio de Janeiro (Brazil), XXVII Congresso de la SBC, 30 Jun-06 Jul 2007
    • A presentation of the status of the EELA applications
  • Budapest (Hungary), EGEE Conference, 1-5 October 2007
    • A presentation of the status of the EELA applications
  • La Antigua (Guatemala), 7th EELA Workshop, 17 Oct 2007
    • A presentation of the status of the EELA applications
  • Mexico City (Mexico), 8th EELA Workshop, 22 Oct 2007
    • A presentation of the status of the EELA applications
  • Margarita (Venezuela), XVIII congreso internacional de parasitologia, 21-25 October 2007
    • A presentation of WISDOM and BiG applications


Links to slides presented in conferences, papers, posters, etc. can be found here:

- WP3 DOCUMENTS

- EELA DOCUMENTS
PARTICIPANTS