we investigate different models and algorithms for querying data (or ressources) through possibly heterogeneous and distributed ontologies.
List of participants: F. Jouanot (Associate Professor), M.-Ch. Rousset (Professor), A. Termier (Associate Professor), G. Vargas-Solar (CR1), Ch. Collet (Professor), R. Tournaire (PhD 2007- ), S. Tandabany (PhD 2007-)
The web has deeply changed the vision of modern data management systems and has forced to revisit the problem of querying data which are distributed, possibly heterogeneous and ill or semi-structured. This revolution is going to get amplified with the miniaturization of storage devices connected to the network. This opens new possibilities and raises new challenges for integrating heterogeneous decentralized and context-sensitive data. Reasoning on context and data semantics is one of the keys for attacking in a principled way those challenges. The positioning of the group is to investigate the different algorithmic issues for the scalability of querying data through possibly heterogeneous and distributed ontologies.
We plan to extend our work on data semantics in two main directions:
We investigate models, algorithms and tools for coordinating services with non functional properties (contracts) and for providing access to heterogeneous data coming from services.
List of participants: Ch. Bobineau (Associate PR), Ch. Collet (PR), G. Vargas-Solar (CR1), Javier Alfonso Espinosa-Oviedo (PhD 2009 - , co-direction France - Mexico), Alberto Portilla-Flores (PhD 2005- 2009 co-direction, France - Mexico), Victor Cuesvas-Vicentt´ın (PhD 2006 - )
Composing services exported by different organisations is a key issue when building large scale and data-intensive applications/systems. It is becoming more crucial when considering services within ubiquitous infrastructures made of heterogeneous devices, servers, applications connected with heterogeneous systems. Composition requires to take into account the characteristics of these eco-systems (e.g., memory and computing, and network capabilities). The composition process uses this knowledge or semantics to dynamically discover and coordinate (ubiquitous) services, and then to adapt the coordination process depending on the availability and change of services. Another important challenge is to consider non functional aspects and QoS (quality of service) criteria such as availability, reliability, and temporal constraints that are crucial when composing data services in a dynamic way. Int he group, we investigate models, algorithms and tools for:
We plan to extend our work on reliable and autonomic data services composition in three directions:
List of participants: Ch. Collet (Professor), M.-Ch. Rousset (Professor), Ch. Bobineau (Associate Professor), F. Jouanot (Associate Professor), A. Termier (Associate Professor), Benjamen Negrevergne (PhD 2008- )
Query optimization in distributed and dynamic systems
Accessing data concerns several dimensions of large scale systems: number of resources, data volume and data complexity.Current large scale systems in number of resources include grids, peer-to-peer networks, sensor networks, ambient and ubiquitous environments. The most popular method to access data within these systems in a convenient and efficient way is still to consider declarative queries that are optimized based on system characteristics. Due to the strong dynamicity of these systems, classical distributed query evaluation techniques are not applicable.Having a global view of the system is not possible: pertinent data sources cannot be a priori known and useful metadata for query evaluation are not always available. In addition, the evaluation strategy for a query has to dynamically adapt to fluctuating conditions and to users with different needs. For example, some may want to maximize performance while others may need to minimize energy consumption. The HADAS group focuses on new approaches for query evaluation efficiency w.r.t. application needs running on large-scale systems following our precedent works on adaptive query processing. We plan to extend our works on efficient query evaluation in two main directions:
These two directions will be explored in the setting of the ANR Blanc 2009 project UBIQUEST and in a collaboration with CEA-LETI.
- Machine-learning-based adaptive query evaluation. In distributed environments where metadata are lacking, classical query evaluation techniques cannot be applied. We propose machine learning techniques exploiting easures taken during previous query executions to improve performance of future query evaluations (case-based easoning).
- Data and network management in dynamic ad-hoc networks. In distributed environments, queries have to be decomposed into subqueries that have to be evaluated on different nodes of the network. In dynamic environments, here is no knowledge about data distribution (localization and volumes). We propose to combine network and ata management by viewing the whole network as a dynamic distributed database system. This work has been tarted in collaboration with the LIAMA in China, and promising results have been already obtained.
Mining large amounts of data to extract patterns of interest
]]>Data mining is another way to access large quantities of data, by extracting interesting patterns from them. Such patternsprovide meaningful abstractions of raw data, which are thus less numerous and more appropriate for data analysis. Thegroup works on pattern mining in complex data such as sequences, trees or graphs, which are found in many applicationsin chemistry (e.g. graphs representing molecules) or in bioinformatics (e.g. gene regulation networks).
The focus will be on designing and deploying parallel pattern mining algorithms on multicore processors. The starting DAMOCLES project (supported by the MSTIC pole of UJF) will investigate “DAta Mining for On Chip Low Energy systems”. This project involves HADAS and MESCAL teams of LIG and the machine architecture team of the TIMA laboratory in Grenoble (F. Petrot, SLS team).