Methodology for Mining Comprehensible Rules from Sequential Data
Responsable du projet | Kilian Stoffel |
Collaborateur | Paul Cotofrei |
Résumé |
The purpose of this project is to respond to an actual necessity --
the need to discover knowledge from huge data collection comprising
multiple sequences that evolve over time -- by proposing a
methodology for temporal rule extraction. To obtain what we called
temporal rules, a discretisation phase that extracts events from
raw data is applied first, followed by an inference phase, where
classification trees are constructed based on these events. The
discrete and continuous characteristics of an event, according to
its definition, allow the use of statistical tools as well as of
techniques from artificial intelligence on the same data. A theoretical framework for this methodology, based on first-order temporal logic, is also defined. This formalism permits the definition of the main notions (event, temporal rule, constraint) in a formal way. The concept of consistent linear time structure allows us to introduce the notions of general interpretation, of support and of confidence, the lasts two measures being the expression of the two similar concepts used in data mining. These notions open the possibility to use statistical approaches in the design of algorithms for inferring higher order temporal rules, denoted temporal meta-rules. The capability of the formalism is extended to "capture" the concept of time granularity. To keep an unitary viewpoint of the meaning of the same formula at different time scales, the usual definition of the interpretation for a predicate symbol, in the frame of a temporal granular logic, is changed: it returns now the degree of truth (a real value between zero and one) and not the meaning of truth (one of the values true or false). Finally, a probabilistic model is attached to the initial formalism to define a stochastic first-order temporal logic. By using advanced theorems from the stochastic limit theory, it was possible to prove that a certain amount of dependence (called near-epoch dependence) is the highest degree of dependence which is sufficient to induce the property of consistency. |
Mots-clés |
temporal data mining, formalism of temporal rules |
Page internet | http://www2.unine.ch/imi/page-18327.html |
Type de projet | Recherche fondamentale |
Domaine de recherche | computer science |
Source de financement | FNS |
Etat | Terminé |
Début de projet | 1-4-2001 |
Fin du projet | 30-9-2003 |
Budget alloué | 103068 |
Contact | Kilian Stoffel |