Faculty of Informatics and Statistics, Department of Information and Knowledge Engineering (DIKE)

Date and time: May 11 2006 (10:30 – 12:00). Non–standard date or time!

Room: 403 NB


Frequent patterns for Natural Language Processing


  • Jan Blaťák, NLP Lab, Fakulta informatiky, MUNI Brno
  • Luboš Popelínský, NLP Lab, Fakulta informatiky, MUNI Brno

Frequent patterns mining is one of the most important tasks in descriptive data mining. Frequent patterns has also been successfully used for data preprocessing and classification.
The content of the presentation is as follows. Firstly, we define frequent patterns both in propositional and first order logic and mention algorithms for mining them. We also define emerging and jumping emerging patterns.
Then we briefly describe RAP, a system for mining long first–order frequent patterns in multi–relation data. After we depict tRAPe, a general framework for frequent patterns mining in text. Two methods of using patterns will be described: feature construction (propositionalization) and classification based on associations (CBA).
We describe experiments with tRAPe for information extraction from biological texts, context–sensitive text correction for English and morphological disambiguation of Czech.
Related resources:
* http://www.fi.muni.cz/kd/projects/rap/
* http://www.fi.muni.cz/kd

Downloads: slides 1 

Powered by Resource Description Framework (RDF)