Faculty of Informatics and Statistics, Department of Information and Knowledge Engineering (DIKE)

Date and time: May 21 2015 (10:30 – 12:00). Non–standard date or time!

Room: 336 RB Non–standard venue!


RExtractor: a Robust Information Extractor


  • Vincent Kríž, ÚFAL MFF UK, Praha

We have presented our initial steps towards a linguistic processing of texts to detect entities and relations between them a year ago. This work was an essential part of the INTLIB project whose aim is to provide a more efficient and user-friendly tool for querying textual documents other than full-text search.
Now we present the RExtractor system that processes input documents by natural language processing tools and consequently queries the parsed sentences to extract a knowledge base of entities and their relations. A workflow of the system is designed to be language and domain independent. We demonstrate RExtractor on Czech and English legal documents. In addition, we discuss RExtractor with respect to its deployment in search engines used by customers from a particular domain.

Downloads: slides 1 

Powered by Resource Description Framework (RDF)