Date and time: May 21 2015 (10:30 – 12:00).
Room: 336 RB Non–standard venue!
RExtractor: a Robust Information Extractor
- Vincent Kríž, ÚFAL MFF UK, Praha
We have presented our initial steps towards a linguistic processing of texts to detect entities and relations between them a year ago. This work was an essential part of the INTLIB project whose aim is to provide a more efficient and user-friendly tool for querying textual documents other than full-text search.
Now we present the RExtractor system that processes input documents by natural language processing tools and consequently queries the parsed sentences to extract a knowledge base of entities and their relations. A workflow of the system is designed to be language and domain independent. We demonstrate RExtractor on Czech and English legal documents. In addition, we discuss RExtractor with respect to its deployment in search engines used by customers from a particular domain.
Downloads: slides 1