Faculty of Informatics and Statistics, Department of Information and Knowledge Engineering (DIKE)

Date and time: April 5 2012 (10:30 – 12:00).

Room: 403 NB

Presentations

Extracting Structured Data about Product and Job Offers from Semi-Structured Web

Speaker

  • Aleš Pouzar, KIZI VŠE

The presentation is focused on practical aspects of information extraction based on extraction ontologies. Several experiments were performed with the Ex system using three types of extraction knowledge: manually written rules, formatting regularities and machine learning models. Development of extraction ontologies is demonstrated on the e-commerce and job offers domains. The goal is to obtain structured data of high granularity which can be useful for populating domain ontologies or in real applications like product comparison and job offer search. Advantages of the presented approach as well as its limitations and constraints in selected domains and their possible solutions are mentioned.

(Slides are in Czech.)

Downloads: slides 1 

Powered by Resource Description Framework (RDF)