Faculty of Informatics and Statistics, Department of Information and Knowledge Engineering (DIKE)

Date and time: September 29 2005 (10:30 – 12:00).

Room: 403 NB


Framework for annotation and composition of web space analysis tools


  • Vojtěch Svátek, KIZI, VŠE Praha

Various tools for analysis of the WWW space have been developed by different communities, e.g. those os Information Retrieval, Information Extraction or Document Classification. Combination of different elementary tools into a more complex application would however require relatively deep acquaintance with the tools and could only be done by manual programming.
One of partial goals of the Rainbow project (see http://rainbow.vse.cz) is to partially automate this process. The tools can be characterised (and annotated) with their position in a 4-dimensional space called TODD (for "Task-Object-Data-Domain"), where each dimension is linked to a simple ontology. Construction of applications can then be eased by generic templates that capture typical sequences and constraints.
The approach has been partially evaluated
1. by describing several existing web space analysis applications in terms of the TODD framework, and
2. by simulating (in Prolog) the composition of a classification application in the pornography recognition domain.
The project has been carried out in collaboration with M. Labsky and M. Vacura from UEP and A. ten Teije from VU Amsterdam.

Downloads: slides 1 

