Date and time: November 30 2006 (10:30 – 12:00).
Room: 403 NB
First experience with medical web crawling, navigation and information extraction in the MedIEQ Project
- Martin Labský, KIZI, VŠE Praha
The number of health information web sites and online services is increasing day by day. It is known that the quality of these web sites is very variable and difficult to assess. Organisations around the world are working on establishing standards of quality in the accreditation of health–related web content. However the establishment of codes of conduct or ethics is not enough in the medical domain where the quality of information delivered from medical web sites may affect the health of the citizens. It is necessary to establish rating mechanisms, either by third party accreditation, or by creating portals where medical web sites are organised and characterised against certain labelling criteria. In order for these mechanisms to be successful, they must be equipped with technologies that enable the automation of the rating process, such as information extraction techniques that allow the continuous monitoring of labelled web sites alerting the labelling agency in case some changes occur against the labelling criteria, or web crawling and spidering techniques that allow the retrieval of new unlabelled web sites, their characterisation and addition in a medical thematic portal.
In this talk we will present the first experience of our group in the development and application of web crawling, navigation and information extraction tools within the EU–funded MedIEQ Project (http://www.medieq.org/).