1 ORPAILLEUR - Knowledge representation, reasonning INRIA Lorraine, LORIA - Laboratoire Lorrain de Recherche en Informatique et ses Applications 2 INIST - Institut de l-information scientifique et technique

Abstract : This article presents a system to extract Knowledge from webpages by producing semantic annotations. taking into account semantic information from the domain to annotate an element in a webpage implies solving two problems : 1 identifying the syntactic structure of this element in the webpage and 2 identifying the most specific concept in terms of subsumption of the ontology that will be used to annotate this element. Our approach relies on a wrapper-based machine learning algorithm combined with reasoning making use of the formal structure of the ontology.

Keywords : Knowledge Representation Knowledge Extraction Machine Learning Reasoning

Author: Sylvain Tenier - Amedeo Napoli - Xavier Polanco - Yannick Toussaint -



