Semantic Processing of Database Textual Attributes Using Wikipedia


Campana J. R. , Medina J. M. , Vila M. A.

9th International Conference on Flexible Query Answering Systems (FQAS 2011), Ghent, Belçika, 26 - 28 Ekim 2011, cilt.7022, ss.84-95 identifier

  • Cilt numarası: 7022
  • Basıldığı Şehir: Ghent
  • Basıldığı Ülke: Belçika
  • Sayfa Sayıları: ss.84-95

Özet

Text attributes in databases contain rich semantic information that is seldom processed or used. This paper proposes a method to extract and semantically represent concepts from texts stored in databases. This process relies on tools such as WordNet and Wikipedia to identify concepts extracted from texts and represent them as a basic ontology whose concepts are annotated with search terms. This ontology can play diverse roles. It can be seen as a conceptual summary of the content of an attribute, which can be used as a means to navigate through the textual content of an attribute. It can also be used as a profile for text search using the terms associated to the ontology concepts. The ontology is built as a subset of Wikipedia category graph, selected using diverse metrics. Category selection using these metrics is discussed and an example application is presented and evaluated.