Please use this identifier to cite or link to this item:
Author(s): Luís Sarmento
Alexander Kehlenbeck
Eugénio Oliveira
Lyle Ungar
Title: An Approach to Web-Scale Named-Entity Disambiguation
Issue Date: 2009
Abstract: We present a multi-pass clustering approach to large scale. wide-scope named-entity disambiguation (NED) oil collections of web pages. Our approach Uses name co-occurrence information to cluster and hence disambiguate entities. and is designed to handle NED on the entire web. We show that on web collections, NED becomes increasing), difficult as the corpus size increases, not only because of the challenge of scaling the NED algorithm, but also because new and surprising facets of entities become visible in the data. This effect limits the potential benefits for data-driven approaches of processing larger data-sets, and suggests that efficient clustering-based disambiguation methods for the web will require extracting more specialized information front documents.
Subject: Informática, Ciências da computação e da informação
Informatics, Computer and information sciences
Scientific areas: Ciências exactas e naturais::Ciências da computação e da informação
Natural sciences::Computer and information sciences
Source: Machine Learning and Data Mining in Pattern Recognition
Document Type: Artigo em Livro de Atas de Conferência Internacional
Rights: openAccess
Appears in Collections:FEUP - Artigo em Livro de Atas de Conferência Internacional

Files in This Item:
File Description SizeFormat 
60958.pdf192.43 kBAdobe PDFView/Open

This item is licensed under a Creative Commons License Creative Commons