Personal tools

INForum'2017 - Rui Antunes

From IEETA

Jump to: navigation, search


Date 2017/10/12
Title Evaluation of word embedding vector averaging functions for biomedical word sense disambiguation
Speaker Rui Antunes
Event INForum
Location Aveiro
Country Portugal
URL http://inforum.org.pt/INForum2017/

ABSTRACT. The biomedical lexicon contains a large amount of term ambiguity, which hinders correct identification of concepts and reduces the accuracy of semantic indexing and information retrieval tools. Previous work on biomedical word sense disambiguation (WSD) has shown that supervised machine learning leads to better results than knowledge-based (KB) approaches. However, machine learning approaches require the availability of sufficient training data, and generalization performance behind the test data is not known. KB methods on the other hand make use of existing knowledge-bases and are therefore mostly limited to the quality of such sources of information about concepts. In this work, we use word embedding vectors to complement the knowledge-base information. We represent the context of an ambiguous term by the average of the embedding vectors of words around the term, and evaluate the impact of using word distance for weighting this average. We show how this weighting improves the disambiguation accuracy of the KB approach in the reference MSH WSD from 0.86 to 0.88.