Word Sense Disambiguation: A Structured Learning Perspective

Yun Zhou; School of Computer, National University of Defense Technology, Changsha 410073; Ting Wang; School of Computer, National University of Defense Technology, Changsha 410073; Zhiyuan Wang; State Key Laboratory of High Performance Computing & School of Computer, National University of Defense Technology, Changsha 410073

Yun Zhou; School of Computer, National University of Defense Technology, Changsha 410073 ; Ting Wang; School of Computer, National University of Defense Technology, Changsha 410073 ; Zhiyuan Wang; State Key Laboratory of High Performance Computing & School of Computer, National University of Defense Technology, Changsha 410073

Computing and Informatics, Tome 34 (2016) no. 4, / Harvested from Computing and Informatics

Text on Computing and Informatics

Résumé

This paper explores the application of structured learning methods (SLMs) to word sense disambiguation (WSD). On one hand, the semantic dependencies between polysemous words in the sentence can be encoded in SLMs. On the other hand, SLMs obtained significant achievements in natural language processing, and so it is a natural idea to apply them to WSD. However, there are many theoretical and practical problems when SLMs are applied to WSD, due to characteristics of WSD. Beginning with the method based on hidden Markov model, this paper proposes for the first time a comprehensive and unified solution for WSD based on maximum entropy Markov model, conditional random field and tree-structured conditional random field, and reduces the time complexity and running time of the proposed methods to a reasonable level by beam search, approximate training, and parallel training. The update of models brings performance improvement, the introduction of one step dependency improves performance by 1--5 percent, the adoption of non-independent features improves performance by 2--3 percent, and the extension of underlying structure to dependency parsing tree improves performance by about 1 percent. On the English all-words WSD dataset of Senseval-2004, the method based on tree-structured conditional random field outperforms the best attendee system significantly. Nevertheless, almost all machine learning methods suffer from data sparseness due to the scarcity of sense tagged data, and so do SLMs. Besides improving structured learning methods according to the characteristics of WSD, another approach to improve disambiguation performance is to mine disambiguation knowledge from all kinds of sources, such as Wikipedia, parallel corpus, and to alleviate knowledge acquisition bottleneck of WSD.

Publié le : 2016-03-02
Classification: Knowledge and Information Engineering, Word sense disambiguation, structured learning, hidden Markov model, conditional random field, parallelization, approximate training, 68T50; 91F20

@article{cai1478,
     author = {Yun Zhou; School of Computer, National University of Defense Technology, Changsha 410073 and Ting Wang; School of Computer, National University of Defense Technology, Changsha 410073 and Zhiyuan Wang; State Key Laboratory of High Performance Computing \& School of Computer, National University of Defense Technology, Changsha 410073},
     title = {Word Sense Disambiguation: A Structured Learning Perspective},
     journal = {Computing and Informatics},
     volume = {34},
     number = {4},
     year = {2016},
     language = {en},
     url = {http://dml.mathdoc.fr/item/cai1478}
}

Yun Zhou; School of Computer, National University of Defense Technology, Changsha 410073; Ting Wang; School of Computer, National University of Defense Technology, Changsha 410073; Zhiyuan Wang; State Key Laboratory of High Performance Computing & School of Computer, National University of Defense Technology, Changsha 410073. Word Sense Disambiguation: A Structured Learning Perspective. Computing and Informatics, Tome 34 (2016) no. 4, . http://gdmltest.u-ga.fr/item/cai1478/