According to the increment of accessible text data source on the internet, it has increased the necessity of the automatic text document summarization. However, the performance of the automatic methods might be poor because the semantic gap between high level user's summary requirement and low level vector representation of machine exists. In this paper, to overcome that problem, we propose a new document summarization method using a pseudo relevance feedback based on clustering method and NMF (non-negative matrix factorization). Relevance feedback is effective technique to minimize the semantic gap of information processing, but the general relevance feedback needs an intervention of a user. Additionally, the refined query without user interference by pseudo relevance feedback may be biased. The proposed method provides an automatic relevance judgment to reformulate query using the clustering method for minimizing a bias of query expansion. The method also can improve the quality of document summarization since the summarized documents are influenced by the semantic features of documents and the expanded query. The experimental results demonstrate that the proposed method achieves better performance than the other document summarization methods.
Publié le : 2016-11-02
Classification:  Document summarization, NMF, PRF, clustering, query expansion, semantic feature
@article{cai3718,
     author = {Sun Park; Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005 and ByungRae Cha; Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005 and JongWon Kim; Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005},
     title = {Document Summarization Using NMF and Pseudo Relevance Feedback Based on K-Means Clustering},
     journal = {Computing and Informatics},
     volume = {34},
     number = {4},
     year = {2016},
     language = {en},
     url = {http://dml.mathdoc.fr/item/cai3718}
}
Sun Park; Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005; ByungRae Cha; Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005; JongWon Kim; Gwangju Institute of Science and Technology, 123 Cheomdangwagi-ro, Buk-gu, Gwangju 61005. Document Summarization Using NMF and Pseudo Relevance Feedback Based on K-Means Clustering. Computing and Informatics, Tome 34 (2016) no. 4, . http://gdmltest.u-ga.fr/item/cai3718/