Automatic generic document summarization based on unsupervised schemes is a very useful approach because it does not require training data. Although techniques using latent semantic analysis (LSA) and non-negative matrix factorization (NMF) have been applied to determine topics of documents, there are no researches on reduction of matrix and speeding up of computation of the NMF method. In order to achieve this scheme, this paper utilizes the generic impressive expressions from newspapers to extract important sentences as summary. Therefore, it has no stemming processes and no filtering of stop words. Generally, novels are typical documents providing sentimental impression for readers. However, newspapers deliver different impressions for new knowledge because they inform readers about current events, informative articles and diverse features. The proposed method introduces impressive expressions for newspapers and their measurements are applied to the NMF method. From 100 KB text data of experimental results by the proposed method, it turns out that the matrix size reduces by 80 % and the computation of the NMF method becomes 7 times faster than with the original method, without degrading the relevancy of extracted sentences.
Publié le : 2013-05-23
Classification:  Impressive expressions, NMF methods, precision, relevancy
@article{cai1626,
     author = {Abdunabi Ubul; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima and El-Sayed Atlam; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima and Hiroya Kitagawa; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima and Masao Fuketa; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima and Kazuhiro Morita; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima and Jun-ichi Aoe; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima},
     title = {An Efficient Method of Summarizing Documents Using Impression Measurements},
     journal = {Computing and Informatics},
     volume = {31},
     number = {6},
     year = {2013},
     language = {en},
     url = {http://dml.mathdoc.fr/item/cai1626}
}
Abdunabi Ubul; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima; El-Sayed Atlam; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima; Hiroya Kitagawa; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima; Masao Fuketa; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima; Kazuhiro Morita; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima; Jun-ichi Aoe; Department of Information Science and Intelligent Systems, Faculty of  Engineering, University of Tokushima. An Efficient Method of Summarizing Documents Using Impression Measurements. Computing and Informatics, Tome 31 (2013) no. 6, . http://gdmltest.u-ga.fr/item/cai1626/