Text Categorization and Sorting of Web Search Results
Miloš Radovanović ; Mirjana Ivanović ; Zoran Budimac
Computing and Informatics, Tome 28 (2012) no. 1, p. 861-893 / Harvested from Computing and Informatics
With the Internet facing the growing problem of information overload, the large volumes, weak structure and noisiness of Web data make it amenable to the application of machine learning techniques. After providing an overview of several topics in text categorization, including document representation, feature selection, and a choice of classifiers, the paper presents experimental results concerning the performance and effects of different transformations of the bag-of-words document representation and feature selection, on texts extracted from the dmoz Open Directory of Web pages. Finally, the paper describes the primary motivation for the experiments: a new meta-search engine CatS which utilizes text categorization to enhance the presentation of search results obtained from a major Web search engine.
Publié le : 2012-01-26
Classification: 
@article{cai67,
     author = {Milo\v s Radovanovi\'c and Mirjana Ivanovi\'c and Zoran Budimac},
     title = {Text Categorization and Sorting of Web Search Results},
     journal = {Computing and Informatics},
     volume = {28},
     number = {1},
     year = {2012},
     pages = { 861-893},
     language = {en},
     url = {http://dml.mathdoc.fr/item/cai67}
}
Miloš Radovanović; Mirjana Ivanović; Zoran Budimac. Text Categorization and Sorting of Web Search Results. Computing and Informatics, Tome 28 (2012) no. 1, pp.  861-893. http://gdmltest.u-ga.fr/item/cai67/