Efficient retrieval of XML elements and documents is essential in the effective application of the XML format. The ranking function BM25F is composed of several document fields with potentially different degrees of importance; these fields are known as selected fields that give substantial improvements over the baseline BM25. The BM25F function has performed well in past evaluations; however, there are issues that require additional attention. In the first instance, which elements should be treated as fields? Secondly, what is an appropriate weight for each field? Previously, document fields were selected manually, and the weight for each chosen field was tuned before being assigned. Two automatic methods are introduced in this paper that enable the extraction of fields in document-centric XML documents and the assignment weights to the selected fields. Our experiments show an improvement of up to 28 % over BM25, and up to 15 % over BM25F at iP[0.01] based on INEX evaluations.
Publié le : 2013-05-23
Classification:  Ranking strategies, indexing units, XML retrieval, BM25F
@article{cai1629,
     author = {Tanakorn Wichaiwong; Department of Computer Science, Faculty of Science, Kasetsart University, Bangkok and Chuleerat Jaruskulchai; Department of Computer Science, Faculty of Science, Kasetsart University, Bangkok},
     title = {A Double Scoring Method for XML Element Retrieval},
     journal = {Computing and Informatics},
     volume = {31},
     number = {6},
     year = {2013},
     language = {en},
     url = {http://dml.mathdoc.fr/item/cai1629}
}
Tanakorn Wichaiwong; Department of Computer Science, Faculty of Science, Kasetsart University, Bangkok; Chuleerat Jaruskulchai; Department of Computer Science, Faculty of Science, Kasetsart University, Bangkok. A Double Scoring Method for XML Element Retrieval. Computing and Informatics, Tome 31 (2013) no. 6, . http://gdmltest.u-ga.fr/item/cai1629/