Efficiently Using Prime-Encoding for Mining Frequent Itemsets in Sparse Data

Karam Gouda; Faculty of Computers and Informatics, Benha University; Mosab Hassaan; Faculty of Computers and Informatics, Benha University

Karam Gouda; Faculty of Computers and Informatics, Benha University ; Mosab Hassaan; Faculty of Computers and Informatics, Benha University

Computing and Informatics, Tome 31 (2013) no. 6, / Harvested from Computing and Informatics

Text on Computing and Informatics

Résumé

In the data mining field, data representation turns out to be one of the major factors affecting mining algorithm scalability. Mining Frequent Itemsets (MFI) is a data mining problem that is heavily affected by this fact. The vertical approach is one of the successful data representations adopted for MFI problem. The main advantage of this approach is support for fast frequency counting via joining operations. Recently, an encoding method called prime-encoding is proposed as an enhancement for the vertical approach [10]. The performance study introduced in [10] confirmed the high quality of prime-encoding based vertical mining of frequent sequence over other vertical and horizontal ones in terms of space and time. Though sequence mining is more general than itemset mining, this paper presents a prime-encoding based vertical mining of frequent itemsets with new optimizations and a new re-encoding method that further enhance memory and speed. The experimental results show that prime encoding based vertical itemset mining is suitable for high-dimensional sparse data.

Publié le : 2013-11-18
Classification: Mining frequent itemsets, prime-block encoding, sparse data

@article{cai1985,
     author = {Karam Gouda; Faculty of Computers and Informatics, Benha University and Mosab Hassaan; Faculty of Computers and Informatics, Benha University},
     title = {Efficiently Using Prime-Encoding for Mining Frequent Itemsets in Sparse Data},
     journal = {Computing and Informatics},
     volume = {31},
     number = {6},
     year = {2013},
     language = {en},
     url = {http://dml.mathdoc.fr/item/cai1985}
}

Karam Gouda; Faculty of Computers and Informatics, Benha University; Mosab Hassaan; Faculty of Computers and Informatics, Benha University. Efficiently Using Prime-Encoding for Mining Frequent Itemsets in Sparse Data. Computing and Informatics, Tome 31 (2013) no. 6, . http://gdmltest.u-ga.fr/item/cai1985/