A rough set-based knowledge discovery process
Zhong, Ning ; Skowron, Andrzej
International Journal of Applied Mathematics and Computer Science, Tome 11 (2001), p. 603-619 / Harvested from The Polish Digital Mathematics Library

The knowledge discovery from real-life databases is a multi-phase process consisting of numerous steps, including attribute selection, discretization of real-valued attributes, and rule induction. In the paper, we discuss a rule discovery process that is based on rough set theory. The core of the process is a soft hybrid induction system called the Generalized Distribution Table and Rough Set System (GDT-RS) for discovering classification rules from databases with uncertain and incomplete data. The system is based on a combination of Generalization Distribution Table (GDT) and the Rough Set methodologies. In the preprocessing, two modules, i.e. Rough Sets with Heuristics (RSH) and Rough Sets with Boolean Reasoning (RSBR), are used for attribute selection and discretization of real-valued attributes, respectively. We use a slope-collapse database as an example showing how rules can be discovered from a large, real-life database.

Publié le : 2001-01-01
EUDML-ID : urn:eudml:doc:207522
@article{bwmeta1.element.bwnjournal-article-amcv11i3p603bwm,
     author = {Zhong, Ning and Skowron, Andrzej},
     title = {A rough set-based knowledge discovery process},
     journal = {International Journal of Applied Mathematics and Computer Science},
     volume = {11},
     year = {2001},
     pages = {603-619},
     zbl = {0990.68139},
     language = {en},
     url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv11i3p603bwm}
}
Zhong, Ning; Skowron, Andrzej. A rough set-based knowledge discovery process. International Journal of Applied Mathematics and Computer Science, Tome 11 (2001) pp. 603-619. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv11i3p603bwm/

[000] Agrawal R., Mannila H., Srikant R., Toivonen H. and Verkano A. (1996): Fast discovery of association rules, In: Advances in Knowledge Discovery and Data Mining (U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy, Eds.). — Cambridge, Massachusetts: MIT Press, pp.307–328.

[001] Bazan J.G. (1998): A comparison of dynamic and non-dynamic rough set methods for extracting laws from decision system, In: Rough Sets in Knowledge Discovery 1: Methodology and Applications (L. Polkowski, A. Skowron, Eds.). — Heidelberg: Physica-Verlag, pp.321–365. | Zbl 1067.68711

[002] Bazan J.G. and Szczuka M. (2000): RSES and RSESlib—A collection of tools for rough set computations. — Proc. 2nd Int. Conf. Rough Setsand Current Trends in Computing (RSCTC-2000), Banff, pp.74–81. | Zbl 1014.68825

[003] Chmielewski M.R. and Grzymała-Busse J.W. (1994): Global discretization of attributes as preprocessing for machine learning. — Proc. 3rd Int. Workshop Rough Sets and Soft Computing, San Tose, pp.294–301. | Zbl 0949.68560

[004] Dong J.Z., Zhong N. and Ohsuga S. (1999a): Probabilistic rough induction: The GDT-RS methodology and algorithms, In: Foundations of Intelligent Systems (Z.W. Ras and A. Skowron, Eds.). — Berlin: Springer, pp.621–629.

[005] Dong J.Z., Zhong N. and Ohsuga S. (1999b): Using rough sets with heuristics to feature selection, In: New Directions in Rough Sets, Data Mining, Granular-Soft Computing (N. Zhong, A. Skowron, S. Ohsuga, Eds.). — Berlin: Springer, pp.178–187.

[006] Dougherty J., Kohavi R. and Sahami M. (1995): Supervised and unsupervised discretization of real features. — Proc. 12th Int. Conf. Machine Learning, pp.194–202.

[007] Fayyad U.M. and Irani K.B. (1992): On the handling of real-valued attributes in decison tree generation. — Machine Learning, Vol.8, pp.87–102. | Zbl 0767.68084

[008] Fayyad U.M., Piatetsky-Shapiro G. and Smyth P. (1996): From data mining to knowledge discovery: An overview , In: Advances in Knowledge Discovery and Data Mining (U. Fayyad, G. Piatetsky-Shapiro, Eds.). — Cambridge, Massachusetts: MIT Press, pp.1– 36.

[009] Grzymała-Busse J.W. (1998): Applications of rule induction system LERS , In: Rough Sets in Knowledge Discovery 1: Methodology and Applications (L. Polkowski, A. Skowron, Eds.). — Heidelberg: Physica-Verlag, pp.366–375. | Zbl 0940.68137

[010] Komorowski J., Pawlak Z., Polkowski L. and Skowron A. (1999): Rough sets: A tutorial , In: Rough Fuzzy Hybridization: A New Trend in Decision Making (S.K. Pal and A. Skowron, Eds.). — Singapore: Springer, pp.3–98.

[011] Lin T.Y. and Cercone N. (Eds.) (1997): Rough Sets and Data Mining: Analysis of Imprecise Data. — Boston: Kluwer.

[012] Mitchell T.M. (1997): Machine Learning. — Boston: Mc Graw-Hill. | Zbl 0913.68167

[013] Nguyen H. Son and Skowron A. (1995): Quantization of real value attributes. — Proc. Int. Workshop Rough Sets and Soft Computing at 2nd Joint Conf. Information Sciences (JCIS’95), Durham, NC, pp.34–37.

[014] Nguyen H. Son and Skowron A. (1997): Boolean reasoning for feature extraction problems, In: Foundations of Intelligent Systems (Z.W. Ras, A. Skowron, Eds.). — Berlin: Springer, pp.117–126.

[015] Nguyen H. Son and Nguyen S. Hoa (1998): Discretization methods in data mining, In: Rough Sets in Knowledge Discovery (L. Polkowski, A. Skowron, Eds.). — Heidelberg: PhysicaVerlag, pp.451–482. | Zbl 0940.68139

[016] Nguyen S.H., Nguyen H.S. Skowron A. (1999): Decomposition of task specification problems, In: Foundations of Intelligent Systems (Z.W. Ras and A. Skowron, Eds.). — Berlin: Springer, pp.310–318.

[017] Pal S.K. and Skowron A. (Eds.) (1999): Rough Fuzzy Hybridization. — Singapore: Springer.

[018] Pawlak Z. (1982): Rough sets. — Int. J. Comp. Inf. Sci., Vol.11, pp.341–356. | Zbl 0501.68053

[019] Pawlak Z. (1991): Rough Sets, Theoretical Aspects of Reasoning about Data. — Boston: Kluwer. | Zbl 0758.68054

[020] Pawlak Z. and Skowron A. (1993): A rough set approach for decision rules generation. — Proc. Workshop W12: The Management of Uncertainty in AI at 13th IJCAI, see also: Institute of Computer Science, Warsaw University of Technology, ICS Res. Rep., 23/93, pp.1–19.

[021] Polkowski L. and Skowron A. (1996): Rough mereology: A new paradigm for approximate reasoning. — Int. J. Approx. Reasoning, Vol.15, No.4, pp.333–365. | Zbl 0938.68860

[022] Polkowski L. and Skowron A. (1999): Towards adaptive calculus of granules, In: Computing with Words in Information/Intelligent Systems 1: Foundations (L.A. Zadeh and J. Kacprzyk, Eds.). — Heidelberg: Physica-Verlag, pp.201–228. | Zbl 0949.68143

[023] Skowron A. and Rauszer C. (1992): The discernibility matrixes and functions in information systems, In: Intelligent Decision Support (R. Slowinski, Ed.). — Boston: Kluwer, pp.331–362.

[024] Yao Y.Y. and Zhong N. (1999): Potential Applications of Granular Computing in Knowledge Discovery and Data Mining. — Proc. 5th Int. Conf. Information Systems Analysis and Synthesis (IASA’99), Orlando, pp.573–580.

[025] Zhong N. and Ohsuga S. (1995): Toward a multi-strategy and cooperative discovery system. — Proc. 1st Int. Conf. Knowledge Discovery and Data Mining (KDD-95), Montreal, pp.337–342.

[026] Zhong N., Liu C. and Ohsuga S. (1997): A way of increasing both autonomy and versatility of a KDD system, In: Foundations of Intelligent Systems (Z.W. Ras and A. Skowron, Eds.). — Berlin: Springer, pp.94–105.

[027] Zhong N., Dong J.Z. and Ohsuga S. (1998): Data mining: A probabilistic rough set approach, In: Rough Sets in Knowledge Discovery, Vol.2 (L. Polkowski and A. Skowron, Eds.). — Heidelberg: Physica-Verlag, pp.127–146.

[028] Zhong N., Skowron A. and Ohsuga S. (Eds.) (1999): New Directions in Rough Sets, Data Mining, and Granular-Soft Computing. — Berlin: Springer.

[029] Zhong N., Dong J.Z. and Ohsuga S. (2000): Using background knowledge as a bias to control the rule discovery process, In: Principles of Data Mining and Knowledge Discovery (D.A. Zighed, J. Komorowski and J. Zytkow, Eds.). — Berlin: Springer, pp.691–698.