A strategy learning model for autonomous agents based on classification
Bartłomiej Śnieżyński
International Journal of Applied Mathematics and Computer Science, Tome 25 (2015), p. 471-482 / Harvested from The Polish Digital Mathematics Library

In this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process.

Publié le : 2015-01-01
EUDML-ID : urn:eudml:doc:271789
@article{bwmeta1.element.bwnjournal-article-amcv25i3p471bwm,
     author = {Bart\l omiej \'Snie\.zy\'nski},
     title = {A strategy learning model for autonomous agents based on classification},
     journal = {International Journal of Applied Mathematics and Computer Science},
     volume = {25},
     year = {2015},
     pages = {471-482},
     zbl = {1322.68203},
     language = {en},
     url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv25i3p471bwm}
}
Bartłomiej Śnieżyński. A strategy learning model for autonomous agents based on classification. International Journal of Applied Mathematics and Computer Science, Tome 25 (2015) pp. 471-482. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv25i3p471bwm/

[000] Airiau, S., Padham, L., Sardina, S. and Sen, S. (2008). Incorporating learning in BDI agents, Proceedings of the ALAMAS+ALAg Workshop, Estoril, Portugal.

[001] Barrett, S., Stone, P., Kraus, S. and Rosenfeld, A. (2012). Learning teammate models for ad hoc teamwork, AAMAS Adaptive Learning Agents (ALA) Workshop, Valencia, Spain.

[002] Bazzan, A., Peleteiro, A. and Burguillo, J. (2011). Learning to cooperate in the iterated prisoners dilemma by means of social attachments, Journal of the Brazilian Computer Society 17(3): 163-174.

[003] Bellman, R. (1957). Dynamic Programming, A Rand Corporation Research Study, Princeton University Press, Princeton, NJ.

[004] Cetnarowicz, K. and Drezewski, R. (2010). Maintaining functional integrity in multi-agent systems for resource allocation, Computing and Informatics 29(6): 947-973.

[005] Cohen, W.W. (1995). Fast effective rule induction, Proceedings of the 12th International Conference on Machine Learning (ICML'95), Tahoe City, CA, USA, pp. 115-123.

[006] Dietterich, T.G. (2000). Hierarchical reinforcement learning with the MAXQ value function decomposition, Journal of Artificial Intelligence Research 13: 227-303. | Zbl 0963.68085

[007] Gehrke, J.D. and Wojtusiak, J. (2008). Traffic prediction for agent route planning, in M. Bubak et al. (Eds.), Computational Science-ICCS 2008, Part III, Lecture Notes Computer Science, Vol. 5103, Springer, Berlin/Heidelberg, pp. 692-701.

[008] Hernandez-Leal, P., Munoz de Cote, E. and Sucar, L.E. (2013). Learning against non-stationary opponents, Workshop on Adaptive Learning Agents, Saint Paul, MN, USA.

[009] Kaelbling, L.P., Littman, M.L. and Moore, A.W. (1996). Reinforcement learning: A survey, Journal of Artificial Intelligence Research 4: 237-285.

[010] Kazakov, D. and Kudenko, D. (2001). Machine learning and inductive logic programming for multi-agent systems, in M. Luck et al. (Eds.), Multi-Agent Systems and Applications, Springer, Berlin/Heidelberg, pp. 246-270. | Zbl 0989.68553

[011] Lin, L.-J. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching, Machine Learning 8(3-4): 293-321.

[012] Panait, L. and Luke, S. (2005). Cooperative multi-agent learning: The state of the art, Autonomous Agents and Multi-Agent Systems 11(3): 387-434.

[013] Quinlan, J. (1993). C4.5: Programs for Machine Learning, Morgan Kaufmann, San Francisco, CA.

[014] Rao, A.S. and Georgeff, M.P. (1991). Modeling rational agents within a BDI-architecture, in J. Allen, R. Fikes and E. Sandewall (Eds.), Proceedings of the 2nd International Conference on Principles of Knowledge Representation and Reasoning, Morgan Kaufmann: San Mateo, CA, pp. 473-484. | Zbl 0765.68194

[015] Rummery, G.A. and Niranjan, M. (1994). On-line q-learning using connectionist systems, Technical report, Cambridge University, Cambridge.

[016] Russell, S.J. and Zimdars, A. (2003). Q-decomposition for reinforcement learning agents, Proceedings of the 20th International Conference on Machine Learning (ICML2003), Washington, DC, USA, pp. 656-663.

[017] Russell, S. and Norvig, P. (2009). Artificial Intelligence: A Modern Approach, 3rd Edn., Prentice-Hall, Upper Saddle River, NJ. | Zbl 0835.68093

[018] Sen, S. and Weiss, G. (1999). Learning in Multiagent Systems, MIT Press, Cambridge, MA, pp. 259-298.

[019] Shoham, Y., Powers, R. and Grenager, T. (2003). Multi-agent reinforcement learning: A critical survey, Technical report, Stanford University, Stanford, CA. | Zbl 1168.68493

[020] Singh, D., Sardina, S., Padgham, L. and Airiau, S. (2010). Learning context conditions for BDI plan selection, Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, Toronto, Canada, pp. 325-332.

[021] Śnieżyński, B. (2013a). Agent strategy generation by rule induction, Computing and Informatics 32(5): 1055-1078.

[022] Śnieżyński, B. (2013b). Comparison of reinforcement and supervised learning methods in farmer-pest problem with delayed rewards, in C. Badica, N.T. Nguyen and M. Brezovan (Eds.), Computational Collective Intelligence, Lecture Notes in Computer Science, Vol. 8083, Springer, Berlin/Heidelberg, pp. 399-408.

[023] Śnieżyński, B. (2014). Agent-based adaptation system for service-oriented architectures using supervised learning, Procedia Computer Science 29: 1057-1067.

[024] Śnieżyński, B. and Dajda, J. (2013). Comparison of strategy learning methods in farmer-pest problem for various complexity environments without delays, Journal of Computational Science 4(3): 144-151.

[025] Śnieżyński, B. and Kozlak, J. (2005). Learning in a multi-agent approach to a fish bank game, in M. Pchouek, P. Petta and L.Z. Varga (Eds.), Multi-Agent Systems and Applications IV, Lecture Notes in Computer Science, Vol. 3690, Springer, Berlin/Heidelberg, pp. 568-571.

[026] Śnieżyński, B., Wojcik, W., Gehrke, J.D. and Wojtusiak, J. (2010). Combining rule induction and reinforcement learning: An agent-based vehicle routing, Proceedings of the International Conference on Machine Learning and Applications, Washington, DC, USA, pp. 851-856.

[027] Sutton, R. and Barto, A. (1998). Reinforcement Learning: An Introduction (Adaptive Computation and Machine Learning), The MIT Press, Cambridge, MA.

[028] Sutton, R.S. (1990). Integrated architecture for learning, planning, and reacting based on approximating dynamic programming, Proceedings of the 7th International Conference on Machine Learning, Austin, TX, USA, pp. 216-224.

[029] Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents, Proceedings of the 10th International Conference on Machine Learning, Amherst, MA, USA, pp. 330-337.

[030] Tuyls, K. and Weiss, G. (2012). Multiagent learning: Basics, challenges, and prospects, AI Magazine 33(3): 41-52.

[031] Watkins, C.J.C.H. (1989). Learning from Delayed Rewards, Ph.D. thesis, King's College, Cambridge.

[032] Wooldridge, M. (2009). An Introduction to MultiAgent Systems, 2nd Edn., Wiley Publishing, Chichester.

[033] Zhang, W. and Dietterich, T.G. (1995). A reinforcement learning approach to job-shop scheduling, Proceedings of the 14th International Joint Conference on Artificial Intelligence, Montreal, Canada, pp. 1114-1120.