Motor control neural models and systems theory

Doya, Kenji; Kimura, Hidenori; Miyamura, Aiko

Doya, Kenji ; Kimura, Hidenori ; Miyamura, Aiko

International Journal of Applied Mathematics and Computer Science, Tome 11 (2001), p. 77-104 / Harvested from The Polish Digital Mathematics Library

Access to full text
Full (PDF)

Résumé

In this paper, we introduce several system theoretic problems brought forward by recent studies on neural models of motor control. We focus our attention on three topics: (i) the cerebellum and adaptive control, (ii) reinforcement learning and the basal ganglia, and (iii) modular control with multiple models. We discuss these subjects from both neuroscience and systems theory viewpoints with the aim of promoting interplay between the two research communities.

Publié le : 2001-01-01

Zbl 1065.93528

EUDML-ID : urn:eudml:doc:207506

@article{bwmeta1.element.bwnjournal-article-amcv11i1p77bwm,
     author = {Doya, Kenji and Kimura, Hidenori and Miyamura, Aiko},
     title = {Motor control neural models and systems theory},
     journal = {International Journal of Applied Mathematics and Computer Science},
     volume = {11},
     year = {2001},
     pages = {77-104},
     zbl = {1065.93528},
     language = {en},
     url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv11i1p77bwm}
}

Doya, Kenji; Kimura, Hidenori; Miyamura, Aiko. Motor control neural models and systems theory. International Journal of Applied Mathematics and Computer Science, Tome 11 (2001) pp. 77-104. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv11i1p77bwm/

Bibliographie

[000] Albus J.S. (1971): A theory of cerebellar function. - Math. Biosci., Vol.10, pp.25-61.

[001] Astrom K.J. and Wittenmark B. (1989): Adaptive Control. - Massachusetts: Addison Wesley.

[002] Barto A.G. (1995): Adaptive critics and the basal ganglia,In: Models of Information Processing in the Basal Ganglia (Houk J.C., Davis J.L. and Beiser D.G., Eds.). - Cambridge,MA: MIT Press, pp.215-232.

[003] Barto A.G., Sutton R.S. and Anderson C.W. (1983): Neuronlike adaptive elements that can solve difficult learning control problems. - IEEE Trans. Syst. Man Cybern., Vol.13, pp.834-846.

[004] Bertsekas D.P. and Tsitsiklis J.N. (1996): Neuro-Dynamic Programming. - Belmont, MA: Athena Scientific. | Zbl 0924.68163

[005] Dayan P. (1992): The convergence of TD(λ) for general λ. - Machine Learn., Vol.8, pp.341-362. | Zbl 0773.68060

[006] Doya K. (1999): What are the computations of the cerebellum, the basal ganglia,and the cerebral cortex. - Neural Netw., Vol.12, No.7-8, pp.961-974.

[007] Doya K. (2000): Reinforcement learning in continuous time and space. - Neural Comp., Vol.12, No.1, pp.243-269.

[008] Doya K., Katagiri K., Wolpert D.M. and Kawato M. (2000a): Recognition and imitation of movement paterns by a multiple predictor-controller architecture. - Tech. Rep. Institute of Electronic, Information, and Communication Engineers, TL2000-11, pp.33-40.

[009] Doya K., Samejima K., Katagiri K. and Kawato M. (2000b): Multiple model-based reinforcement learning. - Tech. Rep. KDB-08, Kawato Dynamic Brain Project, ERATO, Japan Science and Technology Corporation. | Zbl 0997.93037

[010] Ghahramani Z. and Wolpert D.M. (1997): Modular decompositionin visuomotor learning. - Nature, Vol.386, pp.392-395.

[011] Ghez C. and Thach W.T. (2000): The cerebellum, In: Principles of Neural Science, 4th Ed. (Kandel E.R., Schwartz J.H. and Jessell T.M., Eds.). - New York: Mc Graw-Hill, pp.832-852.

[012] Haruno M., Wolpert D.M. and Kawato M. (1999): Multiple paired forward-inverse models for human motor learningand control, In: Advances in Neural Information Processing Systems,No.11 (Kearns M.S., Solla S.A. and Cohen D.A., Eds.). - Cambridge, MA: MIT Press, pp.31-37.

[013] Hikosaka O., Nakahara H., Rand M.K., Sakai K., Lu X., Nakamura K., Miyachi S. and Doya K. (1999): Parallel neural networks for learning sequential procedures. - Trends in Neurosci., Vol.22, No.10, pp.464-471.

[014] Houk J.C., Adams J.L. and Barto A.G. (1995): A model of how the basal ganglia generate and use neural signals that predict reinforcement, In: Models of Information Processing in the Basal Ganglia (Houk J.C., Davis J.L. and Beiser D.G., Eds.). - Cambridge, MA: MIT Press, pp.249-270.

[015] Imamizu H., Miyauchi S., Sasaki Y., Takino R., Putz B. and Kawato M. (1997): Separated modules for visuomotor control andlearning in the cerebellum: A functional MRI study,In: Neuro Image: Third International Conference on Functional Mappingof the Human Brain (Toga A.W., Frackowiak R.S.J. and Mazziotta J.C., Eds.). - Copenhagen, Denmark: Academic Press, Vol.5, p.S598.

[016] Imamizu H., Miyauchi S., Tamada T., Sasaki Y., Takino R., Putz B., Yoshioka T. and Kawato M. (2000): Human cerebellar activity reflecting an acquired internal model of a new tool. - Nature, Vol.403, pp.192-195.

[017] Ito M. (1993): Movement and thought: identical control mechanismsby the cerebellum. - Trends in Neurosci., Vol.16, No.11, pp.448-450.

[018] Ito M., Sakurai M. and Tongroach P. (1982): Climbing fibre induced depression of both mossy fibreresponsiveness and glutamate sensitivity of cerebellar purkinje cells. - J. Physiol., Vol.324, pp.113-134.

[019] Kawagoe R., Takikiwa Y. and Hikosaka O. (1998): Expectation of reward modulates cognitive signals in the basal ganglia. - Nature Neurosci., Vol.1, No.5, pp.411-416.

[020] Kawato M., Furukawa K. and Suzuki R. (1987): A hierarchical neural network model for control and learning of voluntary movement. - Biol. Cybern., Vol.57, pp.169-185. | Zbl 0624.92009

[021] Kawato M. and Gomi H. (1992): A computational model of four regions of the cerebellum based on feedback-error learning. - Biol. Cybern., Vol.68, pp.95-103.

[022] Marr D. (1969): A theory of cerebellar cortex. - J. Physiol., Vol.202, pp.437-470.

[023] Miyamura A. and Kimura H. (2000): Mathematical foundations of feedback error learning method. - (submitted).

[024] Montague P.R., Dayan P. and Sejnowski T.J. (1996): A framework for mesencephalic dopamine systems based on predictive Hebbian learning. - J. Neurosci., Vol.16, pp.1936-1947.

[025] Morimoto J. and Doya K. (2000): Acquisition of stand-up behavior by a real robot using hierarchical reinforcementlearning. - Proc. 17th Int. Conf. Machine Learning,Vancouver, Vol.1, pp.623-630.

[026] Morse A.S. (1996): Supervisory control of families of linear set-point controllers-part 1: Exact matching. - IEEE Trans. Automat. Contr., Vol.41, No.10, pp.1413-1431. | Zbl 0872.93009

[027] Nakahara H., Doya K. and Hikosaka O. (1998): Benefit of multiple representaitons for motor sequence control in thebasal ganglia loops. - BSIS Tech. Rep. 98-05, RIKEN Brain Science Institute.

[028] Pawelzik K., Kohlmorge J. and Muller K.R. (1996): Annealed competition of experts for a segmentation andclassification of switching dynamics. - Neural Comput., Vol.8, pp.340-356.

[029] Schultz W., Dayan P. and Montague P.R. (1997): A neural substrate of prediction and reward. - Science, Vol.275, pp.1593-1599.

[030] Shima K., Mushiake H., Saito N. and Tanji J. (1996): Role for cells in the presupplementary motor area in updatingmotor plans. - Proc. Nat. Acad. Sci., Vol.93, pp.8694-8698.

[031] Suri R.E. and Schultz W. (1998): Learning of sequential movements by neural network model with dopamine-like reinforcementsignal. - Experim. Brain Res., Vol.121, pp.350-354.

[032] Sutton R.S. (1988): Learning to predict by the methods of temporal difference. - Machine Learn., Vol.3, pp.9-44.

[033] Sutton R.S. and Barto A.G. (1998): Reinforcement Learning. - Cambridge, MA: MIT Press.

[034] Tesauro G. (1994): TD-Gammon, a self teaching backgammon program, achieves master-level play. - Neural Comput., Vol.6, pp.215-219.

[035] Watkins C.J.C.H. (1989): Learning from delayed rewards. - Ph.D. Thesis, Cambridge University.

[036] Wilson C.W. (1998): Basal ganglia, In: The Synaptic Organization of the Brain, 3rd Ed. (Shepherd G.M., Ed.). - New York: Oxford University Press, pp.329-375.

[037] Wolpert D.M. and Kawato M. (1998): Multiple paired forward and inverse models for motor control. - Neural Netw., Vol.11, pp.1317-1329.

[038] Zames G. (1981): Feedback and optimal sensitivity: Model referencetransformations multiplicative seminors, and approximateinverses. - IEEE Trans. Automat. Contr., Vol.AC-26, No.2, pp.301-320. | Zbl 0474.93025