Hand gesture recognition based on free-form contours and probabilistic inference
Włodzimierz Kasprzak ; Artur Wilkowski ; Karol Czapnik
International Journal of Applied Mathematics and Computer Science, Tome 22 (2012), p. 437-448 / Harvested from The Polish Digital Mathematics Library

A computer vision system is described that captures color image sequences, detects and recognizes static hand poses (i.e., "letters") and interprets pose sequences in terms of gestures (i.e., "words"). The hand object is detected with a double-active contour-based method. A tracking of the hand pose in a short sequence allows detecting "modified poses", like diacritic letters in national alphabets. The static hand pose set corresponds to hand signs of a thumb alphabet. Finally, by tracking hand poses in a longer image sequence, the pose sequence is interpreted in terms of gestures. Dynamic Bayesian models and their inference methods (particle filter and Viterbi search) are applied at this stage, allowing a bi-driven control of the entire system.

Publié le : 2012-01-01
EUDML-ID : urn:eudml:doc:208120
@article{bwmeta1.element.bwnjournal-article-amcv22i2p437bwm,
     author = {W\l odzimierz Kasprzak and Artur Wilkowski and Karol Czapnik},
     title = {Hand gesture recognition based on free-form contours and probabilistic inference},
     journal = {International Journal of Applied Mathematics and Computer Science},
     volume = {22},
     year = {2012},
     pages = {437-448},
     language = {en},
     url = {http://dml.mathdoc.fr/item/bwmeta1.element.bwnjournal-article-amcv22i2p437bwm}
}
Włodzimierz Kasprzak; Artur Wilkowski; Karol Czapnik. Hand gesture recognition based on free-form contours and probabilistic inference. International Journal of Applied Mathematics and Computer Science, Tome 22 (2012) pp. 437-448. http://gdmltest.u-ga.fr/item/bwmeta1.element.bwnjournal-article-amcv22i2p437bwm/

[000] Arulampalam, M.S., Maskell, S. and Gordon, N. (2002). A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking, IEEE Transactions on Signal Processing 50(2): 174-188.

[001] Baum, L., Petrie, T., Soules, G. and Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains, Annal Mathematics Statistics 41(1): 164-171. | Zbl 0188.49603

[002] Emambakhsh, M., Ebrahimnezhad, H. and Sedaaghi, M.H. (2010). Integrated region-based segmentation using color components and texture features with prior shape knowledge, International Journal of Applied Mathematics and Computer Science 20(4): 711-726, DOI: 10.2478/v10006010-0054-y. | Zbl 1209.94006

[003] Flasiński, M. and Myśliński, S. (2010). On the use of graph parsing for recognition of isolated hand postures of Polish sign language, Pattern Recognition 43(6): 2249-2264.

[004] Fu, C.-S., Cho, W. and Essig, S. (2000). Hierarchical colour image region segmentation for content-based image retrieval system, IEEE Transactions on Image Processing 9(1): 156-162.

[005] Gonzalez, R.C. and Wintz, P. (1987). Digital Image Processing, Addison-Wesley, Reading, MA. | Zbl 0441.68097

[006] Kapuściński, T. (2006). The Recognition of the Polish Sign Language in a Vision System, Ph.D. thesis, University of Zielona Góra, Zielona Góra, (in Polish).

[007] Kasprzak, W. (2009). Image and Speech Signal Recognition, WUT Press, Warsaw, (in Polish).

[008] Kasprzak, W. and Skrzyński, P. (2006). Hand image interpretation based on double active contour tracking, in T. Zielińska and C. Zieliński (Eds.), ROMANSY 16. Robot Design, Dynamics, and Control, CISM Courses and Lectures, Vol. 487, Springer, Wien/New York, NY, pp. 439-446.

[009] Kass, M., Witkin, A. and Terzopoulos, D. (1998). Snakes. Active contour models, International Journal of Computer Vision 1(4): 321-331.

[010] Marnik, J. (2003). The recognition of characters from the Polish finger alphabet, Technical report, StatSoft Polska, Cracow, http://www.statsoft.pl/czytelnia/badanianaukowe/d0ogol/marnik.pdf, (in Polish).

[011] Murphy, K. (2002). Dynamic Bayesian Networks: Representation, Inference and Learning, Ph.D. thesis, University of California, Berkeley, CA.

[012] Murphy, K.P. (1998). Switching Kalman filters, Technical report, DEC/Compaq Cambridge Research Labs, Cambridge, MA, http://www.cs.berkeley.edu/~murphyk/Articles/skf.ps.gz.

[013] Niemann, H. (2000). Klassifikation von Mustern, Springer, Berlin. | Zbl 0537.68084

[014] Pitas, I. (2000). Digital Image Processing Algorithms and Applications, Prentice Hall, New York, NY. | Zbl 0782.68118

[015] Polanska, J., Borys, D. and Polanska, A. (2006). Node assignment problem in Bayesian networks, International Journal of Applied Mathematics and Computer Science 16(2): 233-240. | Zbl 1147.62389

[016] Przepiórkowski, A. (2006). Frequency of letters in written Polish, Linguistic Advisory Website of Polish Scientific Publishers (PWN), http://poradnia.pwn.pl/lista.php?id=7072.

[017] Rabiner, L. and Juang, B. (1993). Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, NJ. | Zbl 0762.62036

[018] Rafajłowicz, E., Wnuk, M. and Rafajłowicz, W. (2008). Local detection of defects from image sequences, International Journal of Applied Mathematics and Computer Science 18(4): 581-592, DOI: DOI: 10.2478/v10006-008-0051-6. | Zbl 1156.93398

[019] Rehg, J. and Kanade, T. (1993). Digit eyes: Vision-based human hand tracking, Technical Report CMU-CS-93-220, School of Computer Science, Carnegie Mellon University, Pittsburg, PA.

[020] Sanchez-Reillo, R., Sanchez-Avila, C. and Gonzalez-Marcos, A. (2000). Biometric identification through hand geometry measurements, Transactions on Pattern Analysis and Machine Intelligence 22(10): 1168-1171.

[021] Starner, T. and Pentland, A. (1995). Visual recognition of American sign language using hidden Markov models, Proceedings of the International Workshop on Automatic Faceand Gesture-Recognition, Zurich, Switzerland, pp. 189-194.

[022] Terzopoulos, D. (2003). Deformable models: Classic, topologyadaptive and generalized formulations, Geometric Level Set Methods in Imaging, Vision, and Graphics, Springer-Verlag, New York, NY, pp. 21-40.

[023] Tóth, L., Kocsor, A. and Csirik, J. (2005). On naive Bayes in speech recognition, International Journal of Applied Mathematics and Computer Science 15(2): 287-294. | Zbl 1085.68667

[024] Wilkowski, A. (2008). An efficient system for continuous hand posture recognition in video sequences, in L. Rutkowski, R. Tadeusiewicz, L. Zadeh and J. Zurada (Eds.), Computational Intelligence: Methods and Applications, EXIT, Warsaw, pp. 411-422.

[025] Xu, C.-Y. and Prince, J. (1998). Snakes, shapes, and gradient vector flow, IEEE Transactions on Image Processing 7(3): 359-369. | Zbl 0973.94003

[026] Yining, D., Manjunath, B. and Shin, H. (1999). Colour image segmentation, Computer Vision and Pattern Recognition, IEEE Computer Society Conference, CVPR'99, Fort Collins, CO, USA, Vol. 2, pp. 2446-2451.