"A meek endeavor to the triumph" by Sampath Jayarathna

Sunday, September 05, 2010

Reading #2: Specifying Gestures by Example (Rubine)

Comments on others:

Yue Li


Summary:

This paper describes possibility of creating automatic gesture recognizers from example gestures by removing the need for hand coding. The GRANDMA (Gesture Recognizers Automated in a Novel Direct Manipulation Architecture) toolkit developed for rapidly adding gestures to direct manipulation interfaces and trainable single stroke gesture recognizer used by GRANDMA are also explained. The paper also describes powerful combination of gesturing and direct manipulation in the two-phase interaction technique. The author suggests the requirement of 15 training samples (or examples) per gesture class which is selected empirically. The author explains a simple preprocessing applied towards the inputs by removing time stamped coordinates which are 3 pixels apart from the previous input point. Furthermore, the author describes the feature set was empirically determined to work well on a number of different gesture sets. The gesture feature classification is handled by a linear classifier over the features. In addition Eager recognition to recognize gestures as soon as they are unambiguous and Multi Finger recognition are also discussed.

 

Discussion

  According to the author, the GRANDMA took kit is essentially a single stroke based gesture system and states that this allows shorter timeouts to be used and therefore avoids segmentation problem. Still the author states the possibility of both timeouts and mouse button release, and this makes readers wonder that if button release action is a possibility then why its so hard to incorporate the multi-stroke gesture recognition to the GRANDMA toolkit. In my opinion, it is voluble for users to show that even the multi-stroke gesture recognition is possible through the GRANDMA toolkit though it is faster (or less problematic and simple) to use only single-stroked gesture recognition.
          The author suggests than 15 training samples are enough per gesture class and this essentially makes readers wonder that the use of terminology in the GRANDMA, "Automation". Author refers it as the possibility of automatic gesture recognition vs. need for hand coding, and in my opinion if the recognizer still requires some sort of manual input (gestures) for the system to be functional (just my opinion).
         The elimination of jiggle is another point to consider, the author states a simple preprocessing to remove input apart from 3 pixels from the previous input. In my opinion this may hinder the users intension as well as the actual input gesture. I believe a strong preprocessing; something like a binarization and then skeltonation would be much more appealing. The use of specific features set need to be independently verified, may be a sensitivity analysis and ranking order based on the analysis would be a better choice. Its quite nice if there’s some results from different gesture classifications, something like KNN compared to linear classifier.

Find the paper here.

1 comment:

Marty said...

You bring up some very good points. When I took a class like this during my undergrad, we had an assignment to extend Rubine to multiple stroke gestures. It is definitely a common question.

You must also remember how old this paper is. ID3 was a new algorithm that couldn't even handle continuous-valued attributes. Neural networks were just being pioneered. kNN is not a very good comparison because one of the features of the linear discriminator is the classification speed, and kNN is very slow to classify an instance.