Open Maze: Reading #5: Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes

Tuesday, September 07, 2010

Reading #5: Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes

Comments on Others:

Summary
The paper states that “$1 recognizer” (I guess $1 represents how easy, cheap, and usable with less loc to program) was developed to enable novice programmers to incorporate gestures into their UI prototypes. The paper includes a detailed pseudo code, so that programmers can easily test the gesture recognizer through their interface design. The authors correctly states that the making of gesture recognition a topic of interests to experts in the field of AI, not experts in HCI with primarily works on interactive level (not completely agree!). Therefore, authors assume that this perhaps limited the opportunity to slip in gesture recognition into UI design. The paper also presents why Ad-hoc recognizers prevents applications due to limitations in defining own gestures at run time, and when the gesture set is very large.
The $1 recognizer is essentially limited to uni-stroke due to the authors’ interest in recognizing paths delineated by users interactively. The authors state comparatively, $1 recognizer is better because sophisticated methods like HMMs, ANNs, statistical classifiers require extensive training before practical use, and also they are difficult to program and debug. The authors describe, even the popular Rubine’s linear classifier requires advance computation before its use.
The paper also states 8 criteria’s defined upon the recognizer to make it simple (These criteria’s also considered as goals of this methodology). The paper describes its algorithm implementation is 4 step process. As limitations, authors states that $1 algorithm is rotation, scale and position invariant (circle and oval both similar). Also gestures are not possible differentiate based on time. The paper states the testing procedure with comparison to Rubin classifier and Dynamic Time Warping (DTW).

Discussion
This paper explains a simple gesture recognizer which can be easily program and plugged in to a UI design and support gestures through the UI. The authors believe that perhaps the area of interests of HCI experts makes it limited for the UI designs to be incorporated with gesture recognition. In my opinion, I’m not completely agreed to that, may be limitations on the existing technology hinder the opportunities to do more research on HMM, ANN etc. and to apply that knowledge to UI usability and development research.
I’m not sure why authors state that HMMs require extensive sample training and complex coding and debugging. With my personal experience on developing HMM based eye movement detection algorithm, I feel authors bit mislead just by the content before applying it to real world applications (HMM is reasonably fast with online applications though it require some sort of a training before hand). Also, I’m just wondering the use of N=64 for the resampling. May be using a digitization (select the path and make adjacent pixels 1 or zero depends on the next pixel and direction)
Overall, the paper presents easy to read/understand concept on simple gesture recognizer, and I believe is a plus for gesture/sketch recognition community to implement it and compare it with their own algorithms (Because authors suggests that their work is comparatively competitive in terms of results in previously used complex algorithms, Rubine and DTW).

Find the paper here.

1 comment:

liwenzhe said...: I am also wondering why he use 64 points in this paper but 16 points in Protractor Paper. he does not give us any convicable evidence. He should have provided us some empirical study about choosing how many points are best. Except its simplicity and let the novice people incorporate into their application even without knowledge of machine learning, I prefer other recognizers; 3:11 AM