On the SLDev mailing list, Philip Linden asked: "Has anyone here worked with camera-based gesture recognition before? How about OpenCV? Is OpenCV the best package for extracting basic head position/gesture information from a camera image stream? Merov and I are pondering a Snowglobe project to detect head motion from simple cameras and connect it to the SL Viewer."
Philip added a specific project proposal for first phase Gesture Recognition. Linden Lab's goal is to have an intern work this summer to achieve a first step of simply using a regular camera (no markers or other hardware) to enhance the visual commmunication of the avatar by triggering gestures and head movement in response to what the camera sees. Community help with this project is welcome!
A very long thread followed: sldev thread about camera-based gesture recognition
Quick Summary: Several replies immediately followed that suggested hardware options beyond a normal webcam. These mainly included 3D cams and remote sensors. The discussion thread continued into details of what to expect from different device drivers. The details ranged from ability to track eye gaze to comments about the device's user-friendliness. Later on in the discussion, several people described needs on the interface, and this discussion quickly went into generic and universal interface design. As the thread quited down, at least a few people seemed to agree that a simple generic interface is a good start point. More specifically, one that can use input from a normal webcam, recognize a few common gestures, and play the related avatar animation much like one now can type /yes or /no and have the head nod. It was also mentioned that the simple interface would not require server changes to implement. Further discussion points out the distinction between SL gestures and human gesture and suggests that basic tracking is the low-hanging fruit rather than a simple interface.
I tried hooking up a webcam head tracking system to the joystick input of SL, via the utility faceapi2ppjoy. Worked fine, but my attempt to track the viewport based on my head position made me nauseous. ;-) -- Latha Serevi, March 2010