Custom gesture processing for Leap Motion. Part 1


During the holidays, the Leap Motion sensor fell into my hands. I wanted to work with him for a long time, but the main work and useless pastime did not allow the session.

Once, about 10 years ago, when I was a schoolboy and did nothing, I bought the Igromania magazine, which came bundled with a disk with all sorts of gaming interests and shareware software. And in this magazine there was a heading about useful software. One of the programs turned out to be Symbol Commander - a utility that allows you to record mouse movements, recognize recorded movements and, when recognized, perform actions assigned to this movement.

Now, with the development of contactless sensors (Leap Motion, Microsoft Kinect, PrimeSence Carmine), the idea arose to repeat similar functionality for one of them. The choice fell on Leap Motion.

So what is needed to handle custom gestures? Gesture representation model and data processor. The scheme of work looks like this:

Thus, the development can be divided into the following steps:

1. Designing a model for describing gestures
2. Implementing a processor for recognizing described gestures
3. Implementing correspondence between a gesture and a command for the OS
4. UI for recording gestures and setting commands.

Let's start with the model.

The official Leap Motion SDK provides a set of predefined gestures. To do this, there is a GestureType enumeration containing the following values:


Since I set the task of processing custom gestures for myself, the model for describing gestures will be my own.

So what is a gesture for Leap Motion?

The SDK provides a Frame where you can get a FingerList containing a collection of descriptions of the current state of each finger. Therefore, for a start, it was decided to follow the path of least resistance and consider the gesture as a set of finger movements along one of the axes (XYZ) in one of the directions (the corresponding component of the finger coordinate increases or decreases).

Therefore, each gesture will be described by a set of primitives, consisting of:

1. Axis of finger movement
2. Directions of movement
3. The order of execution of the primitive in the gesture
4. The number of frames during which the primitive gesture must be executed.

For the gesture, you must also specify:

1. Name
2. Index of the finger that performs this gesture.

Gestures in this version will be described in XML format. For example, I’ll give an XML description of the well-known Tap gesture (“click” with your finger):


This fragment sets the Tap gesture, consisting of two primitives - lowering the finger for 10 frames and, accordingly, raising the finger.

We describe this model for the recognition library:

public class Primitive
    [XmlElement(ElementName = "Axis", Type = typeof(Axis))]         //ось выполнения движения
    public Axis Axis { get; set; }
    [XmlElement(ElementName = "Direction")]                         //направление: +1 -> положительное изменение
    public int Direction { get; set; }                              //             -1 -> отрицательное
    [XmlElement(ElementName = "Order", IsNullable = true)]          //порядок выполнения части движения
    public int? Order { get; set; }
    [XmlElement(ElementName = "FramesCount")]                       //количество кадров для выполнения части движения
    public int FramesCount { get; set; }
public enum Axis
public class Gesture
    [XmlElement(ElementName = "GestureIndex")]                     //порядковый номер жеста
    public int GestureIndex { get; set; }
    [XmlElement(ElementName = "GestureName")]                      //название жеста
    public string GestureName { get; set; }
    [XmlElement(ElementName = "FingerIndex")]                      //порядковый номер пальца
    public int FingerIndex { get; set; }
    [XmlElement(ElementName = "PrimitivesCount")]                  //количество составны частей
    public int PrimitivesCount { get; set; }
    [XmlArray(ElementName = "Primitives")]                         //описание составных частей для жеста
    public Primitive[] Primitives { get; set; }

Ok, the model is ready. Let's move on to the recognition processor.

What is recognition? Given that on each frame we can get the current state of the finger, recognition is a check of the compliance of the states of the finger with the specified criteria over a given period of time .

Therefore, we will create a class inherited from Leap.Listener and redefine the OnFrame method in it:

public override void OnFrame(Leap.Controller ctrl)
    Leap.Frame frame = ctrl.Frame();
    currentFrameTime = frame.Timestamp;
    frameTimeChange = currentFrameTime - previousFrameTime;
    if (frameTimeChange > FRAME_INTERVAL)
        foreach (Gesture gesture in _registry.Gestures)
            Task.Factory.StartNew(() => 
                    Leap.Finger finger = frame.Fingers[gesture.FingerIndex];
                    CheckFinger(gesture, finger);
        previousFrameTime = currentFrameTime;

Here we check the states of the fingers once in a period of time equal to FRAME_INTERVAL. For tests, FRAME_INTERVAL = 5000 (the number of microseconds between processed frames).

From the code it is obvious that recognition is implemented in the CheckFinger method. The parameters of this method are the gesture that is currently being checked, and Leap.Finger is an object that represents the current state of the finger.

How does recognition work?

I decided to make three containers - a container of coordinates for controlling the direction, a container of frame counters, the position of the fingers of which meets the conditions for recognizing gestures and a container of counters of recognized primitives for controlling the execution of gestures.


public void CheckFinger(Gesture gesture, Leap.Finger finger)
    int recognitionValue = _recognized.ElementAt(gesture.GestureIndex);
    Primitive primitive = gesture.Primitives[recognitionValue];
    CheckDirection(gesture.GestureIndex, primitive, finger);
public void CheckDirection(int gestureIndex, Primitive primitive, Leap.Finger finger)
    float pointCoordinates = float.NaN;
        case Axis.X:
            pointCoordinates = finger.TipPosition.x;
        case Axis.Y:
            pointCoordinates = finger.TipPosition.y;
        case Axis.Z:
            pointCoordinates = finger.TipPosition.z;
    if (_coordinates[gestureIndex] == INIT_COUNTER)
        _coordinates[gestureIndex] = pointCoordinates;
        switch (primitive.Direction)
            case 1:
                if (_coordinates[gestureIndex] < pointCoordinates)
                    _coordinates[gestureIndex] = pointCoordinates;
                    _coordinates[gestureIndex] = INIT_COORDINATES;
            case -1:
                if (_coordinates[gestureIndex] > pointCoordinates)
                    _coordinates[gestureIndex] = pointCoordinates;
                    _coordinates[gestureIndex] = INIT_COORDINATES;
    if(_number[gestureIndex] == primitive.FramesCount)
        _number[gestureIndex] = INIT_COUNTER;
public void CheckGesture(Gesture gesture)
    if(_recognized[gesture.GestureIndex] == (gesture.PrimitivesCount - 1))
        _recognized[gesture.GestureIndex] = INIT_COUNTER;

At the moment, the gestures Tap (finger tap) and Round (circular motion) are described.

The following steps will be:

1. Stabilization of recognition (yes, now it is not stable. I’m thinking over options).
2. Implementation of the UI application for the normal operation of the user.

Source code available on github

Also popular now: