Pages

Aug 4, 2012

N3 - yet another SixthSense project

Introduction

This article describes my new project N3, which will follow the idea of SixthSense (hereinafter referred to as 'SS') which is a 'wearable gesture interface that augments the physical world around us with digital information'.

Motivation

I am a big fan of SS. On the TED video, the original developer Pranav showed a lot of example of future technology and great prospect of user interface. The source code of SS was opensourced, and I tried it on my PC the other day. It worked, but it seemed it has some problems.

  • The first problem is recognition precision. SS recognize the color makers, but the precision is not so good. SS uses Touchless SDK to detect markers and the precision completely depends on this SDK so that improving precision is not easy stuff. 
  • Second, portability of original SS is not good, because it depends on DirectX and run only on Widows. SS is 'wearable' application, so I think more cross-platform framework should be used so that SS can run on the mobile devices.
  • The third problem is just a structure of original source code SS. The source code is not well structurize and it takes extra effort to add/change function to SS. As an open source project, the source code itself should be attractive I believe.
Of course, in spite of these problems, SS is a great product and I believe the essence of SS is its idea. The idea of wearable gesture interface which can bring the digital world into the real world is great invention. However, to improve SS, I believe these problems should be fixed. It drived me to make my own SS -- N3.


N3's Approach

For N3, I am using OpenCV for most part of the logics including marker recognition. OpenCV should be fix SS's recognition precision problem and portability problem. The recognition method which I came up with first is color histogram comparison. OpenCV has a lot of convenient function for color histogram comparison, and it's easier to tune it up. Currently I am using very simple algorithm -- first convert the image into HSV planes, compute histogram from H and S plane using cvCalcHist(), calculate back projection image using cvCalcBackProject(), and extract maximum matched area.

Draw App Demo

Here is my draw app demo video. This is a bit different from SS's original draw app. The difference is that a projection plane is calibrated by projecting a chessboard so that we can draw the exact point our finger put on and feel like drawing on the wall. To camera calibration, cvFindHomography() is used to find homography matrix between the coordinate of a chessboard in the projector image and one in the camera image.


Gesture and Physical World Demo

One more demo is an application which recognize the circle gesture and create a circle in the virtual physical world. For gesture recognition, I am using baylor wetzel's C++ port of $1 Unistroke Recognizer, which original SS uses as well. In this application, we can also get a real tissue box mixed in the virtual world. To do so, I use SURF feature detector to extract key points from tissue box image and find it from a camera frame. For physical engine, Box2D is used.


Full Source Code of N3

Source code is on my github. It's not well documented now, but if someone is interested in this project, I'll make more effort to documentation.

Conclusion

This is a brief introduction of N3. It's using OpenCV as a basic library, and it's aiming to improve recognition precision, portability and source code structure. Currently some applications are implemented and the following technologies are implemented/integrated in them.

  • Color Marker Detection and Tracking
  • Camera calibration
  • Object Recognition
  • Gesture Recognition
  • Physical Engine
I will improve these technologies and add more features, so that N3 can be more useful and practical. Stay tuned!