Aug 12, 2011 10:59 AM

Augmented Reality: KinectFusion

*I guess you can't really call it a "Kinect hack" if it's Microsoft R&D guys legitimately researching a Microsoft product.

*"The system allows the user to scan a whole room and its contents within seconds." There's not a peep of audio in the clip here, but never mind, the implications speak for themselves.

Uploaded by MicrosoftResearch on Aug 11, 2011

"We present KinectFusion, a system that takes live depth data from a moving depth camera and in real-time creates high-quality 3D models. The system allows the user to scan a whole room and its contents within seconds.

"As the space is explored, new views of the scene and objects are revealed and these are fused into a single 3D model. The system continually tracks the 6DOF pose of the camera (((That would be "six degrees of freedom" of camera movement, meaning up-down, north-south, east-west, plus roll, pitch and yaw ))) and rapidly builds a volumetric representation of arbitrary scenes.

"Our technique for tracking is directly suited to the point-based depth data of Kinect, and requires no feature extraction or feature tracking. Once the 3D pose of the camera is known, each depth measurement from the sensor can be integrated into a volumetric representation.

"We describe the benefits of this representation over mesh-based approaches. (((It's not enough that they're doing it, it's also gotta be better than some other method. But if you COMBINE six or seven registration methods, all in real-time, you might get superior results.)))

"In particular, the representation implicitly encodes predictions of the geometry of surfaces within a scene, which can be extracted readily from the volume. (((In other words it's partly fictional, which is interesting in itself – imagine household robots trawling through sets of hypotheses.)))

"As the camera moves through the scene, new depth data can be added or removed from this volumetric representation, continually refining the 3D model acquired. (((So it oughta shape up pretty good after a year or so.)))

"We describe novel GPU-based implementations for both camera tracking and surface reconstruction. These take two well-understood methods from the computer vision and graphics literature as a starting point, defining new instantiations designed specifically for parallelizable GPGPU hardware. This allows for interactive real-time rates that have not previously been demonstrated. (((Makes one wonder what happens when you ship the hardware problem out to the cloud.)))

"We demonstrate the interactive possibilities enabled when high-quality 3D models can be acquired in real-time, including: extending multi-touch interactions to arbitrary surfaces; ((("just click on that wall there"))) advanced features for augmented reality; ((("welcome to my daemon-haunted living room"))) real-time physics simulations of the dynamic model; ((("don't fall down those stairs"))) novel methods for segmentation and tracking of scanned objects." ((("Microsoft Bing for lost shoes.")))