This page contains all of the main projects that I have worked on as a PhD candidate, student, as commissional work or in
my spare time for friends or myself. You will find PDF files, videos or pictures above the description of each project, which can be accessed by clicking on the respective thumbnail images.

Computer Vision & Mixed Reality

PhD Thesis: Robust Monocular Pose Estimation of Rigid 3D Objects in Real-Time (2018)

Abstract This dissertation presents novel approaches to visual 3D object pose estimation from 2D images. The particular feature of the proposed solutions is that they operate in real-time while only requiring a single (monocular) camera. The main parts of this work describe an innovative active infrared LED marker-based system as well as a novel algorithm for passive markerless pose estimation, both developed within the course of this thesis. For the marker-based approach, two original, nearly co-planar LED patterns are proposed. These enable high-speed, single-image pose estimation of multiple markers as well as robustly avoiding common pose ambiguities. The proposed markerless method presents a novel combination of region-based and direct photometric pose estimation. It is enabled by a new numerical pose optimization strategy derived for the region-based part as well as an innovative statistical object segmentation model. The overall approach thereby significantly improved the robustness towards challenging conditions, such as dynamic lighting, cluttered backgrounds, different object appearances, occlusions and fast and complex motion, compared to the state of the art. It is furthermore the first capable of estimating the poses of multiple arbitrarily textured objects in real-time on a commodity laptop. In addition to this, a new complex dataset dedicated to the task of monocular object pose tracking has been created and made publicly available.

Article: A Region-based Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking (2018)

Abstract We propose an algorithm for real-time 6DOF pose tracking of rigid 3D objects using a monocular RGB camera. The key idea is to derive a region-based cost function using temporally consistent local color histograms. While such region-based cost functions are commonly optimized using first-order gradient descent techniques, we systematically derive a Gauss-Newton optimization scheme which gives rise to drastically faster convergence and highly accurate and robust tracking performance. We furthermore propose a novel complex dataset dedicated for the task of monocular object pose tracking and make it publicly available to the community. To our knowledge, it is the first to address the common and important scenario in which both the camera as well as the objects are moving simultaneously in cluttered scenes. In numerous experiments - including our own proposed dataset - we demonstrate that the proposed Gauss-Newton approach outperforms existing approaches, in particular in the presence of cluttered backgrounds, heterogeneous objects and partial occlusions.

Henning Tjaden, Ulrich Schwanecke, Elmar Schömer and Daniel Cremers
A Region-based Gauss-Newton Approach to Real-Time Monocular Multiple Object Tracking
In IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2018

Paper: Real-Time Monocular Pose Estimation of 3D Objects using Temporally Consistent Local Color Histograms (2017)

Abstract We present a novel approach to 6DOF pose estimation and segmentation of rigid 3D objects using a single monocular RGB camera based on temporally consistent, local color histograms. We show that this approach outperforms previous methods in cases of cluttered backgrounds, heterogenous objects, and occlusions. The proposed histograms can be used as statistical object descriptors within a template matching strategy for pose recovery after temporary tracking loss e.g. caused by massive occlusion or if the object leaves the camera’s field of view. The descriptors can be trained online within a couple of seconds moving a handheld object in front of a camera. During the training stage, our approach is already capable to recover from accidental tracking loss. We demonstrate the performance of our method in comparison to the state of the art in different challenging experiments including a popular public data set.

Henning Tjaden, Ulrich Schwanecke and Elmar Schömer
Real-Time Monocular Pose Estimation of 3D Objects using Temporally Consistent Local Color Histograms
In Proceedings of the International Conference on Computer Vision (ICCV), 2017

Paper: A Low-Cost Mobile Infrastructure for Compact Aerial Robots Under Supervision (2017)

Abstract The availability of affordable Micro Aerial Vehicles (MAVs) opens up a whole new field of civil applications. We present an Infrastructure for Compact Aerial Robots Under Supervision (ICARUS) that realizes a scalable low-cost testbed for research in the area of MAVs starting at about $100. It combines hardware and software for tracking and computerbased control of multiple quadrotors. In combination with the usage of lightweight miniature off-the-shelf quadrotors our system provides a testbed that virtually can be used anywhere without the need of elaborate safety measures. We give an overview of the entire system, provide some implementation details as well as an evaluation and depict different applications based on our infrastructure such as an Unmanned Ground Vehicle (UGV) which in cooperation with a MAV can be utilized in Search and Rescue (SAR) operations and a multiuser interaction scenario with several MAVs.

Marc Lieser, Henning Tjaden, Robert Brylka Lasse, Löffler and Ulrich Schwanecke
A Low-Cost Mobile Infrastructure for Compact Aerial Robots Under Supervision
In Proceedings of the European Conference on Mobile Robotics (ECMR), 2017

Paper: Real-Time Monocular Segmentation and Pose Tracking of Multiple Objects (2016)

Abstract We present a real-time system capable of segmenting multiple 3D objects and tracking their pose using a single RGB camera, based on prior shape knowledge. The proposed method uses twist-coordinates for pose parametrization and a pixel-wise second-order optimization approach which lead to major improvements in terms of tracking robustness, especially in cases of fast motion and scale changes, compared to previous region-based approaches. Our implementation runs at about 50 – 100 Hz on a commodity laptop when tracking a single object without relying on GPGPU computations. We compare our method to the current state of the art in various experiments involving challenging motion sequences and different complex objects.

Henning Tjaden, Ulrich Schwanecke and Elmar Schömer
Real-Time Monocular Segmentation and Pose Tracking of Multiple Objects
In Proceedings of the European Conference on Computer Vision (ECCV), 2016

Paper: High-Speed and Robust Monocular Tracking (2015)

Abstract In this paper, we present a system for high-speed robust monocular tracking (HSRM-Tracking) of active markers. The proposed algorithm robustly and accurately tracks multiple markers at full framerate of current high-speed cameras. For this, we have developed a novel, nearly co-planar marker pattern that can be identified without initialization or incremental tracking. The pattern also encodes a unique ID to identify different markers. The individual markers are calibrated semi-automatically, thus no time-consuming and error-prone manual measurement is needed. Finally we show that the minimal spatial structure of the marker can be used to robustly avoid pose ambiguities even at large distances to the camera. This allows us to measure the pose of each individual marker with high accuracy in a vast area.

Henning Tjaden, Ulrich Schwanecke, Frédéric Stein and Elmar Schömer
High-Speed and Robust Monocular Tracking
In Proceedings of the International Conference on Computer Vision Theory and Applications (VISIGRAPP), 2015

Master Thesis: Interactive 3D Reconstruction using a Monocular RGB Camera (2012)

During my master thesis (original title "Interaktive 3D-Rekonstruktion mit Hilfe einer RGB-Kamera") I developed an algorithm for interactively computing dense (meaning pixel-wise) 3D reconstructions from a handheld monocular RGB camera. It allows humans or robots to create detailed, textured 3D models of the currently filmed scene, that appear on the screen while the camera is moving. Note that the resulting system was strongly inspired the famous DTAM method.

The software (called SwiftSCAN 3D) was written in C++, OpenCL and OpenGL. It runs in parallel on both the CPU (camera pose tracking) and the GPU (computing the dense reconstruction). The real-time visualization of the reconstructions is performed by making use of OpenGL/OpenCL interoperability in order to minimize the data transfer between CPU and GPU. It also depends on Eigen2, Qt as well as OpenCV for image I/O.

SurfAR - Real-Time Markerless Augmented Reality (2012)

These are the results of my final project before writing my master thesis. I aimed at developing an algorithm for real-time, markerless pose tracking of planar, textured objects, such as postcards, dvd cases or book covers, in order to use this for augmented reality applications.

The resulting system (called SurfAR) is capable of solving this task based on SURF features using arbitrary objects that can be dynamically chosen by the user at run time. It also displays a user defined 3D model on top of the tracked object in an augmented reality manner using the current pose estimate.

The software was written in C++ using OpenCV (especially the GPU SURF algorithm) as well as VTK and Qt for the GUI.

Projection-ONE - Structured Light 3D Scanner (2012)

In the advanced seminar on "3D Image Analysis and Synthesis", we developed a 3D structured light scanner consisting of a single monocular camera and a commodity light projector, that creates precise and dense 3D reconstructions within a couple of seconds. The team consisted of Florian Feuerstein, Marc Lieser, Kai Wolf and myself.

In this project I was responsible for writing the image processing algorithms that compute the 3D reconstructions from given images of the pre-calibrated system. The software was written in Python using OpenCV and NumPy.

Bachelor Thesis: Real-Time and Robust Detection of Root Canal Orifices in Videos (2010)

During my bachelor thesis (original title "Stabile Erkennung von Wurzelkanaleingängen in Videosequenzen") I developed a method to automatically detect root canal orifices of drilled out human teeth in single images or live videos stemming from an intra-oral camera. The system also measures the relative size of all detected orifices and uses this information to classify the type of the tooth that is currently visible. It can thus be used in order to help creating clinical reports.

The Software (called DentalCV) was written in C++ and uses OpenCV for basic image processing algorithms. The GUI was realized with Qt.

Parts of my thesis were later published within the paper An optimized video system for augmented reality in endodontics: a feasibility study.

Carrera CV - Tracking and Autonomous Control of a Slot Car Racing Track (2009)

In the advanced seminar on "3D Reconstruction and Modelling" in the 6th semester, we developed a system for visually tracking and autonomously controlling a slot car on a racing track, using a single monocular camera. The team consisted of Tim Hofmann, Johannes Plag, Matthias Volland and myself.

The software was written in C/C++ and used OpenCV for basic image processing algorithms as well as Qt and OpenGL for the GUI. We furthermore utilized the ARToolkit Plus in order to compute the camera pose relative to the race track. Understanding and integrating the ARToolkit Plus was my main responsibility during this project. I also wrote a short report about it which I presented to the others.

Interactive 3D Computer Graphics

Site3D - 3D Object Manipulation Tool (2011)

Site3D is a tool for manipulating and arranging multiple 3D object meshes in a common scene. The software allows to load models, change their color, shading, textures and apply various rigid and non-rigid transformations to them. It is furthermore possible to select and transform a single or multiple verticies of the mesh and to apply global mesh smoothing to each model.

The program was written using Java with JoGL.

Animating Deformations using Mass-Spring-Systems in Computer Graphics (2012)

The report was written by me as part of a seminar on Computer Graphics and Animation, where I chose to focus on methods for animating deformation.

The report starts by introducing and comparing different methods in this field of application. The main part then describes mass-spring-systems in detail as a sophisticated and real-time capable example method.

Asteroidz - 3D OpenGL/Wiimote Game (2008)

This game was developed in a team with Tim Hofmann and Daniel Bohland in my 4th semester. We used OpenGL and Python.

The specialty about this game is that it could be controlled by either using mouse and keyboard or a Nintendo Wiimote. The cube menu could for example be rotated by quickly whipping the controller sideways and the spaceship could be controlled by using the controller as a joystick. The video does unfortunately not fully represent the gameplay due to the poor screen recording technique, causing the game to run slowly.

Web Development

Flash Design Portfolio with CMS Backend (2012)

In this project I realized the front- and backend for the online portfolio website of the designer Xian Jin.

The layout of the flash frontend was created by the designer and technicaly realized by me using Actionscript 3. All content is loaded dynamically from a Drupal-based CMS backend. This allows the designer to easily manage and exchange the content (images, texts, etc.) of the site.

Band Website for "Besidos" (2012)

The website of the band Besidos was created in cooperation with Xian Jin, who was responsible for the design of the layout, while I realized the front- and backend technically.

The backend was created using Drupal, allowing the band members to easily create news postings, events and upload images to a gallery. The frontend was realized using HTML/CSS and JQuery. The entire site can be navigated without a single page reload, allowing to animate smooth transitions between different sub pages using JavaScript. The content of each sub page is dynamically loaded from the CMS using AJAX in conjunction with deep-linking.

Design & Animation

Animated 3D Promotional Clip for the Musical "A Midsummer Night’s Dream" in Hannover (2010)

The intention of this project was to create a pre-visualization of the stage scenery for the Shakespeare musical "A Midsummer Night’s Dream" in the format of an animated 3D promotional clip. The musical was presented in the summer of 2010 in the "Gartentheater" within the Royal Gardens of Herrenhausen (Hannover).

For this, I first had to recreate the original amphitheater and then add the stage decoration. Visible elements of the stage decoration are the ramp in front of the stage, the artificial leverage on stage and the moon above.

Rendered images of the scene were used on the musical website for previewing purposes. The clip itself was shown at the "Expo Plaza Festival" in Hannover on a big screen in front of approximately fifty thousand people along with other event previews.

3D Modelling (Various)

This gallery contains rendered images of small 3D scenes, that I created either as practice or as part of another project. All scenes were created with Autodesk Maya.

Drawings/Illustrations (Various)

This collection shows a selection of drawings/illustrations I created in my spare time for friends and myself.
They were mainly created using Adobe Photoshop (with a drawing pad), Adobe Illustrator or pencils.

My Photoshop Script (2009)

I created this script for my practical workshop "Adobe Photoshop for interior Design Students", that I held at the RheinMain University of Applied Sciences in 2009 - 2011. I decided to use Photoshop CS2 in this script, since this version was installed on the given PC pool of the University at that time.

If you are interested in the script, please send me an e-mail.