Real Virtual Lab

The Real Virtual Lab hosts a diverse range of state-of-the-art scientific capture systems, that enable the recording and measurement of real-world phenomena at the highest scale and fidelity. The data acquired in these setups forms the foundation for developing new algorithms that digitize, model, and analyze the physical world. This captured data allows us to observe complex real-world interactions, quantitatively evaluate our algorithms, and serve as training material for deep - potentially generative - models.

The lab spans 350m² and currently features the following experimental setups:

Light Stage

The light stage is a spherical dome equipped with 13.000 individually controllable (and optionally polarized) LEDs, as well as 40 ultra-high resolution cameras, that can capture scenes at 6K resolution. The stage allows us to illuminate the scene with arbitrary light patters, in order to simulate the physical light transport of the real world. From data like this, generative AI models can learn to modify the lighting in images and videos, which is important for many downstream applications, for example in VR and AR. Some research highlights include the first large-scale full head „One-light-at-the-time“ (OLAT) dataset that will be publicly available for the research community. Moreover, we captured full-body performances under uniform and environment lighting conditions, as well as a large-scale object dataset.

 

Multi-view Video Studio

This studio provides 36m2 of scene space, allowing us to capture individuals and groups of people. The system features 120 machine vision cameras that stream 4K video at 50fps. Due to the large amounts of data that this setup records per second, we designed a dedicated streaming pipeline, that connects the cameras to 10 storage servers. 8 GPU servers can reconstruct dense 3D geometry from the raw multi-view images, using our implicit surface reconstruction algorithm. We have released multiple datasets with this setup (see Holoported Characters, ASH, MetaCap, Relightable Neural Actor, EgoAvatar), serving as some of the most challenging publicly available benchmarks for neural human rendering and performance capture. The multi-view video studio is a versatile facility, which in addition to the main camera system also hosts complementary camera systems, for example for real-time marker-based and marker-less motion capture, as well as long and short focal length lenses for close-up and wide-angle recordings.

Face Capture Dome

The face capture dome is a multi-camera setup for recording the facial and head motion of subjects. It is equipped with 40 video cameras that capture at 4K resolution. The cameras are mounted on a sphere-like structure, and the subject is illuminated from all sides with a diffuse LED backlight. The cameras are hardware-synchronized. We have used data from this setup in our methods for photorealistic free-view point rendering of head avatars, like UniGAHA, GaussianHeads, and AvatarStudio.

3D Scanner

In our body scanner, the subject is captured by a dense set of 140 RGB cameras. A light projector system allows photogrammetric reconstruction of shape and texture at high detail. In combination with our work on neural implicit geometry recovery, this setup enables us to reconstruct high-fidelity 3D geometry and appearance of static humans, but also of large objects. For instance, we recorded static human mesh templates that are automatically skinned and rigged, serving as the basis for many of our human rendering algorithms, see for example TriHuman or DELIFFAS.