The common goal of all research eﬀorts of the group is the advancement of knowledge on image perception and the development of imaging algorithms with embedded computational models of the human visual system (HVS). This approach oﬀers signiﬁcant improvements in both the computational performance and the perceived image quality. Often, we refer to perceptual eﬀects rather than physical eﬀects, which puts more emphasis on the experience of the observers than physical measurements. In particular, we aim for the exploitation of perceptual eﬀects as a means of overcoming physical limitations of display devices and enhancing the apparent image quality.
Seamless blending of visual content from both of the virtual and real world by the means of augmented reality (AR) is a long-standing goal. To this end, a faithful reproduction of visual cues such as binocular vision, motion parallax, and eye lens accommodation is of particular importance. Displays with such functionality require specialized rendering algorithms that sample and reconstruct important aspects of the complex light-ﬁeld, which is a complete representation of the light distribution at every position and direction in a scene. Modeling visual perception is a principled approach to optimize the sampling and reconstruction, so that the discrepancies from the perception of a reference real-world scene are minimized. For example, the human visual system (HVS) sensitivity to brightness, contrast, and object depth may dynamically change as a function of gaze position, ambient luminance levels and accommodative state of the eye lens, which in turn might depend on the semantic scene content, object positions in screen-space, their depth levels, and observer’s motion relative to the objects. Having a model of the HVS which accounts for all these factors enables dynamic reallocation of rendering resources to achieve the best visual quality.
We consider 3D printing as another aspect of seamless transition between virtual and physical objects, we are aiming at developing a domain-speciﬁc perception model for faithful appearance reproduction in fabrication. When objects are 3D printed their physical appearance is expected to match their virtual models as closely as possible. The perceptual aspects of such appearance matching have not been investigated thoroughly so far, given that each printing technology might impose some unique constraints, such as strong light diﬀusion within the material that washes out texture details of the object.
One of the key goals in our research vision is spreading the awareness of the importance of HVS modeling for computer graphics and image processing. We achieve this by co-authoring textbooks, such as the one on High Dynamic Range Imaging (almost 2,300 Google Scholar citations).
Image-content Aware Foveated Rendering Information on the eye ﬁxation allows rendering quality to be reduced in peripheral vision, and the additional resources can be used to improve the quality in the foveal region. The existing rendering solutions model the sensitivity as a function of eccentricity, neglecting the fact that it also is strongly inﬂuenced by the displayed content. We propose a new luminance-contrast-aware foveated rendering technique which demonstrates that the computational savings of foveated rendering can be signiﬁcantly improved if local luminance contrast of the image is analyzed. To this end, we ﬁrst study the resolution requirements at diﬀerent eccentricities as a function of luminance patterns. We later use this information to derive a low-cost predictor of the foveated rendering parameters. Its main feature is the ability to predict the parameters using only a low-resolution version of the current frame, even though the prediction holds for high-resolution rendering. This property is essential for the estimation of required quality before the full-resolution image is rendered. We demonstrate that our predictor can eﬃciently drive the foveated rendering technique and analyze its beneﬁts in a series of user experiments.
Towards Improving Eye Accommodation Performance Beyond Human Capabilities in Natural World Multi-focal plane and multi-layered light-ﬁeld displays are promising solutions for addressing all visual cues observed in the real world. We take a step further and ask the question of whether the capabilities of current and future display designs combined with eﬃcient perception-inspired content optimizations can be used to improve human task performance beyond the human capabilities in the natural world. We focus on speeding up accommodation response. To achieve this goal we make two key observations. First, it has been demonstrated that the main eye movements, i.e., saccade, vergence, and accommodation, can be signiﬁcantly faster if they occur simultaneously. This suggests that by providing content which suﬃciently stimulates all the movements, the eye transition to a new target can be facilitated. Second, for each of these eye movements, the responses have a higher peak velocity when the motion is larger. This suggests that by initially presenting a stimulus beyond the intended target, and then backing oﬀ to the actual target, higher transition speeds may be achieved. We call this mechanism overdriving the visual cues and hypothesize that it can further speed up eye adaptation to new depth targets.
Improving Visual Experience in VR Virtual Reality (VR) systems increase immersion by reproducing users’ movements in the real world. However, several works have shown that this real-to-virtual mapping does not need to be precise in order to convey a realistic experience. Being able to alter this mapping has many potential applications, since achieving an accurate real-to-virtual mapping is not always possible due to limitations in the capture or display hardware, or in the physical space available. We measure detection thresholds for lateral translation gains of virtual camera motion in response to the corresponding head motion under natural viewing, and in the absence of locomotion, so that virtual camera movement can be either compressed or expanded while these manipulations remain undetected. We propose three applications for our method, addressing three key problems in VR: improving 6-DoF viewing for captured 360◦ footage, overcoming physical constraints, and reducing simulator sickness.
X-Fields: Implicit Neural View-, Light- and Time-image Interpolation Exploring view, time, lighting changes in Virtual Reality (VR) and on light-ﬁeld displays for captured content typically requires sampling densely this data that leads to excessive storage, capture, and processing requirements. Alternatively, sparse and irregular sampling can overcome these limitations, but requires real-time processing pipelines that faithfully interpolate across view, time, and light. We suggest taking an abstract view on all those dimensions and simply denote any set of images conditioned on parameters as an “X-Field”, where X could stand for any combination of time, view, light. We demonstrate how the right neural network (NN) becomes a universal, compact and interpolatable X-Field representation. The NN represents the input to that rendering as an implicit map, that for any view, time, or light coordinate and for any pixel can quantify how it will move if view, time or light coordinates change (Jacobian of pixel position with respect to view, time, illumination, etc.). Our X-Field representation is trained for one scene within minutes, leading to a compact set of trainable parameters and hence real-time navigation in view, time and illumination.
Appearance Reproduction in Computational Fabrication While additive 3D printing such as poly-jetting technology enables accurate color rendition, exact reproduction of a target appearance is hampered by the strong subsurface scattering within common print materials that causes non-trivial volumetric cross-talk between printer voxels. Consequently, the produced prints appear signiﬁcantly diﬀerent from the target models and lack highfrequency texture details. We propose an iterative optimization scheme that in conjunction with Monte-Carlo simulation of light transport within a 3D print, can be used to ﬁnd a spatial distribution of printer voxels that closely approximates a given target appearance. To reduce the computational cost of the repeated Monte-Carlo invocation within the reﬁnement loop, we train a deep neural network to predict the scattering within a highly heterogeneous medium. With this update, our method performs around two orders of magnitude faster than Monte Carlo rendering while yielding optimization results of similar quality level. This for the ﬁrst time enables full heterogenous material optimization for 3D-print preparation within time frames in the order of the actual printing time. Our solution enables high-ﬁdelity color texture reproduction on 3D prints by eﬀectively compensating for internal light scattering within arbitrarily shaped objects. We address also a diﬃcult case of cross-talk between adjacent faces in thin materials. Using a wide range of sample objects with complex color textures, we demonstrate texture reproduction whose ﬁdelity is superior to state-of-the art drivers for 3D color prints.