The common goal of all research efforts of the group is the advancement of knowledge on image perception and the development of imaging algorithms with embedded computational models of the human visual system (HVS). This approach offers significant improvements in both the computational performance and the perceived image quality. Often, we refer to perceptual effects rather than physical effects, which puts more emphasis on the experience of the observers than physical measurements. In particular, we aim for the exploitation of perceptual effects as a means of overcoming physical limitations of display devices and enhancing the apparent image quality.
Seamless blending of visual content from both of the virtual and real world by the means of augmented reality (AR) is a long-standing goal. To this end, a faithful reproduction of visual cues such as binocular vision, motion parallax, and eye lens accommodation is of particular importance. We refer to displays with such a functionality as hyper-reality displays (H-RD). Developing an H-RD requires a joint design of hardware with non-conventional display optics and software with specialized rendering algorithms that altogether attempt to sample and reconstruct important aspects of the complex 5D plenoptic function. Such plenoptic function, also referred to as a light-field, is a complete representation of the light distribution at every position and direction in a scene. Modeling visual perception is a principled approach to optimize the sampling and reconstruction, so that the discrepancies from the perception of a reference real-world scene are minimized. For example, the human visual system (HVS) sensitivity to brightness, contrast, and object depth may dynamically change as a function of gaze position, ambient luminance levels and accommodative state of the eye lens, which in turn might depend on the semantic scene content, object positions in screen-space, their depth levels, and observer's motion relative to the objects. Having a model of the HVS which accounts for all these factors enables dynamic reallocation of rendering resources to achieve the best visual quality.
By capitalizing on our experience gained from building an award winning, extreme wide-field-of-view see-through AR display that supports eye lens accommodation (a joint effort with Nvidia and UNC), we investigate basic HVS mechanisms that accompany switching gaze direction between virtual and real world. Such studies could be performed for the first time without focus conflicts, and other contradictions between visual cues, such as the infamous vergence-accommodation conflict (VAC). The key objective is to develop suitable perceptual models that integrate all important visual cues, and are responsive for the possible cue deviations from the real world observation, which might be imposed by the display optics design or poor rendering quality. Since such perceptual models are able to estimate the quality of viewing experience and quantify any deviations from the reality in a perceptually meaningful way, we apply them to develop objective image quality metrics and to drive foveated rendering on H-RDs with the eye accommodation.
We consider 3D printing as another aspect of seamless transition between virtual and physical objects, we are aiming at developing a domain-specific perception model for faithful appearance reproduction in fabrication. When objects are 3D printed their physical appearance is expected to match their virtual models as closely as possible. The perceptual aspects of such appearance matching have not been investigated thoroughly so far, given that each printing technology might impose some unique constraints, such as strong light diffusion within the material that washes out texture details of the object.
One of the key goals in our research vision is spreading the awareness of the importance of HVS modeling for computer graphics and image processing. We achieve this by co-authoring textbooks, such as the one on High Dynamic Range Imaging (almost 1,900 Google Scholar citations). Another important mission of our group is training of PhD students in applied perception, who proliferate such expertise in new research areas in their future scientific activities.
Information on the eye fixation allows rendering quality to be reduced in peripheral vision, and the additional resources can be used to improve the quality in the foveal region. The adoption of foveated rendering is hampered by system latency which is especially apparent during fast saccadic movements when the information about gaze location is significantly delayed, and the quality mismatch can be noticed. Instead of rendering according to the current gaze position, we predict where the saccade is likely to end and render an image for the new fixation location as soon as the prediction is available. While the quality mismatch during the saccade remains unnoticed due to saccadic suppression, a correct image for the new fixation is rendered before the fixation is established. We derive a model for predicting saccade landing positions and demonstrate how it can be used in the context of gaze-contingent rendering to reduce the influence of system latency on the perceived quality.
Perception-driven Rendering for Multi-layer Accommodative Displays
Multi-focal plane and multi-layered light-field displays are promising solutions for addressing all visual cues observed in the real world. Unfortunately, these devices usually require expensive optimizations to compute a suitable decomposition of the input light-field or focal stack to drive individual display layers. A simple alternative is a linear blending strategy which decomposes a single 2D image using depth information. This method provides real-time performance, but it generates inaccurate results at occlusion boundaries and on glossy surfaces. We propose a perception-based gaze-contingent hybrid decomposition technique which combines the advantages of the above strategies and achieves both real-time performance and high-fidelity results. We perform a complete, perception-informed analysis and develop a computational model that locally determines which of the two strategies should be applied. The prediction is later utilized by our new gaze-contingent synthesis method which performs the image decomposition. The results are analyzed and validated in user experiments on our custom-built multi-focal plane display with an eye tracker.
2D Image and Light-field Quality Evaluation
A large number of imaging and computer graphics applications require localized information on the visibility of image distortions. Existing image quality metrics are not suitable for this task as they provide a single quality value per image. On the other hand, existing visibility metrics produce visual difference maps, and are specifically designed for detecting just noticeable distortions but their predictions are often inaccurate. We argue that the key reason for this problem is the lack of comprehensive image datasets with a good coverage of possible distortions that occur in different applications. To address the problem, we collect an extensive dataset of reference and distorted image pairs together with user markings indicating whether distortions are visible or not. We propose a statistical model that is designed for the meaningful interpretation of such data, which is affected by visual search and imprecision of manual marking. We use our dataset for training existing metrics and we demonstrate that their performance significantly improves. Also, we show that our dataset with the proposed statistical model can be used to train a new CNN-based metric, which outperforms the existing solutions. A similar purpose serves also our Light-field Archive, which is a collection of dense reference and distorted light-fields as well as the corresponding quality scores which are scaled in perceptual units. We use this dataset to check the suitability of existing 2D image quality metrics for light-field quality evaluation.
Appearance Reproduction in Computational Fabrication
Additive 3D printing such as poly-jetting technology, which has gained importance recently, enables object volume construction of micro-voxels. Color texture reproduction in 3D printing commonly ignores such volumetric light transport between surface points on a 3D print. Such light diffusion leads to significant blur of details and color bleeding, and is particularly severe for highly translucent photopolymerization print materials. Given their widely varying scattering properties, this cross-talk between surface points strongly depends on the internal structure of the micro-voxels surrounding each surface point. In our work, we counteract heterogeneous scattering to create the impression of a crisp albedo texture on top of the 3D print, by optimizing for a fully volumetric material distribution that preserves the target appearance. Our method employs an efficient numerical optimizer on top of a general Monte-Carlo simulation of heterogeneous scattering, supported by a practical calibration procedure to obtain scattering parameters from a given set of printer materials.