Vision and Research Strategy
The common denominator in all research efforts as conducted by the group is the advancement of knowledge on image perception and development of imaging algorithms with embedded computational models of the human visual system (HVS). This way the computation performance as well as the image quality as perceived by the human observer can be signiﬁcantly improved. Often, we rather refer to perceptual effects than to physical effects, which we can experience but not measure physically. In particular, we aim for exploitation of perceptual effects as a means to overcome physical limitations of display devices and to enhance apparent image qualities.
An important issue that is often neglected in computer graphics, is to account for display device characteristics in the design of rendering algorithms. This becomes even more important with recently observed proliferation of display technologies, which may substantially differ in terms of reproduced contrast, brightness, screen spatial extent, pixel resolution, or dynamic response. In the past, we proposed a number of techniques, that enhance all those important image quality dimensions beyond physical limitations of display devices in terms of apparent contrast, brightness, and resolved image details. In the last four years we have been interested in the problem of depth reproduction and visual comfort improvement for stereoscopic 3D displays (work in collaboration with Tobias Ritschel and Piotr Didyk). We realized that on the technical level we can capitalize on our experience gained in High Dynamic Range Imaging, e.g., through seeking for analogies between well-researched by us tone mapping operators and newly proposed disparity retargeting. In both these cases, a form of local dynamic range compression needs to be performed, and the key differences lie in the perceptual models that inform such compression and account for different HVS characteristics relevant for the given modality. We consider interactions between binocular depth cues and pictorial cues (i.e., image content) particularly important. By considering additionally motion parallax effects and variable eye lens accommodation, which cannot be experienced while looking at the screen of traditional 2D and stereoscopic 3D displays, we want to close the loop and consider all visual cues that the human observer experiences in the real-world observation conditions. In view of the recent developments in the area of light ﬁeld displays and head-mounted displays (e.g., Oculus Rift), our research proves to be timely.
As a separate track of our research we investigate image quality metrics that are sensitive to image spatial content and observer adaptation states. We envision convergence of both those tracks of research, where the issue of image quality will be expanded ﬁrst on stereoscopic images and then on light ﬁelds.
A signiﬁcant aspect of our research vision is spreading the awareness of the importance of HVS modeling for computer graphics and image processing. We achieve this goal by co-authoring textbooks, such as the one on High Dynamic Range Imaging (over 1,100 Google Scholar citations). As in the meantime HDRI has matured as a research ﬁeld, and our view on it has signiﬁcantly evolved as well, we prepared a 80-page summary of the ﬁeld for Wiley Encyclopedia of Electrical and Electronics Engineering (in collaboration with R. Mantiuk and Hans-Peter Seidel). Another important mission of our group is training of PhD students in applied perception, who later in their scientiﬁc activities proliferate such expertise in new research areas. Four former PhD students, who left the group and started their independent research careers, published their work with strong perceptual components at ACM SIGGRAPH and SIGGRAPH Asia 2014.
Research Areas and Achievements
The audience interest in stereoscopic imaging again declines as current display technologies introduce uncomfortable differences with respect to real-world observation conditions. We argue that such differences can be reduced by careful processing of the 3D content. For example, sudden temporal depth changes, such as those introduced by scene cuts in ﬁlms, are unlikely in the real world, while they require adaptation of the eye vergence angle to the new depth in the presence of conﬂicting accommodation constraints. Through a series of eye tracking experiments, we derived a model describing the vergence response to depth changes that enables a number of strategies for minimizing the adaptation time in the audience.
Another important issue is material rendering on 3D displays. For example, human binocular perception of reﬂective and refractive materials is substantially different from the perception of diffuse surfaces: reﬂections and refractions have depth themselves, which is different from the depth of the object surface. In a custom ray tracer, we solved a small optimization problem at every pixel, where for each layer of reﬂection or refraction a disparity is found such that the resulting image can be easily fused by the HVS, and at the same time the layer depth ordering and relative distances between the layers are maintained. This way we avoid the problem of uncomfortable binocular rivalry, which often arises when physically-based rendering is employed. Another factor that might affect the material appearance is the assumed approach to ﬁlm grain management. The most common solution relies on projecting the grain onto the scene geometry, which often yields unnatural results, changes material perception, and emphasizes inaccuracies of the 2D-to-3D conversion process. We developed an advanced method of grain positioning that scatters the grain in the scene space, which is comfortable to watch, while the material appearance is kept independent from the medium appearance, i.e., an analog ﬁlm featuring such grain.
Based on a series of psychophysical experiments conducted on an HDR display, we developed a model of depth perception at scotopic luminance levels. By analogy to tone mapping of night scenes, this model can be used to match the appearance of scotopic scene depth, which is displayed on a photopic (standard) display, to the depth appearance that would be perceived if the scene was actually scotopic.
We have developed stereo-perception models that allow us to manipulate disparity in order to improve viewing comfort, but we found that perception of object motion can be affected in the process. We proposed an optimization scheme to preserve stereo-motion of content that was subject to arbitrary disparity manipulations, based on a perceptual model of temporal disparity changes.
We are also interested in lossy depth compression, which is an alternative and often desirable way of broadcasting and streaming of 3D video content. In particular, we investigated the problem of decorrelation between the depth and luma&chroma signal at depth discontinuities. Also, we measured the HVS sensitivity to depth changes, and proposed a model of spatial masking and self-masking of a depth signal.
Quality metrics are employed to verify algorithms at all stages of the rendering and transmission pipelines. While a vast majority of existing image quality metrics are tuned for compression/transmission artifacts, we have developed specialized metrics for detecting rendering artifacts. We found that machine learning can be successfully used in the context of full-reference rendering quality metrics. We presented an extensive analysis of feature descriptors for objective image quality assessment. To this end, we explored a large space of possible features including components of existing image quality metrics as well as many traditional computer vision and statistical features.
Previous image quality metrics ignored the fact that the HVS compensates for geometric transformations, e.g., typically we perceive an image and its moderately rotated copy as identical. We proposed a metric that accounts for the effect of these transformations, which results in elevating the HVS thresholds for detecting image differences. Our quality metric can be applied to re-photography and re-rendering and allows for a better judgment of structural image differences, while ignoring minor registration problems.