SAP '15: Proceedings of the ACM SIGGRAPH Symposium on Applied Perception

Full Citation in the ACM Digital Library

SESSION: Avatars & faces

Perception of strength and power of realistic male characters

We investigated the influence of body shape and pose on the perception of physical strength and social power for male virtual characters. In the first experiment, participants judged the physical strength of varying body shapes, derived from a statistical 3D body model. Based on these ratings, we determined three body shapes (weak, average, and strong) and animated them with a set of power poses for the second experiment. Participants rated how strong or powerful they perceived virtual characters of varying body shapes that were displayed in different poses. Our results show that perception of physical strength was mainly driven by the shape of the body. However, the social attribute of power was influenced by an interaction between pose and shape. Specifically, the effect of pose on power ratings was greater for weak body shapes. These results demonstrate that a character with a weak shape can be perceived as more powerful when in a high-power pose.

Avatar preference selection in game design based on color theory

Selecting color schemes for game objects is an important task. It can be valuable to game designers to know what colors are preferred. Principles of color theory are important to select appropriate colors. This paper presents a perceptual experiment that evaluates some basic principles of color theory applied to game objects to study if a particular combination is preferred. An experiment was conducted with 15 participants who performed a two-alternative forced choice (2AFC) preference experiment using 236 pairs of images each. The pairs were based on color harmonies derived from the colors red, green, and blue. The color harmonies were evaluated against each other and included analogous, complementary, split-complementary, triad, and warm and cool colors. A high and low saturation condition was also included. The color harmonies were applied to an existing game character (avatar) and a new object (cube) to study any potential differences in the results. The initial results show that some color harmonies, in particular triad and split-complementary, were generally preferred over others meaning that it is important to take into account these aspects in game design. Additional results also show that color harmonies with a base in green were not as popular as red and blue color harmonies.

Perception of personality through eye gaze of realistic and cartoon models

In this paper, we conducted a perceptual experiment to determine if specific personality traits can be portrayed through eye and head movement in the absence of other facial animation cues. We created a collection of eye and head motions captured from three female actors portraying different personalities, while listening to instructional videos. In a between-groups experiment, we tested the perception of personality on a realistic model and a cartoon stylisation in order to determine if stylisation can positively influence the perceived personality or if personality is more easily identified on a realistic face. Our results verify that participants were able to differentiate between personality traits portrayed only through eye gaze, blinks and head movement. The results also show that perception of personality was robust across character realism.

SESSION: Materials & color

Analyzing and predicting anisotropic effects of BRDFs

The majority of the materials we encounter in the real-world have variable reflectance when rotated along a surface normal. This view and illumination azimuthally-variable behavior is known as visual anisotropy. Such behavior can be represented by a four-dimensional anisotropic BRDF that characterizes the anisotropic appearance of homogeneous materials. Unfortunately, most past research has been devoted to simplistic three dimensional isotropic BRDFs. In this paper, we analyze and categorize basic types of BRDF anisotropy, use a psychophysical study to assess at which conditions can isotropic appearance be used without loss of details in material appearance. To this end, we tested the human impression of material anisotropy on various shapes and under two illuminations. We conclude that subjects sensitivity to anisotropy declines with increasing complexity of 3D geometry and increasing uniformity of illumination environment. Finally, we derive and perceptually validate a computationally efficient measure of material visual anisotropy.

Multimodal perception of material properties

The human ability to perceive materials and their properties is a very intricate multisensory skill and as such not only an intriguing research subject, but also an immense challenge when creating realistic virtual presentations of materials. In this paper, our goal is to learn about how the visual and auditory channels contribute to our perception of characteristic material parameters. At the center of our work are two psychophysical experiments performed on tablet computers, where the subjects rated a set of perceptual material qualities under different stimuli. The first experiment covers a full collection of materials in different presentations (visual, auditory and audio-visual). As a point of reference, subjects also performed all ratings on physical material samples. A key result of this experiment is that auditory cues strongly benefit the perception of certain qualities that are of a tactile nature (like "hard--soft", "rough--smooth"). The follow-up experiment demonstrates that, to a certain extent, audio cues can also be transferred to other materials, exaggerating or attenuating some of their perceived qualities. From these results, we conclude that a multimodal approach, and in particular the inclusion of sound, can greatly enhance the digital communication of material properties.

Sackcloth or silk?: the impact of appearance vs dynamics on the perception of animated cloth

Physical simulation and rendering of cloth is widely used in 3D graphics applications to create realistic and compelling scenes. However, cloth animation can be slow to compute and difficult to specify. In this paper, we present a set of experiments in which we explore some factors that contribute to the perception of cloth, to determine how efficiency could be improved without sacrificing realism. Using real video footage of several fabrics covering a wide range of visual appearances and dynamic behaviors, and their simulated counterparts, we explore the interplay of visual appearance and dynamics in cloth animation.

SESSION: Distance and size in virtual environments

Egocentric distance perception in the Oculus Rift (DK2)

Perceiving an accurate sense of absolute scale is important for the utility of virtual environments (VEs). Research shows that absolute egocentric distances are underestimated in VEs compared to the same judgments made in the real world, but there are inconsistencies in the amount of underestimation. We examined two possible factors in the variation in the magnitude of distance underestimation. We compared egocentric distance judgments in a high-cost (NVIS SX60) and low-cost (Oculus Rift DK2) HMD using both indoor and outdoor highly-realistic virtual models. Performance more accurately matched the intended distance in the Oculus compared to the NVIS, and regardless of the HMD, distances were underestimated more in the outdoor versus the indoor VE. These results suggest promise in future use of consumer-level wide field-of-view HMDs for space perception research and applications, and the importance of considering the context of the environment as a factor in the perception of absolute scale within VEs.

Evoking and assessing vastness in virtual environments

Many have experienced vastness, the feeling when the visual space seems to extend without limits away from you, making you feel like a small element within the space. For over 200 years, people have been writing about this experience, for example stating that vastness is important to the experience of awe [Mikulak 2015]. Yet vastness has received little attention in empirical research. Specifically, it is unknown which aspects of the visual stimulus contribute to perceived vastness. This may be due to the inherent difficulties in presenting a variety of vast stimuli while varying only specific visual cues. Using virtual reality addresses these difficulties, as this technology provides precise control over the presented visual stimuli. Here we investigate whether the feeling of vastness can be evoked using virtual reality and explore potential objective measures to assess vastness. We used three different measures during this experiment: 1) An avatar height adjustment task where participants had to adjust an avatar to be equivalent to their own height as viewed from a distance, 2) a distance estimation task and 3) a subjective vastness rating task. These tasks were performed in four environments: a plain (used in all subsequent environments for the ground and sky surfaces), a forest, a mountain and the mountain and forest environments combined. Our results indicate that the feeling of vastness can indeed be experienced to various degrees in virtual environments, demonstrating the potential of VR as a tool for exploring the perception of vastness. Yet the results combined suggest that the percept of vastness is a rather complex construct.

The effects of minification and display field of view on distance judgments in real and HMD-based environments

Distance perception is important for many virtual reality applications, and numerous studies have found underestimated egocentric distances in head-mounted display (HMD) based virtual environments. Applying minification to imagery displayed in HMDs is a method that can reduce or eliminate the underestimation [Kuhl et al. 2009; Zhang et al. 2012]. In a previous study, we measured distance judgments with direct blind walking through an Oculus Rift DK1 HMD and found that participants judged distance accurately in a calibrated condition, and minification caused subjects to overestimate distances [Li et al. 2014]. This article describes two experiments built on the previous study to examine distance judgments and minification with the Oculus Rift DK2 HMD (Experiment 1), and in the real world with a simulated HMD (Experiment 2). From the results, we found statistically significant distance underestimation with the DK2, but the judgments were more accurate than results typically reported in HMD studies. In addition, we discovered that participants made similar distance judgments with the DK2 and the simulated HMD. Finally, we found for the first time that minification had a similar impact on distance judgments in both virtual and real-world environments.

SESSION: Stereo & high frame rate

What makes 2D-to-3D stereo conversion perceptually plausible?

Different from classic reconstruction of physical depth in computer vision, depth for 2D-to-3D stereo conversion is assigned by humans using semi-automatic painting interfaces and, consequently, is often dramatically wrong. Here we seek to better understand why it still does not fail to convey a sensation of depth. To this end, four typical disparity distortions resulting from manual 2D-to-3D stereo conversion are analyzed: i) smooth remapping, ii) spatial smoothness, iii) motion-compensated, temporal smoothness, and iv) completeness. A perceptual experiment is conducted to quantify the impact of each distortion on the plausibility of the 3D impression relative to a reference without distortion. Close-to-natural videos with known depth were distorted in one of the four above-mentioned aspects and subjects had to indicate if the distortion still allows for a plausible 3D effect. The smallest amounts of distortion that result in a significant rejection suggests a conservative upper bound on the quality requirement of 2D-to-3D conversion.

Evaluation of the impact of high frame rates on legibility in S3D film

There is growing interest in capturing and projecting movies at higher frame rates than the traditional 24 frames per second. Yet there has been little scientific assessment of the impact of higher frame rates (HFR) on the perceived quality of cinema content. Here we investigated the effect of frame rate, and associated variables (shutter angle and camera motion) on viewers' ability to discriminate letters in S3D movie clips captured by a professional film crew. The footage was filmed and projected at varying combinations of frame rate, camera speed and shutter angle. Our results showed that, overall, legibility improved with increased frame rate and reduced camera velocity. However, contrary to expectations, there was little effect of shutter angle on legibility. We also show that specific combinations of camera parameters can lead to dramatic reductions in legibility for localized regions in a scene.

4-D spatial perception established through hypercube recognition tasks using interactive visualization system with 3-D screen

We have developed an interactive 4-D visualization system that employed the principal vanishing points operation as a method to control the movement of the eye-point and the change in the viewing direction in 4-D space. Different from conventional 4-D visualization and interaction techniques, the system can provide intuitive observation of 4-D space and objects by projecting them onto 3D space in real time from various positions and directions in 4-D space. Our next challenge is to examine whether humans are able to develop a spatial perception of 4-D space and objects through 4-D experiences provided by the system. In this paper, as the first step toward our aim, we assessed whether participants were able to get intuitive spatial understanding of 4-D objects. In the evaluation experiment, firstly, the participants learned a structure of a hypercube. Then, we evaluated their spatial perception developed in the learning period by tasks of controlling the 4-D eye-point and reconstructing the hypercube from a set of its 3-D projection drawings. The results indicated evidence for that humans were able to get 4-D spatial perception by operating the system.

SESSION: Artefact visibility and guided attention

Using full reference image quality metrics to detect game engine artefacts

Contemporary game engines offer an outstanding graphics quality but they are not free from visual artefacts. A typical example is aliasing, which, despite advanced antialiasing techniques, is still visible to the game players. Essential deteriorations are the shadow acne and peter panning artefacts related to deficiency of the shadow mapping technique. Also Z-fighting, caused by the incorrect order of drawing polygons, significantly affects the quality of the graphics and makes the gameplay difficult. These artefacts are laborious to eliminate in an algorithm way because either they require computational effort inadequate to obtained results or visibility of artefacts depends on the ambiguous parameters. In this work we propose a technique, in which visibility of deteriorations is perceptually assessed by human observers. We conduct subjective experiments in which people manually mark the visible local artefacts in the screenshots from the games. Then, the detection maps averaged over a number of observers are compared with results generated by the image quality metrics (IQMs). Simple mathematically-based metric - MSE, and advanced IQMs: S-CIELAB, SSIM, MSSIM, and HDR-VDP-2 are evaluated. We compare convergence in the detection between the maps created by humans and computed by IQMs. The obtained results show that SSIM and MSSIM metrics outperform other techniques. However, the results are not indisputable because, for small and scattered aliasing artefacts, HDR-VDP-2 metrics reports the results most consistent with the average human observer. Notwithstanding, the results suggest that it is feasible to use the IQMs detection maps to leverage and calibrate the rendering algorithms directly based on the analysis of quality of the output images.

Evaluating the local visibility of geometric artifacts

Several perceptually-based quality metrics have been introduced to predict the global impact of geometric artifacts on the visual appearance of a 3D model. They usually produce a single score that reflects the global level of annoyance caused by the distortions. However, beside this global information, it is also important in many applications to obtain information about the local visibility of the artifacts (i.e. estimating a localized distortion measure). In this work we present a psychophysical experiment where observers are asked to mark areas of 3D meshes that contain noticeable distortions. The collected per-vertex distortion maps are first used to illustrate several perceptual mechanisms of the human visual system. They then serve as ground-truth to evaluate the performance of well-known geometric attributes and metrics for predicting the visibility of artifacts. Results show that curvature-based attributes demonstrate excellent performance. As expected, the Hausdorff distance is a poor predictor of the perceived local distortion while the recent perceptually-based metrics provide the best results.

Automatic target prediction and subtle gaze guidance for improved spatial information recall

Humans rely heavily on spatial information to perform everyday tasks. Developing good spatial understanding is highly dependent on how the viewer's attention is deployed to specific locations in a scene. Bailey et al. [2009] showed that it is possible to influence exactly where attention is allocated using a technique called Subtle Gaze Direction (SGD). The SGD approach combines eye tracking with subtle image-space modulations to guide viewer gaze about a scene. The modulations are presented to peripheral regions of the field of view, in order to attract the viewer's attention, but are terminated before the viewer can scrutinize them with their high acuity foveal vision. It was observed that subjects who were guided using SGD performed significantly better in recollecting the count and location of target objects, however no significant performance improvement was observed in identifying the shape of the target objects [Bailey et al. 2012]. Also, in previous studies involving SGD, the target locations were manually chosen by researchers. This paper addresses these two limitations. We present a novel technique for automatically selecting target regions using visual saliency and key features in the image. The shape recollection issue is solved by modulating a rough outline of the target object obtained using an edge map composed from a pyramid of low spatial frequency maps of the original image. Results from a user study show that the influence of this approach significantly improved accuracy of target count recollection, location recollection, as well as shape recollection without any manual intervention. Furthermore our technique correctly predicted 81% of the target regions without any prior knowledge of the recollection task being assigned to the viewer. This work has implications for a wide range of applications including spatial learning in virtual environments as well as image search applications, virtual training and perceptually based rendering.

SESSION: Mixed reality

Myo arm: swinging to explore a VE

In this paper, we use an inexpensive wearable device called the Myo armband (199 USD) to implement a simple arm swinging algorithm that allows a user to freely explore an HMD-based virtual environment. Using a spatial orientation task we directly compared our Myo arm--swinging method to joystick locomotion and physical walking. We find that our arm swinging method outperforms the simple joystick and that spatial orientation is comparable to physically walking on foot. Our arm--swinging method is inexpensive compared to tracking systems that permit foot exploration, does not suffer from space constraints, and requires less physical energy than walking on foot.

The effect of avatar model in stepping off a ledge in an immersive virtual environment

Animated digital self-representations of the user in an immersive virtual environment, a self-avatar, have been shown to aid in perceptual judgments in the virtual environment and to provide critical information for people deciding what actions they can and cannot take. In this paper we explore whether the form of the self-avatar is important in providing this information. In particular, we vary the form of a self-avatar between having no self-avatar, a simple line-based skeleton avatar, or a full-body, gender-matched self-avatar and examine whether the form of the self-avatar affects peoples judgments in whether they could or could not step off of a virtual ledge. Our results replicate prior work that shows that having a self-avatar provides critical information for this judgment, but finds no difference in the form of the self-avatar having an effect on the judgment.

Dyadic interactions with avatars in immersive virtual environments: high fiving

Collaborative immersive virtual environments allow the behavior of one user to be observed by other users. In particular, behavior of users in such an environment is represented by each user possessing a self-avatar, a digital representation of themself. In this study we examined dyadic interactions in a collaborative immersive virtual environment when both users were present in the same physical space. This collocation in physical space allows for physical interaction between users as well as virtual interaction. In the context of a common physical gesture, high fiving, we examined the question of whether the form of the self-avatar was important, and whether collocation in the physical world provided benefits or not. We find that the form of the avatar is important but that physical collocation is not. These results reinforce the growing body of evidence that indicates that having a full-body avatar in a virtual environment provides benefits, and these results are significant because they demonstrate this in the context of a dyadic interaction.

Remembering the physical as virtual: source confusion and physical interaction in augmented reality

This study explored whether people misremember having seen a physical object when they actually had viewed a virtual one in augmented reality (and vice versa). Participants viewed uniquely shaped objects in a virtual form or a physical, 3D-printed form. A camera mounted behind a computer monitor showed either the physical object or an augmented reality version of it on the display. After viewing the full set of objects, participants viewed photographs of each object (taken from the physical version) and judged whether they had originally seen it as a physical or virtual object. On average, participants correctly identified the object format for 60% of the photographs. When participants were allowed to manipulate the physical or virtual object (using a Leap Motion Controller), accuracy increased to 73%. In both cases, participants were biased to remember the objects as having been virtual.

POSTER SESSION: Poster abstracts

CG aided makeup design to understand and manipulate the impression of facial look and attractiveness

Facial color and texture make the impressions of facial look and attractiveness (e.g. gorgeous, sophisticated and warm-hearted). These impressions can be affected by facial makeups, including face foundation, lip-makeup, eye-makeup, eyebrow-makeup, and cheek-makeup. Face Foundation changes facial skin textures and adjusts facial skin tones. Lip-makeup changes lip colors and textures. However, it is difficult to figure out the detail of makeup impression clearly, because the meaning of language using in the questionnaire depends on the customer's culture, lifestyle, or country. In addition, the questionnaire cannot measure the elements such as color, radiance and the shapes though these elements have an influence on makeup preference. Therefore, in our previous study, we developed the eyelash makeup design system by using computer graphics for quantitative interpretation of the makeup impression. However, it is not well understood which types of color and texture in specific face parts correspond to each impression of face attractiveness. We aim to understand the corresponding facial impressions and manipulate them as you like, by makeup. .In the present study, using MAYA, we first create a CG image of average face shape as an original image. We next manipulate the original image to create 9 images with various combinations of makeups, including foundation, lip-makeup, eye-makeup, eyebrow, and cheek; each of 9 images is intended to make one specific impression. We evaluate whether these images' actual visual impressions on people correspond to our intended impressions of attractiveness.

Depth-based subtle gaze guidance in virtual reality environments

Virtual reality headsets and immersive head-mounted displays have become commonplace and have found their applications in digital gaming, film and education. An immersive perception is created by surrounding the user of the VR system with photo-realisitic scenes, sound or other stimuli (e.g. haptic) that provide an engrossing experience to the viewer. The ability to interact with the objects in the virtual environment have added greater interest for its use in learning and education. In this proposed work we plan to explore the ability to subtly guide viewers' attention to important regions in a controlled 3D virtual scene. Subtle gaze guidance [Bailey et al. 2009] approach combines eye-tracking and subtle imagespace modulations to guide viewer's attention about a scene. These modulations are terminated before the viewer can fixate on them using their high acuity foveal vision. This approach is preferred over other overt techniques that make permanent changes to the scene being viewed. This approach has also been tested in controlled realworld environments [Booth et al. 2013]. The key challenge to such a system, is the need for an external projector to present modulations on the scene objects to guide viewer's attention. However a VR system enables the user to view and interact in a 3D scene that is close to reality, thereby allowing researchers to digitally manipulate the 3D scene for active gaze guidance.

Directional thermal perception for wearable device

Recently, there has been an increasing research interest on thermal feedback. This includes utilizing thermal systems as simple messaging tools, such as: a system that uses temperature to present the importance of messages [Wilson G. 2012] and help users navigate the road by presenting speed or distance with thermal stimuli on the users arm [David and Henry 2013]. There is no researches about array-type thermal feedback. Therefore, we have focused on communication via an array-type thermal patterns. As a first step, this poster presents 1) how well the subjects can differentiate between different spots on which thermal stimulation is presented and 2) whether the subjects can recognize the directional thermal stimulations. In order to find the answer, we have placed the thermal device on where the subjects felt most comfortable: the wrist. The device was also placed on the back of the neck to mimic a scarf or a bluetooth headset.

Do cartoons feel pain?: using the virtual hand illusion to test human response to degrees of realism

Virtual reality has reached the consumer market and exhibited the potential to become a large social and commercial platform for mainstream markets. New, low-cost devices for virtual reality or mixed reality such as the Oculus Rift, Sony's Project Morpheus, or Microsoft's HoloLens are already available or have been announced and might even outperform previous high-cost systems [Young et al. 2014]. Techniques to measure hand motions are in the making with the goal of being able to control virtual hands using one's own hand movements as well as interact with objects in virtual reality applications. Prototypes such as the combination of the Leap Motion Controller and the Oculus Rift are already available. Applications for these devices will certainly follow.

Evaluating the Uncanny valley with the implicit association test

Despite the elusive term "Uncanny Valley", research in the area of appealing virtual humans approaching realism continues. The theory suggests that characters lose appeal when they approach photorealism (e.g., [MacDorman et al. 2009]). Realistic virtual characters are judged harshly, since the human visual system has acquired more expertise with the featural restrictions of other humans than with the restrictions of artificial characters [Seyama and Nagayama 2007]. Stylisation (making the character's appearance abstract) is therefore often used to avoid virtual characters to be perceived as unpleasant. We designed an experiment to test if there is a general affinity towards abstract as oppose to realistic characters.

Improving redirection with dynamic reorientations and gains

In head-mounted display systems, the confined size of the tracked space limits users from navigating larger virtual environments than the tracked physical space. Previous work suggests this constraint could be broken by asking users to back up or turn 180°whenever they encouter a wall in the real world [Williams et al. 2007]. In this work, we propose that the reorientation rate can be dynamically determined based on the user's instantaneous positional information and the shape of the navigable virtual space around the user. We conducted an experiment to compare our proposed dynamic reorientations with the previous Freeze-Turn reorientation. The results show that, with dynamic reorientations, participants walked a significantly longer distance between orientations than with Freeze-Turn reorientations.

Integration and evaluation of emotion in an articulatory speech synthesis system

We convey a tremendous amount of information vocally. In addition to the obvious exchange of semantic information, we unconsciously vary a number of acoustic properties of the speech wave to provide information about our emotions, thoughts, and intentions. [Cahn 1990] Advances in understanding of human physiology combined with increases in the computational power available in modern computers have made the simulation of the human vocal tract a realistic option for creating artificial speech. Such systems can, in principle, produce any sound that a human can make. Here we present two experiments examining the expression of emotion using prosody (i.e., speech melody) in human recordings and an articulatory speech synthesis system.

The effect of interpersonal familiarity on cooperation in a virtual environment

Immersive virtual environments (VEs) allow people to experience situations which because of danger, expense, time, or distance would not otherwise be available. Moreover, IVEs have been shown to be useful tools for learning and training. However, there are still many unanswered questions about how humans experience and interact with these environments and how this experience differs from the real world. In the experiment presented in this work, we are specifically interested in how effectively two people will collaborate within an environment given that they have never met or even seen each other prior to the experiment. We postulate that participants who never meet their partner in a collaborative environment will perform worse than those who are able to interact with their partner prior to the performance of a task. If this holds true, then it could have important implications for long distance collaboration.

Walking on foot to explore a virtual environment with uneven terrain

Immersive virtual environments (IVEs) provide an opportunity for humans to learn and to experience a place, which because of time, distance, danger, or expense, would not otherwise be available. Since navigation is the most common way users interact with 3D environments, much research has examined how well people navigate and learn the spatial layouts of IVEs. Ideally, a user's experience of an IVE would mimic a real world experience. However, this does not happen in practice. Thus, much work, examines the differences in real world and similar virtual experiences. Additionally, the navigation mechanisms that are used to explore an IVE do not allow for the exact same interactions as the real world. For example, it is difficult to replicate the physical aspect of climbing stairs or walking over rough terrain in an IVE. More work needs to be completed to assess how well people interact and learn in different types of virtual worlds. For example, the ground plane of the virtual worlds that people explore will not necessarily be flat. Thus, the question becomes how well can a person maintain comparable spatial awareness in an environment with uneven terrain? In this work, we examine subjects' spatial orientation as they traverse over uneven terrain while they physically locomote on a flat surface. IVEs are best explored on foot. That is, spatial awareness of the IVEs is best when users physically explore a virtual environment on foot. Thus, in this work we examine what happens to spatial orientation when subjects traverse over hilly virtual environments on foot.