SAP '23: ACM Symposium on Applied Perception 2023

Full Citation in the ACM Digital Library

SESSION: Augmenting Reality

The Impact of Reflection Approximations on Visual Quality in Virtual Reality

Virtual Reality (VR) systems rely on established real-time rendering techniques to uphold a strict performance budget of a few milliseconds per frame. The utilized techniques are based on approximations that trade speed-up for the correctness of the generated depth cues. The visual impact of these trade-offs, however, remains largely unexplored in the context of VR. In this paper we focus specifically on the perception of distorted specular reflections. Our research goal is to quantify the impairment of common reflection approximations on the resulting visual quality in VR. To this end, a subjective quality assessment is conducted, where participants rate the visual difference and visual annoyance of multiple established reflection approximations in a side-by-side comparison to a raytraced reference solution. We introduce the first dataset of its kind, containing 9 specular reflection approximations evaluated for two object types and three material smoothness values. Our results show the quality benefits associated with currently used reflection methods, and we discuss the implications for rendering VR environments.

Perceiving Absolute Distance in Augmented Reality Displays with Realistic and Non-realistic Shadows

Although distance perception to Augmented Reality (AR) objects has been studied for decades, little is known about absolute distance perception with the newest available AR displays. One significant distinction in categories of head-worn AR displays is whether they are optical see-through (OST) or video see-through (VST). These two types of devices have different methods of rendering that could affect the cues available for perceiving distance. Specifically, rendering cast shadows can be challenging, especially in OST displays that rely on additive light for rendering, and there may be alternative shadow shading methods that are equally as effective for conveying cues to depth. The current study tests absolute egocentric distance judgments to targets 3-6 meters away from an observer with two types of shadows, in two types of AR displays, the Hololens 2 (OST) and the Varjo XR-3 (VST). Shadows were realistic cast shadows or non-realistic shadows in the form of a stylized ring placed beneath the object. Participants verbally reported perceived distance to spherical virtual targets presented on or above the ground, viewed through the displays in a real world classroom. We found overall distance underestimation in both devices, but that estimations were more accurate with the Hololens 2 compared to the Varjo XR-3. There was little support for a difference in accuracy of estimations between shadow conditions or position on or above the ground (confirmed by a Bayesian analysis), suggesting that non-realistic shadows may be a good option for providing additional shading cues for depth in AR.

SESSION: Perceiving Agents

Effects of Body Type and Voice Pitch on Perceived Audio-Visual Correspondence and Believability of Virtual Characters

We examined the effects of virtual characters’ body type and voice pitch on perceived audio-visual correspondence and believability. For our within-group study (N = 72), we developed nine experimental conditions using a 3 (body type: ectomorph vs. mesomorph vs. endomorph body types) × 3 (voice pitch: low vs. medium vs. high fundamental frequency [F0]) design. We found statistically significant main effects from voice pitch and statistically significant interaction effects between a virtual character’s body type and voice pitch on both the level of perceived audio-visual correspondence and believability of female and male virtual characters. For female virtual characters, we also observed an additional statistically significant main effect from body type and a statistically significant interaction effect between the participant’s biological sex and the virtual character’s voice pitch on both perceived audio-visual correspondence and believability. Moreover, the results show that perceived believability is highly correlated to perceived audio-visual correspondence. Our findings have important practical implications in applications where the virtual character is meant to be an emotional or informational guide that requires some level of perceived believability, as the findings suggest that it is possible to enhance the perceived believability of the virtual characters by generating appropriate voices through pitch manipulation of existing voices.

The Stare-in-the-Crowd Effect When Navigating a Crowd in Virtual Reality

Nonverbal communication is paramount in daily life, as well as in populated virtual reality (VR) environments. In this paper, we focused on gaze behaviour, which is key to initiate and drive social interactions. Previous work on photographs and on virtual agents showed the importance of gaze, even in the presence of multiple stimuli, by demonstrating the stare-in-the-crowd effect: humans detect faster and observe gazes directed towards them longer than the averted ones. While previous studies focused on static scenarios, which fail in representing the complexity of real-life social interactions, we propose to explore the stare-in-the-crowd effect in dynamic situations. To this end, we designed a within-subject experiment where 21 users navigated a virtual street through an idle or moving crowd of virtual agents. Agents’ gaze was manipulated to display averted, directed, or shifting gaze. We analysed the user’s gaze (fixations, dwell time) and locomotor behaviours (path decisions, proximity to agents) as well as their social anxiety. Results showed that the stare-in-the-crowd effect is preserved when walking through both types of crowd, and that social anxiety decreases gaze interaction time and affects proximity behaviours in case of agents with directed gazes. However, virtual agents’ gaze did not elicit significant changes on users’ locomotion. These findings highlight the importance of considering virtual agents’ gaze when creating VR environments, and open future work perspectives to better understand factors that would strengthen or decrease this effect at gaze and locomotor levels.

Investigating the effect of visual realism on empathic responses to emotionally expressive virtual humans

With the remarkable improvement in technical systems for generating realistic virtual humans, there comes a requirement to quantify the effects that different aspects of realism can have on users. The study outlined here sought to advance research on emotion perception and virtual humans by assessing basic empathic responses to high fidelity emotionally expressive characters. We report findings on participants experiences of cognitive, affective and compassionate empathy, as well as measurements for the uncanny valley at two levels of visual realism. We find that the levels of emotion expressed by virtual humans within our study influenced ratings of empathy and the uncanny valley, with positive and negative valence having significant effects on empathic responses and perceived appeal. We discuss these findings in relation to studies which have measured empathy responses to animated characters, as well as notable differences in uncanny valley measurements.

SESSION: Moving and Doing

Effect of Hanger Reflex on Detection Thresholds for Hand Redirection during Forearm Rotation

Hand redirection is a technique used in virtual reality (VR) to alter the virtual hand position from the real hand position, enabling extended interaction in VR. This technique capitalizes on the visual system’s dominance over proprioception, whereby discrepancies between the virtual hand mapping and the reality below a specific threshold are indiscernible to users. However, these detection thresholds are often minute. In this study, we explore the impact of haptic stimuli known as the ‘hanger reflex’ on the detection threshold of hand redirection, specifically concerning forearm pronation, which involves rotating the user’s palm downward. To achieve this, we conducted a user study that involved manipulating the degree of forearm rotation and measuring detection thresholds in three conditions: Neutral, where no force was applied; Inward, where haptic feedback was provided to match the direction of hand rotation through the hanger reflex; and Outward, where haptic feedback was provided in the opposite direction of hand rotation. The results demonstrate that, in the Neutral condition, the virtual degree of rotation can vary between 0.896 and 1.136 times the actual amount without being perceptible to users. However, under the Inward condition, this range increases to between 0.888 and 1.204 times the actual amount, representing a significant expansion of the detection threshold range. Consequently, our findings contribute to the field of upper limb rehabilitation.

Foveated Walking: Translational Ego-Movement and Foveated Rendering

The demands of creating an immersive Virtual Reality (VR) experience often exceed the raw capabilities of graphics hardware. Perceptually-driven techniques can reduce rendering costs by directing effort away from features that do not significantly impact the overall user experience while maintaining a high level of quality where it matters most. One such approach is foveated rendering, which allows for a reduction in the quality of the image in the peripheral region of the field-of-view where lower visual acuity results in users being less able to resolve fine details. 6 Degrees of Freedom tracking allows for the exploration of VR environments through different modalities, such as user-generated head or body movements. The effect of self-induced motion on rendering optimization has generally been overlooked and is not yet well understood. To explore this, we used Variable Rate Shading (VRS) to create a foveated rendering method triggered by the translational velocity of the users and studied different levels of shading Level-of-Detail (LOD). We asked 10 participants in a within-subjects design to report whether they noticed a degradation in the rendering of a rich environment when performing active ego-movement or when being passively transported through the environment. We ran a psychophysical experiment using an accelerated stochastic approximation staircase method and modified the diameter and the LOD of the peripheral region. Our results show that self-induced walking can be used to significantly improve the savings of foveated rendering by allowing for an increased size of the low-quality area in a foveated algorithm compared to the passive condition. After fitting psychometric functions showcasing the percentage of correct responses related to different shading rates in the two types of movements, we also report the threshold severity (75%) point for when participants are able to detect such degradation. We argue such metrics can inform the future design of movement-dependent foveated techniques that could reduce computational load and increase energy savings.

SESSION: Perceptual Machine Learning

Internal Distraction Detection Utilizing EEG Data in an Educational VR Environment

Virtual reality (VR) makes learning more interesting for students and could help them remember what they have learned better than traditional methods. However, a student could get distracted in a VR environment because of stress, wandering thoughts, unwanted noise, outside sounds, etc. Distractions could be classified as either external (due to the environment) or internal (due to internal thoughts). To identify external distractions, previous researchers have used eye-gaze data. Eye-gaze data cannot, however, detect internal distractions because a user may be looking at the educational material in VR while also thinking about something else. We explored the usage of electroencephalogram (EEG) data to detect internal distractions. We designed an educational VR environment and trained three machine learning models: Random Forest (RF), Support Vector Machine (SVM), and k-nearest-neighbors (kNN), to detect internal distractions of students. For data labeling, we considered two window lengths (20 and 30 seconds) starting at 5 seconds after the distraction task started. We did cross-subject and cross-session tests, and our results show that kNN provides a better accuracy (64%) compared to RF and SVM. We also found that the shorter window length of 20 seconds provided a slightly better accuracy then the 30 second window. Our results are not far from such random guessing. Therefore, our contribution lies more in the fostering of ideas for future work that must employ more advanced and sophisticated techniques.

SESSION: Agency and Embodiment

Effect of Avatar Anthropomorphism on Bodily Awareness and Time Estimation in Virtual Reality

The time elapsed during a virtual reality (VR) experience is estimated to be short. Time estimation, a feeling related to timescales longer than a few seconds, is thought to be related to interoception. The shortening of time estimation may be caused by the VR experience distorting bodily awareness based on interoception; however, the details of this change have not been clarified. The characteristics of avatars are likely to affect bodily awareness. Therefore, based on the hypothesis that bodily awareness during a VR experience is altered by manipulating the avatar’s degree of anthropomorphism, this study investigated the effect of avatar anthropomorphism on bodily awareness and time estimation. In an experiment, participants performed a heartbeat discrimination task and a one-minute time estimation task, in the real world (Real condition), in the virtual world using a human avatar (VR-Human condition), and in the virtual world using a robot avatar (VR-Robot condition). The results showed that the perceived sense of agency was significantly lower in the VR-Human condition than in the VR-Robot condition, and interoceptive accuracy was significantly higher in the VR-Human condition than in the VR-Robot condition. However, there were no differences in time estimation between conditions. These results indicate that it is possible to conduct experiments to manipulate bodily awareness by manipulating the avatar characteristics. These findings can provide insights into understanding the relationship between bodily awareness and time estimation.

The Effect of Sense of Agency on Self-Efficacy Beliefs: A Virtual Reality Paradigm

A sense of control over the environment can stem from mere motor control to overarching belief systems of control. Sense of agency is defined as perceiving oneself as the cause of an action or its effects. It can be conceptualized as the low-level experience of online motor control over one’s actions. Self-efficacy is the high-level belief in one’s ability to achieve intended goals. Both constructs have been frequently studied on their own, but this is the first study that empirically investigates a possible link between the two. To this end, we conducted a virtual reality (VR) experiment in which participants had to trace shapes while experiencing both movement and feedback distortions. The experiment used a 2x2 design with the first factor being the translation of the participant’s movements into VR (accurate vs distorted) and the second factor being feedback upon task completion (real vs hyper-positive). We found that these two factors manipulated the sense of agency and, in turn, influenced self-efficacy, and see this as a first step in the investigation of a possible causal link between the two constructs. Thus, the constructs of agency and self-efficacy appear more closely linked than previous research suggests. Future research targeting the sense of agency as a bottom-up influence on self-efficacy beliefs holds promising implications for both clinical and positive psychological interventions as well as motor rehabilitation.

Effects of Virtual Co-embodiment on Declarative Memory-Based Motor Skill Learning

This study investigated the learning effect and skill retention when virtual co-embodiment, in which movements of people were weighted and averaged into a single avatar, was used to learn motor skills requiring declarative memory. Previous studies have shown that virtual co-embodiment promotes the efficiency of motor skill learning, which relies on procedural memory such as movement procedures. However, declarative memory, such as the connection between specific instructions and actions, plays an important role in learning motor skills. This study compared the learning efficiency and skill retention after one week of repeatedly performing the task of touching a specified combination of virtual spheres with both hands in accordance with the symbols presented, using virtual co-embodiment and in a condition in which the task was performed alone. The results showed that virtual co-embodiment improves learning efficiency for motor skills related to declarative memory as well as procedural memory and promotes long-term retention of skills.