SAP '25: ACM Symposium on Applied Perception 2025

Full Citation in the ACM Digital Library

SESSION: Virtual Humans

`Talk to Me': Comparing the Effects of Virtual Character Interaction Fidelity on Perceived Usability, User Experience, and Acceptance in a Mental Health Application

Mental Health (MHealth) applications using interactive Embodied Conversational Agents (ECAs) can simulate face-to-face experiences through natural dialogue that mimics therapeutic human interactions using both verbal and nonverbal behaviors. However, animated, anthropomorphic, High Interaction Fidelity (HIF) conversational virtual agents are rarely used in MHealth applications due to their resource-intensive and time-consuming nature of development and execution. Instead, simpler Low Interaction Fidelity (LIF) designs featuring non-animated virtual personas or characters are more commonplace. The impact of HIF versus LIF virtual characters on perceived user acceptability and experience, particularly among users with depressive symptoms, as compared to users with non-depressive symptoms, remains much needed and underexplored. To address this gap, a comparative empirical evaluation was conducted involving both depressive and non-depressive participants. Results showed a significantly higher preference for the HIF design in terms of agent animation, behavior, and perceived bonding. Participants also reported greater perceived satisfaction, trust, likeability, conversational naturalness, and relational closeness with the HIF agent compared to the LIF design. These findings demonstrate the perceived acceptability and positive user experience of ECAs embedded within Cognitive Behavioral Therapy-based MHealth applications across user groups, highlighting the potential benefits of interactive conversational agents in mental health support.

Emotional Intensity Through the Eyes: Gaze Behavior Toward Expressive Virtual Avatars

Virtual humans are increasingly used in games, films, and interactive media. Yet, understanding the role of emotional intensity remains critical for expressive character design. This work investigates the gaze behavior in expressive virtual avatars with varying levels of facial expression intensity and light colors. Using eye tracking, we analyzed visual attention patterns as participants viewed avatars displaying three emotions (anger, disgust, and joy) at four intensity levels (neutral, low, medium, and high) under three light colors (white, red, and blue). Participants rated the perceived intensity of each expression while their gaze data was recorded. Results indicate that participants consistently distinguished between intensity levels, with higher ratings corresponding to more intense expressions. The eye region received the highest number of fixations and visits, though gaze distribution across facial features varied by emotion. Additionally, lighting color significantly affected pupil diameter.

SESSION: Avatars & Entertainment

Embodied Expertise: The Impact of Sommelier Avatars on Wine Flavor Perception

The Proteus effect refers to a phenomenon in which individuals infer expected behaviors and attitudes based on the appearance of their avatars. While previous research has explored the influence of the Proteus effect on various perceptual and cognitive domains, this study is the first to investigate its impact on taste perception. We hypothesized that embodying an avatar representing a wine expert, a sommelier, would enhance participants’ ability to perceive more complex wine flavors. To test this, we examined the changes in wine flavor perception when participants evaluated red wine in a virtual environment while embodying three different avatars (Hands-only, Non-expert, and Expert avatars). The results demonstrated that participants selected significantly more fruit-related terms (e.g., berry, cherry) when using the Expert avatar compared to the Non-expert avatar, as measured by an aroma wheel-based evaluation. Moreover, participants’ impressions of the avatar’s wine expertise positively correlated with their ratings of the wine’s fruitiness, mellowness, smoothness, and overall likability. These findings suggest that embodying avatars perceived as experts influence descriptive vocabulary, possibly by facilitating the recall of prototypical flavor-related terms.

From Virtual to Physical: Investigating the Carryover Effects of Avatar-Mediated Communication in Intergenerational Contexts

Previous research has demonstrated that avatars concealing personal attributes reduce psychological barriers and promote physical proximity during virtual interactions; however, it remains unclear whether these effects persist in later face-to-face encounters. To investigate the persistence of these effects, this study examined how avatar-mediated communication affects impression formation and subsequent face-to-face interaction, using intergenerational communication as a case study. In such contexts, differences in life experience and social discomfort can lead to reduced collaboration efficiency and potential prejudice. We conducted an experiment with two dialogue sessions under two conditions, involving young adults (aged 20 to 39) and older adults (aged 40 to 59), with a minimum age difference of 20 years between each pair. The two conditions were: (1) avatar-based communication followed by face-to-face communication, and (2) face-to-face communication in both sessions. Our findings indicate that while avatar-based communication initially reduced interpersonal distance during the virtual session, this effect did not carry over to subsequent face-to-face interactions. Qualitative analysis revealed that participants formed impressions and estimated their partner’s age primarily based on voice and conversation content rather than avatar appearance. These results suggest that avatars have limited influence in short-term hybrid communication contexts, particularly when users do not strongly identify avatars as their interaction partners. Further research should explore the specific conditions under which avatars can effectively contribute to impression formation, with potential applications for supporting intergenerational communication in both virtual and face-to-face contexts.

Evaluating the Text-to-Video-to-Motion Pipeline for Realistic Avatar Animation: A Comparative User Study

AI-driven avatar motion generation relies on extensive yet stylistically narrow or motion-capture-dependent text-motion datasets, limiting motion diversity and often resulting in implausible movements when user requests fall outside the training distribution. Among emerging solutions, our approach integrates Large Language Models (LLMs) and video generation to improve motion diversity and alignment with prompts. Prior studies evaluated interface usability, lacking rigorous comparisons of motion quality against state-of-the-art baselines and analyses of factors contributing to users’ perception of motions. In this study, we conducted a within-subjects design (N=20) to quantitatively evaluate our method against a commercial and an open-source baseline regarding realism, text-motion consistency, diversity, and user perception of the animated avatars. We also performed a qualitative comparison by generating motions for prompts collected from the participants. Results show that our method surpasses the open-source baseline in motion diversity and alignment with prompts, despite lagging behind the commercial model. In addition, our method better handles abstract motions or those less likely to be covered by existing motion-capture datasets. We discuss factors influencing perceptual evaluations in generated motions and the strengths and limitations of integrating LLMs and video generation models.

SESSION: XR Learning, Health and Wellbeing

Effects of Virtual Co-embodiment on Collaborative Rhythmic Movements

Rhythm synchronization plays a vital role in social contexts such as music and sports. This study examines how virtual co-embodiment–in which two users share control of a single avatar by averaging their movements–affects individual timing stability and dyadic synchronization during rhythmic tasks. In a between-pairs design, 20 participants (10 dyads) practiced arm-swing movements across three tempos (40, 80, 120 BPM) and three timing conditions (early-timing, on-time, late-timing). After brief practice (two trials per condition), context-dependent effects emerged: virtual co-embodiment improved individual timing precision at low tempo (40 BPM, early-timing), potentially by mitigating audiovisual conflict; and it facilitated immediate improvements in dyadic synchronization–quantified as relative phase stability–during high-tempo (120 BPM, late-timing) trials. Practice-related gains were also observed across conditions, regardless of embodiment condition. These findings demonstrate the short-term adaptability of joint motor performance under shared control, with implications for collaborative training in music, dance, and rehabilitation.

Toward Nurturing Self-Expansion Preference: The Impact of Repetition and Successful Experiences in Virtual Reality Occupational Simulations

Self-expansion preference is characterized by a positive orientation toward discovering new aspects of oneself and has been shown to correlate positively with indicators of well-being. Previous research has primarily treated self-expansion preference as a stable individual trait, with little attention given to methods for actively cultivating it. As a first step toward developing approaches to foster self-expansion orientation for enhancing people’s well-being, this study focused on the potential of new successful experiences to nurture self-expansion orientation. To this end, we developed a virtual reality (VR) occupational experience system that allows users to virtually experience unfamiliar professions, and conducted experiments and user interviews with university students to investigate how different styles of successful experiences influence self-expansion preference. The results showed that participants who repeatedly experienced the fishing profession in VR five times demonstrated a significant increase in self-expansion preference, whereas participants who experienced five different occupations once each did not show similar changes. Post-experiment interview results suggested that such increases are influenced less by objective success or external feedback, and more by the perceived sense of having succeeded, which is facilitated by sufficient engagement enabling individuals to gain confidence in the task.

SESSION: Perception in Interactive Systems

Reading Between the Colors: Enhancing Feature Perception in Colormaps Through Perceptual Transformations

Colormaps are a common and powerful tool in scientific visualization for displaying continuous data in color that highlights patterns and features. While a wide range of colormaps are available, their differing perceptual and functional characteristics make it difficult to know which one to choose. In this paper, we explore transformations of colormap distributions that preserve perceptual linearity while enhancing local contrast to enhance just noticeable differences of the data, effectively bridging the gap between accuracy and visual salience. Our approach aims to retain the perceptual validity of the data while addressing a key challenge: many standard colormaps either obscure subtle features or introduce perceptual distortions. By rebalancing how color is distributed across the data distribution, our method provides a principled way to improve feature visibility without sacrificing interpretability or accessibility of the visualization.

It Is Only Eco-Logical: Direct Perception for XR Research

Direct perception, as part of the ecological approach to perception, defines a relationship between an organism and their environment that is specified by lawful information. Researchers can apply the ecological approach to eXtended Reality (XR) work to obtain a richer understanding of users’ perception-action coordination in novel virtual settings. To encourage widespread adoption of this theoretical framework, this methodological paper introduces four major concepts from the ecological approach that are highly relevant to XR applications. We also provide an overview of existing literature to illustrate how those concepts may be used to inform and test their designs. These elements include the study of calibration and attunement, affordances, action based responses, and intrinsic scaling for measurements. The goal of this work is to increase awareness of the value of the ecological approach, and to provide a practical, evidence-based reference for researchers interested in applying these techniques in XR research.

SESSION: Haptics & Feedback

Pulling Illusions on Seating Surfaces: Force Feedback Induced by Distributed Asymmetric Vibrations on Seats

It has been reported that asymmetric vibrations applied to the hands or fingers induces pulling illusions. However, it remains unclear whether similar effects occur in other body regions. This study investigates pulling illusions induced by distributed asymmetric vibrations applied to seat surfaces, focusing on the buttocks and back, which come into contact with the chair during sitting. To achieve this, we developed a system that integrates multiple vibrators into the seat to provide distributed asymmetric vibrations. In Experiment 1, we examined the effects of different waveform and frequency conditions. The results showed that pulling illusions could be induced in the buttocks and lower back and 65 Hz sine wave signals were found to be the most effective. In Experiment 2, we investigated the effects of transition, where direction changes, and variations in applied pressure. The results demonstrated that pulling illusions could also be induced in the upper back. Additionally, the illusions were more pronounced when transition was present and when the applied force was approximately 2 kgf.

How do people perceive changes in physical bounce model for virtual racket interactions?

Nowadays, Virtual Reality is widely used in sports, to enhance physical fitness, or improve specific subskills, such as anticipation skills. However, many factors in VR can alter the experience and make it difficult to transfer the skills trained in VR to real practice. One of these factors is the physical simulation of the virtual environment, that may produce unexpected behaviours. Hence, if users are athletes in ball-based sports, the VR training simulator should compute ball trajectories that look plausible for them. In this paper, our aim is to evaluate how human perception can be influenced by variations in a ball physics’ model. We explore properties of human perception, the acceptance threshold beyond which a deviation from the reference ball trajectory is perceived more than 50% of time, and the Just-Noticeable Difference (JND) as an indicator of perceptual sensitivity. To this end, we conducted psychophysical experiments where participants were asked to either only observe, or observe and hit virtual bouncing balls simulated with varying coefficients of restitution (COR). We report the acceptance threshold and JND in different conditions. We found that participants detected variations in COR more easily when having the motor task. Additionally, their sensitivity to variations was globally higher when they first performed the perceptual task alone, before the motor task was introduced. These results contribute to the design of credible VR environments involving bouncing objects, such as for virtual sports.

Enhancement of Haptic Size Sensitivity by Briefly Presenting Incongruent Visual-haptic Stimuli

The processing of multisensory information, such as vision and haptics, is crucial to perceive object properties surrounding us. According to the optimal integration framework, each sensory modality is weighted based on its reliability to make a coherent representation. Given dynamic changes in the environment, their weights should be flexible to perceive the surrounding environment. This study examined whether haptic sensitivity increases after experiencing incongruent visual-haptic stimuli. Participants touched two protrusions and judged their size. An incongruent visual-haptic stimulus was presented in between touching the two protrusions. Assuming that the weight of vision decreases with the presentation of an incongruent visual-haptic stimulus, the relative weight of haptics increases after experiencing an incongruent stimulus compared with that after experiencing a congruent stimulus. Such changes in the weights can increase haptic size sensitivity. The results demonstrated that haptic size sensitivity increases after experiencing an incongruent visual-haptic stimulus. We also analyzed the relationship between haptic sensitivity and cardiovascular and respiratory responses such as respiration rate, heart rate, tissue oxygenation index (a surrogate index of blood flow), and blood pressure. We found a correlation between changes in haptic sensitivity and forearm blood flow. These results indicate that the weights of sensory modalities change dynamically, and forearm blood flow may contribute to this modulation.

SESSION: Understanding and Modeling Users

Modeling Multisensory Integration in Hand Redirection: A Bayesian Causal Inference Framework for Understanding Individual Variability

Recent advances in sensory integration theories have led to the development of sophisticated methods for understanding multimodal integration. Among these, the Bayesian causal inference (BCI) model, which does not assume mandatory integration, is considered capable of explaining the effects of individual differences that affect multisensory integration, such as bottom-up (i.e., sensory profiles) or top-down (i.e., knowledge or beliefs) factors. Hand redirection (HR), which creates an unnoticeable discrepancy between the actual hand movement and the visual feedback of the avatar hand in virtual reality, has attracted attention recently as a method for improving operability and haptic presentation. However, it has been reported that there are significant individual differences in awareness and effectiveness of HR. We hypothesized that by modeling HR using a BCI model rather than the maximum likelihood estimation or Bayesian integration models that have been examined in the past, we could investigate measures to improve the practicality of HR while accounting for individual variability. To test this hypothesis, we collected data on participants’ estimated positions of their real and virtual hands and detection thresholds using an device-driven HR approach that differs from conventional HR paradigms. Results demonstrated that the BCI model fit the overall trend of the position data well. However, the model fit decreased with the increasing number of trials at the individual level. Furthermore, our analysis suggested that specific parameters of the BCI model could explain individual differences in HR thresholds. In conclusion, this study demonstrates that BCI models can improve the explainability of HR, contributing to a better understanding of how individual differences manifest in the integration of multisensory input.

SESSION: Poster Abstracts

Exploring the Impact of Moderate Movement Blending and Partner Impressions in Virtual Environments

The mirroring effect has been demonstrated to facilitate smooth social interactions. Research has shown that digital technology can implement automatic mirroring to support more natural interactions in contexts where spontaneous mirroring is challenging, such as remote human communication or interaction with non-human agents. However, conventional digital mirroring approaches, typically used in human–agent interactions, face significant limitations when applied to full-body interactions between humans in virtual environments: they often appear unnatural and may be perceived as deliberate imitation rather than authentic mirroring. To address this, co-embodiment, a technique that blends a user’s and a partner’s movements into a single avatar, offers a promising alternative by achieving the benefits of mimicry without drawing attention to the mimicking process. We examined how varying the blending ratios (0%, 25%, 50%) influences social impressions in a two-person interaction within a virtual environment. Results showed that moderate blending (25%, 50%) enabled interactions that were as natural as those in the 0% condition. However, we did not observe significant improvements in perceived closeness, trustworthiness, or acceptance of the partner’s advice in the social influence task.

Exploring Haptic Signaling for Emphasizing Partner Movements in the Chicken Game

Strategic decision-making often relies on observing opponents’ actions, but the role of haptic information in competitive interactions remains largely unexplored. We investigated how real-time haptic transmission of opponent choices affects strategic behavior in a competitive territory negotiation game based on the Chicken Game where participants controlled cursors to invade opponent territory or remain in their area. Participants experienced either match conditions (haptic feedback corresponding to cursor movement), random conditions (haptic feedback rotated 90-270 degrees from cursor movement), or control conditions (no haptic feedback). While haptic conditions didn’t influence cooperation/defection choices, participants in invisible match conditions showed significantly increased decision-making changes compared to invisible control conditions. Results suggest that accurate haptic information can alter strategic adaptation when visual information is limited.