The development of the 22-minute Augmented Reality narrative “The Tent” involved significant technical and creative challenges to establish new conventions for Spatial Entertainment. This production case study examines the evolution from a 360* video concept through traditional film adaptation to the final AR tabletop experience, detailing the wrong turns, technical breakthroughs, and creative innovations that shaped this award-winning work. Through analysis of volumetric capture workflows, photogrammetry integration, and the development of spatial cinematic language, this paper provides actionable insights for creators working in spatial narrative while addressing fundamental questions about interactivity, platform choice, and audience engagement in immersive storytelling.
We bring both theatrical and theoretical perspectives to “virtual puppetry”—exploring how individuals shape their digital representations and, in turn, how these digital representations influence them. We feature BEASTS, an autofictional live digital puppetry performance by Scarlett Kim with the Center for Unclassifiable Technologies and Experiences, developed in partnership with the Royal Shakespeare Company and Stanford University. Kim, joined by researchers from the Stanford Virtual Human Interaction Lab, will showcase how XR technology and narrative dramaturgy intersect in BEASTS, while uncovering the empirical questions it raises on embodiment. Through a live demo, we showcase how the performer creates digital representations that resembles and later on transcends their physical identity. In turn, we engage with scientific evidence demonstrating that digital representations, when extending beyond mere physical mirroring, can actively influence their creator. This interdisciplinary collaboration aims to facilitate exchange between artists and researchers who explore similar topics around embodiment and liveness from different lenses, providing meaningful insights on virtual human experiences and spatial storytelling.
At Studio Studio Syro creates animated stories entirely within virtual reality. Using Quill, a VR-native painting and animation tool, we draw, animate, and build worlds directly inside the headset (Figure 1). Our first project: Tales from Soda Island, was the first animated series made completely in VR, and it laid the groundwork for an expressive, handcrafted production pipeline that we've used for every project since. We've since expanded this approach across mediums and formats, creating and publishing animated pieces like Nyssa, The Art of Change, and various music videos and stage visuals, as well as interactive experiences like our SXVA prototype, PondQuest, and Dear Metaverse. In this talk, we'll walk through the way we work, how we build and animate in VR, prototype in Unity, and keep our tools lightweight and artist-friendly. From quick sketches to spatial storytelling, we'll share what it takes to build entire worlds with small teams and big ideas.
Hydrosymposium is an immersive installation that fuses technology, nature, and humanity through computer graphics, sound, and lighting. Situated at water level in the Experience Hall, it presents water as both a living medium and a poetic metaphor, bridging the physical and digital realms. The installation invites viewers to reflect on water's evolving role in art, science, and human experience, resonating with SIGGRAPH's innovative spirit.
Maamawi (\({\rm{\dot{L}L}}\Delta \)): Together Through the Fire presents a deeply immersive and transformative experience that melds Anishinaabe cultural narratives, particularly the 7 Fires prophecies, with cutting-edge digital storytelling techniques. This innovative performance, envisioned through the guidance of elder Gloria May Eshkibok and creatively interpreted by choreographer Olivia C. Davies and digital artist Athomas Goldberg, with music by Michael Red, offers a profound exploration of Indigenous wisdom, future visions, and the power of collective healing and renewal. The narrative structure, centered around the teachings and implications of the 8th Fire prophecy, uses the symbolic figures of the Wolf and the Eagle as storytellers. These characters, embodied by dancers with VR headsets, serve as the bridge between the past, present, and potential futures, narrating the unfolding of the prophecies against a backdrop of interactive digital environments. Surrounding Wolf and Eagle, a circle of eight to twelve audience members, also in VR headsets, embodies the role of active participants within this virtual world. This circle forms a community of witnesses and learners, gathered around a central, ethereally suspended fire. This fire, a symbol of transformation, knowledge, and continuity, serves as a focal point for the narrative and the collective experience. The introduction of a remote audience, connecting via a web-based interface from personal computers or mobile devices, and represented as playable hummingbirds within the virtual environment, adds another layer of connectivity and interaction. This choice not only expands the accessibility of the performance but also weaves in the symbolism of the hummingbird, known for its resilience, joy, and the reminder that small actions can lead to significant impacts. The hummingbirds, visible to both dancers and the in-person audience, further enrich the narrative with their presence, symbolizing the far-reaching influence and interconnectedness of individual efforts and the broader community. As the performance unfolds, the combined in-person and remote audience is guided through a vision of a future shaped by the outcomes of the 8th Fire prophecy, exploring themes of warning, wisdom, and the possibility of renewal. The refuge in the cave, with Joshua Mangeshig Pawis-Steckley's projections, provides a contemplative space for the stories of the 7 Fires to be absorbed and reflected upon, connecting the audience with the ancestral knowledge and the urgency of heeding these teachings. The climactic transformation brought by the coming of the rain and rising waters symbolizes cleansing, a rebirth of the world into a state of peace and harmony. This renewal invites the audience to partake in a celebration of new beginnings, emphasizing the role of collective action and the shared responsibility in fostering a future that embraces the lessons of the past, the realities of the present, and the possibilities of a harmonious coexistence. “Maamawi: Together through the Fire” is not only a performance but a communal ritual, a call to action, and a shared vision for the future. It encapsulates the essence of Anishinaabe teachings and the power of storytelling, blending tradition with technology to inspire, educate, and unite.
Demonstration on how we are designing a hyper-reality adaptation of Shakespeare's Macbeth featuring life-sized metahuman digital doubles that appear to intelligently interact with live actors in a virtual production volume. Attendees will learn how to stream Livelink body data from a mocap source to an invisible metahuman using Blueprint coding, and then spatially design actor-metahuman manipulations that appear improvised and reactive through virtual trigger boxes in the virtual production volume.
This SIGGRAPH demo explores “double immersion,” the unique experience of being immersed both physically in water and virtually in a headset. Our hybrid presentation combines a live on-stage talk and a remote real-time VR performance using MeRCURY, our waterproof headset. Having a light-tone, this session will entertain, inform, and inspire attendees with new possibilities for aquatic spatial storytelling.
Live physical whole-puppet performances can be used to drive digital animation characters and creatures via puppix, a new capture system. The benefits of having a live puppet character in the room with actors, directors and other characters are demonstrated and discussed, as well as practical processes of capturing non-human physicalities.
We practically demonstrate the experience via audience interaction using a puppix capture puppet live-performed by puppeteers in the room with the audience, paired with its digital twin, working with live direction and interaction..
Physical puppets allow directors and actors to work with non-human characters with the same flexibility, freedom and immediacy as human actors. Capturing these performances means non-human digital characters can work alongside and be directed like physical actors.
Unlike keyframe animation, capture of in-the-moment live performance allows real-world weight, physicality and movement transfer to digital twins. This also disrupts the limitations of human-based motion capture systems and the bulk of learning model training data sets, whose movements are originally from human physicality.
The origins of motion capture as a whole come from the technology of puppetry and animatronics. Performance armatures and rigs like Dinosaur Input Device, Sil and Hensons’ Waldo operate as control systems for digital performances, with director focus on digital output screens.
puppix, a whole-puppet capture system, keeps the performance focus on the character in the room, not on the screen.
We show examples of pre-recorded puppix outputs alongside the original performance footage.
We detail practicalities: of how to build a puppet pair for motion capture, successful live performance methods and processes, data processing considerations, practical capture considerations and extra notes for the VFX supervisor.
At present, reference puppets are being used on-set as placeholders for CG characters, providing lighting, position, interaction reference, whilst actors interact with the reference puppeteers’ performances.
puppix allows the full reference puppet performance to be motion captured. Secondary movements and whole body physicality match the digital characters and transfer to the digital character for free. Director and performers focus in the room, whilst creating digital animated performances. This is a tool like human-based motion capture, but for non-human characters and creatures.
We show the development journey of puppix, with examples of outputs alongside their source performance footage, and a live-performed capture puppet.
Quantum Theater takes quantum phenomena and re-imagines them as playable theater using generative AI to expand narrative possibilities in real-time. Working with archival materials and recent literature, it engages quantum science both as a subject and a means for creating playable experience, exploring the history and current development of the field. Phenomena like entanglement, superposition, coherence, and collapse are used as models for experiential and narrative effects manifest in theatrical performance. Through XR techniques multiple realities are layered on stage, branching and collapsing as the action develops over the course of the performance. Functional quantum modules shape this narrative logic in a post-AI exploration of liveness, variability, and improvisation. Quantum Theater explores simultaneous narratives in a space of competing realities, casting the audience as observer-participants actively cohering a story through their choices.
Remixing the Flying Words Project is a mixed-reality installation that reimagines an American Sign Language poem through immersive technology. Using motion capture, Artificial Intelligence generated imagery, and interactive environments, it enables audiences to experience sign language poetry kinesthetically. Presented in a mixed-reality headset, it transforms linguistic translation into a dynamic, multisensory engagement with spatial storytelling.
Live theater has long relied on the convention of a single actor embodying multiple roles. However, in VR performances where avatars persist within the digital space, managing multiple persistent avatars presents unique challenges. Agile Lens’ annual production of A Christmas Carol VR required a single performer to dynamically switch between avatars while maintaining presence, spatial awareness, and direct engagement with audience members. Initially, the primary challenge was controlling what happens to an avatar when the actor switches roles, and by solving for that problem, Agile Lens uncovered something larger: a real-time self-monitoring system for full-body motion capture performance with broad applications beyond theater.
Early solutions to the two (or three) body problem included physical and virtual monitors, but these introduced new obstacles—either obstructing the performer's virtual space or requiring manual physical adjustments that interfered with acting. Floating virtual monitors, though flexible, increased cognitive load and cluttered the scene.
The breakthrough came with a third-person perspective system, externalizing self-monitoring without breaking immersion. This approach didn't just streamline avatar switching—it became a powerful feedback tool for real-time motion capture performance, allowing the actor to see their previous avatar frozen in position while smoothly embodying a new character. This provided clearer spatial awareness, movement precision, and expressive control.
Beyond this production, this system has applications in training simulations (for military, emergency response, or medical scenarios), squad-based VR games with body tracking, third-person VR gaming, real-time VTubing, and virtual production workflows for digital filmmaking. By evolving a performer-centered monitoring system, Seeing Yourself on Stage introduces a scalable tool that benefits multiple XR disciplines, enhancing the capabilities of monitoring and engaging with yourself and others in immersive spaces.
Space Echo 2.0 is an experimental media art project that introduces deliberate disruptions into avatar interactions within a multiplayer virtual reality (VR) environment (see Figure 1). These interruptions are designed to prompt reflection on the nature of genuine communication while offering novel, paradoxical conversational experiences. Drawing on a range of influences—from social VR platforms [Maloney et al. 2020] and experimental game design [Soderman 2021] to psychological research [Eisenberger et al. 2003], mythological narratives, and Bertolt Brecht’s theatrical theory [Brecht 1960]—the project uses intentional communicative disruption as a lens through which to reconsider connection.
Set within a dreamlike, symbolic virtual stage shared by two participants, the experience centers on the Reverse Jetpack, a core mechanic that moves an avatar in the opposite direction of their gaze each time they speak. The more participants attempt to communicate verbally, the more physical distance is created between them. This enforced separation paradoxically fosters emotional intimacy, highlighting the tension between the desire for connection and its inevitable distortion.
Scattered throughout the environment are AI-generated images and audio fragments. As users approach these elements, whispered, looped narratives are triggered, offering a sensory encounter with miscommunication, repetition, and distortion themes. Through its integration of symbolic narrative and experimental mechanics, Space Echo 2.0 invites users to reimagine connection in extended reality and reflect on the essence of social connection in an increasingly mediated world.
Spatial p5 is a browser-based live-coding toolkit for creating mixed-reality experiences that make spatial computing more directly programmable for creators. By combining p5.js [McCarthy et al. 2016], p5.xr, and P5 LIVE, it enables real-time prototyping of immersive multi-user sketches directly on XR headsets using a keyboard and mouse.
The toolkit eliminates several steps traditionally associated with 3D workflows, such as asset export, build compilation, and deployment. This allows creators to code and iterate directly within the target medium.
Grounded in the "low floor, high ceiling" principle [Papert 1980], Spatial p5 supports both beginners exploring spatial interaction for the first time and experienced developers building collaborative XR environments. It thereby makes code-based spatial computing a more intuitive, creative, and social medium on the web. More details are available on the Spatial p5 website [Udvari n. d. ].
We present Spatial Storybook, a system for automatically converting a monaural audiobook into a spatialized, binaural presentation to provide listeners with a more immersive experience. Our key insight is to employ Large Language Models (LLMs) to reason about the dimensions and materials of the spaces in which scenes take place and plausible placement of character voices in these scenes, in a manner that is consistent temporally and with the content narrative, even if cues regarding room dimensions and relative positions are not explicitly given within the text. We achieve this by probing an LLM to reformat the accompanying book text as a stage play with positional cues, and design a language parsing and binaural scene rendering system to generate spatialized dialogue that is auralized using room acoustics settings appropriate for the scene. We discuss the building blocks of this system and the broader potential afforded by the spatial reasoning capabilities of LLMs in service of new spatial audio experiences.
With our contribution to SIGGRAPH 2025 Spatial Storytelling, we want to share our process of creating stories in XR that highlight the importance of capturing authentic human performance and connect with the audience on a visceral level. We want to inspire the attendees by demonstrating how it's possible to innovate in spatial storytelling using existing tech, adapting it to our stories. As creatives, how can we position ourselves in relation to an evolving medium and rapidly progressing technology?
Symbiosis/\ Dysbiosis: Mutualism is a multiplayer experience where players and in-vivo fungal Mycelium can synchronize within a virtual old-growth forest. Built within the Resonite VR platform, the project is an investigation into the experiential depths of social worldbuilding technology. Real-time EEG and bio-electrical data streams, via Muse and custom hardware, respectively, enable players to see, feel and interact with the microscopic organisms around, between and within them. Living mycelium responds to players by triggering olfactory, haptic, visual, and auditory effects, also contributing to a live, reactive music performance. Combined, this project provides a groundbreaking context to engage audiences within and to empirically explore the ways biological, ecological, and technological systems can interact.
The bodies of the spectators who become participants in a live contemporary dance and extended reality performance are perceived as access thresholds to virtual experiences. They are immersed in multisensory narratives with both real and virtual actors, while experiencing their stories and dancing with them.
TimbreSpace is a cross-platform immersive music production ecosystem that bridges the gap between 2D workflows and spatial 3D interaction. It offers both a web interface, and an extended reality (XR) application for Apple Vision Pro (AVP) and Meta Quest, allowing users to create, sequence, query, and analyze sounds within an intuitive node graph and patching environment. Designed for musicians, producers, audio engineers, hobbyists, and scientists, TimbreSpace takes advantage of immersive interaction and machine learning to explore audio and music from a brand new perspective. Users can navigate curated sample packs, use semantic audio search, and build complex sound machines in a feature-rich spatial user interface. Our system opens up new modes of visceral musical experience, ranging from phantom haptics to generative audiovisual structures.
First Encounters is an introductory mixed reality (MR) experience designed to welcome all Quest3 and Quest3s users to MR. Widely praised by both the press and the public, it has become the go-to experience for introducing mixed reality to newcomers. The app has sparked numerous discussions among players and inspired developers.
In this session, we'll explore key moments of the experience, highlighting the design, technical, and production choices that shaped its development. You'll gain valuable insights into what made First Encounters successful, including:
Scene Understanding: How we utilized scene understanding while accommodating human tendencies
VFX in MR: The importance of VFX in creating presence in MR and overcoming challenges
Player Safety: Physically protecting our players in a MR environment
Expanding Physical Spaces: Creating new spaces that utilize and expand beyond the physical
Dealing with Unknowns: Navigating the unknowns of player's space
Multisensory Experience: The importance of utilizing all available senses to create an immersive experience
Whether you're a seasoned developer or just starting out in VR, AR, or MR, this session is open to attendees of all experience levels. Join us to learn from the experts and take away valuable insights to apply to your own projects.
By the end of the talk, you'll leave with a deeper understanding of what worked and how to apply these principles to create your own successful MR experiences.