SA '19- SIGGRAPH Asia 2019 Doctoral Consortium

Full Citation in the ACM Digital Library

Extended Reality Experiences Prediction using Collaborative Filtering

New immersive technologies are providing added value to many areas of our lives. A wide spectrum of hardware and software, including sensory interfaces, applications, and infrastructures are some of the technical drivers of these new opportunities. In particular, Extended Reality applications and simulators are increasingly becoming popular in business, but are often limited by their predictability and moreover, they lack personalization in the selection of the scenarios.

The author proposes the usage of recommendation systems in Extended Reality simulators as a contribution to solve this problem, proposing a system in which Extended Reality experiences (or scenarios) are suggested to the user at the beginning of the experience. An investigation on the usage of item-based collaborative filtering has been performed, resulting in a developed system that suggests an experience based on item feature similarity (in terms of the genre of the experiences) and the actions of similar users. This system uses KNN to find clusters of similar items based on a user's ratings, and proposes ten similar items with a precision of 71%.

In the context of Extended Reality simulators, this platform lead to a new generation of smart simulators that can be potentially used for entertainment as well as customized professional training. The developed simulator can potentially recommend personalized Extended Reality training experiences to the user, facilitating the learning of new skills. Also, this research applies Recommender Systems to the new field of Extended Reality.

Virtual Reality in the Art Museum

With unique affordances of immersion and interaction, virtual reality (VR) technologies have been increasingly utilised in the art museum, offering new opportunities to update and enrich the museum practice whilst bringing issues and concerns in the meantime. The complexity of the situation is demonstrated widely in debates in existing literature, posing urgent needs for a systematic, theoretical approach to shed light on the issues involved and inform art museum professionals in their decision-making process. Based on a review of debates and investigations in related fields, this paper proposes the notion of presence as an appropriate theoretical lens of inspection that signifies critical salience in both fields of VR and art museums. It further argues that current studies on presence have not demonstrated a holistic approach to address this intersectional context, hence an integrated understanding of presence is to be developed. In doing so, this paper poses a taxonomy of presence in four dimensions as an initial attempt to bridge current findings, addressing aspects of human, content, interaction and socialising respectively in the context of art museums using VR.

Material acquisition using deep learning

Texture, highlights, and shading are some of many visual cues that allow humans to perceive material appearance in pictures. Designing algorithms able to leverage these cues to recover spatially-varying bi-directional reflectance distribution functions (SVBRDFs) from a few images has challenged computer graphics researchers for decades. I explore the use of deep learning to tackle lightweight appearance capture and make sense of these visual cues. Our networks are capable of recovering per-pixel normals, diffuse albedo, specular albedo and specular roughness from as little as one picture of a flat surface lit by a hand-held flash. We propose a method which improves its prediction with the number of input pictures, and reaches high quality reconstructions with up to 10 images – a sweet spot between existing single-image and complex multi-image approaches. We introduce several innovations on training data acquisition and network design, bringing clear improvement over the state of the art for lightweight material capture.

Displacement of Self-Continuity

This research considers how a fictional allegory can be employed to examine issues of acculturation, displacement and identity transition [Addis and Tippett, 2008]. Using the story of a refugee family, my PhD research by artistic practice explores the implications of reconstructing an identity inside the body of a new culture. The animated short film, Stella, being developed as the final artefact, is designed to serve as a provocative vehicle for considering the social implications of identity loss and transition. Methodologically, the project employs a heuristic inquiry to increase the chances of discovery in a process that is intuitively negotiated [Ings, 2011]. In processing the inquiry, I shape the work and I am shaped by unexpected discoveries. Inside this dynamic, I generate a narrative embodiment of theory. This relationship may result in elevating both the self (the writer/director/animator) and the body of knowledge, though the making process [Moustakas, 1990]. Beyond its contribution to understanding processes and implications of acculturation, displacement, and identity transition, the project technological significance lies in its propensity to extend the application and demonstrate the potential of deep learning algorithms, performance capture using motion capture technology, and utilising 3d laser scanning and photogrammetry in digital human development.


Image filtering is a fundamental preprocessing task in computer vision and image processing. While the dominant applications of kernel filtering are enhancement and denoising, it can also be used as a powerful regularizer for image reconstruction. In general, the brute-force implementations of kernel filtering is prohibitively expensive. They are often too slow for real-time applications. In the first half of the thesis, we propose fast algorithms for bilateral filtering (BLF) and nonlocal means (NLM). In particular, we demonstrate that by using the Fourier approximation of the underlying kernel, we can obtain state-of-the-art fast algorithms for BLF of grayscale images. We next extend the idea for fast filtering of color images, which involves the approximation of a three-dimensional kernel. We next propose a fast separable formulation for NLM of grayscale images. In the second half of the dissertation, we turn to some applications of kernel filtering. We introduce a scale-adaptive variant of BLF that is used for suppressing fine textures in images. We develop a fast implementation of a symmetrized variant of NLM that is used for regularization (i.e., as a prior) within the plug-and-play framework for image restoration. The core idea can be extended to other forms of kernel filtering.

Numerical Linear Algebra for physically-based Fluid Animations

My dissertation project addresses the integration of numerical linear algebra approaches to the field of computer graphics and visualization, especially physically-based Fluid Animations. I mainly focus on (matrix) decomposition techniques with regard to spectral theory and related concepts. In many applications of Fluid Animations, in particular particle-based ones, these techniques are not consulted due to its irregular nature. However, the use of spectral-theoretic approaches leads to both theoretical insights and practical enhancements of the algorithm in terms of precision, efficiency, and stability. My research is conducted at the University of Stuttgart and at the Hochschule der Medien, Stuttgart, where I am a member of the joint graduate school Digital Media.

Cybersickness in Virtual Reality: Examining the Influence of the Virtual Environments on Sex Susceptibility

The auspicious future of VR could be thwarted by cybersickness. A factor known to influence susceptibility is sex, with females often experiencing higher incidences. A mitigation strategy is to identify individuals who are more sensitive to cybersickness, such that interventions can be implemented before the onset of subjective symptoms. Such an approach could use predictive models that compare a user’s online kinematic body sway and physiological characteristics to data from individuals that reported cybersickness. If such predictive models can be developed, then one approach is altering the virtual environment (VE) based on this real-time data.

The benefit of adjusting the VE is that it permits a susceptible individual to use the VR device with a reduction in adverse symptoms. One way to alter the VE is by manipulating optic flow, which can be described as the perceived visual motion of objects that are generated through an observer’s movements. Optic flow can be increased by increasing the level of details in the VE. That is to say, visual displays that contain a lot of details often give rise to stronger subjective sensations of movement. Thus, if the level of details in the VE is reduced, then this may reduce cybersickness reports.

Intermediated Reality

This thesis explores technical solutions to address the gap between the virtual and physical worlds towards photo-realistic interactive Augmented Reality (AR). As mobile network bandwidth increases, latencies reduce and graphics processing power becomes more efficient, this work tackles the challenge of convincingly re-animating physical objects remotely through digital displays. A framework for distributed Intermediated Reality (IR) communication is introduced, and forms the structure of the constituent methods developed for seamless collaboration through the remote possession of entities that come to life in mobile AR.

To perform such augmentation in an unnoticeable way, a method of deforming surface camera samples for seamless animations of physical objects with background inpainting is first introduced. This technique, in combination with a method to retarget the proximate appearance of real shadows to deformed virtual shadows and a method to perform environment illumination estimation using inconspicuous flat Fresnel lenses, brings real-world props to life in a compelling and practical way. Each method is integrated together to perform in real-time with analysis and evaluations using metric comparisons to expected ground truth renderings are provided.

Intermediated Reality can be applied to a variety of industries and scenarios beyond communication. This thesis presents applications in the movie industry and computer games sectors. For example, an approach to reduce the number of physical configurations needed for a stop-motion animation movie by generating the in-between frames digitally in AR is demonstrated. AR-generated frames preserve its natural appearance and achieve smooth transitions between real-world key-frames and digitally generated in-betweens. Further, the techniques extend across the entire Reality-Virtuality Continuum to target new game experiences called Multi-Reality games. This gaming experience makes an evolutionary step toward the convergence of real and virtual game characters for visceral digital experiences.