I3D '19- Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games

Full Citation in the ACM Digital Library

SESSION: Physics

Aura projection for scalable real-time physics

In this paper we propose a solution to delivering scalable realtime physics simulations. Although high performance computing simulations of physics related problems do exist, these are not realtime and do not model the real-time intricate interactions of rigid bodies for visual effect common in video games (favouring accuracy over real-time). As such, this paper presents the first approach to real-time delivery of scalable, commercial grade, video game quality physics. This is achieved by taking the physics engine out of the player's machine and deploying it across standard cloud based infrastructures. The simulation world is then divided into sections that are then allocated to servers. A server maintains the physics for all simulated objects in its section. Our contribution is the ability to maintain a scalable simulation by allowing object interaction across section boundaries using predictive migration techniques. We allow each object to project an aura that is used to determine object migration across servers to ensure seamless physics interactions between objects. The validity of our results is demonstrated through experimentation and benchmarking. Our approach allows player interaction at any point in real-time (influencing the simulation) in the same manner as any video game. We believe that this is the first successful demonstration of scalable real-time physics.

Efficient block pivoting for multibody simulations with contact

Simulating stiff physical systems is a requirement for numerous computer graphics applications, such as VR training for heavy equipment operation. However, iterative linear solvers often perform poorly in such cases, and direct methods involving a factorization of the system matrix are typically preferred for accurate and stable simulations. This can have a detrimental impact on performance, since factorization of the system matrix is costly for complex simulations. In this paper, we present a method for efficiently solving linear systems of stiff physical systems involving contact, where the dynamics are modeled as a mixed linear complementarity problem (MLCP). Our approach is based on a block Bard-type algorithm that applies low-rank downdates to a Cholesky factorization of the system matrix at each pivoting step. Further performance improvements are realized by exploiting low bandwidth characteristics of the factorization. Our method gives up to 3.5x speed-up versus recomputing the factorization based on the index set. Various challenging scenarios are used to demonstrate the advantages of our approach.

SESSION: Spatial data structures

Dual-split trees

We introduce the dual-split tree, a new tree-based acceleration structure for ray tracing. Each internal node of a dual-split tree uses two axis-aligned planes to either split the parent node into two child nodes or to mark the empty regions of the node. This allows child bounding boxes to overlap when desired. Thus, our dual-split tree is capable of representing space partitioning identical to any given bounding volume hierarchy. Our dual-split tree provides a significant reduction in the required acceleration structure storage by eliminating the redundant bounding planes that are commonplace in bounding volume hierarchies, providing better performance and storage savings than similar previous methods. As a result, we achieve improved rendering performance with dual-split trees, as compared to bounding volume hierarchies with a comparable level of optimization using identical or similar space partitioning.

SESSION: Global illumination

Incrementally baked global illumination

Global Illumination is affected by the slightest change in a 3D scene, requiring a complete reevaluation of the distributed light. In cases where real-time algorithms are not applicable due to high demands on the achievable accuracy, this recomputation from scratch results in artifacts like flickering or noise, disturbing the visual appearance and negatively affecting interactive lighting design workflows.

We propose a novel system tackling this problem by providing incremental updates of a baked global illumination solution after scene modifications, and a re-convergence after a few seconds. Using specifically targeted incremental data structures and prioritization strategies in a many-light global illumination algorithm, we compute a differential update from one illumination state to another. We further demonstrate the use of a novel error balancing strategy making it possible to prioritize the illumination updates.

Fast non-uniform radiance probe placement and tracing

Light field probes extend standard precomputed light probes to reduce light leaks and enable efficient filtered world-space ray tracing queries. When probes are placed uniformly in the scene volume, they permit an efficient querying algorithm. Manually increasing the grid resolution, however, is the only way to eliminate geometric feature undersampling, increasing the memory and computation cost of the approach. We present an automatic non-uniform probe placement method to correctly sample visibility information and eliminate superfluous probes. We organize non-uniform probes in an efficient structure for fast run-time ray tracing. Our probe placement relies on 3D scene skeletons and a gradient descent-based refinement to achieve full geometric coverage and reduce grazing angle sampling biases. Our adaptive probe ray tracer caches visibility information in a sparse voxel octree, augmenting probes with metadata used to apply a hierarchical-Z acceleration when marching rays in distant probes. We benchmark our approach on a variety of scenes and consistently demonstrate better performance, and fewer probes, in equal-quality comparisons to the state-of-the-art.

High-quality object-space dynamic ambient occlusion for characters using Bi-level regression

The widely used ambient occlusion (AO) technique provides an approximation of some global illumination effects and is efficient enough for use in real-time applications. Because it relies on computing the visibility from each point on a surface, AO computation is expensive for dynamically deforming objects, such as characters in particular. In this paper, we describe an algorithm for producing high-quality dynamically changing AO for characters. Our fundamental idea is to factorize the AO computation into a coarse-scale component in which visibility is determined by approximating spheres, and a fine-scale component that leverages a skinning-like algorithm for efficiency, with both components trained in a regression against ground-truth AO values. The resulting algorithm accommodates interactions with external objects and generalizes without requiring carefully constructed training data. Extensive comparisons illustrate the capabilities and advantages of our algorithm.

SESSION: Projections

RGBD temporal resampling for real-time occlusion removal

Occlusions disrupt the visualization of an object of interest, or target, in a real world scene. Video inpainting removes occlusions from a video stream by cutting out occluders and filling in with a plausible visualization of the object, but the approach is too slow for real-time performance. In this paper, we present a method for realtime occlusion removal in the visualization of a real world scene that is captured with an RGBD stream. Our pipeline segments the current RGBD frame to find the target and the occluders, searches for the best matching disoccluded view of the target in an earlier frame, computes a mapping between the target in the current frame and the target in the best matching frame, inpaints the missing pixels of the target in the current frame by resampling from the earlier frame, and visualizes the disoccluded target in the current frame. We demonstrate our method in the case of a walking human occluded by stationary or walking humans. Our method does not rely on a known 2D or 3D model of the target or of the occluders, and therefore it generalizes to other shapes. Our method runs at an interactive frame rate of 30fps.

SESSION: More rendering

Improved geometric specular antialiasing

Shading filtering proposed by Kaplanyan et al. [2016] is a simple solution for specular aliasing. It filters a distribution of microfacet normals in the domain of microfacet slopes by estimating the filtering kernel using derivatives of a halfway vector between incident and outdoing directions. However, for real-time rendering, this approach can produce noticeable artifacts because of an estimation error of derivatives. For forward rendering, this estimation error is increased significantly at grazing angles and near edges. The present work improves the quality of the original technique, while decreasing the complexity of the code at the same time. To reduce the error, we introduce a more efficient kernel bandwidth that takes the angle of the halfvector into account. In addition, we optimize the calculation of an isotropic filter kernel used for deferred rendering by applying the proposed kernel bandwidth. As our implementation is simpler than the original method, it is easier to integrate in time-sensitive applications, such as game engines, while at the same time improving the filtering quality.

RLFC: random access light field compression using key views and bounded integer sequence encoding

We present a new hierarchical compression scheme for encoding light field images (LFI) that is suitable for interactive rendering. Our method (RLFC) exploits redundancies in the light field images by constructing a tree structure. The top level (root) of the tree captures the common high-level details across the LFI, and other levels (children) of the tree capture specific low-level details of the LFI. Our decompressing algorithm corresponds to tree traversal operations and gathers the values stored at different levels of the tree. Furthermore, we use bounded integer sequence encoding which provides random access and fast hardware decoding for compressing the blocks of children of the tree. We have evaluated our method for 4D two-plane parameterized light fields. The compression rates vary from 0.08 -- 2.5 bits per pixel (bpp), resulting in compression ratios of around 200:1 to 20:1 for a PSNR quality of 40 to 50 dB. The decompression times for decoding the blocks of LFI are 1 -- 3 microseconds per channel on an NVIDIA GTX-960 and we can render new views with a resolution of 512 X 512 at 200 fps. Our overall scheme is simple to implement and involves only bit manipulations and integer arithmetic operations.1

SESSION: Performance and interaction

3D mesh animation compression based on adaptive spatio-temporal segmentation

With the recent advances of data acquisition techniques, the compression of various 3D mesh animation data has become an important topic in computer graphics community. In this paper, we present a new spatio-temporal segmentation-based approach for the compression of 3D mesh animations. Given an input mesh sequence, we first compute an initial temporal cut to obtain a small subsequence by detecting the temporal boundary of dynamic behavior. Then, we apply a two-stage vertex clustering on the resulting subsequence to classify the vertices into groups with optimal intra-affinities. After that, we design a temporal segmentation step based on the variations of the principle components within each vertex group prior to performing a PCA-based compression. Our approach can adaptively determine the temporal and spatial segmentation boundaries in order to exploit both temporal and spatial redundancies. We have conducted many experiments on different types of 3D mesh animations with various segmentation configurations. Our comparative studies show the competitive performance of our approach for the compression of 3D mesh animations.

Real-time hierarchical facial performance capture

This paper presents a novel method to reconstruct high resolution facial geometry and appearance in real-time by capturing an individual-specific face model with fine-scale details, based on monocular RGB video input. Specifically, after reconstructing the coarse facial model from the input video, we subsequently refine it using shape-from-shading techniques, where illumination, albedo texture, and displacements are recovered by minimizing the difference between the synthesized face and the input RGB video. In order to recover wrinkle level details, we build a hierarchical face pyramid through adaptive subdivisions and progressive refinements of the mesh from a coarse level to a fine level. We both quantitatively and qualitatively evaluate our method through many experiments on various inputs. We demonstrate that our approach can produce results close to off-line methods and better than previous real-time methods.

Adaptive gaussian mixture trajectory model for physical model control using motion capture data

To enable the physically correct simulation of the interaction of a 3D character with its environment the internal joint forces of a physical model of the character need to be estimated. Recently, derivative-free sampling-based optimization methods, which treat the objective function as a black box, have shown great results for finding control signals for articulated figures in physics simulations. We present a novel sampling-based approach for the reconstruction of control signals for a rigid body model based on motion capture data that combines ideas of previous approaches. The algorithm optimizes control trajectories along a sliding window using the Covariance Matrix Adaption Evolution Strategy. The sampling distribution is represented as a mixture model with a dynamically selected number of clusters based on the variation detected in the samples. During the optimization we keep track of multiple states which enables the exploration of multiple paths. We evaluate the algorithm for the task of motion capture following using figures that were automatically generated from 3D character models.

Multi-touch 3D positioning with the pantograph technique

One advantage of touch interaction is the sense of direct manipulation; there is perhaps no more-intuitive interface than just reaching out and touching virtual entities. However, direct manipulation is generally limited to objects located on the 2D display surface. For 3D spaces extending behind or in front of a touchscreen, the direct manipulation metaphor quickly falls apart. In these cases, gestures are needed to convert 2D finger positions into 3D cursor positions. This paper presents the pantograph technique, a simple two-finger interaction method for positioning a 3D cursor within mono and stereoscopic applications. The pantograph's pseudomechanical linkage between fingers and cursor provides helpful depth cues and maintains the sense of direct manipulation. Extensions to the technique, which integrate selection and other advanced actions, are explored within the context of real-world visual analysis applications. A series of human factors experiments showed that, while the pantograph technique outperformed other similar multitouch 3D positioning techniques, multi-touch was still inferior to other traditional, non-touch-based interfaces for sustained 3D positioning tasks.

Adaptive pointcloud segmentation for assisted interactions

In this work, we propose an interaction-driven approach streamlined to support and improve a wide range of real-time 2D interaction metaphors for arbitrarily large pointclouds based on detected primitive shapes. Rather than performing shape detection as a costly pre-processing step on the entire point cloud at once, a user-controlled interaction determines the region that is to be segmented next. By keeping the size of the region and the number of points small, the algorithm produces meaningful results and therefore feedback on the local geometry within a fraction of a second. We can apply these finding for improved picking and selection metaphors in large point clouds, and propose further novel shape-assisted interactions that utilize this local semantic information to improve the user's workflow.

SESSION: Perception and reaction

Increased affect-arousal in VR can be detected from faster body motion with increased heart rate

We instrumented an immersive VR platform with physiological (heart rate and electrodermal activity) sensors to investigate the use of movement data and physiological data to automatically detect changes in affect (emotional state). 12 users were asked to complete four blocks of tasks requiring them to hit moving targets while standing and moving about. One of the four blocks (in counterbalanced order) was designed to be stressful (S), while the other blocks were designed to be calm (C). The motions required of the users were the same in both conditions; only the visual and audio feedback were different across the S and C conditions. Users' self-scored arousal in the S condition was significantly higher. We analyzed the recorded motions by segmenting out 2747 "fast motions", i.e., intervals of time where the sum of the speed of the hands was above a threshold. A simple machine learning algorithm (a decision tree) could learn to classify these fast motions as either calm or stressed, with ≈80% accuracy, using only two features: the maximum speed achieved during the motion, and the heart rate at the moment of maximum speed, where both features were normalized. If only the maximum speed feature is used (i.e., with no physiological data), ≈70% accuracy is achieved.

SESSION: "Divers"

Dendry: a procedural model for dendritic patterns

We introduce Dendry, a procedural function that generates dendritic patterns and is locally computable. The function is controlled by parameters such as the level of branching, the degree of local smoothing, random seeding and local disturbance parameters, and the range of the branching angles. It is also controlled by a global control function that defines the overall shape and can be used, for example, to initialize local minima. The algorithm returns the distance to a tree structure which is implicitly constructed on the fly, while requiring a small memory footprint. The evaluation can be performed in parallel for multiple points and scales linearly with the number of cores. We demonstrate an application of our model to the generation of terrain heighfields with consistent river networks. A quad core implementation of our algorithm takes about ten seconds for a 512 × 512 resolution grid on the CPU.