Diffractive optical elements (DOEs) have recently drawn great attention in computational imaging because they can drastically reduce the size and weight of imaging devices compared to their refractive counterparts. However, the inherent strong dispersion is a tremendous obstacle that limits the use of DOEs in full spectrum imaging, causing unacceptable loss of color fidelity in the images. In particular, metamerism introduces a data dependency in the image blur, which has been neglected in computational imaging methods so far. We introduce both a diffractive achromat based on computational optimization, as well as a corresponding algorithm for correction of residual aberrations. Using this approach, we demonstrate high fidelity color diffractive-only imaging over the full visible spectrum. In the optical design, the height profile of a diffractive lens is optimized to balance the focusing contributions of different wavelengths for a specific focal length. The spectral point spread functions (PSFs) become nearly identical to each other, creating approximately spectrally invariant blur kernels. This property guarantees good color preservation in the captured image and facilitates the correction of residual aberrations in our fast two-step deconvolution without additional color priors. We demonstrate our design of diffractive achromat on a 0.5mm ultrathin substrate by photolithography techniques. Experimental results show that our achromatic diffractive lens produces high color fidelity and better image quality in the full visible spectrum.
We present a practical framework for reproducing omnidirectional incident illumination conditions with complex spectra using a light stage with multispectral LED lights. For lighting acquisition, we augment standard RGB panoramic photography with one or more observations of a color chart with numerous reflectance spectra. We then solve for how to drive the multispectral light sources so that they best reproduce the appearance of the color charts in the original lighting. Even when solving for non-negative intensities, we show that accurate lighting reproduction is achievable using just four or six distinct LED spectra for a wide range of incident illumination spectra. A significant benefit of our approach is that it does not require the use of specialized equipment (other than the light stage) such as monochromators, spectroradiometers, or explicit knowledge of the LED power spectra, camera spectral response functions, or color chart reflectance spectra. We describe two simple devices for multispectral lighting capture, one for slow measurements of detailed angular spectral detail, and one for fast measurements with coarse angular detail. We validate the approach by realistically compositing real subjects into acquired lighting environments, showing accurate matches to how the subject would actually look within the environments, even for those including complex multispectral illumination. We also demonstrate dynamic lighting capture and playback using the technique.
Depth cameras are a ubiquitous technology used in a wide range of applications, including robotic and machine vision, human-computer interaction, autonomous vehicles as well as augmented and virtual reality. In this paper, we explore the design and applications of phased multi-camera time-of-flight (ToF) systems. We develop a reproducible hardware system that allows for the exposure times and waveforms of up to three cameras to be synchronized. Using this system, we analyze waveform interference between multiple light sources in ToF applications and propose simple solutions to this problem. Building on the concept of orthogonal frequency design, we demonstrate state-of-the-art results for instantaneous radial velocity capture via Doppler time-of-flight imaging and we explore new directions for optically probing global illumination, for example by de-scattering dynamic scenes and by non-line-of-sight motion detection via frequency gating.
Physics-based animation is often used to animate scenes containing destruction of near-rigid, man-made materials. For these applications, the most important visual features are plastic deformation and fracture. Methods based on continuum mechanics model these materials as elastoplastic, and must perform expensive elasticity computations even though elastic deformations are imperceptibly small for rigid materials. We introduce an example-based plasticity model based on linear blend skinning that allows artists to author simulation objects using familiar tools. Dynamics are computed using an unmodified rigid body simulator, making our method computationally efficient and easy to integrate into existing pipelines. We introduce a flexible technique for mapping impulses computed by the rigid body solver to local, example-based deformations. For completeness, our method also supports prescoring based fracture. We demonstrate the practicality of our method by animating a variety of destructive scenes.
We enrich character animations with secondary soft-tissue Finite Element Method (FEM) dynamics computed under arbitrary rigged or skeletal motion. Our method optionally incorporates pose-space deformation (PSD). It runs at milliseconds per frame for complex characters, and fits directly into standard character animation pipelines. Our simulation method does not require any skin data capture; hence, it can be applied to humans, animals, and arbitrary (real-world or fictional) characters. In standard model reduction of three-dimensional nonlinear solid elastic models, one builds a reduced model around a single pose, typically the rest configuration. We demonstrate how to perform multi-model reduction of Finite Element Method (FEM) nonlinear elasticity, where separate reduced models are precomputed around a representative set of object poses, and then combined at runtime into a single fast dynamic system, using subspace interpolation. While time-varying reduction has been demonstrated before for offline applications, our method is fast and suitable for hard real-time applications in games and virtual reality. Our method supports self-contact, which we achieve by computing linear modes and derivatives under contact constraints.
Dynamic skin deformation is vital for creating life-like characters, and its real-time computation is in great demand in interactive applications. We propose a practical method to synthesize plausible and dynamic skin deformation based on a helper bone rig. This method builds helper bone controllers for the deformations caused not only by skeleton poses but also secondary dynamics effects. We introduce a state-space model for a discrete time linear time-invariant system that efficiently maps the skeleton motion to the dynamic movement of the helper bones. Optimal transfer of nonlinear, complicated deformations, including the effect of soft-tissue dynamics, is obtained by learning the training sequence consisting of skeleton motions and corresponding skin deformations. Our approximation method for a dynamics model is highly accurate and efficient owing to its low-rank property obtained by a sparsity-oriented nuclear norm optimization. The resulting linear model is simple enough to easily implement in the existing workflows and graphics pipelines. We demonstrate the superior performance of our method compared to conventional dynamic skinning in terms of computational efficiency including LOD controls, stability in interactive controls, and flexible expression in deformations.
Skinning algorithms that work across a broad range of character designs and poses are crucial to creating compelling animations. Currently, linear blend skinning (LBS) and dual quaternion skinning (DQS) are the most widely used, especially for real-time applications. Both techniques are efficient to compute and are effective for many purposes. However, they also have many well-known artifacts, such as collapsing elbows, candy wrapper twists, and bulging around the joints. Due to the popularity of LBS and DQS, it would be of great benefit to reduce these artifacts without changing the animation pipeline or increasing the computational cost significantly. In this paper, we introduce a new direct skinning method that addresses this problem. Our key idea is to pre-compute the optimized center of rotation for each vertex from the rest pose and skinning weights. At runtime, these centers of rotation are used to interpolate the rigid transformation for each vertex. Compared to other direct skinning methods, our method significantly reduces the artifacts of LBS and DQS while maintaining real-time performance and backwards compatibility with the animation pipeline.
While playing a fundamental role in shape understanding, the medial axis is known to be sensitive to small boundary perturbations. Methods for pruning the medial axis are usually guided by some measure of significance. The majority of significance measures over the medial axes of 3D shapes are locally defined and hence unable to capture the scale of features. We introduce a global significance measure that generalizes in 3D the classical Erosion Thickness (ET) measure over the medial axes of 2D shapes. We give precise definition of ET in 3D, analyze its properties, and present an efficient approximation algorithm with bounded error on a piece-wise linear medial axis. Experiments showed that ET outperforms local measures in differentiating small boundary noise from prominent shape features, and it is significantly faster to compute than existing global measures. We demonstrate the utility of ET in extracting clean, shape-revealing and topology-preserving skeletons of 3D shapes.
Many high-level geometry processing tasks rely on low-level constructive solid geometry operations. Though trivial for implicit representations, boolean operations are notoriously difficult to execute robustly for explicit boundary representations. Existing methods for 3D triangle meshes fall short in one way or another. Some methods are fast but fail to produce closed, self-intersection free output. Other methods are robust but place prohibitively strict assumptions on the input, e.g., no hollow cavities, non-manifold edges or self-intersections. We propose a systematic recipe for conducting a family of exact constructive solid geometry operations. The two-stage method makes no general position assumptions and does not resort to numerical perturbation. The method is variadic, operating on any number of input meshes. This generalizes unary mesh-repair operations, classic binary boolean differencing, and n-ary operations such as finding all regions inside at least k out of n inputs. We demonstrate the superior effectiveness and robustness of our method on a dataset of 10,000 "real-world" meshes from a popular online repository. To encourage development, validation, and comparison, we release both our code and dataset to the public.
In this paper, we propose a new adaptive rendering method to improve the performance of Monte Carlo ray tracing, by reducing noise contained in rendered images while preserving high-frequency edges. Our method locally approximates an image with polynomial functions and the optimal order of each polynomial function is estimated so that our reconstruction error can be minimized. To robustly estimate the optimal order, we propose a multi-stage error estimation process that iteratively estimates our reconstruction error. In addition, we present an energy-preserving outlier removal technique to remove spike noise without causing noticeable energy loss in our reconstruction result. Also, we adaptively allocate additional ray samples to high error regions guided by our error estimation. We demonstrate that our approach outperforms state-of-the-art methods by controlling the tradeoff between reconstruction bias and variance through locally defining our polynomial order, even without need for filtering bandwidth optimization, the common approach of other recent methods.
In this paper, we show that applying a linear transformation---represented by a 3 x 3 matrix---to the direction vectors of a spherical distribution yields another spherical distribution, for which we derive a closed-form expression. With this idea, we can use any spherical distribution as a base shape to create a new family of spherical distributions with parametric roughness, elliptic anisotropy and skewness. If the original distribution has an analytic expression, normalization, integration over spherical polygons, and importance sampling, then these properties are inherited by the linearly transformed distributions.
By choosing a clamped cosine for the original distribution we obtain a family of distributions, which we call Linearly Transformed Cosines (LTCs), that provide a good approximation to physically based BRDFs and that can be analytically integrated over arbitrary spherical polygons. We show how to use these properties in a realtime polygonal-light shading application. Our technique is robust, fast, accurate and simple to implement.
While Russian roulette (RR) and splitting are considered fundamental importance sampling techniques in neutron transport simulations, they have so far received relatively little attention in light transport. In computer graphics, RR and splitting are most often based solely on local reflectance properties. However, this strategy can be far from optimal in common scenes with non-uniform light distribution as it does not accurately predict the actual path contribution. In our approach, like in neutron transport, we estimate the expected contribution of a path as the product of the path weight and a pre-computed estimate of the adjoint transport solution. We use this estimate to generate so-called weight window which keeps the path contribution roughly constant through RR and splitting. As a result, paths in unimportant regions tend to be terminated early while in the more important regions they are spawned by splitting. This results in substantial variance reduction in both path tracing and photon tracing-based simulations. Furthermore, unlike the standard computer graphics RR, our approach does not interfere with importance-driven sampling of scattering directions, which results in superior convergence when such a technique is combined with our approach. We provide a justification of this behavior by relating our approach to the zero-variance random walk theory.
We propose a method to fabricate textured 3D models using thermoforming. Differently from industrial techniques, which target mass production of a specific shape, we propose a combined hardware and software solution to manufacture customized, unique objects. Our method simulates the forming process and converts the texture of a given digital 3D model into a pre-distorted image that we transfer onto a plastic sheet. During thermoforming, the sheet deforms to create a faithful physical replica of the digital model. Our hardware setup uses off-the-shelf components and can be calibrated with an automatic algorithm that extracts the simulation parameters from a single calibration object produced by the same process.
Microstructures at the scale of tens of microns change the physical properties of objects, making them lighter or more flexible. While traditionally difficult to produce, additive manufacturing now lets us physically realize such microstructures at low cost.
In this paper we propose to study procedural, aperiodic microstructures inspired by Voronoi open-cell foams. The absence of regularity affords for a simple approach to grade the foam geometry --- and thus its mechanical properties --- within a target object and its surface. Rather than requiring a global optimization process, the microstructures are directly generated to exhibit a specified elastic behavior. The implicit evaluation is akin to procedural textures in computer graphics, and locally adapts to follow the elasticity field. This allows very detailed structures to be generated in large objects without having to explicitly produce a full representation --- mesh or voxels --- of the complete object: the structures are added on the fly, just before each object slice is manufactured.
We study the elastic behavior of the microstructures and provide a complete description of the procedure generating them. We explain how to determine the geometric parameters of the microstructures from a target elasticity, and evaluate the result on printed samples. Finally, we apply our approach to the fabrication of objects with spatially varying elasticity, including the implicit modeling of a frame following the object surface and seamlessly connecting to the microstructures.
This paper presents CofiFab, a coarse-to-fine 3D fabrication solution, combining 3D printing and 2D laser cutting for cost-effective fabrication of large objects at lower cost and higher speed. Our key approach is to first build coarse internal base structures within the given 3D object using laser cutting, and then attach thin 3D-printed parts, as an external shell, onto the base to recover the fine surface details. CofiFab achieves this with three novel algorithmic components. First, we formulate an optimization model to compute fabricatable polyhedrons of maximized volume, as the geometry of the internal base. Second, we devise a new interlocking scheme to tightly connect the laser-cut parts into a strong internal base, by iteratively building a network of nonorthogonal joints and interlocking parts around polyhedral corners. Lastly, we optimize the partitioning of the external object shell into 3D-printable parts, while saving support material and avoiding overhangs. Besides cost saving, these components also consider aesthetics, stability and balancing. Hence, CofiFab can efficiently produce large objects by assembly. To evaluate CofiFab, we fabricate objects of varying shapes and sizes, and show that CofiFab can significantly outperform previous methods.
As humans, we regularly interpret scenes based on how objects are related, rather than based on the objects themselves. For example, we see a person riding an object X or a plank bridging two objects. Current methods provide limited support to search for content based on such relations. We present
We introduce a co-analysis method which learns a functionality model for an object category, e.g., strollers or backpacks. Like previous works on functionality, we analyze object-to-object interactions and intra-object properties and relations. Differently from previous works, our model goes beyond providing a functionality-oriented descriptor for a single object; it prototypes the functionality of a category of 3D objects by co-analyzing typical interactions involving objects from the category. Furthermore, our co-analysis localizes the studied properties to the specific locations, or surface patches, that support specific functionalities, and then integrates the patch-level properties into a category functionality model. Thus our model focuses on the how, via common interactions, and where, via patch localization, of functionality analysis.
Given a collection of 3D objects belonging to the same category, with each object provided within a scene context, our co-analysis yields a set of proto-patches, each of which is a patch prototype supporting a specific type of interaction, e.g., stroller handle held by hand. The learned category functionality model is composed of proto-patches, along with their pairwise relations, which together summarize the functional properties of all the patches that appear in the input object category. With the learned functionality models for various object categories serving as a knowledge base, we are able to form a functional understanding of an individual 3D object, without a scene context. With patch localization in the model, functionality-aware modeling, e.g, functional object enhancement and the creation of functional object hybrids, is made possible.
Patterns play a central role in 2D graphic design. A critical step in the design of patterns is evaluating multiple design alternatives. Exploring these alternatives with existing tools is challenging because most tools force users to work with a single fixed representation of the pattern that encodes a specific set of geometric relationships between pattern elements. However, for most patterns, there are many different interpretations of its regularity that correspond to different design variations. The exponential nature of this variation space makes the problem of finding all variations intractable. We present a method called PATEX to characterize and efficiently identify distinct and valid pattern variations, allowing users to directly navigate the variation space. Technically, we propose a novel linear approximation to handle the complexity of the problem and efficiently enumerate suitable pattern variations under proposed element movements. We also present two pattern editing interfaces that expose the detected pattern variations as suggested edits to the user. We show a diverse collection of pattern edits and variations created with PATEX. The results from our user study indicate that our suggested variations can be useful and inspirational for typical pattern editing tasks.
Industrial knitting machines can produce finely detailed, seamless, 3D surfaces quickly and without human intervention. However, the tools used to program them require detailed manipulation and understanding of low-level knitting operations. We present a compiler that can automatically turn assemblies of high-level shape primitives (tubes, sheets) into low-level machine instructions. These high-level shape primitives allow knit objects to be scheduled, scaled, and otherwise shaped in ways that require thousands of edits to low-level instructions. At the core of our compiler is a heuristic transfer planning algorithm for knit cycles, which we prove is both sound and complete. This algorithm enables the translation of high-level shaping and scheduling operations into needle-level operations. We show a wide range of examples produced with our compiler and demonstrate a basic visual design interface that uses our compiler as a backend.
Designers frequently reuse existing designs as a starting point for creating new garments. In order to apply garment modifications, which the designer envisions in 3D, existing tools require meticulous manual editing of 2D patterns. These 2D edits need to account both for the envisioned geometric changes in the 3D shape, as well as for various physical factors that affect the look of the draped garment. We propose a new framework that allows designers to directly apply the changes they envision in 3D space; and creates the 2D patterns that replicate this envisioned target geometry when lifted into 3D via a physical draping simulation. Our framework removes the need for laborious and knowledge-intensive manual 2D edits and allows users to effortlessly mix existing garment designs as well as adjust for garment length and fit. Following each user specified editing operation we first compute a target 3D garment shape, one that maximally preserves the input garment's style-its proportions, fit and shape-subject to the modifications specified by the user. We then automatically compute 2D patterns that recreate the target garment shape when draped around the input mannequin within a user-selected simulation environment. To generate these patterns, we propose a fixed-point optimization scheme that compensates for the deformation due to the physical forces affecting the drape and is independent of the underlying simulation tool used. Our experiments show that this method quickly and reliably converges to patterns that, under simulation, form the desired target look, and works well with different black-box physical simulators. We demonstrate a range of edited and resimulated garments, and further validate our approach via expert and amateur critique, and comparisons to alternative solutions.
Fabrics play a significant role in many applications in design, prototyping, and entertainment. Recent fiber-based models capture the rich visual appearance of fabrics, but are too onerous to design and edit. Yarn-based procedural models are powerful and convenient, but too regular and not realistic enough in appearance. In this paper, we introduce an automatic fitting approach to create high-quality procedural yarn models of fabrics with fiber-level details. We fit CT data to procedural models to automatically recover a full range of parameters, and augment the models with a measurement-based model of flyaway fibers. We validate our fabric models against CT measurements and photographs, and demonstrate the utility of this approach for fabric modeling and editing.
While the concept of visual saliency has been previously explored in the areas of mesh and image processing, saliency detection also applies to other sensory stimuli. In this paper, we explore the problem of tactile mesh saliency, where we define salient points on a virtual mesh as those that a human is more likely to grasp, press, or touch if the mesh were a real-world object. We solve the problem of taking as input a 3D mesh and computing the relative tactile saliency of every mesh vertex. Since it is difficult to manually define a tactile saliency measure, we introduce a crowdsourcing and learning framework. It is typically easy for humans to provide relative rankings of saliency between vertices rather than absolute values. We thereby collect crowdsourced data of such relative rankings and take a learning-to-rank approach. We develop a new formulation to combine deep learning and learning-to-rank methods to compute a tactile saliency measure. We demonstrate our framework with a variety of 3D meshes and various applications including material suggestion for rendering and fabrication.
A typical crowd engine pipeline animates numerous moving characters according to a two-step process: global trajectories are generated by a crowd simulator, whereas full body motions are generated by animation engines. Because interactions are only considered at the first stage, animations sometimes lead to residual collisions and/or characters walking as if they were alone, showing no sign to the influence of others. In this paper, we investigate the value of adding shoulder motions to characters passing at close distances on the perceived visual quality of crowd animations (i.e., perceived residual collisions and animation naturalness). We present two successive perceptual experiments exploring this question where we investigate first, local interactions between two isolated characters, and second, crowd scenarios. The first experiment shows that shoulder motions have a strong positive effect on both perceived residual collisions and animation naturalness. The second experiment demonstrates that the effect of shoulder motions on animation naturalness is preserved in the context of crowd scenarios, even though the complexity of the scene is largely increased. Our general conclusion is that adding secondary motions in character interactions has a significant impact on the visual quality of crowd animations, with a very light impact on the computational cost of the whole animation pipeline. Our results advance crowd animation techniques by enhancing the simulation of complex interactions between crowd characters with simple secondary motion triggering techniques.
Realistic, metrically accurate, 3D human avatars are useful for games, shopping, virtual reality, and health applications. Such avatars are not in wide use because solutions for creating them from high-end scanners, low-cost range cameras, and tailoring measurements all have limitations. Here we propose a simple solution and show that it is surprisingly accurate. We use crowdsourcing to generate attribute ratings of 3D body shapes corresponding to standard linguistic descriptions of 3D shape. We then learn a linear function relating these ratings to 3D human shape parameters. Given an image of a new body, we again turn to the crowd for ratings of the body shape. The collection of linguistic ratings of a photograph provides remarkably strong constraints on the metric 3D shape. We call the process crowdshaping and show that our Body Talk system produces shapes that are perceptually indistinguishable from bodies created from high-resolution scans and that the metric accuracy is sufficient for many tasks. This makes body "scanning" practical without a scanner, opening up new applications including database search, visualization, and extracting avatars from books.
Everyone, from a shopper buying shoes to a doctor palpating a growth, uses their sense of touch to learn about the world. 3D printing is a powerful technology because it gives us the ability to control the haptic impression an object creates. This is critical for both replicating existing, real-world constructs and designing novel ones. However, each 3D printer has different capabilities and supports different materials, leaving us to ask: How can we best replicate a given haptic result on a particular output device? In this work, we address the problem of mapping a real-world material to its nearest 3D printable counterpart by constructing a perceptual model for the compliance of nonlinearly elastic objects. We begin by building a perceptual space from experimentally obtained user comparisons of twelve 3D-printed metamaterials. By comparing this space to a number of hypothetical computational models, we identify those that can be used to accurately and efficiently evaluate human-perceived differences in nonlinear stiffness. Furthermore, we demonstrate how such models can be applied to complex geometries in an interaction-aware way where the compliance is influenced not only by the material properties from which the object is made but also its geometry. We demonstrate several applications of our method in the context of fabrication and evaluate them in a series of user experiments.
Specular BRDF rendering traditionally approximates surface microstructure using a smooth normal distribution, but this ignores glinty effects, easily observable in the real world. While modeling the actual surface microstructure is possible, the resulting rendering problem is prohibitively expensive. Recently, Yan et al.  and Jakob et al.  made progress on this problem, but their approaches are still expensive and lack full generality in their material and illumination support. We introduce an efficient and general method that can be easily integrated in a standard rendering system. We treat a specular surface as a four-dimensional position-normal distribution, and fit this distribution using millions of 4D Gaussians, which we call elements. This leads to closed-form solutions to the required BRDF evaluation and sampling queries, enabling the first practical solution to rendering specular microstructure.
We introduce a Spatially-Varying BRDF model tailored to the multi-scale rendering of scratched materials such as metals, plastics or finished woods. Our approach takes advantage of the regular structure of scratch distributions to achieve high performance without compromising visual quality. We provide users with controls over the profile, micro-BRDF, density and orientation of scratches, while updating our material model at interactive rates. The BRDF for a single scratch is simulated using an optimized 2D ray-tracer and compactly stored in a three-component 2D texture. In contrast to existing models, our approach takes into account all interreflections inside a scratch, including Fresnel effects. At render time, the SV-BRDF for the scratch distribution under a pixel or ray footprint is obtained by linear combination of individual scratch BRDFs. We show how to evaluate it using both importance and light sampling, in direct and global illumination settings.
Modeling multiple scattering in microfacet theory is considered an important open problem because a non-negligible portion of the energy leaving rough surfaces is due to paths that bounce multiple times. In this paper we derive the missing multiple-scattering components of the popular family of BSDFs based on the Smith microsurface model. Our derivations are based solely on the original assumptions of the Smith model. We validate our BSDFs using raytracing simulations of explicit random Beckmann surfaces.
Our main insight is that the microfacet theory for surfaces with the Smith model can be derived as a special case of the microflake theory for volumes, with additional constraints to enforce the presence of a sharp interface, i.e. to transform the volume into a surface. We derive new free-path distributions and phase functions such that plane-parallel scattering from a microvolume with these distributions exactly produces the BSDF based on the Smith microsurface model, but with the addition of higher-order scattering.
With this new formulation, we derive multiple-scattering micro-facet BSDFs made of either diffuse, conductive, or dielectric material. Our resulting BSDFs are reciprocal, energy conserving, and support popular anisotropic parametric normal distribution functions such as Beckmann and GGX. While we do not provide closed-form expressions for the BSDFs, they are mathematically well-defined and can be evaluated at arbitrary precision. We show how to practically use them with Monte Carlo physically based rendering algorithms by providing analytic importance sampling and unbiased stochastic evaluation. Our implementation is analytic and does not use per-BSDF precomputed data, which makes our BSDFs usable with textured albedos, roughness, and anisotropy.
While 3D movies are gaining popularity, viewers in a 3D cinema still need to wear cumbersome glasses in order to enjoy them. Automultiscopic displays provide a better alternative to the display of 3D content, as they present multiple angular images of the same scene without the need for special eyewear. However, automultiscopic displays cannot be directly implemented in a wide cinema setting due to variants of two main problems: (i) The range of angles at which the screen is observed in a large cinema is usually very wide, and there is an unavoidable tradeoff between the range of angular images supported by the display and its spatial or angular resolutions. (ii) Parallax is usually observed only when a viewer is positioned at a limited range of distances from the screen. This work proposes a new display concept, which supports automultiscopic content in a wide cinema setting. It builds on the typical structure of cinemas, such as the fixed seat positions and the fact that different rows are located on a slope at different heights. Rather than attempting to display many angular images spanning the full range of viewing angles in a wide cinema, our design only displays the narrow angular range observed within the limited width of a single seat. The same narrow range content is then replicated to all rows and seats in the cinema. To achieve this, it uses an optical construction based on two sets of parallax barriers, or lenslets, placed in front of a standard screen. This paper derives the geometry of such a display, analyzes its limitations, and demonstrates a proof-of-concept prototype.
We propose a see-through additive light field display as a novel type of compressive light field display. We utilize holographic optical elements (HOEs) as transparent additive layers. The HOE layers are almost free from diffraction unlike spatial light modulator layers, which makes this additive light field display more advantageous when modifying the number of layers, thickness, and pixel density compared with conventional compressive displays. Meanwhile, the additive light field display maintains advantages of compressive light field displays. The proposed additive light field display shows bright and full-color volumetric images in high definition. In addition, users can view real-world scenes beyond the displays. Hence, we expect that our method can contribute to the realization of augmented reality. Here, we describe implementation of a prototype additive light field display with two additive layers, evaluate the performance of transparent HOE layers, describe several results of display experiments, discuss the diffraction effect of spatial light modulators, and analyze the ability of the additive light field display to express uncorrelated light fields.
When designing trajectories for quadrotor cameras, it is important that the trajectories respect the dynamics and physical limits of quadrotor hardware. We refer to such trajectories as being feasible. In this paper, we introduce a fast and user-friendly algorithm for generating feasible quadrotor camera trajectories. Our algorithm takes as input an infeasible trajectory designed by a user, and produces as output a feasible trajectory that is as similar as possible to the user's input. By design, our algorithm does not change the spatial layout or visual contents of the input trajectory. Instead, our algorithm guarantees the feasibility of the output trajectory by re-timing the input trajectory, perturbing its timing as little as possible while remaining within velocity and control force limits. Our choice to perturb the timing of a shot, while leaving the spatial layout and visual contents of the shot intact, leads to a well-behaved non-convex optimization problem that can be solved at interactive rates.
We implement our algorithm in an open-source tool for designing quadrotor camera shots, where we achieve interactive performance across a wide range of camera trajectories. We demonstrate that our algorithm is between 25x and 45x faster than a spacetime constraints approach implemented using a commercially available solver. As we scale to more finely discretized trajectories, this performance gap widens, with our algorithm outperforming spacetime constraints by between 90x and 180x. Finally, we fly 5 feasible trajectories generated by our algorithm on a real quadrotor camera, producing video footage that is faithful to Google Earth shot previews, even when the trajectories are at the quadrotor's physical limits.
Rotoscoping (cutting out different characters/objects/layers in raw video footage) is a ubiquitous task in modern post-production and represents a significant investment in person-hours. In this work, we study the particular task of professional rotoscoping for high-end, live action movies and propose a new framework that works with roto-artists to accelerate the workflow and improve their productivity. Working with the existing keyframing paradigm, our first contribution is the development of a shape model that is updated as artists add successive keyframes. This model is used to improve the output of traditional interpolation and tracking techniques, reducing the number of keyframes that need to be specified by the artist. Our second contribution is to use the same shape model to provide a new interactive tool that allows an artist to reduce the time spent editing each keyframe. The more keyframes that are edited, the better the interactive tool becomes, accelerating the process and making the artist more efficient without compromising their control. Finally, we also provide a new, professionally rotoscoped dataset that enables truly representative, real-world evaluation of rotoscoping methods. We used this dataset to perform a number of experiments, including an expert study with professional roto-artists, to show, quantitatively, the advantages of our approach.
This paper presents Rich360, a novel system for creating and viewing a 360° panoramic video obtained from multiple cameras placed on a structured rig. Rich360 provides an as-rich-as-possible 360° viewing experience by effectively resolving two issues that occur in the existing pipeline. First, a deformable spherical projection surface is utilized to minimize the parallax from multiple cameras. The surface is deformed spatio-temporally according to the depth constraints estimated from the overlapping video regions. This enables fast and efficient parallax-free stitching independent of the number of views. Next, a non-uniform spherical ray sampling is performed. The density of the sampling varies depending on the importance of the image region. Finally, for interactive viewing, the non-uniformly sampled video is mapped onto a uniform viewing sphere using a UV map. This approach can preserve the richness of the input videos when the resolution of the final 360° panoramic video is smaller than the overall resolution of the input videos, which is the case for most 360° panoramic videos. We show various results from Rich360 to demonstrate the richness of the output video and the advancement in the stitching results.
Real walking offers higher immersive presence for virtual reality (VR) applications than alternative locomotive means such as walking-in-place and external control gadgets, but needs to take into consideration different room sizes, wall shapes, and surrounding objects in the virtual and real worlds. Despite perceptual study of impossible spaces and redirected walking, there are no general methods to match a given pair of virtual and real scenes.
We propose a system to match a given pair of virtual and physical worlds for immersive VR navigation. We first compute a planar map between the virtual and physical floor plans that minimizes angular and distal distortions while conforming to the virtual environment goals and physical environment constraints. Our key idea is to design maps that are globally surjective to allow proper folding of large virtual scenes into smaller real scenes but locally injective to avoid locomotion ambiguity and intersecting virtual objects. From these maps we derive altered rendering to guide user navigation within the physical environment while retaining visual fidelity to the virtual environment. Our key idea is to properly warp the virtual world appearance into real world geometry with sufficient quality and performance. We evaluate our method through a formative user study, and demonstrate applications in gaming, architecture walkthrough, and medical imaging.
We extend parametric texture synthesis to capture rich, spatially varying parametric reflectance models from a single image. Our input is a single head-lit flash image of a mostly flat, mostly stationary (textured) surface, and the output is a tile of SVBRDF parameters that reproduce the appearance of the material. No user intervention is required. Our key insight is to make use of a recent, powerful texture descriptor based on deep convolutional neural network statistics for "softly" comparing the model prediction and the examplars without requiring an explicit point-to-point correspondence between them. This is in contrast to traditional reflectance capture that requires pointwise constraints between inputs and outputs under varying viewing and lighting conditions. Seen through this lens, our method is an indirect algorithm for fitting photorealistic SVBRDFs. The problem is severely ill-posed and non-convex. To guide the optimizer towards desirable solutions, we introduce a soft Fourier-domain prior for encouraging spatial stationarity of the reflectance parameters and their correlations, and a complementary preconditioning technique that enables efficient exploration of such solutions by L-BFGS, a standard non-linear numerical optimizer.
Reality is the most realistic representation. We introduce a material display called ZoeMatrope that can reproduce a variety of materials with high resolution, dynamic range and light field reproducibility by using compositing and animation principles used in a zoetrope and a thaumatrope. With ZoeMatrope, the quality of the material is equivalent to that of real objects and the range of expressible materials is diversified by overlaying a set of base materials in a linear combination. ZoeMatrope is also able to express spatially-varying materials, and even augmented materials such as materials with an alpha channel. In this paper, we propose a method for selecting the optimal material set and determining the weights of the linear combination to reproduce a wide range of target materials properly. We also demonstrate the effectiveness of this approach with the developed system and show the results for various materials.
The visual quality of a motion picture is significantly influenced by the choice of the presentation frame rate. Increasing the frame rate improves the clarity of the image and helps to alleviate many artifacts, such as blur, strobing, flicker, or judder. These benefits, however, come at the price of losing well-established film aesthetics, often referred to as the "cinematic look". Current technology leaves artists with a sparse set of choices, e.g., 24 Hz or 48 Hz, limiting the freedom in adjusting the frame rate to artistic needs, content, and display technology. In this paper, we solve this problem by proposing a novel filtering technique which enables emulating the whole spectrum of presentation frame rates on a single-frame-rate display. The key component of our technique is a set of simple yet powerful filters calibrated and evaluated in psychophysical experiments. By varying their parameters we can achieve an impression of continuously varying presentation frame rate in both the spatial and temporal dimensions. This allows artists to achieve the best balance between the aesthetics and the objective quality of the motion picture. Furthermore, we show how our technique, informed by cinematic guidelines, can adapt to the content and achieve this balance automatically.
Producing a high quality stereoscopic impression on current displays is a challenging task. The content has to be carefully prepared in order to maintain visual comfort, which typically affects the quality of depth reproduction. In this work, we show that this problem can be significantly alleviated when the eye fixation regions can be roughly estimated. We propose a new method for stereoscopic depth adjustment that utilizes eye tracking or other gaze prediction information. The key idea that distinguishes our approach from the previous work is to apply gradual depth adjustments at the eye fixation stage, so that they remain unnoticeable. To this end, we measure the limits imposed on the speed of disparity changes in various depth adjustment scenarios, and formulate a new model that can guide such seamless stereoscopic content processing. Based on this model, we propose a real-time controller that applies local manipulations to stereoscopic content to find the optimum between depth reproduction and visual comfort. We show that the controller is mostly immune to the limitations of low-cost eye tracking solutions. We also demonstrate benefits of our model in off-line applications, such as stereoscopic movie production, where skillful directors can reliably guide and predict viewers' attention or where attended image regions are identified during eye tracking sessions. We validate both our model and the controller in a series of user experiments. They show significant improvements in depth perception without sacrificing the visual quality when our techniques are applied.
This paper proposes multi-view display using a digital light processing (DLP) projector and new active shutter glasses. In conventional stereoscopic active shutter systems, active shutter glasses have a 0--1 (open and closed) state, and the right and left frames are temporally divided. However, this causes the display to flicker because the human eye perceives the appearance of black frames when the other shutter is closing. Furthermore, it is difficult to increase the number of views because the number of frames representing images is also divided. We solve these problems by extending the active shutter beyond the use of the 0--1 state to a continuous range of states [0, 1] instead. This relaxation leads to the formulation of a new DLP imaging model and an optimization problem. The special structure of DLP binary imaging and the continuous transmittance of the new active shutter glasses require the solution of a binary continuous image decomposition problem. Although it contains NP-hard problems, the proposed algorithm can efficiently solve the problem. The implementation of our imaging system requires the development of an active shutter device with continuous transmittance. We implemented the control of the transmittance of the liquid crystal display (LCD) shutter by using a pulse-width modulation (PWM). A simulation and the developed multi-view display system were used to show that our model can represent multi-view images more accurately than the conventional time-division 0-1 active shutter system.
Approximately 250 million people suffer from color vision deficiency (CVD). They can hardly share the same visual content with normal-vision audiences. In this paper, we propose the first system that allows CVD and normal-vision audiences to share the same visual content simultaneously. The key that we can achieve this is because the ordinary stereoscopic display (non-autostereoscopic ones) offers users two visual experiences (with and without wearing stereoscopic glasses). By allocating one experience to CVD audiences and one to normal-vision audiences, we allow them to share. The core problem is to synthesize an image pair, that when they are presented binocularly, CVD audiences can distinguish the originally indistinguishable colors; and when it is in monocular presentation, normal-vision audiences cannot distinguish its difference from the original image. We solve the image-pair recoloring problem by optimizing an objective function that minimizes the color deviation for normal-vision audiences, and maximizes the color distinguishability and binocular fusibility for CVD audiences. Our method is extensively evaluated via multiple quantitative experiments and user studies. Convincing results are obtained in all our test cases.
This article defines a new way to perform intuitive and geometrically faithful regressions on histogram-valued data. It leverages the theory of optimal transport, and in particular the definition of Wasserstein barycenters, to introduce for the first time the notion of barycentric coordinates for histograms. These coordinates take into account the underlying geometry of the ground space on which the histograms are defined, and are thus particularly meaningful for applications in graphics to shapes, color or material modification. Beside this abstract construction, we propose a fast numerical optimization scheme to solve this backward problem (finding the barycentric coordinates of a given histogram) with a low computational overhead with respect to the forward problem (computing the barycenter). This scheme relies on a backward algorithmic differentiation of the Sinkhorn algorithm which is used to optimize the entropic regularization of Wasserstein barycenters. We showcase an illustrative set of applications of these Wasserstein coordinates to various problems in computer graphics: shape approximation, BRDF acquisition and color editing.
Many shape and image processing tools rely on computation of correspondences between geometric domains. Efficient methods that stably extract "soft" matches in the presence of diverse geometric structures have proven to be valuable for shape retrieval and transfer of labels or semantic information. With these applications in mind, we present an algorithm for probabilistic correspondence that optimizes an entropy-regularized Gromov-Wasserstein (GW) objective. Built upon recent developments in numerical optimal transportation, our algorithm is compact, provably convergent, and applicable to any geometric domain expressible as a metric measure matrix. We provide comprehensive experiments illustrating the convergence and applicability of our algorithm to a variety of graphics tasks. Furthermore, we expand entropic GW correspondence to a framework for other matching problems, incorporating partial distance matrices, user guidance, shape exploration, symmetry detection, and joint analysis of more than two domains. These applications expand the scope of entropic GW correspondence to major shape analysis problems and are stable to distortion and noise.
Point cloud registration is a fundamental task in computer graphics, and more specifically, in rigid and non-rigid shape matching. The rigid shape matching problem can be formulated as the problem of simultaneously aligning and labelling two point clouds in 3D so that they are as similar as possible. We name this problem the Procrustes matching (PM) problem. The non-rigid shape matching problem can be formulated as a higher dimensional PM problem using the functional maps method. High dimensional PM problems are difficult non-convex problems which currently can only be solved locally using iterative closest point (ICP) algorithms or similar methods. Good initialization is crucial for obtaining a good solution.
We introduce a novel and efficient convex SDP (semidefinite programming) relaxation for the PM problem. The algorithm is guaranteed to return a correct global solution of the problem when matching two isometric shapes which are either asymmetric or bilaterally symmetric.
We show our algorithm gives state of the art results on popular shape matching datasets. We also show that our algorithm gives state of the art results for anatomical classification of shapes. Finally we demonstrate the power of our method in aligning shape collections.
This paper presents a method for bijective parametrization of 2D and 3D objects over canonical domains. While a range of solutions for the two-dimensional case are well-known, our method guarantees bijectivity of mappings also for a large, combinatorially-defined class of tetrahedral meshes (shellable meshes). The key concept in our method is the piecewise-linear (PL) foliation, decomposing the mesh into one-dimensional submanifolds and reducing the mapping problem to parametrization of a lower-dimensional manifold (a foliation section). The maps resulting from these foliations are proved to be bijective and continuous, and shown to have provably bijective PL approximations. We describe exact, numerically robust evaluation methods and demonstrate our implementation's capabilities on a large variety of meshes.
The ability to identify objects or region correspondences between consecutive frames of a given hand-drawn animation sequence is an indispensable tool for automating animation modification tasks such as sequence-wide recoloring or shape-editing of a specific animated character. Existing correspondence identification methods heavily rely on appearance features, but these features alone are insufficient to reliably identify region correspondences when there exist occlusions or when two or more objects share similar appearances. To resolve the above problems, manual assistance is often required. In this paper, we propose a new correspondence identification method which considers both appearance features and motions of regions in a global manner. We formulate correspondence likelihoods between temporal region pairs as a network flow graph problem which can be solved by a well-established optimization algorithm. We have evaluated our method with various animation sequences and results show that our method consistently outperforms the state-of-the-art methods without any user guidance.
Most fluid scenarios in graphics have a high Reynolds number, where viscosity is dominated by inertial effects, thus most solvers drop viscosity altogether: numerical damping from coarse grids is generally stronger than physical viscosity while resembling it in character. However, viscosity remains crucial near solid boundaries, in the boundary layer, to a large extent determining the look of the flow as a function of Reynolds number. Typical graphics simulations do not resolve boundary layer dynamics, so their look is determined mostly by numerical errors with the given grid size and time step, rather than physical parameters. We introduce two complementary techniques to capture boundary layer dynamics, bringing more physical control and predictability. We extend the FLIP particle-grid method with viscous particle strength exchange[Rivoalen and Huberson 2001] to better transfer momentum at solid boundaries, dubbed VFLIP. We also introduce Weakly Higher Resolution Regional Projection (WHIRP), a cheap and simple way to increase grid resolution where important by overlaying high resolution grids on the global coarse grid.
We describe a new approach for the purely Eulerian simulation of incompressible fluids. In it, the fluid state is represented by a C2-valued wave function evolving under the Schrödinger equation subject to incompressibility constraints. The underlying dynamical system is Hamiltonian and governed by the kinetic energy of the fluid together with an energy of Landau-Lifshitz type. The latter ensures that dynamics due to thin vortical structures, all important for visual simulation, are faithfully reproduced. This enables robust simulation of intricate phenomena such as vortical wakes and interacting vortex filaments, even on modestly sized grids. Our implementation uses a simple splitting method for time integration, employing the FFT for Schrödinger evolution as well as constraint projection. Using a standard penalty method we also allow arbitrary obstacles. The resulting algorithm is simple, unconditionally stable, and efficient. In particular it does not require any Lagrangian techniques for advection or to counteract the loss of vorticity. We demonstrate its use in a variety of scenarios, compare it with experiments, and evaluate it against benchmark tests. A full implementation is included in the ancillary materials.
We propose a novel surface-only technique for simulating incompressible, inviscid and uniform-density liquids with surface tension in three dimensions. The liquid surface is captured by a triangle mesh on which a Lagrangian velocity field is stored. Because advection of the velocity field may violate the incompressibility condition, we devise an orthogonal projection technique to remove the divergence while requiring the evaluation of only two boundary integrals. The forces of surface tension, gravity, and solid contact are all treated by a boundary element solve, allowing us to perform detailed simulations of a wide range of liquid phenomena, including waterbells, droplet and jet collisions, fluid chains, and crown splashes.
This work extends existing multiphase-fluid SPH frameworks to cover solid phases, including deformable bodies and granular materials. In our extended multiphase SPH framework, the distribution and shapes of all phases, both fluids and solids, are uniformly represented by their volume fraction functions. The dynamics of the multiphase system is governed by conservation of mass and momentum within different phases. The behavior of individual phases and the interactions between them are represented by corresponding constitutive laws, which are functions of the volume fraction fields and the velocity fields. Our generalized multiphase SPH framework does not require separate equations for specific phases or tedious interface tracking. As the distribution, shape and motion of each phase is represented and resolved in the same way, the proposed approach is robust, efficient and easy to implement. Various simulation results are presented to demonstrate the capabilities of our new multiphase SPH framework, including deformable bodies, granular materials, interaction between multiple fluids and deformable solids, flow in porous media, and dissolution of deformable solids.
We propose a unified motion planner that reproduces variations in swimming styles based on the differences in the fish skeletal structures or the variations in the swimming styles based on changes in environmental conditions. The key idea in our method, based on biology, is the following. We considered the common decision-making mechanism in fish that allows them to instantly decide "where and how to swim." The unified motion planner comprises two stages. In the first stage, where to swim to is decided. Using a probability distribution generated by integrating the perceptual information, the short-term target position and target speed are decided. In the second stage, how to swim is decided. A style of swimming that matches the information for transitioning from the current speed to the target speed is selected. Using the proposed method, we demonstrate 12 types of CG models with completely different sizes and skeletal structures, such as manta ray, tuna, and boxfish, as well as a scene where a school of a few thousand fish swim realistically. Our method is easy to integrate into existing graphics pipelines. In addition, in our method, the movement characteristics can easily be changed by adjusting the parameters. The method also has a feature where the expression of an entire school of fish, such as tornado or circling, can be designated top-down.
Reinforcement learning offers a promising methodology for developing skills for simulated characters, but typically requires working with sparse hand-crafted features. Building on recent progress in deep reinforcement learning (DeepRL), we introduce a mixture of actor-critic experts (MACE) approach that learns terrain-adaptive dynamic locomotion skills using high-dimensional state and terrain descriptions as input, and parameterized leaps or steps as output actions. MACE learns more quickly than a single actor-critic approach and results in actor-critic experts that exhibit specialization. Additional elements of our solution that contribute towards efficient learning include Boltzmann exploration and the use of initial actor biases to encourage specialization. Results are demonstrated for multiple planar characters and terrain classes.
High quality locomotion is key to achieving believable character animation, but is often modeled as a generic stepping motion between two locations. In practice, locomotion often has task-specific characteristics and can exhibit a rich vocabulary of step types, including side steps, toe pivots, heel pivots, and intentional foot slides. We develop a model for such types of behaviors, based on task-specific foot-step plans that act as motion templates. The footstep plans are invoked and optimized at interactive rates and then serve as the basis for producing full body motion. We demonstrate the production of high-quality motions for three tasks: whiteboard writing, moving boxes, and sitting behaviors. The model enables retargeting to characters of varying proportions by yielding motion plans that are appropriately tailored to these proportions. We also show how the task effort or duration can be taken into account, yielding coarticulation behaviors.
The Halide image processing language has proven to be an effective system for authoring high-performance image processing code. Halide programmers need only provide a high-level strategy for mapping an image processing pipeline to a parallel machine (a schedule), and the Halide compiler carries out the mechanical task of generating platform-specific code that implements the schedule. Unfortunately, designing high-performance schedules for complex image processing pipelines requires substantial knowledge of modern hardware architecture and code-optimization techniques. In this paper we provide an algorithm for automatically generating high-performance schedules for Halide programs. Our solution extends the function bounds analysis already present in the Halide compiler to automatically perform locality and parallelism-enhancing global program transformations typical of those employed by expert Halide developers. The algorithm does not require costly (and often impractical) auto-tuning, and, in seconds, generates schedules for a broad set of image processing benchmarks that are performance-competitive with, and often better than, schedules manually authored by expert Halide developers on server and mobile CPUs, as well as GPUs.
Computational photography systems are becoming increasingly diverse, while computational resources---for example on mobile platforms---are rapidly increasing. As diverse as these camera systems may be, slightly different variants of the underlying image processing tasks, such as demosaicking, deconvolution, denoising, inpainting, image fusion, and alignment, are shared between all of these systems. Formal optimization methods have recently been demonstrated to achieve state-of-the-art quality for many of these applications. Unfortunately, different combinations of natural image priors and optimization algorithms may be optimal for different problems, and implementing and testing each combination is currently a time-consuming and error-prone process. ProxImaL is a domain-specific language and compiler for image optimization problems that makes it easy to experiment with different problem formulations and algorithm choices. The language uses proximal operators as the fundamental building blocks of a variety of linear and nonlinear image formation models and cost functions, advanced image priors, and noise models. The compiler intelligently chooses the best way to translate a problem formulation and choice of optimization algorithm into an efficient solver implementation. In applications to the image processing pipeline, deconvolution in the presence of Poisson-distributed shot noise, and burst denoising, we show that a few lines of ProxImaL code can generate highly efficient solvers that achieve state-of-the-art results. We also show applications to the nonlinear and nonconvex problem of phase retrieval.
Image processing algorithms implemented using custom hardware or FPGAs of can be orders-of-magnitude more energy efficient and performant than software. Unfortunately, converting an algorithm by hand to a hardware description language suitable for compilation on these platforms is frequently too time consuming to be practical. Recent work on hardware synthesis of high-level image processing languages demonstrated that a single-rate pipeline of stencil kernels can be synthesized into hardware with provably minimal buffering. Unfortunately, few advanced image processing or vision algorithms fit into this highly-restricted programming model.
In this paper, we present Rigel, which takes pipelines specified in our new multi-rate architecture and lowers them to FPGA implementations. Our flexible multi-rate architecture supports pyramid image processing, sparse computations, and space-time implementation tradeoffs. We demonstrate depth from stereo, Lucas-Kanade, the SIFT descriptor, and a Gaussian pyramid running on two FPGA boards. Our system can synthesize hardware for FPGAs with up to 436 Megapixels/second throughput, and up to 297x faster runtime than a tablet-class ARM CPU.
We present a computational method for designing wire sculptures consisting of interlocking wires. Our method allows the computation of aesthetically pleasing structures that are structurally stable, efficiently fabricatable with a 2D wire bending machine, and assemblable without the need of additional connectors. Starting from a set of planar contours provided by the user, our method automatically tests for the feasibility of a design, determines a discrete ordering of wires at intersection points, and optimizes for the rest shape of the individual wires to maximize structural stability under frictional contact. In addition to their application to art, wire sculptures present an extremely efficient and fast alternative for low-fidelity rapid prototyping because manufacturing time and required material linearly scales with the physical size of objects. We demonstrate the effectiveness of our approach on a varied set of examples, all of which we fabricated.
In this paper we present a novel method for non-linear shape optimization of 3d objects given by their surface representation. Our method takes advantage of the fact that various shape properties of interest give rise to underdetermined design spaces implying the existence of many good solutions. Our algorithm exploits this by performing iterative projections of the problem to local subspaces where it can be solved much more efficiently using standard numerical routines. We demonstrate how this approach can be utilized for various shape optimization tasks using different shape parameterizations. In particular, we show how to efficiently optimize natural frequencies, mass properties, as well as the structural yield strength of a solid body. Our method is flexible, easy to implement, and very fast.
Acoustic filters have a wide range of applications, yet customizing them with desired properties is difficult. Motivated by recent progress in additive manufacturing that allows for fast prototyping of complex shapes, we present a computational approach that automates the design of acoustic filters with complex geometries. In our approach, we construct an acoustic filter comprised of a set of parameterized shape primitives, whose transmission matrices can be precomputed. Using an efficient method of simulating the transmission matrix of an assembly built from these underlying primitives, our method is able to optimize both the arrangement and the parameters of the acoustic shape primitives in order to satisfy target acoustic properties of the filter. We validate our results against industrial laboratory measurements and high-quality off-line simulations. We demonstrate that our method enables a wide range of applications including muffler design, musical wind instrument prototyping, and encoding imperceptible acoustic information into everyday objects.
We present a computational method for interactive 3D design and rationalization of surfaces via auxetic materials, i.e., flat flexible material that can stretch uniformly up to a certain extent. A key motivation for studying such material is that one can approximate doubly-curved surfaces (such as the sphere) using only flat pieces, making it attractive for fabrication. We physically realize surfaces by introducing cuts into approximately inextensible material such as sheet metal, plastic, or leather. The cutting pattern is modeled as a regular triangular linkage that yields hexagonal openings of spatially-varying radius when stretched. In the same way that isometry is fundamental to modeling developable surfaces, we leverage conformal geometry to understand auxetic design. In particular, we compute a global conformal map with bounded scale factor to initialize an otherwise intractable non-linear optimization. We demonstrate that this global approach can handle non-trivial topology and non-local dependencies inherent in auxetic material. Design studies and physical prototypes are used to illustrate a wide range of possible applications.
A reconfigurable is an object or collection of objects whose transformation between various states defines its functionality or aesthetic appeal. For example, consider a mechanical assembly composed of interlocking pieces, a transforming folding bicycle, or a space-saving arrangement of apartment furniture. Unlike traditional computer-aided design of static objects, specialized tools are required to address problems unique to the computational design and revision of objects undergoing rigid transformations. Collisions and interpenetrations as objects transition from one configuration to another prevent the physical realization of a design. We present a software environment intended to support fluid interactive design of reconfigurables, featuring tools that identify, visualize, monitor and resolve infeasible configurations. We demonstrate the versatility of the environment on a number of examples spanning mechanical systems, urban dwelling, and interlocking puzzles, some of which we then realize via additive manufacturing.
Spatial-temporal information about collisions between objects is presented to the designer according to a cascading order of precedence. A designer may quickly determine when, and then where, and then how objects are colliding. This precedence guides the design and implementation of our four-dimensional spacetime bounding volume hierarchy for interactive-rate collision detection. On screen, the designer experiences a suite of interactive visualization and monitoring tools during editing: timeline notifications of new collisions, picture-in-picture windows for tracking collisions and suggestive hints for contact resolution. Contacts too tedious to remove manually can be eliminated automatically via our proposed constrained numerical optimization and swept-volume carving.
Collaborative systems are well established solutions for sharing work among people. In computer graphics these workflows are still not well established, compared to what is done for text writing or software development. Usually artists work alone and share their final models by sending files. In this paper we present a system for collaborative 3D digital sculpting. In our prototype, multiple artists concurrently sculpt a polygonal mesh on their local machines by changing its vertex properties, such as positions and material BRDFs. Our system shares the artists' edits automatically and seamlessly merges these edits even when they happen on the same region of the surface. We propose a merge algorithm that is fast-enough for seamless collaboration, respects users' edits as much as possible, can support any sculpting operation, and works for both geometry and appearance modifications. Since in sculpting artists alternatively perform fine adjustments and large scale modifications, our algorithm is based on a multiresolution edit representation that handles concurrent overlapping edits at different scales. We tested our algorithm by modeling meshes collaboratively in different sculpting sessions and found that our algorithm outperforms prior works on collaborative mesh editing in all cases.
We present an approach to example-based stylization of 3D renderings that better preserves the rich expressiveness of hand-created artwork. Unlike previous techniques, which are mainly guided by colors and normals, our approach is based on light propagation in the scene. This novel type of guidance can distinguish among context-dependent illumination effects, for which artists typically use different stylization techniques, and delivers a look closer to realistic artwork. In addition, we demonstrate that the current state of the art in guided texture synthesis produces artifacts that can significantly decrease the fidelity of the synthesized imagery, and propose an improved algorithm that alleviates them. Finally, we demonstrate our method's effectiveness on a variety of scenes and styles, in applications like interactive shading study or autocompletion.
We present an interactive method that manipulates perceived object shape from a single input color image thanks to a warping technique implemented on the GPU. The key idea is to give the illusion of shape sharpening or rounding by exaggerating orientation patterns in the image that are strongly correlated to surface curvature. We build on a growing literature in both human and computer vision showing the importance of orientation patterns in the communication of shape, which we complement with mathematical relationships and a statistical image analysis revealing that structure tensors are indeed strongly correlated to surface shape features. We then rely on these correlations to introduce a flow-guided image warping algorithm, which in effect exaggerates orientation patterns involved in shape perception. We evaluate our technique by 1) comparing it to ground truth shape deformations, and 2) performing two perceptual experiments to assess its effects. Our algorithm produces convincing shape manipulation results on synthetic images and photographs, for various materials and lighting environments.
People may look dramatically different by changing their hair color, hair style, when they grow older, in a different era style, or a different country or occupation. Some of those may transfigure appearance and inspire creative changes, some not, but how would we know without physically trying? We present a system that enables automatic synthesis of limitless numbers of appearances. A user inputs one or more photos (as many as they like) of his or her face, text queries an appearance of interest (just like they'd search an image search engine) and gets as output the input person in the queried appearance. Rather than fixing the number of queries or a dataset our system utilizes all the relevant and searchable images on the Internet, estimates a doppelgänger set for the inputs, and utilizes it to generate composites. We present a large number of examples on photos taken with completely unconstrained imaging conditions.
This paper explores methods for synthesizing physics-based bubble sounds directly from two-phase incompressible simulations of bubbly water flows. By tracking fluid-air interface geometry, we identify bubble geometry and topological changes due to splitting, merging and popping. A novel capacitance-based method is proposed that can estimate volume-mode bubble frequency changes due to bubble size, shape, and proximity to solid and air interfaces. Our acoustic transfer model is able to capture cavity resonance effects due to near-field geometry, and we also propose a fast precomputed bubble-plane model for cheap transfer evaluation. In addition, we consider a bubble forcing model that better accounts for bubble entrainment, splitting, and merging events, as well as a Helmholtz resonator model for bubble popping sounds. To overcome frequency bandwidth limitations associated with coarse resolution fluid grids, we simulate micro-bubbles in the audio domain using a power-law model of bubble populations. Finally, we present several detailed examples of audiovisual water simulations and physical experiments to validate our frequency model.
When aiming to seamlessly integrate a fluid simulation into a larger scenario (like an open ocean), careful attention must be paid to boundary conditions. In particular, one must implement special "non-reflecting" boundary conditions, which dissipate out-going waves as they exit the simulation. Unfortunately, the state of the art in non-reflecting boundary conditions (perfectly-matched layers, or PMLs) only permits trivially simple inflow/outflow conditions, so there is no reliable way to integrate a fluid simulation into a more complicated environment like a stormy ocean or a turbulent river.
This paper introduces the first method for combining non-reflecting boundary conditions based on PMLs with inflow/outflow boundary conditions that vary arbitrarily throughout space and time. Our algorithm is a generalization of state-of-the-art mean-flow boundary conditions in the computational fluid dynamics literature, and it allows for seamless integration of a fluid simulation into much more complicated environments. Our method also opens the door for previously-unseen post-process effects like retroactively changing the location of solid obstacles, and locally increasing the visual detail of a pre-existing simulation.
Fluid animation methods based on Eulerian grids have long struggled to resolve flows involving narrow gaps and thin solid features. Past approaches have artificially inflated or voxelized boundaries, although this sacrifices the correct geometry and topology of the fluid domain and prevents flow through narrow regions. We present a boundary-respecting fluid simulator that overcomes these challenges. Our solution is to intersect the solid boundary geometry with the cells of a background regular grid to generate a topologically correct, boundary-conforming cut-cell mesh. We extend both pressure projection and velocity advection to support this enhanced grid structure. For pressure projection, we introduce a general graph-based scheme that properly preserves discrete incompressibility even in thin and topologically complex flow regions, while nevertheless yielding symmetric positive definite linear systems. For advection, we exploit polyhedral interpolation to improve the degree to which the flow conforms to irregular and possibly non-convex cell boundaries, and propose a modified PIC/FLIP advection scheme to eliminate the need to inaccurately reinitialize invalid cells that are swept over by moving boundaries. The method naturally extends the standard Eulerian fluid simulation framework, and while we focus on thin boundaries, our contributions are beneficial for volumetric solids as well. Our results demonstrate successful one-way fluid-solid coupling in the presence of thin objects and narrow flow regions even on very coarse grids.
Filigrees are thin patterns found in jewelry, ornaments and lace fabrics. They are often formed of repeated base elements manually composed into larger, delicate patterns. Digital fabrication simplifies the process of turning a virtual model of a filigree into a physical object. However, designing a virtual model of a filigree remains a time consuming and challenging task. The difficulty lies in tightly packing together the base elements while covering a target surface. In addition, the filigree has to be well connected and sufficiently robust to be fabricated. We propose a novel approach automating this task. Our technique covers a target surface with a set of input base elements, forming a filigree strong enough to be fabricated. We exploit two properties of filigrees to make this possible. First, as filigrees form delicate traceries they are well captured by their skeleton. This affords for a simpler definition of operators such as matching and deformation. Second, instead of seeking for a perfect packing of the base elements we relax the problem by allowing appearance preserving partial overlaps. We optimize a filigree by a stochastic search, further improved by a novel boosting algorithm that records and reuses good configurations discovered during the process.
We illustrate our technique on a number of challenging examples reproducing filigrees on large objects, which we manufacture by 3D printing. Our technique affords for several user controls, such as the scale and orientation of the elements.
We present a computational tool for designing ornamental curve networks---structurally-sound physical surfaces with user-controlled aesthetics. In contrast to approaches that leverage texture synthesis for creating decorative surface patterns, our method relies on user-defined spline curves as central design primitives. More specifically, we build on the physically-inspired metaphor of an embedded elastic curve that can move on a smooth surface, deform, and connect with other curves. We formalize this idea as a globally coupled energy-minimization problem, discretized with piece-wise linear curves that are optimized in the parametric space of a smooth surface. Building on this technical core, we propose a set of interactive design and editing tools that we demonstrate on manually-created layouts and semi-automated deformable packings. In order to prevent excessive compliance, we furthermore propose a structural analysis tool that uses eigenanalysis to identify potentially large deformations between geodesically-close curves and guide the user in strengthening the corresponding regions. We used our approach to create a variety of designs in simulation, validated with a set of 3D-printed physical prototypes.
We develop a new kind of "space-filling" curves, connected Fermat spirals, and show their compelling properties as a tool path fill pattern for layered fabrication. Unlike classical space-filling curves such as the Peano or Hilbert curves, which constantly wind and bind to preserve locality, connected Fermat spirals are formed mostly by long, low-curvature paths. This geometric property, along with continuity, influences the quality and efficiency of layered fabrication. Given a connected 2D region, we first decompose it into a set of sub-regions, each of which can be filled with a single continuous Fermat spiral. We show that it is always possible to start and end a Fermat spiral fill at approximately the same location on the outer boundary of the filled region. This special property allows the Fermat spiral fills to be joined systematically along a graph traversal of the decomposed sub-regions. The result is a globally continuous curve. We demonstrate that printing 2D layers following tool paths as connected Fermat spirals leads to efficient and quality fabrication, compared to conventional fill patterns.
Traditional 3D printers fabricate objects by depositing material to build up the model layer by layer. Instead printing only wireframes can reduce printing time and the cost of material while producing effective depictions of shape. However, wireframe printing requires the printer to undergo arbitrary 3D motions, rather than slice-wise 2D motions, which can lead to collisions with already-printed parts of the model. Previous work has either limited itself to restricted meshes that are collision free by construction, or simply dropped unreachable parts of the model, but in this paper we present a method to print arbitrary meshes on a 5DOF wireframe printer. We formalize the collision avoidance problem using a directed graph, and propose an algorithm that finds a locally minimal set of constraints on the order of edges that guarantees there will be no collisions. Then a second algorithm orders the edges so that the printing progresses smoothly. Though meshes do exist that still cannot be printed, our method prints a wide range of models that previous methods cannot, and it provides a fundamental enabling algorithm for future development of wireframe printing.
We present a new continuum-based method for the realistic simulation of large-scale free-flowing granular materials. We derive a compact model for the rheology of the material, which accounts for the exact nonsmooth Drucker-Prager yield criterion combined with a varying volume fraction. Thanks to a semi-implicit time-stepping scheme and a careful spatial discretization of our rheology built upon the Material-Point Method, we are able to preserve at each time step the exact coupling between normal and tangential stresses, in a stable way. This contrasts with previous approaches which either regularize or linearize the yield criterion for implicit integration, leading to unrealistic behaviors or visible grid artifacts. Remarkably, our discrete problem turns out to be very similar to the discrete contact problem classically encountered in multibody dynamics, which allows us to leverage robust and efficient nonsmooth solvers from the literature. We validate our method by successfully capturing typical macroscopic features of some classical experiments, such as the discharge of a silo or the collapse of a granular column. Finally, we show that our method can be easily extended to accommodate more complex scenarios including two-way rigid body coupling as well as anisotropic materials.
We simulate sand dynamics using an elastoplastic, continuum assumption. We demonstrate that the Drucker-Prager plastic flow model combined with a Hencky-strain-based hyperelasticity accurately recreates a wide range of visual sand phenomena with moderate computational expense. We use the Material Point Method (MPM) to discretize the governing equations for its natural treatment of contact, topological change and history dependent constitutive relations. The Drucker-Prager model naturally represents the frictional relation between shear and normal stresses through a yield stress criterion. We develop a stress projection algorithm used for enforcing this condition with a non-associative flow rule that works naturally with both implicit and explicit time integration. We demonstrate the efficacy of our approach on examples undergoing large deformation, collisions and topological changes necessary for producing modern visual effects.
We present a boundary element based method for fast simulation of brittle fracture. By introducing simplifying assumptions that allow us to quickly estimate stress intensities and opening displacements during crack propagation, we build a fracture algorithm where the cost of each time step scales linearly with the length of the crack-front.
The transition from a full boundary element method to our faster variant is possible at the beginning of any time step. This allows us to build a hybrid method, which uses the expensive but more accurate BEM while the number of degrees of freedom is low, and uses the fast method once that number exceeds a given threshold as the crack geometry becomes more complicated.
Furthermore, we integrate this fracture simulation with a standard rigid-body solver. Our rigid-body coupling solves a Neumann boundary value problem by carefully separating translational, rotational and deformational components of the collision forces and then applying a Tikhonov regularizer to the resulting linear system. We show that our method produces physically reasonable results in standard test cases and is capable of dealing with complex scenes faster than previous finite- or boundary element approaches.
Planar shape interpolation is a classic problem in computer graphics. We present a novel shape interpolation method that blends C∞ planar harmonic mappings represented in closed-form. The intermediate mappings in the blending are guaranteed to be locally injective C∞ harmonic mappings, with conformal and isometric distortion bounded by that of the input mappings. The key to the success of our method is the fact that the blended differentials of our interpolated mapping have a simple closed-form expression, so they can be evaluated with unprecedented efficiency and accuracy. Moreover, in contrast to previous approaches, these differentials are integrable, and result in an actual mapping without further modification. Our algorithm is embarrassingly parallel and is orders of magnitude faster than state-of-the-art methods due to its simplicity, yet it still produces mappings that are superior to those of existing techniques due to its guaranteed bounds on geometric distortion.
Computation of mappings is a central building block in many geometry processing and graphics applications. The pursuit to compute mappings that are injective and have a controllable amount of conformal and isometric distortion is a long endeavor which has received significant attention by the scientific community in recent years. The difficulty of the problem stems from the fact that the space of bounded distortion mappings is nonconvex. In this paper, we consider the special case of harmonic mappings which have been used extensively in many graphics applications. We show that, somewhat surprisingly, the space of locally injective planar harmonic mappings with bounded conformal and isometric distortion has a convex characterization. We describe several projection operators that, given an arbitrary input mapping, are guaranteed to output a bounded distortion locally injective harmonic mapping that is closest to the input mapping in some special sense. In contrast to alternative approaches, the optimization problems that correspond to our projection operators are shown to be always feasible for any choice of distortion bounds. We use the boundary element method (BEM) to discretize the space of planar harmonic mappings and demonstrate the effectiveness of our approach through the application of planar shape deformation.
UV-maps are required in order to apply a 2D texture over a 3D model. Conventional UV-maps are defined by an assignment of uv positions to mesh vertices. We present an alternative representation, volume-encoded UV-maps, in which each point on the surface is mapped to a uv position which is solely a function of its 3D position. This function is tailored for a target surface: its restriction to the surface is a parametrization exhibiting high quality, e.g. in terms of angle and area preservation; and, near the surface, it is almost constant for small orthogonal displacements. The representation is applicable to a wide range of shapes and UV-maps, and unlocks several key advantages: it removes the need to duplicate vertices in the mesh to encode cuts in the map; it makes the UV-map representation independent from the meshing of the surface; the same texture, and even the same UV-map, can be shared by multiple geometrically similar models (e.g. all levels of a LoD pyramid); UV-maps can be applied to representations other than polygonal meshes, like point clouds or set of registered range-maps. Our schema is cheap on GPU computational and memory resources, requiring only a single, cache-coherent indirection to a small volumetric texture per fragment. We also provide an algorithm to construct a volume-encoded UV-map given a target surface.
Scanned performances are commonly represented in virtual environments as sequences of textured triangle meshes. Detailed shapes deforming over time benefit from meshes with dynamically evolving connectivity. We analyze these unstructured mesh sequences to automatically synthesize motion graphs with new smooth transitions between compatible poses and actions. Such motion graphs enable natural periodic motions, stochastic playback, and user-directed animations. The main challenge of unstructured sequences is that the meshes differ not only in connectivity but also in alignment, shape, and texture. We introduce new geometry processing techniques to address these problems and demonstrate visually seamless transitions on high-quality captures.
Intrinsic video decomposition refers to the fundamentally ambiguous task of separating a video stream into its constituent layers, in particular reflectance and shading layers. Such a decomposition is the basis for a variety of video manipulation applications, such as realistic recoloring or retexturing of objects. We present a novel variational approach to tackle this underconstrained inverse problem at real-time frame rates, which enables on-line processing of live video footage. The problem of finding the intrinsic decomposition is formulated as a mixed variational ℓ2-ℓp-optimization problem based on an objective function that is specifically tailored for fast optimization. To this end, we propose a novel combination of sophisticated local spatial and global spatio-temporal priors resulting in temporally coherent decompositions at real-time frame rates without the need for explicit correspondence search. We tackle the resulting high-dimensional, non-convex optimization problem via a novel data-parallel iteratively reweighted least squares solver that runs on commodity graphics hardware. Real-time performance is obtained by combining a local-global solution strategy with hierarchical coarse-to-fine optimization. Compelling real-time augmented reality applications, such as recoloring, material editing and retexturing, are demonstrated in a live setup. Our qualitative and quantitative evaluation shows that we obtain high-quality real-time decompositions even for challenging sequences. Our method is able to outperform state-of-the-art approaches in terms of runtime and result quality -- even without user guidance such as scribbles.
We present a novel technique to automatically colorize grayscale images that combines both global priors and local image features. Based on Convolutional Neural Networks, our deep network features a fusion layer that allows us to elegantly merge local information dependent on small image patches with global priors computed using the entire image. The entire framework, including the global and local priors as well as the colorization model, is trained in an end-to-end fashion. Furthermore, our architecture can process images of any resolution, unlike most existing approaches based on CNN. We leverage an existing large-scale scene classification database to train our model, exploiting the class labels of the dataset to more efficiently and discriminatively learn the global priors. We validate our approach with a user study and compare against the state of the art, where we show significant improvements. Furthermore, we demonstrate our method extensively on many different types of images, including black-and-white photography from over a hundred years ago, and show realistic colorizations.
With recent advances on mobile computing, power consumption has become a significant limiting constraint for many graphics applications. As a result, rendering on a power budget arises as an emerging demand. In this paper, we present a real-time, power-optimal rendering framework to address this problem, by finding the optimal rendering settings that minimize power consumption while maximizing visual quality. We first introduce a novel power-error, multi-objective cost space, and formally formulate power saving as an optimization problem. Then, we develop a two-step algorithm to efficiently explore the vast power-error space and leverage optimal Pareto frontiers at runtime. Finally, we show that our rendering framework can be generalized across different platforms, desktop PC or mobile device, by demonstrating its performance on our own OpenGL rendering framework, as well as the commercially available Unreal Engine.
We present Spire, a shading language and compiler framework that facilitates rapid exploration of shader optimization choices (such as frequency reduction and algorithmic approximation) afforded by modern real-time graphics engines. Our design combines ideas from rate-based shader programming with new language features that expand the scope of shader execution beyond traditional GPU hardware pipelines, and enable a diverse set of shader optimizations to be described by a single mechanism: overloading shader terms at various spatio-temporal computation rates provided by the pipeline. In contrast to prior work, neither the shading language's design, nor our compiler framework's implementation, is specific to the capabilities of any one rendering pipeline, thus Spire establishes architectural separation between the shading system and the implementation of modern rendering engines (allowing different rendering pipelines to utilize its services). We demonstrate use of Spire to author complex shaders that are portable across different rendering pipelines and to rapidly explore shader optimization decisions that span multiple compute and graphics passes and even offline asset preprocessing. We further demonstrate the utility of Spire by developing a shader level-of-detail library and shader auto-tuning system on top of its abstractions, and demonstrate rapid, automatic re-optimization of shaders for different target hardware platforms.
We present a novel method for real-time rendering of subdivision surfaces whose goal is to make subdivision faces as easy to render as triangles, points, or lines. Our approach uses standard GPU tessellation hardware and processes each face of a base mesh independently, thus allowing an entire model to be rendered in a single pass. The key idea of our method is to subdivide the u, v domain of each face ahead of time, generating a quadtree structure, and then submit one tessellated primitive per input face. By traversing the quadtree for each post-tessellation vertex, we are able to accurately and efficiently evaluate the limit surface. Our method yields a more uniform tessellation of the surface, and faster rendering, as fewer primitives are submitted. We evaluate our method on a variety of assets, and realize performance that can be three times faster than state-of-the-art approaches. In addition, our streaming formulation makes it easier to integrate subdivision surfaces into applications and shader code written for polygonal models. We illustrate integration of our technique into a full-featured video game engine.
We contribute a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time. Our algorithm supports both incremental reconstruction, improving the surface estimation over time, as well as parameterizing the nonrigid scene motion. Our approach is highly robust to both large frame-to-frame motion and topology changes, allowing us to reconstruct extremely challenging scenes. We demonstrate advantages over related real-time techniques that either deform an online generated template or continually fuse depth data nonrigidly into a single reference model. Finally, we show geometric reconstruction results on par with offline methods which require orders of magnitude more processing time and many more RGBD cameras.
We present a new anatomically-constrained local face model and fitting approach for tracking 3D faces from 2D motion data in very high quality. In contrast to traditional global face models, often built from a large set of blendshapes, we propose a local deformation model composed of many small subspaces spatially distributed over the face. Our local model offers far more flexibility and expressiveness than global blendshape models, even with a much smaller model size. This flexibility would typically come at the cost of reduced robustness, in particular during the under-constrained task of monocular reconstruction. However, a key contribution of this work is that we consider the face anatomy and introduce subspace skin thickness constraints into our model, which constrain the face to only valid expressions and helps counteract depth ambiguities in monocular tracking. Given our new model, we present a novel fitting optimization that allows 3D facial performance reconstruction from a single view at extremely high quality, far beyond previous fitting approaches. Our model is flexible, and can be applied also when only sparse motion data is available, for example with marker-based motion capture or even face posing from artistic sketches. Furthermore, by incorporating anatomical constraints we can automatically estimate the rigid motion of the skull, obtaining a rigid stabilization of the performance for free. We demonstrate our model and single-view fitting method on a number of examples, including, for the first time, extreme local skin deformation caused by external forces such as wind, captured from a single high-speed camera.
We introduce AutoHair, the first fully automatic method for 3D hair modeling from a single portrait image, with no user interaction or parameter tuning. Our method efficiently generates complete and high-quality hair geometries, which are comparable to those generated by the state-of-the-art methods, where user interaction is required. The core components of our method are: a novel hierarchical deep neural network for automatic hair segmentation and hair growth direction estimation, trained over an annotated hair image database; and an efficient and automatic data-driven hair matching and modeling algorithm, based on a large set of 3D hair exemplars. We demonstrate the efficacy and robustness of our method on Internet photos, resulting in a database of around 50K 3D hair models and a corresponding hairstyle space that covers a wide variety of real-world hairstyles. We also show novel applications enabled by our method, including 3D hairstyle space navigation and hair-aware image retrieval.
Facial scanning has become ubiquitous in digital media, but so far most efforts have focused on reconstructing the skin. Eye reconstruction, on the other hand, has received only little attention, and the current state-of-the-art method is cumbersome for the actor, time-consuming, and requires carefully setup and calibrated hardware. These constraints currently make eye capture impractical for general use. We present the first approach for high-quality lightweight eye capture, which leverages a database of pre-captured eyes to guide the reconstruction of new eyes from much less constrained inputs, such as traditional single-shot face scanners or even a single photo from the internet. This is accomplished with a new parametric model of the eye built from the database, and a novel image-based model fitting algorithm. Our method provides both automatic reconstructions of real eyes, as well as artistic control over the parameters to generate user-specific eyes.
This paper presents the first realtime 3D eye gaze capture method that simultaneously captures the coordinated movement of 3D eye gaze, head poses and facial expression deformation using a single RGB camera. Our key idea is to complement a realtime 3D facial performance capture system with an efficient 3D eye gaze tracker. We start the process by automatically detecting important 2D facial features for each frame. The detected facial features are then used to reconstruct 3D head poses and large-scale facial deformation using multi-linear expression deformation models. Next, we introduce a novel user-independent classification method for extracting iris and pupil pixels in each frame. We formulate the 3D eye gaze tracker in the Maximum A Posterior (MAP) framework, which sequentially infers the most probable state of 3D eye gaze at each frame. The eye gaze tracker could fail when eye blinking occurs. We further introduce an efficient eye close detector to improve the robustness and accuracy of the eye gaze tracker. We have tested our system on both live video streams and the Internet videos, demonstrating its accuracy and robustness under a variety of uncontrolled lighting conditions and overcoming significant differences of races, genders, shapes, poses and expressions across individuals.
We present the Sketchy database, the first large-scale collection of sketch-photo pairs. We ask crowd workers to sketch particular photographic objects sampled from 125 categories and acquire 75,471 sketches of 12,500 objects. The Sketchy database gives us fine-grained associations between particular photos and sketches, and we use this to train cross-domain convolutional networks which embed sketches and photographs in a common feature space. We use our database as a benchmark for fine-grained retrieval and show that our learned representation significantly outperforms both hand-crafted features as well as deep features trained for sketch or photo classification. Beyond image retrieval, we believe the Sketchy database opens up new opportunities for sketch and image understanding and synthesis.
Vector drawing is a popular representation in graphic design because of the precision, compactness and editability offered by parametric curves. However, prior work on line drawing vectorization focused solely on faithfully capturing input bitmaps, and largely overlooked the problem of producing a compact and editable curve network. As a result, existing algorithms tend to produce overly-complex drawings composed of many short curves and control points, especially in the presence of thick or sketchy lines that yield spurious curves at junctions. We propose the first vectorization algorithm that explicitly balances fidelity to the input bitmap with simplicity of the output, as measured by the number of curves and their degree. By casting this trade-off as a global optimization, our algorithm generates few yet accurate curves, and also disambiguates curve topology at junctions by favoring the simplest interpretations overall. We demonstrate the robustness of our algorithm on a variety of drawings, sketchy cartoons and rough design sketches.
In this paper, we present a novel technique to simplify sketch drawings based on learning a series of convolution operators. In contrast to existing approaches that require vector images as input, we allow the more general and challenging input of rough raster sketches such as those obtained from scanning pencil sketches. We convert the rough sketch into a simplified version which is then amendable for vectorization. This is all done in a fully automatic way without user intervention. Our model consists of a fully convolutional neural network which, unlike most existing convolutional neural networks, is able to process images of any dimensions and aspect ratio as input, and outputs a simplified sketch which has the same dimensions as the input image. In order to teach our model to simplify, we present a new dataset of pairs of rough and simplified sketch drawings. By leveraging convolution operators in combination with efficient use of our proposed dataset, we are able to train our sketch simplification model. Our approach naturally overcomes the limitations of existing methods, e.g., vector images as input and long computation time; and we show that meaningful simplifications can be obtained for many different test cases. Finally, we validate our results with a user study in which we greatly outperform similar approaches and establish the state of the art in sketch simplification of raster images.
A calligram is an arrangement of words or letters that creates a visual image, and a compact calligram fits one word into a 2D shape. We introduce a fully automatic method for the generation of legible compact calligrams which provides a balance between conveying the input shape, legibility, and aesthetics. Our method has three key elements: a path generation step which computes a global layout path suitable for embedding the input word; an alignment step to place the letters so as to achieve feature alignment between letter and shape protrusions while maintaining word legibility; and a final deformation step which deforms the letters to fit the shape while balancing fit against letter legibility. As letter legibility is critical to the quality of compact calligrams, we conduct a large-scale crowd-sourced study on the impact of different letter deformations on legibility and use the results to train a letter legibility measure which guides the letter deformation. We show automatically generated calligrams on an extensive set of word-image combinations. The legibility and overall quality of the calligrams are evaluated and compared, via user studies, to those produced by human creators, including a professional artist, and existing works.
State-of-the-art hex meshing algorithms consist of three steps: Frame-field design, parametrization generation, and mesh extraction. However, while the first two steps are usually discussed in detail, the last step is often not well studied. In this paper, we fully concentrate on reliable mesh extraction.
Parametrization methods employ computationally expensive countermeasures to avoid mapping input tetrahedra to degenerate or flipped tetrahedra in the parameter domain because such a parametrization does not define a proper hexahedral mesh. Nevertheless, there is no known technique that can guarantee the complete absence of such artifacts.
We tackle this problem from the other side by developing a mesh extraction algorithm which is extremely robust against typical imperfections in the parametrization. First, a sanitization process cleans up numerical inconsistencies of the parameter values caused by limited precision solvers and floating-point number representation. On the sanitized parametrization, we extract vertices and so-called darts based on intersections of the integer grid with the parametric image of the tetrahedral mesh. The darts are reliably interconnected by tracing within the parametrization and thus define the topology of the hexahedral mesh. In a postprocessing step, we let certain pairs of darts cancel each other, counteracting the effect of flipped regions of the parametrization. With this strategy, our algorithm is able to robustly extract hexahedral meshes from imperfect parametrizations which previously would have been considered defective. The algorithm will be published as an open source library [Lyon et al. 2016].
The polycube-based hexahedralization methods are robust to generate all-hex meshes without internal singularities. They avoid the difficulty to control the global singularity structure for a valid hexahedralization in frame-field based methods. To thoroughly utilize this advantage, we propose to use a frame field without internal singularities to guide the polycube construction. Theoretically, our method extends the vector fields associated with the polycube from exact forms to closed forms, which are curl free everywhere but may be not globally integrable. The closed forms give additional degrees of freedom to deal with the topological structure of high-genus models, and also provide better initial axis alignment for subsequent polycube generation. We demonstrate the advantages of our method on various models, ranging from genus-zero models to high-genus ones, and from single-boundary models to multiple-boundary ones.
Computing discrete geodesic distance over triangle meshes is one of the fundamental problems in computational geometry and computer graphics. In this problem, an effective window pruning strategy can significantly affect the actual running time. Due to its importance, we conduct an in-depth study of window pruning operations in this paper, and produce an exhaustive list of scenarios where one window can make another window partially or completely redundant. To identify a maximal number of redundant windows using such pairwise cross checking, we propose a set of procedures to synchronize local window propagation within the same triangle by simultaneously propagating a collection of windows from one triangle edge to its two opposite edges. On the basis of such synchronized window propagation, we design a new geodesic computation algorithm based on a triangle-oriented region growing scheme. Our geodesic algorithm can remove most of the redundant windows at the earliest possible stage, thus significantly reducing computational cost and memory usage at later stages. In addition, by adopting triangles instead of windows as the primitive in propagation management, our algorithm significantly cuts down the data management overhead. As a result, it runs 4--15 times faster than MMP and ICH algorithms, 2-4 times faster than FWP-MMP and FWP-CH algorithms, and also incurs the least memory usage.
We present a novel image-based representation for dynamic 3D avatars, which allows effective handling of various hairstyles and headwear, and can generate expressive facial animations with fine-scale details in real-time. We develop algorithms for creating an image-based avatar from a set of sparsely captured images of a user, using an off-the-shelf web camera at home. An optimization method is proposed to construct a topologically consistent morphable model that approximates the dynamic hair geometry in the captured images. We also design a real-time algorithm for synthesizing novel views of an image-based avatar, so that the avatar follows the facial motions of an arbitrary actor. Compelling results from our pipeline are demonstrated on a variety of cases.
The rich signals we extract from facial expressions imposes high expectations for the science and art of facial animation. While the advent of high-resolution performance capture has greatly improved realism, the utility of procedural animation warrants a prominent place in facial animation workflow. We present a system that, given an input audio soundtrack and speech transcript, automatically generates expressive lip-synchronized facial animation that is amenable to further artistic refinement, and that is comparable with both performance capture and professional animator output. Because of the diversity of ways we produce sound, the mapping from phonemes to visual depictions as visemes is many-valued. We draw from psycholinguistics to capture this variation using two visually distinct anatomical actions: Jaw and Lip, wheresound is primarily controlled by jaw articulation and lower-face muscles, respectively. We describe the construction of a transferable template jali 3D facial rig, built upon the popular facial muscle action unit representation facs. We show that acoustic properties in a speech signal map naturally to the dynamic degree of jaw and lip in visual speech. We provide an array of compelling animation clips, compare against performance capture and existing procedural animation, and report on a brief user study.
This paper introduces a method to modify the apparent relative pose and distance between camera and subject given a single portrait photo. Our approach fits a full perspective camera and a parametric 3D head model to the portrait, and then builds a 2D warp in the image plane to approximate the effect of a desired change in 3D. We show that this model is capable of correcting objectionable artifacts such as the large noses sometimes seen in "selfies," or to deliberately bring a distant camera closer to the subject. This framework can also be used to re-pose the subject, as well as to create stereo pairs from an input portrait. We show convincing results on both an existing dataset as well as a new dataset we captured to validate our method.
Head portraits are popular in traditional painting. Automating portrait painting is challenging as the human visual system is sensitive to the slightest irregularities in human faces. Applying generic painting techniques often deforms facial structures. On the other hand portrait painting techniques are mainly designed for the graphite style and/or are based on image analogies; an example painting as well as its original unpainted version are required. This limits their domain of applicability. We present a new technique for transferring the painting from a head portrait onto another. Unlike previous work our technique only requires the example painting and is not restricted to a specific style. We impose novel spatial constraints by locally transferring the color distributions of the example painting. This better captures the painting texture and maintains the integrity of facial structures. We generate a solution through Convolutional Neural Networks and we present an extension to video. Here motion is exploited in a way to reduce temporal inconsistencies and the shower-door effect. Our approach transfers the painting style while maintaining the input photograph identity. In addition it significantly reduces facial deformations over state of the art.
3D modeling remains a notoriously difficult task for novices despite significant research effort to provide intuitive and automated systems. We tackle this problem by combining the strengths of two popular domains: sketch-based modeling and procedural modeling. On the one hand, sketch-based modeling exploits our ability to draw but requires detailed, unambiguous drawings to achieve complex models. On the other hand, procedural modeling automates the creation of precise and detailed geometry but requires the tedious definition and parameterization of procedural models. Our system uses a collection of simple procedural grammars, called snippets, as building blocks to turn sketches into realistic 3D models. We use a machine learning approach to solve the inverse problem of finding the procedural model that best explains a user sketch. We use non-photorealistic rendering to generate artificial data for training convolutional neural networks capable of quickly recognizing the procedural rule intended by a sketch and estimating its parameters. We integrate our algorithm in a coarse-to-fine urban modeling system that allows users to create rich buildings by successively sketching the building mass, roof, facades, windows, and ornaments. A user study shows that by using our approach non-expert users can generate complex buildings in just a few minutes.
Connectivity and layout of underlying networks largely determine agent behavior and usage in many environments. For example, transportation networks determine the flow of traffic in a neighborhood, whereas building floorplans determine the flow of people in a workspace. Designing such networks from scratch is challenging as even local network changes can have large global effects. We investigate how to computationally create networks starting from only high-level functional specifications. Such specifications can be in the form of network density, travel time versus network length, traffic type, destination location, etc. We propose an integer programming-based approach that guarantees that the resultant networks are valid by fulfilling all the specified hard constraints and that they score favorably in terms of the objective function. We evaluate our algorithm in two different design settings, street layout and floorplans to demonstrate that diverse networks can emerge purely from high-level functional specifications.
We propose a novel approach for designing mid-scale layouts by optimizing with respect to human crowd properties. Given an input layout domain such as the boundary of a shopping mall, our approach synthesizes the paths and sites by optimizing three metrics that measure crowd flow properties: mobility, accessibility, and coziness. While these metrics are straightforward to evaluate by a full agent-based crowd simulation, optimizing a layout usually requires hundreds of evaluations, which would require a long time to compute even using the latest crowd simulation techniques. To overcome this challenge, we propose a novel data-driven approach where nonlinear regressors are trained to capture the relationship between the agent-based metrics, and the geometrical and topological features of a layout. We demonstrate that by using the trained regressors, our approach can synthesize crowd-aware layouts and improve existing layouts with better crowd flow properties.
This paper introduces a new computational method to solve differential equations on subdivision surfaces. Our approach adapts the numerical framework of Discrete Exterior Calculus (DEC) from the polygonal to the subdivision setting by exploiting the refin-ability of subdivision basis functions. The resulting Subdivision Exterior Calculus (SEC) provides significant improvements in accuracy compared to existing polygonal techniques, while offering exact finite-dimensional analogs of continuum structural identities such as Stokes' theorem and Helmholtz-Hodge decomposition. We demonstrate the versatility and efficiency of SEC on common geometry processing tasks including parameterization, geodesic distance computation, and vector field design.
We present the Accelerated Quadratic Proxy (AQP) - a simple first-order algorithm for the optimization of geometric energies defined over triangular and tetrahedral meshes.
The main stumbling block of current optimization techniques used to minimize geometric energies over meshes is slow convergence due to ill-conditioning of the energies at their minima. We observe that this ill-conditioning is in large part due to a Laplacian-like term existing in these energies. Consequently, we suggest to locally use a quadratic polynomial proxy, whose Hessian is taken to be the Laplacian, in order to achieve a preconditioning effect. This already improves stability and convergence, but more importantly allows incorporating acceleration in an almost universal way, that is independent of mesh size and of the specific energy considered.
Experiments with AQP show it is rather insensitive to mesh resolution and requires a nearly constant number of iterations to converge; this is in strong contrast to other popular optimization techniques used today such as Accelerated Gradient Descent and Quasi-Newton methods, e.g., L-BFGS. We have tested AQP for mesh deformation in 2D and 3D as well as for surface parameterization, and found it to provide a considerable speedup over common baseline techniques.
This paper develops new refinement rules for non-uniform Catmull-Clark surfaces that produce G1 extraordinary points whose blending functions have a single local maximum. The method consists of designing an "eigen polyhedron" in R2 for each extraordinary point, and formulating refinement rules for which refinement of the eigen polyhedron reduces to a scale and translation. These refinement rules, when applied to a non-uniform Catmull-Clark control mesh in R3, yield a G1 extraordinary point.
Showy inflorescences - clusters of flowers - are a common feature of many plants, greatly contributing to their beauty. The large numbers of individual flowers (florets), arranged in space in a systematic manner, make inflorescences a natural target for procedural modeling. We present a suite of biologically motivated algorithms for modeling and animating the development of inflorescences with closely packed florets. These inflorescences share the following characteristics: (i) in their ensemble, the florets form a relatively smooth, often approximately planar surface; (ii) there are numerous collisions between petals of the same or adjacent florets; and (iii) the developmental stage and type of a floret may depend on its position within the inflorescence, with drastic or gradual differences. To model flat-topped branched inflorescences (corymbs and umbels), we propose a florets-first algorithm, in which the branching structure self-organizes to support florets in predetermined positions. This is an alternative to previous branching-first models, in which floret positions were determined by branch arrangement. To obtain realistic visualizations, we complement the algorithms that generate the inflorescence structure with an interactive method for modeling floret corollas (petal sets). The method supports corollas with both separate and fused petals. We illustrate our techniques with models from several plant families.
Human motion is complex and difficult to synthesize realistically. Automatic style transfer to transform the mood or identity of a character's motion is a key technology for increasing the value of already synthesized or captured motion data. Typically, state-of-the-art methods require all independent actions observed in the input to be present in a given style database to perform realistic style transfer. We introduce a spectral style transfer method for human motion between independent actions, thereby greatly reducing the required effort and cost of creating such databases. We leverage a spectral domain representation of the human motion to formulate a spatial correspondence free approach. We extract spectral intensity representations of reference and source styles for an arbitrary action, and transfer their difference to a novel motion which may contain previously unseen actions. Building on this core method, we introduce a temporally sliding window filter to perform the same analysis locally in time for heterogeneous motion processing. This immediately allows our approach to serve as a style database enhancement technique to fill-in non-existent actions in order to increase previous style transfer method's performance. We evaluate our method both via quantitative experiments, and through administering controlled user studies with respect to previous work, where significant improvement is observed with our approach.
We present a framework to synthesize character movements based on high level parameters, such that the produced movements respect the manifold of human motion, trained on a large motion capture dataset. The learned motion manifold, which is represented by the hidden units of a convolutional autoencoder, represents motion data in sparse components which can be combined to produce a wide range of complex movements. To map from high level parameters to the motion manifold, we stack a deep feedforward neural network on top of the trained autoencoder. This network is trained to produce realistic motion sequences from parameters such as a curve over the terrain that the character should follow, or a target location for punching and kicking. The feedforward control network and the motion manifold are trained independently, allowing the user to easily switch between feedforward networks according to the desired interface, without re-training the motion manifold. Once motion is generated it can be edited by performing optimization in the space of the motion manifold. This allows for imposing kinematic constraints, or transforming the style of the motion, while ensuring the edited motion remains natural. As a result, the system can produce smooth, high quality motion sequences without any manual pre-processing of the training data.
We learn a probabilistic model connecting human poses and arrangements of object geometry from real-world observations of interactions collected with commodity RGB-D sensors. This model is encoded as a set of prototypical interaction graphs (PiGraphs), a human-centric representation capturing physical contact and visual attention linkages between 3D geometry and human body parts. We use this encoding of the joint probability distribution over pose and geometry during everyday interactions to generate interaction snapshots, which are static depictions of human poses and relevant objects during human-object interactions. We demonstrate that our model enables a novel human-centric understanding of 3D content and allows for jointly generating 3D scenes and interaction poses given terse high-level specifications, natural language, or reconstructed real-world scene constraints.
Texture synthesis is a well-established area, with many important applications in computer graphics and vision. However, despite their success, synthesis techniques are not used widely in practice because the creation of good exemplars remains challenging and extremely tedious. In this paper, we introduce an unsupervised method for analyzing texture content across multiple scales that automatically extracts good exemplars from natural images. Unlike existing methods, which require extensive manual tuning, our method is fully automatic. This allows the user to focus on using texture palettes derived from their own images, rather than on manual interactions dictated by the needs of an underlying algorithm.
Most natural textures exhibit patterns at multiple scales that may vary according to the location (non-stationarity). To handle such textures many synthesis algorithms rely on an analysis of the input and a guidance of the synthesis. Our new analysis is based on a labeling of texture patterns that is both (i) multi-scale and (ii) unsupervised -- that is, patterns are labeled at multiple scales, and the scales and the number of labeled clusters are selected automatically. Our method works in two stages. The first builds a hierarchical extension of superpixels and the second labels the superpixels based on random walk in a graph of similarity between superpixels and a nonnegative matrix factorization. Our label-maps provide descriptors for pixels and regions that benefit state-of-the-art texture synthesis algorithms. We show several applications including guidance of non-stationary synthesis, content selection and texture painting. Our method is designed to treat large inputs and can scale to many megapixels. In addition to traditional exemplar inputs, our method can also handle natural images containing different textured regions.
We present a technique to synthesize time-varying weathered textures. Given a single texture image as input, the degree of weathering at different regions of the input texture is estimated by prevalence analysis of texture patches. This information then allows to gracefully increase or decrease the popularity of weathered patches, simulating the evolution of texture appearance both backward and forward in time. Our method can be applied to a wide variety of different textures since the reaction of the material to weathering effects is physically-oblivious and learned from the input texture itself. The weathering process evolves new structures as well as color variations, providing rich and natural results. In contrast with existing methods, our method does not require any user interaction or assistance. We demonstrate our technique on various textures, and their application to time-varying weathering of 3D scenes. We also extend our method to handle multi-layered textures, weathering transfer, and interactive weathering painting.
This paper presents Soli, a new, robust, high-resolution, low-power, miniature gesture sensing technology for human-computer interaction based on millimeter-wave radar. We describe a new approach to developing a radar-based sensor optimized for human-computer interaction, building the sensor architecture from the ground up with the inclusion of radar design principles, high temporal resolution gesture tracking, a hardware abstraction layer (HAL), a solid-state radar chip and system architecture, interaction models and gesture vocabularies, and gesture recognition. We demonstrate that Soli can be used for robust gesture recognition and can track gestures with sub-millimeter accuracy, running at over 10,000 frames per second on embedded hardware.
Fully articulated hand tracking promises to enable fundamentally new interactions with virtual and augmented worlds, but the limited accuracy and efficiency of current systems has prevented widespread adoption. Today's dominant paradigm uses machine learning for initialization and recovery followed by iterative model-fitting optimization to achieve a detailed pose fit. We follow this paradigm, but make several changes to the model-fitting, namely using: (1) a more discriminative objective function; (2) a smooth-surface model that provides gradients for non-linear optimization; and (3) joint optimization over both the model pose and the correspondences between observed data points and the model surface. While each of these changes may actually increase the cost per fitting iteration, we find a compensating decrease in the number of iterations. Further, the wide basin of convergence means that fewer starting points are needed for successful model fitting. Our system runs in real-time on CPU only, which frees up the commonly over-burdened GPU for experience designers. The hand tracker is efficient enough to run on low-power devices such as tablets. We can track up to several meters from the camera to provide a large working volume for interaction, even using the noisy data from current-generation depth cameras. Quantitative assessments on standard datasets show that the new approach exceeds the state of the art in accuracy. Qualitative results take the form of live recordings of a range of interactive experiences enabled by this new approach.
We propose a novel approach to digital character animation, combining the benefits of tangible input devices and sophisticated rig animation algorithms. A symbiotic software and hardware approach facilitates the animation process for novice and expert users alike. We overcome limitations inherent to all previous tangible devices by allowing users to directly control complex rigs using only a small set (5-10) of physical controls. This avoids oversimplification of the pose space and excessively bulky device configurations. Our algorithm derives a small device configuration from complex character rigs, often containing hundreds of degrees of freedom, and a set of sparse sample poses. Importantly, only the most influential degrees of freedom are controlled directly, yet detailed motion is preserved based on a pose interpolation technique. We designed a modular collection of joints and splitters, which can be assembled to represent a wide variety of skeletons. Each joint piece combines a universal joint and two twisting elements, allowing to accurately sense its configuration. The mechanical design provides a smooth inverse kinematics-like user experience and is not prone to gimbal locking. We integrate our method with the professional 3D software Autodesk Maya® and discuss a variety of results created with characters available online. Comparative user experiments show significant improvements over the closest state-of-the-art in terms of accuracy and time in a keyframe posing task.
Animation artists enjoy the benefits of simulation but do not want to be held back by its constraints. Artist-directed dynamics seeks to resolve this need with a unified method that combines simulation with classical keyframing techniques. The combination of these approaches improves upon both extremes: simulation becomes more customizable and keyframing becomes more automatic. Examining our system in the context of the twelve fundamental animation principles reveals that it stands out for its treatment of exaggeration and appeal. Our system accommodates abrupt jumps, large plastic deformations, and makes it easy to reuse carefully crafted animations.
We present SketchiMo, a novel approach for the expressive editing of articulated character motion. SketchiMo solves for the motion given a set of projective constraints that relate the sketch inputs to the unknown 3D poses. We introduce the concept of sketch space, a contextual geometric representation of sketch targets---motion properties that are editable via sketch input---that enhances, right on the viewport, different aspects of the motion. The combination of the proposed sketch targets and space allows for seamless editing of a wide range of properties, from simple joint trajectories to local parent-child spatiotemporal relationships and more abstract properties such as coordinated motions. This is made possible by interpreting the user's input through a new sketch-based optimization engine in a uniform way. In addition, our view-dependent sketch space also serves the purpose of disambiguating the user inputs by visualizing their range of effect and transparently defining the necessary constraints to set the temporal boundaries for the optimization.
Shadow theatre is a genre of performance art in which the actors are only visible as shadows projected on the screen. The goal of this study is to generate animated characters, the shadows of which match a sequence of target silhouettes. This poses several challenges. The motion of multiple characters are carefully coordinated to form a target silhouette on the screen, and each character's pose should be stable, balanced, and plausible. The resulting character animation should be smooth and coherent spatially and temporally. We formulate the problem as nonlinear constrained optimization with objectives, which were designed to generate plausible human motions. Our optimization algorithm was primarily inspired by the heuristic strategies of professional shadow theatre actors. Their know-how was studied and then incorporated into our optimization formulation. We demonstrate the effectiveness of our approach with a variety of target silhouettes and 3D fabrication of the results.
People often take a series of nearly redundant pictures to capture a moment or scene. However, selecting photos to keep or share from a large collection is a painful chore. To address this problem, we seek a relative quality measure within a series of photos taken of the same scene, which can be used for automatic photo triage. Towards this end, we gather a large dataset comprised of photo series distilled from personal photo albums. The dataset contains 15, 545 unedited photos organized in 5,953 series. By augmenting this dataset with ground truth human preferences among photos within each series, we establish a benchmark for measuring the effectiveness of algorithmic models of how people select photos. We introduce several new approaches for modeling human preference based on machine learning. We also describe applications for the dataset and predictor, including a smart album viewer, automatic photo enhancement, and providing overviews of video clips.
Skies are common backgrounds in photos but are often less interesting due to the time of photographing. Professional photographers correct this by using sophisticated tools with painstaking efforts that are beyond the command of ordinary users. In this work, we propose an automatic background replacement algorithm that can generate realistic, artifact-free images with a diverse styles of skies. The key idea of our algorithm is to utilize visual semantics to guide the entire process including sky segmentation, search and replacement. First we train a deep convolutional neural network for semantic scene parsing, which is used as visual prior to segment sky regions in a coarse-to-fine manner. Second, in order to find proper skies for replacement, we propose a data-driven sky search scheme based on semantic layout of the input image. Finally, to re-compose the stylized sky with the original foreground naturally, an appearance transfer method is developed to match statistics locally and semantically. We show that the proposed algorithm can automatically generate a set of visually pleasing results. In addition, we demonstrate the effectiveness of the proposed algorithm with extensive user studies.