A projector manipulates outgoing light rays, while a camera records incoming ones. Combining these optically inverse devices, especially in a coaxial manner, creates the possibility of a new computer-vision technology. The "Smart Headlight," currently under development at Carnegie Mellon's Robotics Institute, is one example: a device that can "erase" raindrops or snowflakes from a driver's sight, allowing for continuous use of the "high beams" mode while not causing glare against oncoming drivers, and enhance the appearance of important objects, such as pedestrians. In that sense, it constitutes a "genuine" augmented reality, manipulating the reality for how it appears to a viewer, rather than merely overlaying objects on the image of the reality. This talk will present the state of the Smart Headlight project and discuss further possible applications of projector-camera systems.
We present an exploration into the future of fabrication, in particular the vision of mobile fabrication, which we define as "personal fabrication on the go". We explore this vision with two surveys, two simple hardware prototypes, matching custom apps that provide users with access to a solution database, custom fabrication processes we designed specifically for these devices, and a user study conducted in situ on metro trains. Our findings suggest that mobile fabrication is a compelling next direction for personal fabrication. From our experience with the prototypes we derive hardware requirements to make mobile fabrication also technically feasible.
In recent years, extensive research in the HCI literature has explored interactive techniques for digital fabrication. However, little attention in this body of work has examined how to involve and guide human workers in fabricating larger-scale structures. We propose a novel model of crowdsourced fabrication, in which a large number of workers and volunteers are guided through the process of building a pre-designed structure. The process is facilitated by an intelligent construction space capable of guiding individual workers and coordinating the overall build process. More specifically, we explore the use of smartwatches, indoor location sensing, and instrumented construction materials to provide real-time guidance to workers, coordinated by a foreman engine that manages the overall build process. We report on a three day deployment of our system to construct a 12-tall bamboo pavilion with assistance from more than one hundred volunteer workers, and reflect on observations and feedback collected during the exhibit.
Everyday tools and objects often need to be customized for an unplanned use or adapted for specific user, such as adding a bigger pull to a zipper or a larger grip for a pen. The advent of low-cost 3D printing offers the possibility to rapidly construct a wide range of such adaptations. However, while 3D printers are now affordable enough for even home use, the tools needed to design custom adaptations normally require skills that are beyond users with limited 3D modeling experience.
In this paper, we describe Reprise--a design tool for specifying, generating, customizing and fitting adaptations onto existing household objects. Reprise allows users to express at a high level what type of action is applied to an object. Based on this high level specification, Reprise automatically generates adaptations. Users can use simple sliders to customize the adaptations to better suit their particular needs and preferences, such as increasing the tightness for gripping, enhancing torque for rotation, or making a larger base for stability. Finally, Reprise provides a toolkit of fastening methods and support structures for fitting the adaptations onto existing objects.
To validate our approach, we used Reprise to replicate 15 existing adaptation examples, each of which represents a specific category in a design space distilled from an analysis of over 3000 cases found in the literature and online communities. We believe this work would benefit makers and designers for prototyping lifehacking solutions and assistive technologies.
We explore the design space of energy-neutral situated displays, which give physical presence to digital information. We investigate three central dimensions: energy sources, display technologies, and wireless communications. Based on the power implications from our analysis, we present a thin, wireless, photovoltaic-powered display that is quick and easy to deploy and capable of indefinite operation in indoor lighting conditions. The display uses a low-resolution e-paper architecture, which is 35 times more energy-efficient than smaller-sized high-resolution displays. We present a detailed analysis on power consumption, photovoltaic energy harvesting performance, and a detailed comparison to other display-driving architectures. Depending on the ambient lighting, the display can trigger an update every 1 -- 25 minutes and communicate to a PC or smartphone via Bluetooth Low-Energy.
We present FaceTouch, a novel interaction concept for mobile Virtual Reality (VR) head-mounted displays (HMDs) that leverages the backside as a touch-sensitive surface. With FaceTouch, the user can point at and select virtual content inside their field-of-view by touching the corresponding location at the backside of the HMD utilizing their sense of proprioception. This allows for rich interaction (e.g. gestures) in mobile and nomadic scenarios without having to carry additional accessories (e.g. a gamepad). We built a prototype of FaceTouch and conducted two user studies. In the first study we measured the precision of FaceTouch in a display-fixed target selection task using three different selection techniques showing a low error rate of 2% indicate the viability for everyday usage. To asses the impact of different mounting positions on the user performance we conducted a second study. We compared three mounting positions of the touchpad (face, hand and side) showing that mounting the touchpad at the back of the HMD resulted in a significantly lower error rate, lower selection time and higher usability. Finally, we present interaction techniques and three example applications that explore the FaceTouch design space.
Patients researching medical diagnoses, scientist exploring new fields of literature, and students learning about new domains are all faced with the challenge of capturing information they find for later use. However, saving information is challenging on mobile devices, where the small screen and font sizes combined with the inaccuracy of finger based touch screens makes it time consuming and stressful for people to select and save text for future use. Furthermore, beyond the challenge of simply selecting a region of bounded text on a mobile device, in many learning and data exploration tasks the boundaries of what text may be relevant and useful later are themselves uncertain for the user. In contrast to previous approaches which focused on speeding up the selection process by making the identification of hard boundaries faster, we introduce the idea of intentionally supporting uncertain input in the context of saving information during complex reading and information exploration. We embody this idea in a system that uses force touch and fuzzy bounding boxes along with posthoc expandable context to support identifying and saving information in an intentionally uncertain way on mobile devices. In a two part user study we find that this approach reduced selection time and was preferred by participants over the default system text selection method.
We present HoloFlex, a 3D flexible smartphone featuring a light-field display consisting of a high-resolution P-OLED display and an array of 16,640 microlenses. HoloFlex allows mobile users to interact with 3D images featuring natural visual cues such as motion parallax and stereoscopy without glasses or head tracking. Its flexibility allows the use of bend input for interacting with 3D objects along the z axis. Images are rendered into 12-pixel wide circular blocks-pinhole views of the 3D scene-which enable ~80 unique viewports at an effective resolution of 160 × 104. The microlens array distributes each pixel from the display in a direction that preserves the angular information of light rays in the 3D scene. We present a preliminary study evaluating the effect of bend input vs. a vertical touch screen slider on 3D docking performance. Results indicate that bend input significantly improves movement time in this task. We also present 3D applications including a 3D editor, a 3D Angry Birds game and a 3D teleconferencing system that utilize bend input.
Existing smartwatches rely on touchscreens for display and input, which inevitably leads to finger occlusion and confines interactivity to a small area. In this work, we introduce AuraSense, which enables rich, around-device, smartwatch interactions using electric field sensing as an adapted device. To explore how this sensing approach could enhance smartwatch interactions, we considered different antenna configurations and how they could enable useful interaction modalities. We identified four configurations that can support six well-known modalities of particular interest and utility, including gestures above or in close proximity to watches, and touchscreen-like finger tracking on the skin. We quantify the feasibility of these input modalities, suggesting that AuraSense can be low latency and robust across users and environments.
This paper presents ChainFORM: a linear, modular, actuated hardware system as a novel type of shape changing interface. Using rich sensing and actuation capability, this modular hardware system allows users to construct and customize a wide range of interactive applications. Inspired by modular and serpentine robotics, our prototype comprises identical modules that connect in a chain. Modules are equipped with rich input and output capability: touch detection on multiple surfaces, angular detection, visual output, and motor actuation. Each module includes a servo motor wrapped with a flexible circuit board with an embedded microcontroller.
Leveraging the modular functionality, we introduce novel interaction capability with shape changing interfaces, such as rearranging the shape/configuration and attaching to passive objects and bodies. To demonstrate the capability and interaction design space of ChainFORM, we implemented a variety of applications for both computer interfaces and hands-on prototyping tools.
This paper introduces swarm user interfaces, a new class of human-computer interfaces comprised of many autonomous robots that handle both display and interaction. We describe the design of Zooids, an open-source open-hardware platform for developing tabletop swarm interfaces. The platform consists of a collection of custom-designed wheeled micro robots each 2.6 cm in diameter, a radio base-station, a high-speed DLP structured light projector for optical tracking, and a software framework for application development and control. We illustrate the potential of tabletop swarm user interfaces through a set of application scenarios developed with Zooids, and discuss general design considerations unique to swarm user interfaces.
We introduce Rovables, a miniature robot that can move freely on unmodified clothing. The robots are held in place by magnetic wheels, and can climb vertically. The robots are untethered and have an onboard battery, microcontroller, and wireless communications. They also contain a low-power localization system that uses wheel encoders and IMU, allowing Rovables to perform limited autonomous navigation on the body. In the technical evaluations, we found that Rovables can operate continuously for 45 minutes and can carry up to 1.5N. We propose an interaction space for mobile on-body devices spanning sensing, actuation, and interfaces, and develop application scenarios in that space. Our applications include on-body sensing, modular displays, tactile feedback and interactive clothing and jewelry.
This paper presents a design, simulation, and fabrication pipeline for making transforming inflatables with various materials. We introduce a bending mechanism that creates multiple, programmable shape-changing behaviors with inextensible materials, including paper, plastics and fabrics. We developed a software tool that generates these bending mechanism for a given geometry, simulates its transformation, and exports the compound geometry as digital fabrication files. We show a range of fabrication methods, from manual sealing, to heat pressing with custom stencils and a custom heat-sealing head that can be mounted on usual 3-axis CNC machines to precisely fabricate the designed transforming material. Finally, we present three applications to show how this technology could be used for designing interactive wearables, toys, and furniture.
Aligning and distributing graphical objects is a common, but cumbersome task. In a preliminary study (six graphic designers, six non-designers), we identified three key problems with current tools: lack of persistence, unpredictability of results, and inability to 'tweak' the layout. We created StickyLines, a tool that treats guidelines as first-class objects: Users can create precise, predictable and persistent interactive alignment and distribution relationships, and 'tweaked' positions can be maintained for subsequent interactions. We ran a [2x2] within-participant experiment to compare StickyLines with standard commands, with two levels of layout difficulty. StickyLines performed 40% faster and required 49% fewer actions than traditional alignment and distribution commands for complex layouts. In study three, six professional designers quickly adopted StickyLines and identified novel uses, including creating complex compound guidelines and using them for both spatial and semantic grouping.
The lack of dedicated multitasking interface features in smartphones has resulted in users attempting a sequential form of multitasking via frequent app switching. In addition to the obvious temporal cost, it requires physical and cognitive effort which increases multifold as the back and forth switching becomes more frequent. We propose porous interfaces, a paradigm that combines the concept of translucent windows with finger identification to support efficient multitasking on small screens. Porous interfaces enable partially transparent app windows overlaid on top of each other, each of them being accessible simultaneously using a different finger as input. We design porous interfaces to include a broad range of multitasking interactions with and between windows, while ensuring fidelity with the existing smartphone interactions. We develop an end-to-end smartphone interface that demonstrates porous interfaces. In a qualitative study, participants found porous interfaces intuitive, easy, and useful for frequent multitasking scenarios.
Today's game analytics systems are powered by event logs, which reveal information about what players are doing but offer little insight about the types of gameplay that games foster. Moreover, the concept of gameplay itself is difficult to define and quantify. In this paper, we show that analyzing players' controller inputs using probabilistic topic models allows game developers to describe the types of gameplay -- or action -- in games in a quantitative way. More specifically, developers can discover the types of action that a game fosters and the extent that each game level fosters each type of action, all in an unsupervised manner. They can use this information to verify that their levels feature the appropriate style of gameplay and to recommend levels with gameplay that is similar to levels that players like. We begin with latent Dirichlet allocation (LDA), the simplest topic model, then develop the player-gameplay action (PGA) model to make the same types of discoveries about gameplay in a way that is independent of each player's play style. We train a player recognition system on the PGA model's output to verify that its discoveries about gameplay are in fact independent of each player's play style. The system recognizes players with over 90% accuracy in about 20 seconds of playtime.
We present TRing, a finger-worn input device which provides instant and customizable interactions. TRing offers a novel method for making plain objects interactive using an embedded magnet and a finger-worn device. With a particle filter integrated magnetic sensing technique, we compute the fingertip's position relative to the embedded magnet. We also offer a magnet placement algorithm that guides the magnet installation location based upon the user's interface customization. By simply inserting or attaching a small magnet, we bring interactivity to both fabricated and existing objects. In our evaluations, TRing shows an average tracking error of 8.6 mm in 3D space and a 2D targeting error of 4.96 mm, which are sufficient for implementing average-sized conventional controls such as buttons and sliders. A user study validates the input performance with TRing on a targeting task (92% accuracy within 45 mm distance) and a cursor control task (91% accuracy for a 10 mm target). Furthermore, we show examples that highlight the interaction capability of our approach.
Traditional wearable tactile displays transfer tactile stimulations through a firm contact between the stimulator and the skin. We conjecture that a firm contact may not be always possible and acceptable. Therefore, we explored the concept of a non-contact wearable tactile display using an airflow, which can transfer information without a firm contact. To secure an empirical ground for the design of a wearable airflow display, we conducted a series of psychophysical experiments to estimate the intensity thresholds, duration thresholds, and distance thresholds of airflow perception on various body locations, and report the resulting empirical data in this paper. We then built a 4-point airflow display, compared its performance with that of a vibrotactile display, and could show that the two tactile displays are comparable in information transfer performance. User feedback was also positive and revealed many unique expressions describing airflow-based tactile experiences. Lastly, we demonstrate the feasibility of an airflow-based wearable tactile display with a prototype using micro-fans.
We present RealPen, an augmented stylus for capacitive tablet screens that recreates the physical sensation of writing on paper with a pencil, ball-point pen or marker pen. The aim is to create a more engaging experience when writing on touch surfaces, such as screens of tablet computers. This is achieved by regenerating the friction-induced oscillation and sound of a real writing tool in contact with paper. To generate realistic tactile feedback, our algorithm analyzes the frequency spectrum of the friction oscillation generated when writing with traditional tools, extracts principal frequencies, and uses the actuator's frequency response profile for an adjustment weighting function. We enhance the realism by providing the sound feedback aligned with the writing pressure and speed. Furthermore, we investigated the effects of superposition and fluctuation of several frequencies on human tactile perception, evaluated the performance of RealPen, and characterized users' perception and preference of each feedback type.
We explore how to create interactive systems based on electrical muscle stimulation that offer expressive output. We present muscle-plotter, a system that provides users with input and output access to a computer system while on the go. Using pen-on-paper interaction, muscle-plotter allows users to engage in cognitively demanding activities, such as writing math. Users write formulas using a pen and the system responds by making the users' hand draw charts and widgets. While Anoto technology in the pen tracks users' input, muscle-plotter uses electrical muscle stimulation (EMS) to steer the user's wrist so as to plot charts, fit lines through data points, find data points of interest, or fill in forms. We demonstrate the system at the example of six simple applications, including a wind tunnel simulator.
The key idea behind muscle-plotter is to make the user's hand sweep an area on which muscle-plotter renders curves, i.e., series of values, and to persist this EMS output by means of the pen. This allows the system to build up a larger whole. Still, the use of EMS allows muscle-plotter to achieve a compact and mobile form factor. In our user study, muscle-plotter made participants draw random plots with an accuracy of ±4.07 mm and preserved the frequency of functions to be drawn up to 0.3 cycles per cm.
Haptic learning of gesture shortcuts has never been explored. In this paper, we investigate haptic learning of a freehand semaphoric finger tap gesture shortcut set using haptic rings. We conduct a two-day study of 30 participants where we couple haptic stimuli with visual and audio stimuli, and compare their learning performance with wholly visual learning. The results indicate that with <30 minutes of learning, haptic learning of finger tap semaphoric gestures is comparable to visual learning and maintains its recall on the second day.
We present GyroVR, head worn flywheels designed to render inertia in Virtual Reality (VR. Motions such as flying, diving or floating in outer space generate kinesthetic forces onto our body which impede movement and are currently not represented in VR. We simulate those kinesthetic forces by attaching flywheels to the users head, leveraging the gyroscopic effect of resistance when changing the spinning axis of rotation. GyroVR is an ungrounded, wireless and self contained device allowing the user to freely move inside the virtual environment. The generic shape allows to attach it to different positions on the users body. We evaluated the impact of GyroVR onto different mounting positions on the head (back and front) in terms of immersion, enjoyment and simulator sickness. Our results show, that attaching GyroVR onto the users head (front of the Head Mounted Display (HMD)) resulted in the highest level of immersion and enjoyment and therefore can be built into future VR HMDs, enabling kinesthetic forces in VR.
Software APIs often contain too many methods and parameters for developers to memorize or navigate effectively. Instead, developers resort to finding answers through online search engines and systems such as Stack Overflow. However, the process of finding and integrating a working solution is often very time-consuming. Though code search engines have increased in quality, there remain significant language- and workflow-gaps in meeting end-user needs. Novice and intermediate programmers often lack the language to query, and the expertise in transferring found code to their task. To address this problem, we present CodeMend, a system to support finding and integration of code. CodeMend leverages a neural embedding model to jointly model natural language and code as mined from large Web and code datasets. We also demonstrate a novel, mixed-initiative, interface to support query and integration steps. Through CodeMend, end-users describe their goal in natural language. The system makes salient the relevant API functions, the lines in the end-user's program that should be changed, as well as proposing the actual change. We demonstrate the utility and accuracy of CodeMend through lab and simulation studies.
Collectively authored programming resources such as Q&A sites and open-source libraries provide a limited window into how programs are constructed, debugged, and run. To address these limitations, we introduce Meta: a language extension for Python that allows programmers to share functions and track how they are used by a crowd of other programmers. Meta functions are shareable via URL and instrumented to record runtime data. Combining thousands of Meta functions with their collective runtime data, we demonstrate tools including an optimizer that replaces your function with a more efficient version written by someone else, an auto-patcher that saves your program from crashing by finding equivalent functions in the community, and a proactive linter that warns you when a function fails elsewhere in the community. We find that professional programmers are able to use Meta for complex tasks (creating new Meta functions that, for example, cross-validate a logistic regression), and that Meta is able to find 44 optimizations (for a 1.45 times average speedup) and 5 bug fixes across the crowd.
Touch screens have a delay between user input and corresponding visual interface feedback, called input 'latency' (or 'lag'). Visual latency is more noticeable during continuous input actions like dragging, so methods to display feedback based on the most likely path for the next few input points have been described in research papers and patents. Designing these 'next-point prediction' methods is challenging, and there have been no standard metrics to compare different approaches. We introduce metrics to quantify the probability of 7 spatial error 'side-effects' caused by next-point prediction methods. Types of side-effects are derived using a thematic analysis of comments gathered in a 12 participants study covering drawing, dragging, and panning tasks using 5 state-of-the-art next-point predictors. Using experiment logs of actual and predicted input points, we develop quantitative metrics that correlate positively with the frequency of perceived side-effects. These metrics enable practitioners to compare next-point predictors using only input logs.
We explore the contextual details afforded by wearable devices to support multi-user, direct-touch interaction on electronic whiteboards in a way that-unlike previous work-can be fully consistent with natural bimanual-asymmetric interaction as set forth by Guiard.
Our work offers the following key observation. While Guiard's framework has been widely applied in HCI, for bimanual interfaces where each hand interacts via direct touch, subtle limitations of multi-touch technologies as well as limitations in conception and design-mean that the resulting interfaces often cannot fully adhere to Guiard's principles even if they want to. The interactions are fundamentally ambiguous because the system does not know which hand, left or right, contributes each touch. But by integrating additional context from wearable devices, our system can identify which user is touching, as well as distinguish what hand they use to do so. This enables our prototypes to respect lateral preference the assignment of natural roles to each hand as advocated by Guiard in a way that has not been articulated before.
We explore how gaze can support touch interaction on tablets. When holding the device, the free thumb is normally limited in reach, but can provide an opportunity for indirect touch input. Here we propose gaze and touch input, where touches redirect to the gaze target. This provides whole-screen reachability while only using a single hand for both holding and input. We present a user study comparing this technique to direct-touch, showing that users are slightly slower but can utilise one-handed use with less physical effort. To enable interaction with small targets, we introduce CursorShift, a method that uses gaze to provide users temporal control over cursors during direct-touch interactions. Taken together, users can employ three techniques on tablets: direct-touch, gaze and touch, and cursor input. In three applications, we explore how these techniques can coexist in the same UI and demonstrate how tablet tasks can be performed with thumb-only input of the holding hand, and with it describe novel interaction techniques for gaze based tablet interaction.
Accurately predicting the accuracy of finger-touch target acquisition is crucial for designing touchscreen UI and for modeling complex and higher level touch interaction behaviors. Despite its importance, there has been little theoretical work on creating such models. Building on the Dual Gaussian Distribution Model, we derived an accuracy model that predicts the success rate of target acquisition based on the target size. We evaluated the model by comparing the predicted success rates with empirical measures for three types of targets including 1-dimensional vertical and horizontal, and 2-dimensional circular targets. The predictions matched the empirical data very well: the differences between predicted and observed success rates were under 5% for 4.8 mm and 7.2 mm targets, and under 10% for 2.4 mm targets. The evaluation results suggest that our simple model can reliably predict touch accuracy.
Smartwatches and wearables are unique in that they reside on the body, presenting great potential for always-available input and interaction. Their position on the wrist makes them ideal for capturing bio-acoustic signals. We developed a custom smartwatch kernel that boosts the sampling rate of a smartwatch's existing accelerometer to 4 kHz. Using this new source of high-fidelity data, we uncovered a wide range of applications. For example, we can use bio-acoustic data to classify hand gestures such as flicks, claps, scratches, and taps, which combine with on-device motion tracking to create a wide range of expressive input modalities. Bio-acoustic sensing can also detect the vibrations of grasped mechanical or motor-powered objects, enabling passive object recognition that can augment everyday experiences with context-aware functionality. Finally, we can generate structured vibrations using a transducer, and show that data can be transmitted through the human body. Overall, our contributions unlock user interface techniques that previously relied on special-purpose and/or cumbersome instrumentation, making such interactions considerably more feasible for inclusion in future consumer devices.
Today's commercially available prosthetic limbs lack tactile sensation and feedback. Recent research in this domain focuses on sensor technologies designed to be directly embedded into future prostheses. We present a novel concept and prototype of a prosthetic-sensing wearable that offers a non-invasive, self-applicable and customizable approach for the sensory augmentation of present-day and future low to mid-range priced lower-limb prosthetics. From consultation with eight lower-limb amputees, we investigated the design space for prosthetic sensing wearables and developed novel interaction methods for dynamic, user-driven creation and mapping of sensing regions on the foot to wearable haptic feedback actuators. Based on a pilot-study with amputees, we assessed the utility of our design in scenarios brought up by the amputees and we summarize our findings to establish future directions for research into using smart textiles for the sensory enhancement of prosthetic limbs.
We present SleepCoacher, an integrated system implementing a framework for effective self-experiments. SleepCoacher automates the cycle of single-case experiments by collecting raw mobile sensor data and generating personalized, data-driven sleep recommendations based on a collection of template recommendations created with input from clinicians. The system guides users through iterative short experiments to test the effect of recommendations on their sleep. We evaluate SleepCoacher in two studies, measuring the effect of recommendations on the frequency of awakenings, self-reported restfulness, and sleep onset latency, concluding that it is effective: participant sleep improves as adherence with SleepCoacher's recommendations and experiment schedule increases. This approach presents computationally-enhanced interventions leveraging the capacity of a closed feedback loop system, offering a method for scaling guided single-case experiments in real time.
To address the increasing functionality (or information) overload of smartphones, prior research has explored a variety of methods to extend the input vocabulary of mobile devices. In particular, body tapping has been previously proposed as a technique that allows the user to quickly access a target functionality by simply tapping at a specific location of the body with a smartphone. Though compelling, prior work often fell short in enabling users' unconstrained tapping locations or behaviors. To address this problem, we developed a novel recognition method that combines both offline-before the system sees any user-defined gestures and online learning to reliably recognize arbitrary, user-defined body tapping gestures, only using a smartphone's built-in sensors. Our experiment indicates that our method significantly outperforms baseline approaches in several usage conditions. In particular, provided only with a single sample per location, our accuracy is 30.8% over an SVM baseline and 24.8% over a template matching method. Based on these findings, we discuss how our approach can be generalized to other user-defined gesture problems.
Natural language interfaces for visualizations have emerged as a promising new way of interacting with data and performing analytics. Many of these systems have fundamental limitations. Most return minimally interactive visualizations in response to queries and often require experts to perform modeling for a set of predicted user queries before the systems are effective. Eviza provides a natural language interface for an interactive query dialog with an existing visualization rather than starting from a blank sheet and asking closed-ended questions that return a single text answer or static visualization. The system employs a probabilistic grammar based approach with predefined rules that are dynamically updated based on the data from the visualization, as opposed to computationally intensive deep learning or knowledge based approaches.
The result of an interaction is a change to the view (e.g., filtering, navigation, selection) providing graphical answers and ambiguity widgets to handle ambiguous queries and system defaults. There is also rich domain awareness of time, space, and quantitative reasoning built in, and linking into existing knowledge bases for additional semantics. Eviza also supports pragmatics and exploring multi-modal interactions to help enhance the expressiveness of how users can ask questions about their data during the flow of visual analysis.
Direct manipulation interfaces provide intuitive and interactive features to a broad range of users, but they often exhibit two limitations: the built-in features cannot possibly cover all use cases, and the internal representation of the content is not readily exposed. We believe that if direct manipulation interfaces were to (a) use general-purpose programs as the representation format, and (b) expose those programs to the user, then experts could customize these systems in powerful new ways and non-experts could enjoy some of the benefits of programmable systems.
In recent work, we presented a prototype SVG editor called Sketch-n-Sketch that offered a step towards this vision. In that system, the user wrote a program in a general-purpose lambda-calculus to generate a graphic design and could then directly manipulate the output to indirectly change design parameters (i.e. constant literals) in the program in real-time during the manipulation. Unfortunately, the burden of programming the desired relationships rested entirely on the user.
In this paper, we design and implement new features for Sketch-n-Sketch that assist in the programming process itself. Like typical direct manipulation systems, our extended Sketch-n-Sketch now provides GUI-based tools for drawing shapes, relating shapes to each other, and grouping shapes together. Unlike typical systems, however, each tool carries out the user's intention by transforming their general-purpose program. This novel, semi-automated programming workflow allows the user to rapidly create high-level, reusable abstractions in the program while at the same time retaining direct manipulation capabilities. In future work, our approach may be extended with more graphic design features or realized for other application domains.
As small displays on devices like smartwatches become increasingly common, many people have difficulty reading the text on these displays. Vision conditions like presbyopia that result in blurry near vision make reading small text particularly hard. We design multiple different scripts for displaying English text, legible at small sizes even when blurry, for small screens such as smartphones and smartwatches. These "smartfonts" redesign visual character presentations to improve the reading experience. Like cursive, Grade 1 Braille, and ordinary fonts, they preserve orthography and spelling. They have the potential to enable people to read more text comfortably on small screens, e.g., without reading glasses. To simulate presbyopia, we blur images and evaluate their legibility using paid crowdsourcing. We also evaluate the difficulty of learning to read smartfonts and observe a learnability/legibility trade-off. Our most learnable smartfont can be read at roughly half the speed of Latin after two thousand practice sentences. It is also legible smaller than half the size of traditional Latin (i.e. "English") when blurry.
An interactive method for segmentation and isosurface extraction of medical volume data is proposed. In conventional methods, users decompose a volume into multiple regions iteratively, segment each region using a threshold, and then manually clean the segmentation result by removing clutter in each region. However, this is tedious and requires many mouse operations from different camera views. We propose an alternative approach whereby the user simply applies painting operations to the volume using tools commonly seen in painting systems, such as flood fill and brushes. This significantly reduces the number of mouse and camera control operations. Our technical contribution is in the introduction of the threshold field, which assigns spatially-varying threshold values to individual voxels. This generalizes discrete decomposition of a volume into regions and segmentation using a constant threshold in each region, thereby offering a much more flexible and efficient workflow. This paper describes the details of the user interaction and its implementation. Furthermore, the results of a user study are discussed. The results indicate that the proposed method can be a few times faster than a conventional method.
For glass-free mixed reality (MR), mid-air imaging is a promising way of superimposing a virtual image onto a real object. We focus on attaching virtual images to non-static real life objects. In previous work, moving the real object causes latency in the superimposing system, and the virtual image seems to follow the object with a delay. This is caused by delays due to sensors, displays and computational devices for position sensing, and occasionally actuators for moving the image generation source. In order to avoid this problem, this paper proposes to separate the object-anchored imaging effect from the position sensing. Our proposal is a retro-reflective system called "SkyAnchor," which consists of only optical devices: two mirrors and an aerial-imaging plate. The system reflects light from a light source anchored under the physical object itself, and forms an image anchored around the object. This optical solution does not cause any latency in principle and is effective for high-quality mixed reality applications. We consider two types of light sources to be attached to physical objects: reflecting content from a touch table on which the object rests, or attaching the source directly on the object. As for position sensing, we utilize a capacitive marker on the bottom of the object, tracked on a touch table. We have implemented a prototype, where mid-air images move with the object, and whose content may change based on its position.
We present physical interfaces that change their appearance through controlled transparency. These transparency-controlled physical interfaces are well suited for applications where communication through optical appearance is sufficient, such as ambient display scenarios. They transition between perceived shapes within milliseconds, require no mechanically moving parts and consume little energy. We build 3D physical interfaces with individually controllable parts by laser cutting and folding a single sheet of transparency-controlled material. Electrical connections are engraved in the surface, eliminating the need for wiring individual parts. We consider our work as complementary to current shape-changing interfaces. While our proposed interfaces do not exhibit dynamic tangible qualities, they have unique benefits such as the ability to create apparent holes or nesting of objects. We explore the benefits of transparency-controlled physical interfaces by characterizing their design space and showcase four physical prototypes: two activity indicators, a playful avatar, and a lamp shade with dynamic appearance.
We present JOLED, a mid-air display for interactive physical visualization using Janus objects as physical voxels. The Janus objects have special surfaces that have two or more asymmetric physical properties at different areas. In JOLED, they are levitated in mid-air and controllably rotated to reveal their different physical properties. We made voxels by coating the hemispheres of expanded polystyrene beads with different materials, and applied a thin patch of titanium dioxide to induce electrostatic charge on them. Transparent indium tin oxide electrodes are used around the levitation volume to create a tailored electric field to control the orientation of the voxels. We propose a novel method to control the angular position of individual voxels in a grid using electrostatic rotation and their 3D position using acoustic levitation. We present a display in which voxels can be flipped independently, and two mid-air physical games with a voxel as the playable character that moves in 3D across other physical structures and rotates to reflect its status in the games. We demonstrate a voxel update speed of 37.8 ms/flip, which is video-rate.
Room-temperature liquid metal GaIn25 (Eutectic Gallium- Indium alloy, 75% gallium and 25% indium) has distinctive properties of reversible deformation and controllable locomotion under an external electric field stimulus. Liquid metal's newly discovered properties imply great possibilities in developing new technique for interface design. In this paper, we present LIME, LIquid MEtal interfaces for non-rigid interaction. We first discuss the interaction potential of LIME interfaces. Then we introduce the development of LIME cells and the design of some LIME widgets.
A computer display that is sufficiently realistic such that the difference between a presented image and a real object cannot be discerned is in high demand in a wide range of fields, such as entertainment, digital signage, and design industry. To achieve such a level of reality, it is essential to reproduce the three-dimensional (3D) shape and material appearances simultaneously; however, to date, developing a display that can satisfy both conditions has been difficult. To address this problem, we propose a system that places physical elements at desired locations to create a visual image that is perceivable by the naked eye. This configuration can be realized by exploiting characteristics of human visual perception. Humans perceive light modulation as perfectly steady light if the modulation rate is sufficiently high. Therefore, if high-speed spatially varying illumination is projected to the actuated physical elements possessing various appearances at the desired timing, a realistic visual image that can be transformed dynamically by simply modifying the lighting pattern can be obtained. We call the proposed display technology Phyxel. This paper describes the proposed configuration and required performance for Phyxel. We also demonstrate three applications: dynamic stop motion, a layered 3D display, and shape mixture.
Private Webmail 2.0 (Pwm 2.0) improves upon the current state of the art by increasing the usability and practical security of secure email for ordinary users. More users are able to send and receive encrypted emails without mistakenly revealing sensitive information. In this paper we describe four user interface traits that positively affect the usability and security of Pwm 2.0. In a user study involving 51 participants we validate that these interface modifications result in high usability, few mistakes, and a strong understanding of the protection provided to secure email messages. We also show that the use of manual encryption has no effect on usability or security.
We present CloakingNote, a novel desktop interface for subtle writing. The main idea of CloakingNote is to misdirect observers' attention away from a real text by using a prominent decoy text. To assess the subtlety of CloakingNote, we conducted a subtlety test while varying the contrast ratio between the real text and its background. Our results demonstrated that the real text as well as the interface itself were subtle even when participants were aware that a writer might be engaged in suspicious activities. We also evaluated the feasibility of CloakingNote through a performance test and categorized the users' layout strategies.
Many people can author static web pages with HTML and CSS but find it hard or impossible to program persistent, interactive web applications. We show that for a broad class of CRUD (Create, Read, Update, Delete) applications, this gap can be bridged. Mavo extends the declarative syntax of HTML to describe Web applications that manage, store and transform data. Using Mavo, authors with basic HTML knowledge define complex data schemas implicitly as they design their HTML layout. They need only add a few attributes and expressions to their HTML elements to transform their static design into a persistent, data-driven web application whose data can be edited by direct manipulation of the content in the browser. We evaluated Mavo with 20 users who marked up static designs---some provided by us, some their own creation---to transform them into fully functional web applications. Even users with no programming experience were able to quickly craft Mavo applications.
We present QuickCut, an interactive video editing tool designed to help authors efficiently edit narrated video. QuickCut takes an audio recording of the narration voiceover and a collection of raw video footage as input. Users then review the raw footage and provide spoken annotations describing the relevant actions and objects in the scene. QuickCut time-aligns a transcript of the annotations with the raw footage and a transcript of the narration to the voiceover. These aligned transcripts enable authors to quickly match story events in the narration with semantically relevant video segments and form alignment constraints between them. Given a set of such constraints, QuickCut applies dynamic programming optimization to choose frame-level cut points between the video segments while maintaining alignments with the narration and adhering to low-level film editing guidelines. We demonstrate QuickCut's effectiveness by using it to generate a variety of short (less than 2 minutes) narrated videos. Each result required between 14 and 52 minutes of user time to edit (i.e. between 8 and 31 minutes for each minute of output video), which is far less than typical authoring times with existing video editing workflows.
Speech recordings are central to modern media from podcasts to audio books to e-lectures and voice-overs. Authoring these recordings involves an iterative back and forth process between script writing/editing and audio recording/editing. Yet, most existing tools treat the script and the audio separately, making the back and forth workflow very tedious. We present Voice Script, an interface to support a dynamic workflow for script writing and audio recording/editing. Our system integrates the script with the audio such that, as the user writes the script or records speech, edits to the script are translated to the audio and vice versa. Through informal user studies, we demonstrate that our interface greatly facilitates the audio authoring process in various scenarios.
Video production is a collaborative process in which stakeholders regularly review drafts of the edited video to indicate problems and offer suggestions for improvement. Although practitioners prefer in-person feedback, most reviews are conducted asynchronously via email due to scheduling and location constraints. The use of this impoverished medium is challenging for both providers and consumers of feedback. We introduce VidCrit, a system for providing asynchronous feedback on drafts of edited video that incorporates favorable qualities of an in-person review. This system consists of two separate interfaces: (1) A feedback recording interface captures reviewers' spoken comments, mouse interactions, hand gestures and other physical reactions. (2) A feedback viewing interface transcribes and segments the recorded review into topical comments so that the video author can browse the review by either text or timelines. Our system features novel methods to automatically segment a long review session into topical text comments, and to label such comments with additional contextual information. We interviewed practitioners to inform a set of design guidelines for giving and receiving feedback, and based our system's design on these guidelines. Video reviewers using our system preferred our feedback recording interface over email for providing feedback due to the reduction in time and effort. In a fixed amount of time, reviewers provided 10.9 (σ=5.09) more local comments than when using text. All video authors rated our feedback viewing interface preferable to receiving feedback via e-mail.
Recently, researchers started to engineer not only the outer shape of objects, but also their internal microstructure. Such objects, typically based on 3D cell grids, are also known as metamaterials. Metamaterials have been used, for example, to create materials with soft and hard regions.
So far, metamaterials were understood as materials-we want to think of them as machines. We demonstrate metamaterial objects that perform a mechanical function. Such metamaterial mechanisms consist of a single block of material the cells of which play together in a well-defined way in order to achieve macroscopic movement. Our metamaterial door latch, for example, transforms the rotary movement of its handle into a linear motion of the latch. Our metamaterial Jansen walker consists of a single block of cells-that can walk. The key element behind our metamaterial mechanisms is a specialized type of cell, the only ability of which is to shear.
In order to allow users to create metamaterial mechanisms efficiently we implemented a specialized 3D editor. It allows users to place different types of cells, including the shear cell, thereby allowing users to add mechanical functionality to their objects. To help users verify their designs during editing, our editor allows users to apply forces and simulates how the object deforms in response.
Several recent projects have introduced digital machines to the kitchen, yet their impact on culinary culture is limited. We envision a culture of Digital Gastronomy that enhances traditional cooking with new interactive capabilities, rather than replacing the chef with an autonomous machine. Thus, we deploy existing digital fabrication instruments in traditional kitchen and integrate them into cooking via hybrid recipes. This concept merges manual and digital procedures, and imports parametric design tools into cooking, allowing the chef to personalize the tastes, flavors, structures and aesthetics of dishes. In this paper we present our hybrid kitchen and the new cooking methodology, illustrated by detailed recipes with degrees of freedom that can be set digitally prior to cooking. Lastly, we discuss future work and conclude with thoughts on the future of hybrid gastronomy.
We introduce a new form of low-cost 3D printer to print interactive electromechanical objects with wound in place coils. At the heart of this printer is a mechanism for depositing wire within a five degree of freedom (5DOF) fused deposition modeling (FDM) 3D printer. Copper wire can be used with this mechanism to form coils which induce magnetic fields as a current is passed through them. Soft iron wire can additionally be used to form components with high magnetic permeability which are thus able to shape and direct these magnetic fields to where they are needed. When fabricated with structural plastic elements, this allows simple but complete custom electromagnetic devices to be 3D printed. As examples, we demonstrate the fabrication of a solenoid actuator for the arm of a Lucky Cat figurine, a 6-pole motor stepper stator, a reluctance motor rotor and a Ferrofluid display. In addition, we show how printed coils which generate small currents in response to user actions can be used as input sensors in interactive devices.
We demonstrate a new approach for designing functional material definitions for multi-material fabrication using our system called Foundry. Foundry provides an interactive and visual process for hierarchically designing spatially-varying material properties (e.g., appearance, mechanical, optical). The resulting meta-materials exhibit structure at the micro and macro level and can surpass the qualities of traditional composites. The material definitions are created by composing a set of operators into an operator graph. Each operator performs a volume decomposition operation, remaps space, or constructs and assigns a material composition. The operators are implemented using a domain-specific language for multi-material fabrication; users can easily extend the library by writing their own operators. Foundry can be used to build operator graphs that describe complex, parameterized, resolution-independent, and reusable material definitions. We also describe how to stage the evaluation of the final material definition which in conjunction with progressive refinement, allows for interactive material evaluation even for complex designs. We show sophisticated and functional parts designed with our system.
Emerging ultra-small wearables like smartwatches pose a design challenge for touch-based text entry. This is due to the "fat-finger problem," wherein users struggle to select elements much smaller than their fingers. To address this challenge, we developed DriftBoard, a panning-based text entry technique where the user types by positioning a movable qwerty keyboard on an interactive area with respect to a fixed cursor point. In this paper, we describe the design and implementation of DriftBoard and report results of a user study on a watch-size touchscreen. The study compared DriftBoard to two ultra-small keyboards, ZoomBoard (tapping-based) and Swipeboard (swiping-based). DriftBoard performed comparably (no significant difference) to ZoomBoard in the major metrics of text entry speed and error rate, and outperformed Swipeboard, which suggests that panning-based typing is a promising input method for text entry on ultra-small touchscreens.
Gesture-typing is an efficient, easy-to-learn, and errortolerant technique for entering text on software keyboards. Our goal is to "recycle" users' otherwise-unused gesture variation to create rich output under the users' control, without sacrificing accuracy. Experiment 1 reveals a high level of existing gesture variation, even for accurate text, and shows that users can consciously vary their gestures under different conditions. We designed an Expressive Keyboard for a smart phone which maps input gesture features identified in Experiment 1 to a continuous output parameter space, i.e. RGB color. Experiment 2 shows that users can consciously modify their gestures, while retaining accuracy, to generate specific colors as they gesture-type. Users are more successful when they focus on output characteristics (such as red) rather than input characteristics (such as curviness). We designed an app with a dynamic font engine that continuously interpolates between several typefaces, as well as controlling weight and random variation. Experiment 3 shows that, in the context of a more ecologically-valid conversation task, users enjoy generating multiple forms of rich output. We conclude with suggestions for how the Expressive Keyboard approach can enhance a wide variety of gesture recognition applications.
This paper presents EdgeVib, a system of spatiotemporal vibration patterns for delivering alphanumeric characters on wrist-worn vibrotactile displays. We first investigated spatiotemporal pattern delivery through a watch-back tactile display by performing a series of user studies. The results reveal that employing a 2×2 vibrotactile array is more effective than employing a 3×3 one, because the lower-resolution array creates clearer tactile sensations in less time consumption. We then deployed EdgeWrite patterns on a 2×2 vibrotactile array to determine any difficulties of delivering alphanumerical characters, and then modified the unistroke patterns into multistroke EdgeVib ones on the basis of the findings. The results of a 24-participant user study reveal that the recognition rates of the modified multistroke patterns were significantly higher than the original unistroke ones in both alphabet (85.9% vs. 70.7%) and digits (88.6% vs. 78.5%) delivery, and a further study indicated that the techniques can be generalized to deliver two-character compound messages with recognition rates higher than 83.3%. The guidelines derived from our study can be used for designing watch-back tactile displays for alphanumeric character output.
A system capable of suggesting multi-word phrases while someone is writing could supply ideas about content and phrasing and allow those ideas to be inserted efficiently. Meanwhile, statistical language modeling has provided various approaches to predicting phrases that users type. We introduce a simple extension to the familiar mobile keyboard suggestion interface that presents phrase suggestions that can be accepted by a repeated-tap gesture. In an extended composition task, we found that phrases were interpreted as suggestions that affected the content of what participants wrote more than conventional single-word suggestions, which were interpreted as predictions. We highlight a design challenge: how can a phrase suggestion system make valuable suggestions rather than just accurate predictions'
Prior work on creativity support tools demonstrates how a computational semantic model of a solution space can enable interventions that substantially improve the number, quality and diversity of ideas. However, automated semantic modeling often falls short when people contribute short text snippets or sketches. Innovation platforms can employ humans to provide semantic judgments to construct a semantic model, but this relies on external workers completing a large number of tedious micro tasks. This requirement threatens both accuracy (external workers may lack expertise and context to make accurate semantic judgments) and scalability (external workers are costly). In this paper, we introduce IdeaHound, an ideation system that seamlessly integrates the task of defining semantic relationships among ideas into the primary task of idea generation. The system combines implicit human actions with machine learning to create a computational semantic model of the emerging solution space. The integrated nature of these judgments allows IDEAHOUND to leverage the expertise and efforts of participants who are already motivated to contribute to idea generation, overcoming the issues of scalability inherent to existing approaches. Our results show that participants were equally willing to use (and just as productive using) IDEAHOUND compared to a conventional platform that did not require organizing ideas. Our integrated crowdsourcing approach also creates a more accurate semantic model than an existing crowdsourced approach (performed by external crowds). We demonstrate how this model enables helpful creative interventions: providing diverse inspirational examples, providing similar ideas for a given idea and providing a visual overview of the solution space.
Paid crowdsourcing platforms suffer from low-quality work and unfair rejections, but paradoxically, most workers and requesters have high reputation scores. These inflated scores, which make high-quality work and workers difficult to find, stem from social pressure to avoid giving negative feedback. We introduce Boomerang, a reputation system for crowdsourcing platforms that elicits more accurate feedback by rebounding the consequences of feedback directly back onto the person who gave it. With Boomerang, requesters find that their highly-rated workers gain earliest access to their future tasks, and workers find tasks from their highly-rated requesters at the top of their task feed. Field experiments verify that Boomerang causes both workers and requesters to provide feedback that is more closely aligned with their private opinions. Inspired by a game-theoretic notion of incentive-compatibility, Boomerang opens opportunities for interaction design to incentivize honest reporting over strategic dishonesty.
Citizen science and communitysensing applications allow everyday citizens to collect data about the physical world to benefit science and society. Yet despite successes, current approaches are still limited by the number of domain-interested volunteers who are willing and able to contribute useful data. In this paper we introduce habitsourcing, an alternative approach that harnesses the habit-building practices of millions of people to collect environmental data. To support the design and development of habitsourcing apps, we present (1) interaction techniques and design principles for sensing through actuation, a method for acquiring sensing data from cued interactions; and (2) ExperienceKit, an iOS library that makes it easy for developers to build and test habitsourcing applications. In two experiments, we show that our two proof-of-concept apps, ZenWalk and Zombies Interactive, compare favorably to their non-data collecting counterparts, and that we can effectively extract environmental data using simple detection techniques.
The world is full of physical interfaces that are inaccessible to blind people, from microwaves and information kiosks to thermostats and checkout terminals. Blind people cannot independently use such devices without at least first learning their layout, and usually only after labeling them with sighted assistance. We introduce VizLens - an accessible mobile application and supporting backend that can robustly and interactively help blind people use nearly any interface they encounter. VizLens users capture a photo of an inaccessible interface and send it to multiple crowd workers, who work in parallel to quickly label and describe elements of the interface to make subsequent computer vision easier. The VizLens application helps users recapture the interface in the field of the camera, and uses computer vision to interactively describe the part of the interface beneath their finger (updating 8 times per second). We show that VizLens provides accurate and usable real-time feedback in a study with 10 blind participants, and our crowdsourcing labeling workflow was fast (8 minutes), accurate (99.7%), and cheap ($1.15). We then explore extensions of VizLens that allow it to (i) adapt to state changes in dynamic interfaces, (ii) combine crowd labeling with OCR technology to handle dynamic displays, and (iii) benefit from head-mounted cameras. VizLens robustly solves a long-standing challenge in accessibility by deeply integrating crowdsourcing and computer vision, and foreshadows a future of increasingly powerful interactive applications that would be currently impossible with either alone.
As interactive electronics become increasingly intimate and personal, the design of circuitry is correspondingly developing a more playful and creative aesthetic. Circuit sketching and design is a multidimensional activity which combines the arts, crafts, and engineering broadening participation of electronic creation to include makers of diverse backgrounds. In order to support this design ecology, we present Ellustrate, a digital design tool that enables the functional and aesthetic design of electronic circuits with multiple conductive and dielectric materials. Ellustrate guides users through the fabrication and debugging process, easing the task of practical circuit creation while supporting designers' aesthetic decisions throughout the circuit authoring workflow. In a formal user study, we demonstrate how Ellustrate enables a new electronic design conversation that combines electronics, materials, and visual aesthetic concerns.
The recent proliferation of easy to use electronic components and toolkits has introduced a large number of novices to designing and building electronic projects. Nevertheless, debugging circuits remains a difficult and time-consuming task. This paper presents a novel debugging tool for electronic design projects, the Toastboard, that aims to reduce debugging time by improving upon the standard paradigm of point-wise circuit measurements. Ubiquitous instrumentation allows for immediate visualization of an entire breadboard's state, meaning users can diagnose problems based on a wealth of data instead of having to form a single hypothesis and plan before taking a measurement. Basic connectivity information is displayed visually on the circuit itself and quantitative data is displayed on the accompanying web interface. Software-based testing functions further lower the expertise threshold for efficient debugging by diagnosing classes of circuit errors automatically. In an informal study, participants found the detailed, pervasive, and context-rich data from our tool helpful and potentially time-saving.
For makers and developers, circuit prototyping is an integral part of building electronic projects. Currently, it is common to build circuits based on breadboard schematics that are available on various maker and DIY websites. Some breadboard schematics are used as is without modification, and some are modified and extended to fit specific needs. In such cases, diagrams and schematics merely serve as blueprints and visual instructions, but users still must physically wire the breadboard connections, which can be time-consuming and error-prone. We present CircuitStack, a system that combines the flexibility of breadboarding with the correctness of printed circuits, for enabling rapid and extensible circuit construction. This hybrid system enables circuit reconfigurability, component reusability, and high efficiency at the early stage of prototyping development.
Recent advances in materials science research allow production of highly stretchable sensors and displays. Such technologies, however, are still not accessible to non-expert makers. We present a novel and inexpensive fabrication method for creating Stretchis, highly stretchable user interfaces that combine sensing capabilities and visual output. We use Polydimethylsiloxan (PDMS) as the base material for a Stretchi and show how to embed stretchable touch and proximity sensors and stretchable electroluminescent displays. Stretchis can be ultra-thin (≈ 200μm), flexible, and fully customizable, enabling non-expert makers to add interaction to elastic physical objects, shape-changing surfaces, fabrics, and the human body. We demonstrate the usefulness of our approach with three application examples that range from ubiquitous computing to wearables and on-skin interaction.
We present a novel manipulation method that subconsciously changes the walking direction of users via visual processing on a head mounted display (HMD). Unlike existing navigation systems that require users to recognize information and then follow directions as two separate, conscious processes, the proposed method guides users without them needing to pay attention to the information provided by the navigation system and also allows them to be graphically manipulated by controllers. In the proposed system, users perceive the real world by means of stereo images provided by a stereo camera and the HMD. Specifically, while walking, the navigation system provides users with real-time feedback by processing the images they have just perceived and giving them visual stimuli. This study examined two image-processing methods for manipulation of human's walking direction: moving stripe pattern and changing focal region. Experimental results indicate that the changing focal region method most effectively leads walkers as it changes their walking path by approximately 200 mm/m on average.
We present an investigation of mechanically-actuated hand-held controllers that render the shape of virtual objects through physical shape displacement, enabling users to feel 3D surfaces, textures, and forces that match the visual rendering. We demonstrate two such controllers, NormalTouch and TextureTouch, which are tracked in 3D and produce spatially-registered haptic feedback to a user's finger. NormalTouch haptically renders object surfaces and provides force feedback using a tiltable and extrudable platform. TextureTouch renders the shape of virtual objects including detailed surface structure through a 4×4 matrix of actuated pins. By moving our controllers around while keeping their finger on the actuated platform, users obtain the impression of a much larger 3D shape by cognitively integrating output sensations over time. Our evaluation compares the effectiveness of our controllers with the two de-facto standards in Virtual Reality controllers: device vibration and visual feedback only. We find that haptic feedback significantly increases the accuracy of VR interaction, most effectively by rendering high-fidelity shape output as in the case of our controllers.
We present Amphibian, a simulator to experience scuba diving virtually in a terrestrial setting. While existing diving simulators mostly focus on visual and aural displays, Amphibian simulates a wider variety of sensations experienced underwater. Users rest their torso on a motion platform to feel buoyancy. Their outstretched arms and legs are placed in a suspended harness to simulate drag as they swim. An Oculus Rift head-mounted display (HMD) and a pair of headphones delineate the visual and auditory ocean scene. Additional senses simulated in Amphibian are breath motion, temperature changes, and tactile feedback through various sensors. Twelve experienced divers compared Amphibian to real-life scuba diving. We analyzed the system factors that influenced the users' sense of being there while using our simulator. We present future UI improvements for enhancing immersion in VR diving simulators.
We present an end-to-end system for augmented and virtual reality telepresence, called Holoportation. Our system demonstrates high-quality, real-time 3D reconstructions of an entire space, including people, furniture and objects, using a set of new depth cameras. These 3D models can also be transmitted in real-time to remote users. This allows users wearing virtual or augmented reality displays to see, hear and interact with remote participants in 3D, almost as if they were present in the same physical space. From an audio-visual perspective, communicating and interacting with remote users edges closer to face-to-face communication. This paper describes the Holoportation technical system in full, its key interactive capabilities, the application scenarios it enables, and an initial qualitative study of using this new communication medium.
Dynamic effects such as waves, splashes, fire, smoke, and explosions are an integral part of stylized animations. However, such dynamics are challenging to produce, as manually sketching key-frames requires significant effort and artistic expertise while physical simulation tools lack sufficient expressiveness and user control. We present an interactive interface for designing these elemental dynamics for animated illustrations. Users draw with coarse-scale energy brushes which serve as control gestures to drive detailed flow particles which represent local velocity fields. These fields can convey both realistic and artistic effects based on user specification. This painting metaphor for creating elemental dynamics simplifies the process, providing artistic control, and preserves the fluidity of sketching. Our system is fast, stable, and intuitive. An initial user evaluation shows that even novice users with no prior animation experience can create intriguing dynamics using our system.
Design plays an important role in adoption of apps. App design, however, is a complex process with multiple design activities. To enable data-driven app design applications, we present interaction mining -- capturing both static (UI layouts, visual details) and dynamic (user flows, motion details) components of an app's design. We present ERICA, a system that takes a scalable, human-computer approach to interaction mining existing Android apps without the need to modify them in any way. As users interact with apps through ERICA, it detects UI changes, seamlessly records multiple data-streams in the background, and unifies them into a user interaction trace. Using ERICA we collected interaction traces from over a thousand popular Android apps. Leveraging this trace data, we built machine learning classifiers to detect elements and layouts indicative of 23 common user flows. User flows are an important component of UX design and consists of a sequence of UI states that represent semantically meaningful tasks such as searching or composing. With these classifiers, we identified and indexed more than 3000 flow examples, and released the largest online search engine of user flows in Android apps.
The outfits people wear contain latent fashion concepts capturing styles, seasons, events, and environments. Fashion theorists have proposed that these concepts are shaped by design elements such as color, material, and silhouette. A dress may be "bohemian" because of its pattern, material, trim, or some combination of them: it is not always clear how low-level elements translate to high-level styles. In this paper, we use polylingual topic modeling to learn latent fashion concepts jointly in two languages capturing these elements and styles. Using this latent topic formation we can translate between these two languages through topic space, exposing the elements of fashion style. We train the polylingual topic model (PLTM) on a set of more than half a million outfits collected from Polyvore, a popular fashion-based social net- work. We present novel, data-driven fashion applications that allow users to express their needs in natural language just as they would to a real stylist and produce tailored item recommendations for these style needs.
Virtual Reality (VR) narratives have the unprecedented potential to connect with an audience through presence, placing viewers within the narrative. The onset of consumer VR has resulted in an explosion of interest in immersive storytelling. Planning narratives for VR, however, is a grand challenge due to its unique affordances, its evolving cinematic vocabulary, and most importantly the lack of supporting tools to explore the creative process in VR.
In this paper, we distill key considerations with the planning process for VR stories, collected through a formative study conducted with film industry professionals. Based on these insights we propose a workflow, specific to the needs of professionals creating storyboards for VR film, and present a multi-device (tablet and head-mounted display) storyboard tool supporting this workflow. We discuss our design and report on feedback received from interviews following demonstration of our tool to VR film professionals.
We present SketchingWithHands, a 3D sketching system that incorporates a hand-tracking sensor. The system enables product designers to easily capture desired hand postures from a first-person point of view at any time and to use the captured hand information to explore handheld product concepts by 3D sketching while keeping the proper scale and usage of the products. Based on the analysis of design practices and drawing skills in the art and design literature, we suggest novel ideas for efficiently acquiring hand postures (palm-pinning widget, front and center mirrors, responsive spangles), for quickly creating and easily adjusting sketch planes (modified tick-triggered, orientable and shiftable sketch planes), for appropriately starting 3D sketching products with hand information (hand skeleton, grip axis), and for practically increasing user throughput (intensifier, rough and precise erasers)---all of which are coherently and consistently integrated in our system. A user test by ten industrial design students and an in-depth discussion show that our system is both useful and usable in designing handheld products.
Illustrations of human movements are used to communicate ideas and convey instructions in many domains, but creating them is time-consuming and requires skill. We introduce DemoDraw, a multi-modal approach to generate these illustrations as the user physically demonstrates the movements. In a Demonstration Interface, DemoDraw segments speech and 3D joint motion into a sequence of motion segments, each characterized by a key pose and salient joint trajectories. Based on this sequence, a series of illustrations is automatically generated using a stylistically rendered 3D avatar annotated with arrows to convey movements. During demonstration, the user can navigate using speech and amend or re-perform motions if needed. Once a suitable sequence of steps has been created, a Refinement Interface enables fine control of visualization parameters. In a three-part evaluation, we validate the effectiveness of the generated illustrations and the usability of DemoDraw. Our results show 4 to 7-step illustrations can be created in 5 or 10 minutes on average.
Gaze is frequently explored in public display research given its importance for monitoring and analysing audience attention. However, current gaze-enabled public display interfaces require either special-purpose eye tracking equipment or explicit personal calibration for each individual user. We present AggreGaze, a novel method for estimating spatio-temporal audience attention on public displays. Our method requires only a single off-the-shelf camera attached to the display, does not require any personal calibration, and provides visual attention estimates across the full display. We achieve this by 1) compensating for errors of state-of-the-art appearance-based gaze estimation methods through on-site training data collection, and by 2) aggregating uncalibrated and thus inaccurate gaze estimates of multiple users into joint attention estimates. We propose different visual stimuli for this compensation: a standard 9-point calibration, moving targets, text and visual stimuli embedded into the display content, as well as normal video content. Based on a two-week deployment in a public space, we demonstrate the effectiveness of our method for estimating attention maps that closely resemble ground-truth audience gaze distributions.
In RadarCat we present a small, versatile radar-based system for material and object classification which enables new forms of everyday proximate interaction with digital devices. We demonstrate that we can train and classify different types of materials and objects which we can then recognize in real time. Based on established research designs, we report on the results of three studies, first with 26 materials (including complex composite objects), next with 16 transparent materials (with different thickness and varying dyes) and finally 10 body parts from 6 participants. Both leave one-out and 10-fold cross-validation demonstrate that our approach of classification of radar signals using random forest classifier is robust and accurate. We further demonstrate four working examples including a physical object dictionary, painting and photo editing application, body shortcuts and automatic refill based on RadarCat. We conclude with a discussion of our results, limitations and outline future directions.
Electrical Impedance Tomography (EIT) was recently employed in the HCI domain to detect hand gestures using an instrumented smartwatch. This prior work demonstrated great promise for non-invasive, high accuracy recognition of gestures for interactive control. We introduce a new system that offers improved sampling speed and resolution. In turn, this enables superior interior reconstruction and gesture recognition. More importantly, we use our new system as a vehicle for experimentation ' we compare two EIT sensing methods and three different electrode resolutions. Results from in-depth empirical evaluations and a user study shed light on the future feasibility of EIT for sensing human input.
This paper proposes a novel machine learning architecture, specifically designed for radio-frequency based gesture recognition. We focus on high-frequency (60]GHz), short-range radar based sensing, in particular Google's Soli sensor. The signal has unique properties such as resolving motion at a very fine level and allowing for segmentation in range and velocity spaces rather than image space. This enables recognition of new types of inputs but poses significant difficulties for the design of input recognition algorithms. The proposed algorithm is capable of detecting a rich set of dynamic gestures and can resolve small motions of fingers in fine detail. Our technique is based on an end-to-end trained combination of deep convolutional and recurrent neural networks. The algorithm achieves high recognition rates (avg 87%) on a challenging set of 11 dynamic gestures and generalizes well across 10 users. The proposed model runs on commodity hardware at 140 Hz (CPU only).
We propose and study a new input modality, WristWhirl, that uses the wrist as an always-available joystick to perform one-handed continuous input on smartwatches. We explore the influence of the wrist's bio-mechanical properties for performing gestures to interact with a smartwatch, both while standing still and walking. Through a user study, we examine the impact of performing 8 distinct gestures (4 directional marks, and 4 free-form shapes) on the stability of the watch surface. Participants were able to perform directional marks using the wrist as a joystick at an average rate of half a second and free-form shapes at an average rate of approximately 1.5secs. The free-form shapes could be recognized by a $1 gesture recognizer with an accuracy of 93.8% and by three human inspectors with an accuracy of 85%. From these results, we designed and implemented a proof-of-concept device by augmenting the watchband using an array of proximity sensors, which can be used to draw gestures with high quality. Finally, we demonstrate a number of scenarios that benefit from one-handed continuous input on smartwatches using WristWhirl.
Training gesture recognizers with synthetic data generated from real gestures is a well known and powerful technique that can significantly improve recognition accuracy. In this paper we introduce a novel technique called gesture path stochastic resampling (GPSR) that is computationally efficient, has minimal coding overhead, and yet despite its simplicity is able to achieve higher accuracy than competitive, state-of-the-art approaches. GPSR generates synthetic samples by lengthening and shortening gesture subpaths within a given sample to produce realistic variations of the input via a process of nonuniform resampling. As such, GPSR is an appropriate rapid prototyping technique where ease of use, understandability, and efficiency are key. Further, through an extensive evaluation, we show that accuracy significantly improves when gesture recognizers are trained with GPSR synthetic samples. In some cases, mean recognition errors are reduced by more than 70%, and in most cases, GPSR outperforms two other evaluated state-of-the-art methods.
"A single design is one molecule that contributes to the atmosphere of the whole environment." This is the basic idea of design in this age. That is to say, designing today seems to embody the subject by its surroundings, rather than by the subject itself.