Interaction in 3D Graphics
Vol.32 No.4 November 1998
Designing Special-Purpose Input Devices
W. Bradford Paley
For an increasing number of applications, we may be reaching the point of diminishing returns with general purpose computer input devices, such as the keyboard and mouse. At Digital Image Design Incorporated (DID) we’ve created special purpose input devices to perform tasks that are commonly addressed with software and general purpose devices alone. These more specific devices have led to significant advantages in accomplishing the tasks, and we’ve developed an approach to designing these devices that I hope will help others in doing similar work.
When someone says “Build us a system to speed up computer character animation,” the typical solution is purely software. This is not necessarily optimal or even cost effective, and the only way to determine that is to consider developing a hardware device during the initial project exploration. Let’s go through the process of designing a special purpose device together. Real-world details are critical to this process, so we’ll do it in the problem domain of animation, where DID and others have previously developed special purpose input devices. I’ll generalize where appropriate.
Special Purpose Tools
Why a special purpose input device, rather than a more general purpose device like a mouse or digitizing tablet? There’s an inverse relationship between task scope and tool quality. The more constrained the scope, the better a tool you can build for supporting it. Look at the toolkit of any serious craftsman, artist or mechanic. You’ll see dozens of tools, each slightly different and addressed to different tasks, or even variations on the same task.
If we don’t limit the task we end up making compromises, since there are more needs to be satisfied with the limited resources at hand: the hardware and the user’s mental and physical efforts. At some point, the compromises detract more than the device is adding.
Rocks are pretty good general purpose tools, and we probably used them for generations before people started modifying them. The mouse, a rock on a string, is a pretty decent input device if the task is defined as “indicate a point on a coarsely-quantized 2D plane (point), and also indicate one of several ways you’re interested in that point (click).” But the mouse was designed only for that 2D plane, not the user’s real task. It seems to work for every task because present-day programmers have worked very hard to reduce every problem to one that can be solved by mouse actions and keypresses.
In certain problem domains, we’ll get better results by engaging the user’s task-specific abilities. That’s where we need to build tools that feel like the problems, not tools that extend our fingers or voices.
The Design Process
There are a few distinct strategies that we’ll use to design special purpose input tools. I’m going to lay them out as steps for this article, but it’s important to note that this is not a simple linear process. As we proceed we may need to reapply previous strategies, or jump ahead to later ones; this will happen naturally as part of the design process.
The steps include:
This article does not address the first and last steps, which connect the device design into the larger system design, but they are critical steps and should not be ignored. In fact, involving a friendly user in each of the other steps can provide valuable insight, if someone is available.
This article then presents two case studies in detail; devices that were designed using this process. It closes with an observation or two about collaborating with people in a related field: industrial design.
Define the Tasks
Task definitions will be useful only to the degree that they are put in human rather than technical terms. I have been told that the Palm Pilot Personal Digital Assistant design specification was expressed in goals like “should fit in a shirt pocket,” “should not make the owner cringe when it falls out,” and “the batteries should last long enough so that when the owner changes them, he doesn’t remember the last time he did.” I think this is brilliant. Contrast it with specifications like “8 cm x 0.8 cm x 12 cm,” “resistant to 3.5 Gs” and “drawing less than n milliwatts per minute.”
Tasks specified like the above seem almost to suggest solutions. They help us to solve the user’s real problems, not the problems as seen through a somewhat arbitrary quantitative screen. They also promote flexibility in the process, allowing substitution of different technologies or approaches, rather than simply finalizing the engineering at too early a stage.
We want to know what tasks are of the essence of the project, not what tasks people currently need to do to accomplish it. For example, keyframing character animation can be broken into concrete and essential tasks like “position the camera,” “position the figure,” “save a keyframe,” “reposition the figure,” “save another keyframe,” “preview and evaluate the motion,” and “edit and refine the motion.” These are going to be more useful for us than tasks like “load a scene file,” or “define an inverse-kinematics chain;” tasks that address the current tools rather than the work that needs to be done.
Define The Perfect Tool
Existing devices tend to be thinly wrapped around the technology, or incremental variations on existing devices, if there is something that’s related in form or in word. An example is the late ‘80s “3D Mouse.” By the early ‘90s, there were several variations on the idea, all looking generally alike despite the fact that they were used in an entirely different way. We want to think without being hampered by ideas like “mouse;” thinking rather in a process and user centered way.
At DID we’ve developed some simple strategies to free ourselves, as device designers, and free the problem domain experts or users from the constraints of existing computer technology.
Figure 1: The Monkey input device.
Use the Past, the Future and Magic
As a first attempt at technology-free thinking, we can simply ask how the task was done before — or ‘use the past.’
In our animation example, getting the cyborg to act may have been giving an actor the right motivation and adjectives to do the part. Work in simulating human actors is progressing, but interpreting fine nuances of motion is still laborious; and it’s a programming job, not an input device job, so let’s not follow that thought. Traditional cell animation seems to have worked well, but required too much time and produced a different look, so that’s not interesting. Moving an articulated model into a posture has been done by art students for ages; this could have been the inspiration for the Monkey, but it occurred to DID only in retrospect.
Science fiction and magic can be lumped together for brainstorming, mental inhibition-relaxing purposes. We want to have an off-site meeting (where other inhibition-relaxing techniques might be introduced into the discussion on the company’s tab) with a few interested parties: users, domain experts, designers, even the occasional sales or marketing person. Let’s start the talks by saying “We’re going to relax constraints here, and accomplish the task with all of Merlin and Picard’s resources. We understand they’re available for subcontracting.”
At DID we’ve done this successfully for years, relying on the fact that people actually remember in the backs of their minds that we’re going to do it with computers, so the techniques get defined at the right level of detail.
This was the real genesis of DID’s Monkey input device . It’s essentially an electronic version of that wooden artist’s model, instrumented with sensors, that can control an on-screen human model. People at DID looked at animators spending minutes — or tens of minutes — getting decent postures using the keyboard, mouse and inverse kinematics. They’d switch from pick mode to move mode, and occasionally have to rotate the scene back and forth to see a resulting posture, acting as a computer operator much more than artist. We asked ourselves how magic might help, and said “Reach into the monitor, grab the model and move it around.” Since VR was rare, expensive, cumbersome and not really up to the task anyway (this was 1990), reaching into the display space was not an option. So we built a doll.
These brainstorming sessions should be focused on defining a tool or set of tools, and noticing how well they address the task. Don’t worry about implementing them for now.
Use Software, Too
We don’t want to ignore software, just give people a better handle on it. The physical device need not do all of the work. Since we’re controlling the world in the computer, too, we can have the stuff the user is manipulating meet you halfway.
Consider how “gravity fields” around lines and snap-to grids in drawing programs make the mouse more effective. And witness the beautiful cooperation between a human and computer in Active Contours , each bringing distinct and complementary abilities to a task neither could accomplish alone. With Active Contours or “Snakes,” a person can sketch a cross section on a MRI scan, for example, and the computer will nudge the line toward edges until it conforms exactly to the structure in the image.
Let’s try to design our tool to help people do the looser, more high-level direction that they’re good at, and build software to recognize and assist where it can. The Monkey, for instance, is for entering key frames, not creating all of the “in-between” frames: that’s left for the animation program. And specific, well-defined mathematical, geometric or procedural tasks (like making sure animated feet stay in one place on the floor) are best delegated to the software.
Figure 2: The Cyclops.
Allow Several Tools
We shouldn’t be afraid of splitting a task into several subtasks, each of which requires a different device. People are excellent at switching tools very quickly while maintaining focus on the larger task. Watch a mechanic or craftsperson work, constantly drawing on spatial and kinesthetic senses. Quick glances toolward are followed by reaching — after attention returns to the task. Combination screwdrivers and wrenches are popular father’s day gifts, but a real professional still has a toolbox full of specific wrenches and screwdrivers.
For animation, suppose we want to direct the camera to follow some action. We could remap the input angles coming from the head of the Monkey to control the camera; this would give us a physical affordance to easily tilt and pan. But aside from the tedious computer-operator task of remapping the input, it would feel wrong and look wrong. The Monkey has the technological capabilities, but the way it was designed leads one to use it for posture input. We need a different specific device for the new task.
We were asked to develop another device, the Cyclops, by computer animation director Steve Katz. It’s a simple concept: take a standard fluid-head tripod and instrument it, then mount a flat-panel display on the tripod where the camera would normally go. Now feed the tilt and pan angles into the animation program, while simultaneously displaying a real-time preview of the animation. Voila! You’ve brought camera control back out into the real world.
This device demonstrates several important principles. First, we’re using a separate device for a separate function, even though we could accomplish the job with an existing tool. This essentially allows the animator’s body to instantly make what would normally be a computer operator-ish mode change done by clicking keys or on-screen buttons.
Second, we’re using the past again. Not only as inspiration this time, but directly. What we get is a device that’s completely intuitive to use, and one that taps into years of experience that a cinematographer may have acquired on the somatic level. We’re using the interaction of cinematographer’s eye and visual sense (where the artist is), muscle-memory of the arm and hand (where the training is) and the grip and fluid-damped head of the tripod.
Likewise, the User Interface Research Group at Alias|Wavefront has tapped into decades of training and development in editing by simulating a standard jog/shuttle wheel. They gave the animator a puck-like input device to control animation previews the way a video editor controls tape replay. This device allows easy forward and backward motion through the animated sequence with simple and familiar turning motions, and single-frame moves with related motions. We can easily imagine an animation workplace set up with several devices, each giving instantaneous access to a different function, allowing the animator to work with his body.
This brings us a huge advantage for some tasks. By instrumenting or simulating an existing device, we recapture all of the thought and engineering that went into developing it: the accumulated ideas and hard work of smart people working for decades.
Bringing the task of animation back into the physical realm is especially appropriate to the spatial task of creating animation. Aesthetic intuition may flourish in an environment filled with physical movement: the brain activity that supports the movement may activate mental processes which are deeply associated with the animator’s experience of space. Those spatial understanding processes may support the understanding and creation of spatial changes in the animation. The varied movements seem likely to help avoid repetitive strain injury, also.
Having several devices can act as an aid to the user in conceiving how to accomplish a goal. Imagine being handed the perfect general purpose matter shaping tool and a block of marble in an empty white room sometime in the 21st century. Now go create a sculpture. Think of how much less daunting it would be to be shown into a workshop containing pencils, paper, calipers, a pantograph, five mallets, thirty chisels ranging from 3” to ice-pick like points, scraping tools, sandpaper and rags.
The proper tools help people coming into a project with only half-formed ideas know how to complete it. A general purpose tool is much less suggestive of a specific step in the process. Think of a multiplicity of tools as providing two things: a concrete record of how masters have approached this type of project in the past, and an aid to your body in remembering where things are and how to apply them.
Use Your Hands, Use Your Colleagues
Now we have to hone the tools — bring them from brainstorm to implementable and useful devices. At DID we’ve had great success in developing ideas in this iterative, everybody-try-it-out stage. The final devices rarely look like the initial brainstormed idea.
With our brainstormed tools in mind, we need to sit down at a desk and imagine doing the task. Don’t stop at one solution; the seventh one we devise will be better. Take an occasional break and watch how people do related tasks — or even completely unrelated tasks.
Let’s ask some colleagues to join in the process. Describe a task to set up a context. Describe a few initial solutions, but be sure to tell them that they are just first stabs at the issue. We’ll have to swallow our egos when they “don’t get it.” It’s never the user’s fault. They know what they’re trying to do, and it’s our job to build tools to help the computer figure it out. Besides, maybe we can turn their misinterpretations into valid approaches. Involve a competitive friend or acquaintance who’ll try to one-up your ideas. In fact, involve that annoying know-it-all around the corner, or even a politely hostile acquaintance. (Be sure to get the unfriendly types to sign a nondisclosure first.)
When the ideas stabilize, get users into this imagined-use testing. Have a list of questions to ask, but also watch carefully. It’s incredibly valuable to note physical cues, like a hesitation when someone’s supposed to grab something, an incorrect grasp when something’s picked up, a frown or frustrated glance at the designers. We may also need to lie a little to avoid having the user pull punches to spare our feelings: say it’s not our invention, or that we’re trying to see how many of the known flaws the user will come across.
We need to actually go through the motions of using the device. Again and again. With different variations of the task in mind. For the whole duration of the task, too: if it takes five minutes, we can’t stop after the first repetition and mentally say “repeat 30 times.” That tedium may be the essence of the task, and the thing we most want to address. Play with your hands in the air until passersby think you’re nuts. Buy a lump of clay or some carveable foam and make lots of models. Give them to people with different sized hands, and different intellectual approaches (e.g. spatial vs. linguistic). And write all of the ideas in as few words and lines as possible on visible and reorganizable big yellow Post-Its all over your walls and desk. Make a mess; that’s the fundamental goal of this step in the design process.
We’re going to be looking through this mess for the solutions that seem obvious in retrospect. But be prepared to have every third person at SIGGRAPH tell you “I thought of that five years ago.”
Developing and building a new input device can cost hundreds of thousands of dollars or more, but prototyping one need not. There are ways to do it very effectively on a budget.
Need a handheld device? Give your clay or foam prototype to a local sculpture student. A couple of hundred dollars might get you a hand-carved, beautifully finished ebony model. I-Cube  makes a general-purpose A/D and D/A converter with inputs for 18 different kinds of sensors from pressure to temperature, sound to light. And if you want to go into limited production, there are custom electronics shops like Shooting Star Technologies  that have been in this field for many years. Shops like that can help you define electrical and sensor tolerances, as well as logical protocols and software drivers. Bill Buxton’s 3D resource page  and the University of Washington’s HIT Lab  have excellent lists of places which do this type of work. And you can always walk over to the engineering department of a nearby college — graduate students might be looking for a meaningful class project.
A Case Study In Layering: The Monkey 2
(Editor’s note: The Monkey 2 is pictured on this issue’s front cover.)
Make an instrumented armature that plugs into a computer: this is the kind of obvious solution we’re seeking. Be able to reassemble it into a variety of shapes and topologies: a simple extension. What distinguishes an adequate engineering job from the design of an interface device that really enriches someone’s work and thought process is layers of human-centered improvements. These only come from willingness not to stop at the initial solution, careful observation of ourselves while we’re pretending to use the device, and real users after we get a prototype or two out into the field.
For animators, the value of a digital input device like the Monkey doesn’t come from having it precisely match the model on the screen. The usefulness of a puppet-like input has nothing to do with how digitally accurate the mapping of joint-angle to number is — a 1 percent deviation in sensing has exactly zero effect on the usefulness of the device. What’s important is having an affordance that clearly and subtly stands in for the thing you’re trying to manipulate.
The tweaks DID made came from carefully considering how the device was used by animators as people and artists. For instance, the head, hands, chest and pelvis plates are convenient handles, but they serve a more important function. They give the armature a gestalt sense of presence as a person, not just an erector-set jumble of components. This is central to its usability, not just cosmetic polish. The animator must be able to watch the screen for the real position of the model, and often use the device without looking directly at it. It must be possible for the animator’s hands to instantly find a hand, head, chest or elbow without wasting time, diverting attention or, most importantly, interrupting the flow of aesthetic ideas.
It’s interesting that in this field visual and tactile “surface” changes aren’t what people imply when they call a change cosmetic. Cosmetics are functional in so far as they really affect how a device is perceived and used. The Monkey 2 is human-shaped, but intentionally has no prominent character. The hope is that the artist is using it as an intermediary, or even transferring his feelings for the character to the device. This transference would be harder if the device drew too much attention to itself, or had too much intrinsic character.
To keep the device sublimated as a tool, we needed animators to trust the device, to viscerally know that it was not going to break or loosen in production, and know that they could grab it anywhere to effect a change. We made it generally tough looking by giving it a consistent black, beefy look and slightly matte surface texture. We gave the animator freedom to knock it around by carefully engineering a smooth, solid feel for the joint rotations; and a means of individually controlling the stiffness of the rotations to prevent excessive movement of joints the animator wants to restrict. We tested it by dropping it from eight feet, and let users know it survived. We made the wires visible, but barely so, by using colored conductors twisted in with the originally-specified black ones; and we gave them a perceptual consistency with the sensors by making one of those conductors the same medium blue as the sensors. This also helped to tie the whole unit together as an entity.
We also followed human factors when we addressed maintenance of the device. We allowed people to test it without connecting it to a computer by adding an LED which lights when a joint is being rotated (colored the same medium blue as the sensor, of course). And we had all screws and fittings custom-made to have the same satin black finish as the body of the device if they were used only to disassemble it; and a prominent, contrasting brass if they are used during production for joint-tension adjustment.
These kinds of tweaks make a device not barely work, but truly make a task easier. They create an overall feeling of simplicity in what’s actually a complex device, and they can actually make it more satisfying and fun to use. Our job as physical input designers can actually help make people more comfortable and satisfied in their jobs — a worthy goal that should always be borne in mind while we’re designing. Would we be proud to have Mom and Dad use the device on a daily basis? If not, let’s keep tweaking.
Figure 3: The Cricket.
Another Case Study: The Cricket
DID also developed a 3D mouse in 1993 called the Cricket. Some of the things we considered while designing it may be useful to illustrate general principles.
Previous 3D mouse designers got caught in the linguistic trap of designing a 6 or more degree of freedom spatial manipulation device after the model of a 2D mouse. Or the equally dangerous attempt to make it work as a 2D mouse too: the trap of over-generalizing the task. We set out to invent something to improve upon the mouse, and the data glove, from first principles. The scope of our problem domain was reduced to manipulating objects in a desktop virtual reality. This defined a somewhat restricted project domain.
What needed to be done there? Well, users want to pick or indicate objects in the world, and possibly draw things (essentially indicating a point in space to apply paint). They may want to indicate quantity or size when creating or picking. They want to move things around. They want to get a feel for the space and perhaps properties of things in the space. And they may want to do more rare arbitrary things like choosing colors or navigating 2D menus. This defined the task list.
Next we started tinkering. Raise your hand into the air in front of a computer display. Is it in the pronate posture that it assumes on a 2D mouse? Nope. Your elbow’s down, and your hand’s some 20 or 30 degrees off of vertical. So we made a device that fits into that comfortable hand-posture.
Pick something on your desk: indicate it to an imaginary other person. Draw a line in the air. You used your index finger. Fine, we put an affordance under the index finger to make these actions in communicating with the computer as natural as they are without it. We seem to have reinvented the pistol grip and trigger. This is important for two reasons: we’re tapping into the wisdom of the past, but more significant: we reinvented based on current needs.
This also illustrates some compromises: we decided to have the grip vertical rather than the comfortable neutral angle because people often will hold a device vertically (years of training in our rectilinear world?) and we wanted a single device usable by both left and right handed people. Also there’s a more serious compromise if you watch people picking and drawing in the air: they extend the finger — the opposite of squeezing a trigger. This was mechanically difficult at the time, so we let it go.
Moving things was a little more natural. We allowed users to squeeze the rest of their hand, the lower three fingers, around the base of the grip. Why not put a different button under each finger? The device felt less solid, most people use those fingers together anyway, and we had no need in our task list for that feature. (A important blow against creeping featurism! Keep your eager engineer self behind a user-shaped screen.) Now the user can really feel like they’re grabbing something in the real world when they want to in the virtual. Tests proved squeezing the whole grip was tiresome, and squeezing a base-long button felt flimsy or unstable, so we reduced it to a short button.
How do users know when they’ve touched something in the display? Well, it could change color, but that’s a visual cue to what’s a tactile experience in the real world. Why not use that real-world expectation and put a tactile display in the device? More compromises led us not to try to display to the fingertips, where most of those sensations occur in real-world grabbing. We display only to the thenar eminance (the muscular mound at the inside base of the thumb): it’s second in tactile sensitivity (in the hand) only to the fingertips. Another engineering compromise: we don’t display force feedback (actually pushing the fingers), just vibration. But we gained something: the vibration is variable in frequency (nicely covering the most sensitive part of the tactile spectrum: 250-1500hz.) amplitude and waveform, so we can give different objects different tactile signatures.
We wanted more than binary control, so the user could control the size or rate of something, or pick a larger or smaller population of objects around a point. Pressure sensors on the buttons provided that. How about more arbitrary control? We used the thumb, with more articulation than the other fingers, to control a “flat joystick” on top of the device with a full continuous 3 degrees of freedom. This can be used to pick a color, e.g., or navigate a 2D menu.
Users can’t push the device into a desktop virtual reality, so how do we indicate points in there? This is where getting the software and the objects in the virtual world to work together with the device became extremely useful. We put a cursor into the world at a constant offset from the device, and also tried having a virtual wand starting at the device and projecting into the virtual space. It turns out that each technique is useful in a different set of circumstances, but that’s another paper.
Software also enhances the Cricket by helping to interpret the pressure sensors into something useful. Picking an object near the cursor is a quick click on the trigger. Picking several within a sphere can be done by squeezing the trigger slowly, letting the picking sphere expand and contract with the slowly adjusted pressure; the picking action occurring when software senses a quick trigger release. Likewise, gripping an object to move it might translate it, but a firmer grip might also change its orientation.
Many other tweaks, both hardware and software, have accreted into the Cricket and its repertoire, both during its design phase and after we made a limited production run of them. It’s important to be open to exploring these changes, even after the “final design” has been cast. There’s always a reason to think about another, if only to hone our own skills for the creation of the next device.
W. Bradford Paley is Principal, Digital Image Design Incorporated (DID) of New York. Paley is a long-time participant in the fields of interface design, scientific visualization and information presentation. After graduating with Phi Beta Kappa honors from U.C. Berkeley in 1981, he entered the financial data processing field as a consultant with Citibank in New York. He simultaneously pursued his interests in using the computer as a human communications medium by doing computer animation for the advertising industry. Finding the production tools awkward and almost non-existent he began writing his own, soon realizing that building a comfortable tool was more challenging and interesting than doing the animation itself.
W. Bradford Paley
You’re Not The Expert, You’re The Expert
You may want to involve an industrial designer in the process since industrial designers have spent a lifetime solving spatially-based functional problems. This is incredibly valuable to the typical person embarking on an input device design project, often starting from a background as domain expert, engineer or programmer. You’re not the expert here — let the industrial designer show you related problems and solutions, and invent new spatial structures and organizations for your thoughts.
But seen from the viewpoint of an interface device designer, the object in industrial design is somewhat antagonistic to ours: we’re trying to be invisible, where they’re trying to capture and please the eye. As a result, though industrial designers may constantly invoke ergonomics. In the end, the design may be more for the eyes than for the hand. It doesn’t help that the premier award in the field is bestowed on the basis of photographs and write-ups rather than touching and using the physical object. It’s important to be constantly vigilant, and gently steer our collaborators towards our real goal of comfort and usability. If the lines and color can win awards after that, so much the better.
As computers are applied to more varied tasks and particularly more spatial tasks, we increasingly seem to be coming up against the limitations of existing general purpose devices. There’s a good reason to believe that tapping into people’s real-world skills and abilities can greatly simplify many tasks.
We’ve reached a turning point in the development of the computer. It’s beginning to come out of its box and address human concerns in a much more human way. Special purpose input devices are an important step toward making tools that involve our bodies as well as our minds. I hope that some of DID’s experiences have made it easier for new devices to be conceived, targeted and ultimately deployed — to allow people to quit being computer operators and get more in tune with what they wanted the tools for in the first place.
Clifford Beshers helped immeasurably in reviewing to make this article more readable, as did Sally Grisdale; though naturally all errors and awkwardness remain my own. Cliff was an inspiration and domain expert in developing the Cricket, also. Bill Buxton convinced me that people might benefit from hearing about our design experiences. JueyChong Ong and Hai Ng gathered and refined references and images for this article, as well as being important participants in the development of the DID devices mentioned.