Vol.32 No.1 February 1998
The Shape of Things to Come
Andries van Dam
I still vividly remember the day in 1964 when, as a graduate student interested in information storage and retrieval, I saw Ivan Sutherland's landmark film on Sketchpad . I was blown away. The next day I switched my thesis topic to an unknown specialty in the nascent field of computer science called computer graphics. I still show the Sketchpad film in my introductory course each year and, like the old-timer who tells his kids about walking 10 miles to school, in the snow, barefoot, I always emphasize to my students, spoiled rotten by multi-hundred MIPS engines with accelerators for 3D raster graphics, what an amazing paradigm shift interaction via pictures was in an era defined by batch processing with punched cards.
Later, at the 1969 FJCC, Doug Engelbart further expanded my horizons when he, in what has to be the mother of all demos, demoed window systems, the mouse, outline processing, sophisticated hypertext features and telecollaboration with video and audio communication.
In the intervening decades, graphics has greatly expanded from those early days of primitive vector graphics to include not just classical geometric modeling, photorealistic rendering (made possible by color raster graphics) and even behavioral modeling, but also elements that only recently became part of mainstream graphics, such as image processing, computer vision and new paradigms for user interaction. A look at these examples will serve to show how the field continues to evolve and take on new forms, and may give a hint as to the shape of things to come.
Image processing became important to graphics as soon as raster graphics became a dominant mode of display. We learned (much too slowly) about various causes of unsightly aliasing and about simple antialiasing techniques. At the same time, page description languages like PostScript were evolving, and image processing techniques formed part of the foundation for new applications that let users manipulate images, such as Adobe's Photoshop and After Effects. Such applications were significant departures from the standard CAD/CAM and scientific visualization applications, the previous breadwinners in graphics.
While the graphics community in the 70s, knowingly or unknowingly, embraced elements of image processing technology, there was a clear distinction between 3D geometric models and their rendering on the one hand and manipulation of 2D images for medical image processing, for example, on the other. Graphics hardware and software had too little in common with image processing hardware and software, and professionals in one area did not mingle with those in the other. Texture mapping began the move to combine the two areas, and the distinction is now being further blurred as rendering pipelines use image-based rendering techniques such as image warping  to obviate rerendering each frame from scratch. In addition to high-end research such as Hanrahan and Levoy's light-field rendering  and Gortler et al.'s work on the Lumigraph , commercial programs such as QuickTime VR have popularized the idea of generating a sensation of 3D from 2D data.
Computer vision techniques, such as depth extraction from multiple images, formed the basis of McMillan's plenoptic modeling work . Those same techniques are used in the reconstruction of static geometry (and reflectance information) from multiple images. Some of these methods may soon become hardware-supported as their importance to graphics increases. The combination of geometric model-based and image-based rendering techniques appears to be a realistic solution to the lack of sufficient time, human resources and modeling expertise to model arbitrarily complex, non-static scenes (e.g., office environments containing people and a mixture of manmade and natural objects). Graphics researchers are now working on solving the "inverse-rendering" problem that just a few years ago might have been considered the domain of vision researchers; the world-view peculiar to graphics researchers has brought new approaches to this daunting task -- reconstructing the detailed geometry of a scene from the physics of light reflection and a collection of suitable images.
User Interface Issues
The SIGGRAPH research community has at times been criticized for being too preoccupied with photorealistic rendering and not paying enough attention to real-time interaction and user interface issues and techniques. Indeed, because people felt that the annual SIGGRAPH conference gave short shrift to these important topics, two conferences specializing in such issues, UIST (User Interface Software Technology) and 3DIG (3D Interactive Graphics) were organized in the 80s, not to mention SIGCHI. When Xerox PARC researchers invented the precursor to today's standard desktop GUI, often called the WIMP interface (Windows, Icons, Menus, Pointing), we did not anticipate that the old clichι about graphics being the window into the computer would apply even more to interaction with any desktop application, even word processing, than to information visualization. Even we zealots didn't dream, back in the 60s, that GUIs would eventually make computers easy enough to use that even preschoolers (dare I say managers? college professors?...) could exploit them. Without GUIs, PCs and the information technology industry would be nowhere near as pervasive and as economically important as they are today.
I believe that, thanks to Moore's Law, we finally have almost enough computing power for much of what we want to do, and that now the quality of the user interface is far more important to the success of a platform and its applications than its underlying functionality. I regret, therefore, that user-interface issues and indeed the entire human component in the human-computer loop are still relegated to a peripheral role in the standard computer science curriculum. Too much hardware and software is still designed by technologists with an inadequate understanding of wetware. For example, rendering technology is largely uninformed by what visual effects are most important under what conditions for what tasks. And increasingly, computers are not being used in the "single user interacting with a software application" paradigm, but in a paradigm in which the computer provides a means of communication between two or more people. Telecollaboration in more or less immersive environments, e.g., for mechanical or architectural design or for telemedicine, is an example of a non-traditional problem area that needs far more knowledge about human beings and how they function individually and in groups than we teach our students (and indeed than we know ourselves).
The Post-WIMP Interface
Another trend affecting the user interface is the gradual move away from a nearly 30-year old paradigm of the WIMP desktop interface to what I call the post-WIMP interface. The WIMP is characterized by deterministic, synchronous, single-device-at-a-time interaction, typically via mouse and keyboard, and typically passive, at most reactive application objects. The post-WIMP user interface is characterized by nondeterministic, asynchronous, multiple-devices-in-parallel interaction and often reactive objects that also have autonomous behaviors. Consider, for example, interaction in an immersive VR environment of some complex object or collection of interacting objects whose behavior is being simulated. The computer continuously tracks the user's head and hands, adjusts the point of view dynamically as the user moves her head, and probabilistically interprets gestures that may indicate objects as well as operations on them. Voice recognition may be used in combination to provide another channel while spatial sound and haptic feedback augment the visual display. MIT's landmark "put that there" demo in the late 70s was one of the earliest examples of such multimodal interfaces.
|Figure 1: A user designs a parameterized feature-based 3D mechanical model by sketching 2D gestures at an ActiveDesk (a drafting table sized projection table). The non-dominant hand controls other modeling operations such as camera controls using the trackball attached to the desk or virtual widgets (e.g., a colorpicker) using a tracker.||
Games and battle simulations exemplify the types of applications that demand post-WIMP, multimodal interfaces. But this type of post-WIMP interaction can be just as useful on the desktop as in immersive VR. For example, our "sketching" user interface for conceptual design and mechanical CAD does away with interface widgets altogether, attempting to create a medium as natural and convenient to use as a cocktail napkin (see Figure 1).
Another development I anticipate in user interface technology is a combination of the user's direct manipulation of the environment with the indirect control provided by intelligent agents working on the user's behalf. Despite the commercial failure of the first social interface, Microsoft's Bob, more sophisticated versions will reappear and may actually be useful in, for example, a telecollaboration environment where avatars of remote collaborators, agent avatars and application objects will all be present and interact in the scene.
The broadening of computer graphics with its emphasis on physically-based modeling of geometry, behavior (albeit usually still quite primitive) and light-object interaction, and more recently on user modeling, has created an increasing emphasis on mathematics, physics and human studies such as cognitive science and perceptual psychology (as is evident from the SIGGRAPH proceedings).
Those involved in authoring tools and content creation have also had to learn new design disciplines: graphic design, user interface design and what I believe to be an important nascent field, the design of interactive experiences. These experiences include not just entertainment and edutainment but educational environments in which students interact with controllable microworlds that I call interactive illustrations or exploratories, as discussed below. In short, our field has become highly interdisciplinary and projects typically require interdisciplinary teams rather than an individual investigator with a couple of students or colleagues.
The Exploratory at Brown
In a project at Brown University, we are trying to determine how to create successful explorable environments for teaching. Our idea of an exploratory is a computer-based combination of a science museum-style exploratorium and a laboratory: an approach to teaching and learning that uses two- and three-dimensional explorable worlds in which objects have behaviors and users can interact with models of concepts and phenomena. Exploratories leverage computer graphics and a deeply interactive, constructivist learning approach to provide efficient, powerful educational experiences that would be impractical, if not impossible, with traditional textbooks with static illustrations or even video clips.
As the start of the Exploratory project, an interdisciplinary team of students, with backgrounds not only in computer science but also in art, design, cognitive psychology and education, is constructing a set of Java applets for my introductory computer graphics course. These go well beyond the kinds of algorithm animations we designed in the early '80s with BALSA (the Brown Algorithm Simulator and Animator) . Not only will this give us an excellent set of teaching and learning tools, but, more importantly, by testing our applets on students and getting critical feedback, we hope to develop a good sense of what it takes to create a successful exploratory. We are planning to capture our experiences in a set of design patterns for such teaching tools.
Education and Computer Graphics
Clearly I remain excited about the potential of computers in education, and I believe that graphics will play a key role. But I am also a realist, some might even say a pessimist. As I look back over 30 years, I see extraordinary applications of graphics throughout academia and industry but, frankly, not many success stories for graphics in the world of education. Most of the projects I worked on in the late '60s and '70s, such as the electronic book/hypermedia systems HES (A Hypertext Editing System) , FRESS (File Retrieval and Editing SyStem) [9, 10] and Intermedia  and most of the other efforts that have made headlines in the last few decades, from the promise of CAI in the 60s and 70s to Alan Kay's Vivarium in the 80s, have not sparked a revolution in teaching and learning.
Part of the problem has been expensive hardware and expensive maintenance. To date, the vision of interactive, explorable, behaviorally correct worlds has been possible only on high-end Unix workstations. With the next generation of personal computers and World Wide Web software, this once esoteric method of teaching can become commonplace. Some of the hype about the Web may finally come true and this may be the decade in which the computer starts to have a major impact on how we design educational materials and even how we think about the educational process. But none of this will happen without major research efforts to create tools, design strategies and inspirational examples.
Now let me get off my soapbox and finish by speculating a bit more about technology in the not-too-distant future. It is clear that sometime soon, we will break out of the constraints of the desktop workstation/personal computer and the desktop metaphor. All CPUs will have built-in graphics/image-based rendering/image processing capabilities so that graphics and multimedia will no longer be add-ons but will be standard. Our appliances and wearable computers will have graphical (and aural and haptic) user interfaces. This major shift in capability mirrors an earlier revolution, the introduction of mostly self-sufficient workstations and PCs with bitmap-based GUIs in the mid-'80s that largely replaced alphanumeric dumb terminals on time-shared mainframes.
At the same time as this new computing hardware becomes standard, developments in TI mirror chip projector technology may finally let us display large-screen images on our office walls without high-priced projectors and special display surfaces. The model espoused by researchers such as VR pioneer Henry Fuchs includes multiple projectors and cameras integrated into our offices, capable of capturing, displaying and controlling synthetic environments with much higher spatial and temporal resolution than we get on contemporary devices. The games industry will lead the way in giving us more interesting interaction devices, e.g. force-feedback tools, and in general we will be able to produce far more realistic virtual environments and use more of our sensory capabilities for perceiving and interacting with them. In particular, unobtrusive VR of a quality much better than available today will be affordable and commonplace, and it will be especially useful for telecollaboration.
When I first saw Sutherland's Sketchpad film, and for many years thereafter, a graphics terminal was considered exotic and there was usually one, at best, per university. Much of the challenge graphics professionals faced had to do with hardware, optimization and hacks to get the most from the available resources. Today, even personal computers have extraordinary graphics capabilities and are as common as desk lamps in university offices. While other exotic graphics hardware such as VR labs and CAVEs have taken the workstation's role as experimental facilities found only at a few universities, the classical graphics machines have entered a new era. In many cases, speed is no longer the issue; instead, elements like better user interfaces, knowledge of pedagogy for interactive education and protocols for collaborative computer-based work are what are needed to increase our productivity.
Just as the capabilities of the early graphics workstations are now standard in personal computers, so I believe that some form of VR will be a common part of our offices and homes. We will experience synthetic environments that provide a feeling of "being there," a capability already predicted in Sutherland's 1965 article, "The Ultimate Display" . Not only will graphics continue to become more ubiquitous and integrated into daily life through cheaper, better hardware, but our definition of "graphics" will continue to expand to include more disciplines, leverage more of our senses and play an ever more important role in the basic tasks of thinking and communication.
Andries van Dam, Professor of Computer Science at Brown University, is well known to Computer Graphics readers. He co-founded SIGGRAPH in 1967, co-authored the widely used textbook Fundamentals of Interactive Computer Graphics and its successor Computer Graphics: Principles and Practice and won the ACM SIGGRAPH Steven A. Coons Award in 1991. His research has concerned computer graphics, text processing and hypermedia systems and workstations.
I am grateful to Anne Morgan Spalter for prodding me with questions and helping me articulate these thoughts.