a a a


Vol.32 No.2 May 1998

Real Animators Don’t Rotoscope, Or Do They?

Mike Milne

May 98 Columns
About the Cover Visfiles

Mike Milne
Previous article by Mike Milne Next article by Mike Milne

Recently, in an Internet discussion group for 3D animators, there was a lengthy thread concerning the use of motion capture, or performance capture as it’s sometimes called, or even “performance animation” -- which it definitely isn’t. ( See the SIGGRAPH 97 web for transcripts of a panel addressing such issues. )

For those of you lucky enough to have remained totally oblivious of this arcane practice, let me spoil your day. Motion capture means recording the movement of something (alive or not) by one of several methods. In one method, electromagnetic sensors are attached to the something in question, and the data from the sensors is fed (via wires or a radio link) into a handy PC. Another way involves sticking little reflective balls onto the victim, I mean subject, and recording the motion on six or eight infrared cameras placed around it (or him, or her). The resulting video pictures are fed into a clever piece of software that works out what the balls must have been doing. A third method, used for capture of facial expressions and speech, needs only dots, applied with makeup to the unfortunate vic... volunteer’s face.

Now for some reason, people get very steamed up about the subject. I have heard the phrase “Satan’s rotoscope” applied to motion capture, usually accompanied by hissing and frequent signs of the cross. Other people can’t get enough of it, and think that in the glorious future, everything will be motion-captured, even sex. Personally, I can’t get very worked up about it -- motion capture is a tool I need to use occasionally, like a floorboard saw. Sometimes, nothing else will do; most other times, it’s useless. However, I do get worried about the use of the word “rotoscope” as a term of abuse, although I understand how this may have come about. The rotoscope is an honourable instrument -- its uses are many, and they are not all bad. Also, if you examine the subject, it tends to lead to more fundamental issues about the nature of entertainment.

In design studios, you will often hear artists using the term “reference” -- by which they usually mean a photograph of whatever it is they are required to illustrate. The implication of the word is that they will be examining this picture carefully, analysing the light and form and texture in it, and then putting it to one side while they set to work on their masterpiece. The truth is often more prosaic than that, and calling the photo a reference is about as accurate as a burglar referring to his victims as “clients.” What actually happens is that the artist slips the photo under a sheet of layout paper (lightweight paper that is slightly transparent) and traces off the salient details, before giving it a some sort of graphic treatment that covers up the evidence, so to speak. In some studios (but never in front of the clients) these reference pictures are called “swipes,” for obvious reasons.

In traditional (hand-drawn) animation, swipes exist as well, although they have a different name. In the early part of this century, animators at Walt Disney’s studio had, since the earliest days, been filming actors performing movements, sometimes with props, and then using the resulting footage to help them with their work. The animation was not traced from the photographic material, but was used as a guide for weight, timing and movement.

In their seminal account of the work of the Disney studio, The Illusion of Life, Frank Thomas and Ollie Johnston describe frequent trips to nearby farms to film animals when animating scenes for Bambi. During the making of Snow White, they invited a top burlesque comedian to the studios to record on film an interpretation of one of the dwarfs, Dopey, who had proved problematic to characterize. The result was a huge success, and live-action filming became a regular part of the animation process -- as a rehearsal ground for gags and stage business, and as a way of getting the input of talented performers who could often find solutions when the animators were in trouble.

Meanwhile, there was another way in which live-action photography contributed to animation. It was pioneered in Disney’s 1940 work Fantasia, in which the animated figure of Mickey Mouse was seen climbing onto the conductor’s podium to shake hands with a (real-life) Leopold Stokowski. Soon, all the Hollywood animation studios were at it, and animated characters were popping up all over the place to share the limelight with the screen idols of the day.

Now in order for our little two-dimensional chums to inhabit the same celluloid world as their flesh-and-blood colleagues, the animators had to have a reference (that word again!) of the humans’ movements -- they had to know whereabouts to draw the critters, where the eyelines were (the eyeline is filmspeak for the direction in which you’re looking), where the hands and feet should go and the scale to which they should be drawn.

The solution was to be found in a piece of technology that was used for another aspect of cinematic trickery -- that of hand-painting mattes (silhouettes of parts of the image) to allow different layers to be superimposed on each other. This technique formed part of the craft scathingly known (by directors) as “trick photography,” and generally relegated to faceless technicians (known as “trick men”) in back rooms -- and never mentioned in the company of creative people, lest they should suffer palpitations. (Today, all that has changed. The art of producing visual effects or VFX, as they are known in the trade, has become respectable, creative and even sexy; practitioners drive expensive sports cars and buy vineyards).

The technique used for hand-painting mattes at the time was to project the film, one frame at a time, onto a glass screen on which an artist placed a translucent sheet. The image could be easily (though somewhat tediously) traced and filled in with solid black. A separate drawing was made for each frame of film, and the resulting pictures were then rephotographed onto cine film, to produce the matte roll. The machine was called a Rotoscope, and thus the process was called “rotoscoping,” or even “rotoscopy” (which sounds so much more official). It was a principal tool of special effects photography from the earliest days to comparatively recent times. Lucas’ Star Wars trilogy, made before the digital revolution, and many blockbusters since have made extensive use of it. Nowadays, of course, the process has been computerised. The film is digitally scanned onto disk, and the artist sits in front of a terminal with a stylus and a virtual toolbox of clever algorithms and fancy hardware -- but the process is still called “rotoscoping.”

The Disney animators co-opted the rotoscoping technique to help them in the task of combining painted characters into live-action scenes, although this time with the insertion of an extra stage in the process. Filmed images were projected frame-by-frame onto photographic paper, and the resulting large prints (sometimes called “photorotos” or “photostats”) punched with registration holes and slipped under the animator’s “flimsy” (which is lightweight paper -- now why does that sound familiar?) making it a simple matter for the animator to draw the character in the right place, doing the right thing.

Historically, this form of rotoscoping has popped up in feature films every once in a while, sometimes in a major role. For instance, it was used in the 1964 film Mary Poppins, and most notably in the Richard Williams classic, Who Framed Roger Rabbit, which caused almost as much excitement on its release in 1988 as Fantasia had done five decades earlier. Outside the feature film business, however, rotoscopy is far more prevalent -- especially in commercials. Just think of all those ads in which animated characters run around the tabletop to extol the virtues of a breakfast cereal, or in which a pack of something-or-other sprouts arms and legs and cavorts across the screen. These latter are so common that they have earned the generic name of “dancing products” amongst the animation fraternity -- both conventional and computer-based. However, while the computer animators perform their rotoscopy digitally, using processor-power to display the wire-frame (or even phong-shaded) CG characters over the photographed live-action, the traditional animators still order up their photorotos and slide them under the flimsy, just as their colleagues did 60 years ago.

Now you can picture the time when the temptation might become too great. An animator has been given an impossible deadline, say, or a job has turned out more difficult than it first appeared. The animator just doesn’t have the skill to pull it off -- the temptation is there to say “the hell with it!” Hire a couple of actors, put them in front of a camera, shoot the action, pull out some photorotos, slide them under the old flimsies and get to work with the pencil, tracing the figures straight from the photographs. At last!

Now we’re motoring! Look how realistic the animation is! And how cheap to do! Unfortunately, though, it doesn’t look very good. Disney was tempted, when making Cinderella during a time of extreme financial difficulty for the studio, to film the whole story with actors (much cheaper, and quicker, than animation) and then to feed the resulting images to the animators to trace. While they didn’t actually use the tracings directly, they were encouraged to follow them fairly closely in order to speed up the animation process. The results convinced Disney that there was no future in this form of animation, and the experiment was not repeated.

Incidentally, in a less-than-reverent biography of Uncle Walt (The Disney Version, 1968) Richard Schickel had a more jaundiced view. He stated categorically that, in Snow White, all of the Prince’s actions, and most of Snow White’s, were traced directly off live-action film. He also implied that all Disney employees were sworn to secrecy about the process, and that the word “rotoscope” was a code name to cover up this dark practice.

The problem with traced animation is that it looks dead -- lifeless and inert, and carrying with it a sort of built-in boredom switch. One glance, and your brain switches off. If you doubt this, check out the “Lucy in the Sky with Diamonds” sequence in the 1968 film Yellow Submarine, in which the worst aspects of the process are clearly visible. There is a dancing couple in the foreground, drawn (or rather traced) in outline; but no matter what the animators have done to jazz up the picture (and believe me, they tried -- even filling every frame with different scribbles, flashes, stars, rainbows -- anything to get some life into the scene), there is still that same yawn-inducing flatness. Even though it is far more faithful to real-life motion, it is still less interesting than the hand-drawn animation in the scenes that surround it.

When the film was released, that scene was hailed (by the animation industry) as a triumph of experimental animation -- but to me (and, I suspect, to the public in general), it was another nail in the coffin of animation as a commercially viable method of producing feature films. The animation industry was enjoying (if that’s the word) its lowest level of popularity since ... well, since the invention of animated films at the beginning of the century. Even the Disney Studio had turned to live-action features and the Disneyland theme park to earn revenue, while animated features had been pushed onto the back burner.

The situation did not improve for some time after that, and it was not helped by the release of another animated full-length feature a decade later -- Ralph Bakshi’s animated version of Tolkien’s The Lord of the Rings. Once again, we can see the corpse-like hand of traced animation in many of the scenes. In some instances, they didn’t even bother to trace the animation. The live-action images themselves were used directly, with an image-processing technique applied (or perhaps it was simply a photocopy) in an effort to disguise their origin -- along with some parts of the image redrawn over the top (“It’s still animation, guys! Honest!”). As before, the industry defended this awful practice as “experimental.” Yup, it was an experiment all right -- like the Ford Edsel.

OK, you’re getting impatient. What has all this discussion of hand-drawn animation techniques in the mid-twentieth century got to do with computer animation at the brink of the new millennium?

Well, it’s got to do with the reason why traced animation just doesn’t cut the mustard, and how that reason affects everyone involved in creating entertaining images.

Some years ago, I was approached by the director of a commercial (for the Norwegian national railway, as it happens) which had been filmed as a live-action commercial, but which had to end up looking like an animated impressionist painting. The director had heard of the new software packages that might do the trick; I had too, so I bought one, and we started work.

At first, it seemed absurdly easy. Still frames, subjected to some of the processes, did indeed look like a Sisley or a Cezanne. The problem came when we applied the effect to the whole sequence. It was as if the hand-painted quality of the image disappeared, and in its place was the effect of watching a scene through rippled glass, or through a rather dirty piece of crumpled polythene.

My hypothesis is that the brain is not deceived by the image processing, because the processing of the image that it performs itself is many times more powerful, and the software is millions of years older. Parts of the brain have, quite literally, evolved to recognise live-action motion. Once a movement has been recognised as real, the brain interprets the treatment (the colours, blotches, lines, scribbles, whatever) as something that the image is seen through, and therefore of no consequence. It starts to examine at the action and if the action is not particularly interesting, then the brain is just not interested.

Mike Milne is Director of Computer Animation at FrameStore, which together with its sister company CFC, forms one of Europe's largest digital effects teams. Mike started out as an artist and beachcomber in the '60s, moved into graphic design in the '70s and finally to computer graphics in 1982. Sometimes he regards his career as one long, downhill slide.

Mike Milne
9 Nole Street
London W1V 4AL
United Kingdom

Tel: +44-171-208-2600
Fax: +44-171-208-2626

The copyright of articles and images printed remains with the author unless otherwise indicated.

Exactly the same process occurs with motion capture. Even when the motion is clothed with a computer graphics character, it is recognizably (to the brain) a real-world action, and therefore the computer graphics rendering is just a fancy suit. Usually, the visual richness of the real world is absent from the CG image, however densely it is swamped with textures, and the scene is interpreted as a flat version of reality.

With traced 2D animation, the outline is “mechanically” reproduced even though the machine is a human hand, because the artist’s brain is effectively disengaged during the process of tracing. The line follows the live-action willy-nilly, and the viewer’s brain once more, with the magic of the evolved visual cortex, spots the real-world action and discards the rest.

The brain can be fooled, however, by another brain. If, instead of a mechanical reproduction process, the live-action is passed through the filter of human intelligence before being rendered (onto paper or into digits), the resulting images are harder for the viewer’s brain to spot as real-world action. Consequently they are more intriguing, and the brain stays interested while it examines them. I expect, if we were the sort of people who like to publish papers in scientific journals, we could make a case for a direct relationship between the perceived quality of the piece and the length of time the image resides in the “intelligence filter” before coming back out into the real world. So, if the line is drawn at the same time as it’s seen, as in tracing, the perceived quality is pretty low (Lucy in the Sky). If the image is retained for a few hours, as in Disney’s Cinderella animation-guided-by-live-action, that’s a whole lot better, but no cigar. Leave it stewing in the brain for a few days (the farm animals in Bambi), and now we’re really cooking with gas. And what’s this? Someone who mulls over the same mountain scene for 20 years?

Well, pleased to meet you, Mr. Cezanne!