Skip to content. | Skip to navigation

siggraph.org

Sections
Personal tools
You are here: Home Publications Computer Graphics (CG) Quarterly Volume 41, Number 3 Stereoscopic 3D Film and Animation - Getting It Right
Document Actions

Stereoscopic 3D Film and Animation - Getting It Right

A new format for stereoscopic 3D movies has recently been introduced to theaters, raising the image and color quality to higher levels. Unlike the old red / blue anaglyph, format the new technology uses polarized lenses, a metalized screen and polarized glasses to allow the viewer to see sharp, full color, separate images with their left and right eyes. In addition a new type of personal video display has been introduced to the DVD and computer market in the last few years. Video Eyewear use two tiny LCDs to provide separate images for each eye, and are worn like a pair of glasses.

Author: Kenneth Wittlief


These new technologies are creating a new market for stereoscopic 3D movies and games. For animation the content is very easy to generate. All you need to do is render a second version of a film with a second camera position angled a few degrees to the right of the original, and you will have stereoscopic separation giving depth perception so real it seems like you can reach out and touch the objects on the screen.

Getting the stereoscopic effect is simple. Getting it right requires a little more attention to several important details. To begin it is first necessary to understand how we perceive distance and 3D shapes. The primary mechanism we use is converging our eyes. If an object is very far away the light that hits our eyes is parallel, and both eyes look straight ahead. As that distant object moves closer and closer our eyes converge, our left eye points more and more to the right while our right eye points more and more to the left. At the extreme, when you look at the tip of your nose, your eyes are crossed at their limit.

The secondary mechanism is focus. If you have ever used a manual SLR camera you know the lens (and our eyes) focus on distant object by moving the lens back, and focus on closer objects by moving the lens out. Our brain subconsciously takes in the amount of convergence, and the amount of muscle force needed to attain focus, and judges distance based on these two inputs. It is important to note that in the real world these two mechanisms are always tied together. If you look at an object 20 feet away your eyes converge on a point 20 feet away, and your eyes focus on a point 20 feet away.

Another important fact that most people don't realize: our depth perception is only good out to approximately 200 yards. Beyond that distance everything appears to be flat, and we use sideways motion to judge distances. The average person has an Inter Pupil Distance (IPD) of 2.5 inches. At a distance of 200 yards or more the image seen by the left and right eye is the same.

So lets translate this into a movie theater. For the sake of discussion lets put our viewer 40 feet away from the screen. We have a stereoscopic 3D projection system, and our viewer is wearing the polarized glasses. We can project two different images on the screen, and he will see one only with his left eye, and the other only with his right eye. To begin we put up the same image for both eyes, a capitol “A” in the center of both screens. What does our viewer perceive? Both eyes will converge and focus on the same point on the screen, and his brain will conclude the object is about 40 feet away.

Now we'll pull the left eye “A” to the left 2.5 inches. If our viewer has the average IPD of 2.5 inches it will appear the object has moved out to the limit of his depth perception, about 200 yards away or more. With no other visual clues he wont be able to tell. What about focus? The screen is fixed 40 feet away, so his eyes will stay focused at that point. For most people this will be the first time in their life their eyes have converged and focused at two different distances, and it will feel uncomfortable. For people who wear glasses the sensation is similar to what you experience when you get a new pair of glasses and your prescription has changed. Your eyes must now focus differently than they did before, and you feel eye strain. After a few hours you may develop a headache, until your brain has learned to cope with this new way of seeing. We have only moved the left eye image 2.5 inches, and already we have problems.

Lets put the left eye image back to the same position as the right eye. The image once again seems to be 40 feet away, and the viewer's eyes are not experiencing any strain for the moment. Now move the left image 2.5 inches to the right. What does the viewer experience? He will cross his eyes slightly to keep them converged on what is projected. Since his IPD is 2.5 inches and we have crossed the images by 2.5 inches, his two lines of vision will cross halfway to the screen, and it will appear the object has jumped halfway towards him off the screen, and is now only 20 feet away. His eyes will remain focused on the actual screen, 40 feet away, and he will experience eye strain again.



So far we have created a virtual 3D space in front of the viewer sitting forty feet from the screen, that extends 20 feet towards him and recedes away to the limit of 200 yards. Past that everything looks flat. We have achieved this by moving the left eye view plus and minus only 2.5 inches, which is plus and minus 0.3 degrees from the viewers position. Our initial idea about setting up a second camera a 'few degrees off' would end up creating an extreme stereoscopic effect.



The first rule of getting it right: you have to know how the images will be viewed. You are creating a virtual space in front of the viewer and you must know what that space is, before you can start placing shapes and objects at different distances. The reward for your attention to this detail is that it is possible to create images that will appear to exist in real space, across the front of the theater, extending halfway out the screen towards the viewer, and receding back for 200 yards or more.

We have uncovered two problems already. Focus is an issue with no easy solution. The best we can do with existing systems is to not violate the convergence/focus lock the viewers eyes have learned over their lifetime too harshly. If you push objects off the screen more than half the distance to the viewer, it will start to feel like you are crossing your eyes. This includes all those tacky 3D effects, like pointing a stick at the viewers face, having objects fly off the screen right between the viewers eyes, or having an object move till it appears to be only 3 feet in front of the viewer. You must be careful never to flaunt the 3D effect in the viewers face.

The second problem is that we have been talking about an average person with an IPD of 2.5 inches. What happens to a child with an IPD of 2 inches when you pull the two images 2.5 inches apart, to make the object appear to be far away? His eyes must diverge! Very few people can make their eyes point outward in opposite directions. The only solution for this problem is to limit the separation on the screen to the minimum IPD that might be present in the audience: around 2 inches. This means for the person at the other extreme, with an IPD of 3 inches, we can only work with a space that extends back 130 yards behind the screen. Our virtual world is getting boxed in!

How tall are you!? Stereoscopic 3D has been around for a long time. Shortly after photography was invented 3D photo viewers were produced, with photos from all over the world. People wanted to see the Rocky Mountains and the Grand Canyon in all their spectacular grandeur, but the photographers quickly ran into a problem. If you took a pair of cameras and set them side by side, 2.5 inches apart, to match the average persons IPD, and you used a lens with a normal field of view, then what happens when you take a stereoscopic photo of the mountains? It looks flat! It looks the same as a single photo. Why? Because of the limitations we discussed, our depth and 3D perception only extends around 200 yards. Everything past that looks flat.

They came up with an easy solution, and it has been used ever since: move the cameras further apart. Keep moving the cameras further apart until you can see the shapes in the mountains and the canyons. This works, but it creates another problem. When you view these old photos your brain tells you something is not right. I have a stereoscopic photo of Washington DC, that was taken with two cameras set 100 feet apart. You can see incredible details and the shapes of the buildings, the curves of the hills beyond the city, construction cranes rising above new buildings, smoke rising from smoke stacks, but it feels like you are looking at a model of the city, a very small model, like you are standing over a train board with tiny little buildings. You cant fool your brain. Since the cameras were set 100 feet apart it feels like your eyes are 100 feet apart, which would make you 480 times larger than a normal person: 2,880 feet tall!

The second rule for getting it right: don't turn the audience into giants!You must use the correct camera separation for the camera field of view. For animation this means that first you must put your viewer into the virtual space itself. If you are animating bugs then how tall is your viewer? Do you want it to feel like the person watching the film is 6 feet tall, looking at bugs on the ground? Or do you want to bring your viewer down to bug size, so his eyes (your cameras) are the same distance apart as the bugs eyes?

Keep this in mind for long shots as well. If you are using a 3X (telephoto) field of view then move the cameras 3 times as far apart. If you use a wide angle field of view you have to bring the cameras closer together accordingly. If you are zooming in and out, your camera separation must adjust to track the field of view changes, otherwise as you zoom the viewer will feel like they are getting larger and smaller (a nice effect if that is what you are trying to accomplish, but very distracting if that was not your intention).

To answer the old question: how do you capture the Grand Canyon or other scenic vistas in 3D? Keep the camera separation correct, don't pull them apart to exaggerate the stereoscopic effect. Instead pan the cameras sideways (together), or fly into the space (a fly through), or zoom into the space with a telephoto shot (adjusting your camera IPD accordingly) and pan the area, as if you are viewing it with binoculars.

What are you looking at? If you sit in the real world and look around at objects that are close and others that are distant, your eyes track by converging and focusing together on each object in the center of your vision. If I take a stereoscopic photo of the same area I must converge the cameras on only one point. One of the problems with 3D movies is the effect is so unique that viewers tend to look all around the image on the screen. Instead of looking at the character or object that was intended to be the center of attention, they may be looking at the trees in the distance, or the wall paper on the wall behind them, or the pattern on the floor. If you converge your cameras on an object in the foreground, and the viewer decides to look at objects in the distance it may be necessary for his eyes to diverge. As we said before, most people cannot diverge their eyes. The result will be either severe eye strain, or they wont be able to lock their eyes on the object they are trying to look at.



The third rule for getting it right: set your cameras to converge on the most distant objects in view, and adjust your separation so that infinity is 2 inches apart at the screen, and let the foreground objects find their own place in that space. Resist the temptation to converge your cameras on the center of attention. If you really want to lock the viewers attention on one area, then use a depth of focus effect to blur the rest of the image, so the viewer will not be inclined to look around the area at other things.


I lost it! (Motion tearing). The last problem that is challenging in the creation of stereoscopic 3D is motion tearing. If an object moves across the screen too quickly, or moves in or out of the 3D space too quickly, the viewers eyes lose track and coordination, and the 3D effect is lost. In the real world we use both eye convergence and focus to track objects and determine their distance. Since focus is locked out of the equation in the theater our brain is trying to figure out what it is seeing with half its input missing. Things we would normally be able to track through a 3 dimension real space can be too fast to follow in the virtual space on the screen. In addition you must know how the images will be projected. The earlier 3D theater systems used two projectors. The left and right eye images were shown simultaneously. The newer systems that are being used intermix the left and right images in time, showing first one, then the other: left, right, left, right,through a single lens with a polarizer that flips for the left and right images.

The fourth rule for getting it right: you must know the timing of the projection system. If the projector is tossing up left and right images sequentially at 24 images per second, that means there will be a 41.6mS delay between the left image and the right image. If the image rate is 120 Hz then the delay is only 8.3mS. When objects are stationary or moving slowly on the screen this is not an issue. But when objects move quickly there comes a point when your brain sees the left view in one place, and the right view in another place, but then the left view has moved considerably and your brain cannot pull them together. The result is you see two objects: the stereoscopic 3D effect is lost.



To prevent motion tearing the best tool is to preview fast moving scenes with the projection system that will be used in the theater, or one designed to simulate the timing of the theater system. If the image tears then slow down the motion.

There is one other issue that you may have realized on your own reading this article. We have discussed the creation of a 3D virtual space for a viewer sitting 40 feet from the screen. What happens if someone sits in the back row, 80 feet from the screen? If you optimize the images for the person in the center row, the person in the last row will see an elongated space on the screen. Objects that project out halfway will look 40 feet deep instead of 20 feet. Likewise the person who sits in the front row will perceive a flattened virtual space. I don't have a general rule to address this issue for the theater. Generally speaking animation is more acceptable under this condition. If an animated world is a little distorted in space it doesn't seem to offend us as much as it would in a film with cameras and real actors. Maybe this is part of the reason why most movies released in the last couple years in 3D format have been animations instead of live action films?

This is one area where personal 3D stereoscopic systems, like video eyewear, have an advantage over theater projection systems. The virtual space can be optimized for the field of view and focus distance of the video eyewear, and every viewer will be at the idea distance from the screen. The drawback is that a movie designed to be projected in a theater may not be ideally formatted for viewing with video eyewear. To optimize stereoscopic 3D for personal viewing all the rules we have discussed have to be reviewed using the characteristics of the video eyewear system (the virtual space it can generate), instead of the large open space of the 3D theater.

About the author: Ken Wittlief is a Senior Engineer at Icuiti Corp, makers of VideoEyewear and other micro display systems. Ken has been working with stereoscopic photography and video systems for over 20 years.


Powered by Plone CMS, the Open Source Content Management System

This site conforms to the following standards: