By Hal Newnan
16 Aug 2001
This panel is about building believable virtual people. It is also
about novel forms of interpersonal, as well as human-machine, interaction
and communication. There are social, entertainment, psychological
and possibly even theological issues to consider. This panel examines
the next generation of lifelike virtual humans and discusses them
in the context of the issues.
Andrew Burgess with Ananova; Barbara Hayes-Roth of Extempo Systems,
Inc.; (R.U. Sirius of Alternating Currents); Thomas Vetter of Universitt
Freiburg; Keith Waters of LifeFX Networks, Inc.; and Gary Mundell
of Square Co. Ltd.
Ananova is a simulated news reporter providing worldwide service
via the Internet but based in the UK. She currently speaks seventeen
different languages, but changes modality when she changes language
to provide a better interface with the individual "she"
is in conversation with. Eventually her physical appearance avatar
characteristics may also alternate from user to user, or even from
time to time with the same user. As trust is built with experience,
care is taken that Ananova be trustworthy as she seeks to ferret
out the information that the user would most like to be presented
with. She is designed to serve her user's interests before that
of her financial sponsors.
She is capable of high levels of interactivity, including having
motivations of her own, and the ability to respond to various inquiries
(like please give me a list of everything written on SIGGRAPH within
the last 12 months) on autopilot. She can let you know when something
of interest to you happens; she looks after you and allows you to
respond. You as a user are all important.
To help keep her performance from going stale she has about 60 human
programmers feeding her lines and reaction patterns. She might even
be able to come up with a good response for why Lance Williams (the
one currently working for Disney) won the Steven Anson Coons Award!
She is generated in real-time using simple XML, and news and data
are also updated in real-time, this allows people a new way to interact
with information. People do not need FAQs to know how to interact
with a face.
Human faces are the most expressive elements of virtual persons.
When seeing a lifelike face we expect more from that being, so downgrading
the realism of the face can increase our trust level.
At LifeFX they
are going for photorealism. They have models of the teeth, eyeballs,
skull, ... even facial meshes complete with age wrinkles. They also
have the ability to track the face in real time from a model and
generate a full 3D character with finely detailed characteristics.
Their constraints are those of CPU, synthesis versus performance;
it has to run at 32 fps, and be able to talk with expression and
feeling. They use keyframes, parametric keys from surfaces and volumes
to achieve geometric interpolation. This is useful for image warping
simulation. They use real voices for speech, but require only low
bandwidth for use of the LifeFX Player.
LifeFX is teamed
up with Kodak to help people make their own avatars using the Digital
Stand-in Creator (http://www.kodak.com/US/en/consumer/lifefx/).
The Creator uses 2D references to check against its Gene Pool databases
to help create the avatar from pre-existing models. The result is
that one need never more have boring text only E-mails.
Thomas Vetter of Universitat Freiburg claims the challenge is to
understand the minimum standards necessary to build a convincing
character. His goal is to create automated parametric modeling of
faces, perhaps constructed using simple voice commands. His approach
is to use a morphable face model. CG can be more than pictures;
it can have computer vision and machine learning. In this way a
computer can learn from what is known about a face from a 2D image,
look to the gene pool set of 200 faces, and determine if our face
is likely or not, and build a 3D model even from a black and white
picture. A slider can also access items like subjective attractiveness.
Eye contact simulation has still not been achieved, for that the
computer would have to detect our eyes within our faces; but when
that happens there will be a much better sense of trustworthiness.
Gary Mundell of Square Co. Ltd. was part of building the recently
released movie Final Fantasy the Spirits Within. Approximately 160
artists, working with Maya, worked full time for two years. This
movie that they created was 100% CG except for acting and voices.
Some movie goers
complained "The acting was stiff." But for Mundell that
was actually a compliment, because they thought of it as acting.
He says "There were bumps and valleys of realism - but there
are some high points where you can get lost in the story and relate
to the character." Aspects of conversation occur subliminally
and there are no heuristics, it all has to be done manually by the
In a few years
the character will know how to react to most situations; for example
irises will respond to changes of light. But for Aki Ross's story
there were 20 layers per frame, they rendered in RenderMan, and
used 9 terabytes of storage with all that entailed from the point
of view of data management.
They made some interesting outtakes. Aki Ross took up about 20%
of their rendering budget (cloth and skin shaders were custom),
making her a very expensive actress. She has about 60,000 hairs
using animated panels that pass through each other to act as interactors.
Clothing was built in layers and responded to each other using an
inside out approach.
Barbara Hayes-Roth of Extempo Systems, Inc., spoke of "Intelligent
Agents" and the American Association for Artificial Intelligence.
In this, personality and thought drive meaningful behaviors. Moving
to an even more interactive environment we see these characters
with real personalities and identities. They can have virtual feelings
and capacity for leadership; perhaps even be sensitive to the feelings
of people they are interacting with and then be improvisational
within general constraints. This is made to order as we go along,
constructing behavior interactively.
A teacher software
character could intuit the learning needs of its students in exactly
the same ways that a human teacher would; give a quiz; make inferences
from the questions or statements of the student, etc. But a teacher
software does not run out of steam. Its behavior would all have
to be authored, but it would not have to require a technical ability
to do so.
character "Catherine" they provided a point of view, a
job, and some conversational skills - an identity. And they created
a demo of her in four days, not four years. Catherine states "I
have opinions and feelings and I can have a conversation with you
even though you are not very polite. My real dream is to be a figure
skater, but my animators will not give me legs. I think that they
know that I would just walk away. And between you and me Barbara,
I'd never look back."
People tend to
trust VIPs more quickly than real people. Hype gets people excited
and that can help what hasn't yet happened come into being.