SIGGRAPH 2002 Course #48 Notes

Dynamic Media on Demand: Exploring Wireless & Wired Streaming Technologies & Content

Topic #1: Overview of 2D & 3D Streaming Media in Wired & Wireless Environments

Theresa-Marie Rhyne
Learning Technology Service
North Carolina State University
tmrhyne@ncsu.edu

Introduction:

Here, we provide an overview of 2D & 3D streaming media concepts. The leading streaming media players are reviewed as well as efforts to create a streaming media standard. Next, we discuss the basic process of creating and delivering continuous media. Solutions for creating streaming Powerpoint presentations are reviewed as well as the Synchronized Multimedia Integration Language (SMIL). We then address the Moving Pictures Expert Group (MPEG) standards as well as MP3 and DivX. Three approaches to sreaming objects over the Internet are highlighted and include 1) the Virtual Reality Modeling Language (VRML) or X3D; 2) Viewpoint Media Player and 3) Pointstream application software. Finally, we discuss fundamental transmission protocols for streaming content, multicasting, the Real Time Streaming Protocol (RTSP), and the Wireless Application Protocol (WAP).

What are streaming media technologies?

The concept of streaming media is based on the notion that it is not necessary to completely download content from the Internet before it is played. Content is played as it is received from a server on a network and is not usually saved on the client's (i.e. your) hard disk drive. Streaming also enables live broadcasting, similar to a radio or television station, except over the Internet. Another similar technology includes progressive downloading where by a file is downloaded from an Internet site in small increments.

The Synchronized Multimedia Integration Language (SMIL) assists with assembling an integrated multimedia streaming presentation. SMIL is a markup language for describing the temporal behavior, screen layout and associated hyperlinks of a streaming media presentation. It is a specification based on the Extensible Markup Language (XML). We will further discuss SMIL in later sections of these course notes. The three leading streaming media players support SMIL 1.0 in various ways and are beginning to address the SMIL 2.0 specification.

Examining the leading streaming media players:

There are three leading commercial streaming media players: RealPlayer, from RealNetworks; Windows Media, from Microsoft; QuickTime, from Apple Computer. While RealPlayer presently holds the leadership position as the most popular streaming media player, Windows Media has made significant advancements. QuickTime is frequently used for progressive downloading of multimedia content as well as for streaming media purposes. RealNetworks has alliances with mobile phone vendors, such as Nokia, and other mobile computing providers to extend the RealOne Player to wireless handheld devices. In February 2002, RealNetworks announced its RealSystem Mobile and RealOne Player for mobile devices. Apple Computer has teamed up with Sun Microsystems and Ericsson to create a system for delivering multimedia content (including movie clips) to cell phones and other wireless gadgets. Versions of the Microsoft Windows Media player are available for the PocketPC, other wireless personal digital assistants (PDAs) and portable music devices.

Figure #1: RealNetworks' RealOne Player User Interface, see (http://www.realnetworks.com). This latest version of RealPlayer allows for reading in not only RealPlayer (.rm) files but also SMIL (.smil), QuickTime (.mov or .qt) files and MP3 (.mp3) files. RealOne currently only works on Windows architectures but the older RealPlayer 8.0 version works on both Macintosh and Windows architectures.

Figure #2: Microsoft's Windows Media Player User Interface, see (http://www.microsoft.com/windows/windowsmedia/). Windows Media Player 7.1 is available for Windows, Macintosh, Sun - Solaris, Pocket PC and other selected handheld architectures. Windows Media file extensions include (.asf), (.wma), and (.wmv). MP3 (.mp3) files as well as (.avi) files can be played with Windows Media Player.

Figure #3: Apple's QuickTime Player User Interface, see (http://www.apple.com/quicktime/). The QuickTime player is available for Windows and Macintosh platforms. The file extensions for QuickTime are (.qt) and (.mov). The player also reads MP3 (.mp3) files. As noted previously, QuickTime supports both progressive downloading and streaming media technologies.

Seeking a Streaming Media Standard:

Three leading commercial companies providing different incompatible file formats poses challenges for establishing a streaming media standard. The Internet Streaming Media Alliance (ISMA), (http://www.ism-alliance.org), has developed a specification intended to allow users to install just one plug-in to stream video and audio over the Web. The ISMA hopes to facilitate the adoption of open standards for streaming media over the Internet. Members of the ISMA include Apple Computer and Cisco Systems. Unfortunately, Microsoft and RealNetworks have not joined this alliance. At this time, it is difficult to determine the demand for the new ISMA 1.0 standard. It will be interesting to watch as these efforts evolve.

The Basic Process for Creating & Delivering Streaming Media Content:

There are four basic steps to creating and delivering streaming media content: (1) Capture; (2) Convert; (3) Distribute; and (4) Play.

The sequence shown below depicts the capture and convert processes.

Video Content --> Capture Card --> Digital Media File --> Encoding & Streaming Media File

If your content is in video format, a capture card for your computer is required to digitize the analog signal. The content is then saved in a digital file format such as (.avi) files. Once in a digital file format, the content needs to be encoded, compressed and converted to a streaming media file. Tools like RealNetwork's RealProducer (http://www.realnetworks.com/products/producer/index.html), Microsoft's Windows Media Encoder (http://www.microsoft.com/windows/windowsmedia/wm7/encoder.asp), and QuickTime Pro (http://www.apple.com/quicktime/download/) aid in this conversion process.

The Distribute step is supported by either downloading of files or streaming media servers. RealNetworks, Microsoft Windows Media, and Apple QuickTime provide their own respective streaming media server products. The Play step is supported by the installation of streaming media players on the client (or end-user) machine desiring to access the content.

Excellent Reference: H. Peter Alesso, e-Video: How to Produce Internet Video as Broadband Technologies Converge, Addison-Weslely, 2000. See web site at: (http://www.video-software.com/)

Streaming PowerPoint over the Web:

Many business and educational presentations include the development of PowerPoint slide presentations. It is possible to create fluid PowerPoint presentations with an audio sound track that can be saved and played back in any of the streaming media file formats. Two specific products designed to aid in this content creation process include RealNetworks' RealPresenter Basic and RealPresenter Plus products, (http://www.realnetworks.com/products/presenter/basic.html), and Microsoft Producer for PowerPoint 2002, (http://www.microsoft.com/Windows/windowsmedia/technologies/producer.asp).

Figure #4: Streaming PowerPoint content created with Microsoft's PowerPoint and RealNetwork's RealPresenter Basic.

What is the Synchronized Multimedia Integration Language (SMIL)?

The Synchronized Multimedia Integration Language (SMIL) is an XML specification for combining images (GIF or JPEG), hypermedia links, streaming text, streaming audio, and streaming video into a single intergrated presentation. The Streaming PowerPoint content created with RealPresenter is controlled by a SMIL file. The SMIL 2.0 specification was released in August 2001 and is available from the World Wide Web Consoritum's site at: (http://www.w3.org/AudioVideo/). There are a number of SMIL editing tools that facilitate the creation of SMIL files. These include Confluent Technologies' Fluition product, (http://www.fluition.com) and Oratrix Development's GRINS Editor and SMIL player, (http://www.oratrix.com). Below, we show a screen shot from a SMIL presentation created with Fluition.

Excellent Reference: Lloyd Rutledge, "SMIL 2.0: XML for Web Multimedia";, IEEE Internet Computing, (Sept/Oct 2001), Vol. 5, No. 5, IEEE Computer Society, pp. 78 - 84.

Figure #5: Synchronized Multimedia Integration Language (SMIL) presentation created with Confluent Technologies' Fluition product - a SMIL Editor tool.

What is MPEG - the Moving Pictures Expert Group?

The Moving Pictures Expert Group (MPEG) of the International Organization for Standardization (ISO) is charged with developing standards for code representation, processing, compression and decompression of moving pictures, sound and their combination (multimedia). MPEG-1, introduced in 1992, plays out video and audio in linear streams and operates like a digital video player. MPEG-2, introduced in 1995, supports compression and transmission of digital television signals. MPEG-4 is a multimedia standard that allows users to interact with objects within a scene. MPEG-4 (version 1) was introduced in late 1998 while MPEG-4 (version 2) was approved in December 1999. MPEG-7 is a content representation standard for information searching was approved in December 2001. MPEG-21 is a new effort to define a Multimedia Framework to support the delivery of electronic content that began in June 2000. The MPEG Web site is located at: (http://mpeg.telecomitalialab.com/).

Where does MP3 fit in?

MP3 is an digital audio compression file format. MP3 is based on a perceptual coding scheme where the goal is to insure the output signal seems identical to the original source for the human ear. The codec does not try to absolutely maintain an audio signal identical to the original source. MP3 is an open standard, meaning that no organization controls, that has become the most pervasive audio format on the Web. The three leading streaming media players read MP3 files. In addition to small portable MP3 gadgets, there are pre-installed MP3 players on most all computers sold today. The Fraunhofer Institute for Integrated Circuits developed the perceptual encodings that MP3 is based on in cooperation with the University of Erlangen in Germany. Two Web sites that provide information on MP3 and MP3 players are (http://www.mp3-tech.org) and (http://www.mpeg.org/MPEG).

What about DivX?

DivX is a digital video compression technology that is based on the MPEG-4 standard. DivX allows for compressing MPEG-2 video to about one eighth of its original size while still maintaining premium visual quality. This means that a 90 minute DVD-quality feature film can be downloaded in less that 30 minutes via existing broadband connections. DivX is rapidly becoming the most popular format for distribution of digital videos with premium visual quality and small file sizes. DivXNetworks Inc., (http://www.divxnetworks.com/), created the DivX format. For more information on DivX, see: (http://www.divx.com/about/index.php).

Streaming 3D Objects over the Internet:

A number of technologies have evolved to support the distribution of three dimensional (3D) objects over the Internet. The Virutal Reality Modeling Language (VRML) was one of the first standards developed. VRML and its next generation product, X3D, are guided by the Web3D Consortium, (http://www.web3d.org). Parallel Graphics has developed "pocket cortona", a VRML browser for handheld wireless devices, (http://www.parallelgraphics.com/products/cortonace). VRML is currently a progressive download technology.

Commercial solutions for streaming 3D objects over the Web have also evolved. Viewpoint Media Player, (http://www.viewpoint.com), is a streaming tool for this purpose. Originally named Metastream, it was developed in a joint project with MetaCreations and Intel. Viewpoint (or Metastream) objects stream across the Internet without a server and scale automatically to support the end-user's processing power. With the aid of smart compression technology, users with networking connection speeds ranging from a 56K modem to a T1 (or higher) line are able to examine 3D content in real time. Viewpoint provides a free Web browser plug-in to view content. A modeling tool is required to create the 3D objects.

Another promising technology is Pointstream. Pointstream application software creates 3D images using a "pixel" as the object primitive rather than conventional polygon models. A complete scanned data set of an object is represented in a point cloud format and embedded into a Web page. This allows for effective scaling of 3D images as well. Pointstream is also optimized for the delivery of content on handheld devices. Arius3D Inc. developed the Pointstream technology. More information about Arius3D and Pointstream is available at: (http://www.pointstream.net).

Figure #6: Snapshot from a Viewpoint example of "Shared Architecture". Researchers at the Centre for Advanced Spatial Analysis (CASA) of the University College London are using Viewpoint technology in their Online Planning and Internet based visualization research. More information on these projects is available at: (http://www.casa.ucl.ac.uk/public/meta.htm). To learn how the architecture models evolved from photographs, see: (http://www.casa.ucl.ac.uk/public/guidelines.htm). A few researchers at CASA have formed the Plannet Visualisations Ltd. to consult in these arenas, (http://www.plannet.co.uk/index2.html). Image shown courtesy of Andrew Hudson-Smith.

Basic Transmission Protocols for Streaming Content via the Internet:

At a fundamental level, the Internet is based on a protocol, the Internet Protocol (IP), that does not guarantee the delivery of any particular packet of data. Another protocol, defined as the Transmission Control Protocol (TCP), compensates for this potential loss of packets. TCP adds careful error correction by marking data with sequence numbers. Any data that appears is retransmitted by TCP. Together, TCP/IP form a reliable connectivity solution that guarantees data will be delivered in its proper order, no matter how long it takes.

Most traditional Internet Protocol (IP) applications, such as Web browsing and E-Mail, employ unicasting to transmit data. Unicasting is where a separate connection is set up between a server and each of its clients. This type of connectivity is optimal for situations where every client has specific needs such as browsing Web pages and reading particular E-Mail messages. For collaborative situations, where multiple clients all want the same data at the same time, unicasting is inefficient since identical streams of data are sent to each client. This is a waste of both server and network capacity. So, another connectivity solution, multicasting, is more optimal.

What is Multicasting?:

Multicasting allows a server to transmit only a single data stream, regardless of how many clients might request it. Whenever this data stream traverses a multicast-enabled switch or router, it is copied. Copies are only made to branches of the networking tree where clients that requested the data stream are located. This results in a dramatic reduction in overall bandwidth consumption. Multicasting is ideal for situations where clients want the identical data simultaneously. Live video and audio streams are examples of these situations. The Multicast Backbone (MBone), for desktop videoconferencing, is a well established multicasting application for streaming video and audio across the Internet. More information on MBone is available at: (http://www-itg.lbl.gov/mbone/).

Since multicasting uses only IP, data packets can become lost. The MBone is sometimes called a loss-tolerant application since it permits temporary losses in audio or video tranmissions across the Internet. In some multicasting situations, a feedback or checking mechanism is desirable. The Real-Time Transfer Protocol (RTP) and the Real-Time Transport Control Protocol (RTCP) together provide a solution to this concern. RTP and RTCP run on top of the User Datagram Protocol (UDP) to provide timing information necessary to synchronize and display audio and video data.

How does the Real Time Streaming Protocol (RTSP) fit in?:

The Real Time Streaming Protocol (RTSP) is a protocol relevant to multicasting and designed to optimize the delivery of streaming content over the Internet. RTSP was first published as a proposed standard in April 1998 by the Internet Engineering Task Force (IETF). RTSP acts as a framework for controlling multiple data delivery sessions and assists with switching between TCP, UDP and RTP sessions as needed. RTSP helps with firewall configurations and provides VCR-style controls such as pause, fast-forward, reverse or seek for previously recorded data streams. This protocol provides the underlying infrastructure to support the functions associated with a streaming media player. The fundamental difference between TCP/IP and RTSP can be simply stated. A real-time streaming protocol (RTSP) is most concerned about when a data packet arrives while TCP/IP is concerned about if the packet has arrived. Researchers at RealNetworks Inc. developed the basic framework for RTSP. Information on RTSP is available at the IETF's web site, see: (http://www.ietf.org/rfc/rfc2326.txt).

What is the Wireless Application Protocol (WAP)?:

The Wireless Application Protocol (WAP) has become a defacto standard for providing Internet connectivity to pagers, PDAs, digital mobile phones and other wireless devices. Motorola, Nokia, Ericsson and Phone.com (formerly Unwired Planet) were the initial companies that teamed up in 1997 to develop and implement WAP. An excellent summary discussion of WAP is available at: (http://www.gsmworld.com/technology/wap.html). More detailed information on the WAP specification is available at: (http://www.wapforum.org/what/technical.htm). As noted previously, both RealNetworks' RealOne Player and Microsoft's Window Media Player are addressing streaming media across wireless devices.

What about Compact HTML (C-HTML)?:

The Compact HyperText Markup Language (C-HTML) is a subset of HTML optimized for small information appliances like PDAs and mobile phones. C-HTML is based on the original HTML guidelines. Thus many existing HTML content resources can also be used to aid in creating C-HTML files. The World Wide Web Consortium provides a discussion on C-HTML at: (http://www.w3.org/TR/1998/NOTE-compactHTML-19980209/).

Concluding Remarks:

This discussion has provided a general overview of some basic streaming media concepts as well as terminology associated with Internet and Wireless application protocols. We have focused on the delivery of dynamic media across the Web. Other instructors in this course will describe in more detail issues relating to multimedia compression and management technologies. There will also be further coverage of the protocols and strategies for transmitting media content via local, metropolitan and wide-area wired and wireless networking. Finally, new approaches to rendering multimedia (2D & 3D) content on PDAs, digital mobile phones and other thin clients will be described. The arena of dynamic media on demand is only beginning to emerge.

-------------------------------------------------------------------------------------