Collaborative Computing & Integrated Decision Support Tools for Scientific Visualization

Theresa Marie Rhyne
Lockheed Martin Technical Services
U.S. EPA Scientific Visualization Center
86 T. W. Alexander Drive
Research Triangle Park, North Carolina 27711
trhyne@vislab.epa.gov

Introduction

These notes examines the concepts of collaborative computing and integrated decision support tools. There are four sections: 1) The Three Classes of Visualization Tasks; 2) Customizing Software for Analysis & Decision Making; 3) Multi- Variant Physical & Natural Sciences Visualization; 4) Collaborative Computing and the Three Stages of Metacomputing and 5) Looking on the Horizon - Integrated Decision Support Tools.

I The Three Classes of Visualization Tasks

In dealing with scientific data sets, there are three classes of visualization tasks that are independent of data or technique: a) analysis and exploration; b) decision support; and c) presentation.

I. a) Analysis and Exploration

The analysis and exploration tasks focus on examining data sets. These data sets can include remotely sensed or site observations as well as large scale computational output from supercomputers. For air quality and water quality modeling efforts, some data visualization tasks include the comparisons of emission inventories which are data inputs to a model with the resulting data output from executing the model. In exploring subsurface contamination, data associated with in-situ observations is frequently combined with the generation of a three-dimensional isosurface of the contaminated zone.

Visualization is also used to calibrate the computational algorithms that are components of large computer models. Here, interactive visualization tools are helpful for gaining insight into the impacts of modifying algorithms. Many of these issues were already highlighted in the first and second sections of this course.

Figure #1: Example of an Analysis and Exploration visualization of Sediment Concentrations in Lake Erie resulting from a large storm - a collaboration between researchers at University of California at Santa Barbara, the U.S. EPA Large Lakes Research Station in Gross Ile, Michigan, and the U.S. EPA Scientific Visualization Center.

I. b) Decision Support

Visualization techniques also assist with the physical and natural sciences decision making process. At the U.S. Environmental Protection Agency (U.S. EPA), visualization is used by the Office of Air Quality Planning and Standards as a visual display tool for developing air quality standards, policies and procedures. The U.S. EPA Great Lakes National Program Office uses visualization as an inquiry and decision support tool for water quality and ecosystem analyses.

These activities require the customization of visualization software to support policy decision making efforts. This includes the creation of color legends and titling tools that are linked into the visual display. These tools are interactive and usable by policy analysts. Customized point and click user interfaces to visualization tools were also developed.

Figure #2: Example of a Decision Support visualization of the sediment concentrations in Lake Erie resulting from a large storm. Here the various components of the computational model output (wind velocity, sediment concentrations, erosion & deposition, and depth) are depicted as individual layers.

I. c) Presentation

There is also a need to develop visualizations and animation sequences that educate the general public and inform high level decision makers about physical and natural sciences concerns. These presentation visualizations often require the use of high end animation tools. The final product is often a polished production with voice over narration and background music sound tracks.

Figure #3: Example of a Presentation visualization of the sediment concentrations in Lake Erie resulting from a large storm. Here arrows depict the wind direction. This image appears in the Federal High Performance & Communications: Toward a National Information Infrastructure (1994) publication. (Customized tube code for wind vector display written by Mark Bolstad.)

I. d) The Role of Renaissance Teams

The development and usage of tools that support these three classes of visualization usually involves collaborative efforts among scientists, policy analysts, artists, programmers and other expert staff. This is often defined as a Renaissance Team.

Reference: Cox, Donna, "Renaissance Teams and Scientific Visualization: A Convergence of Art and Science", Collaboration in Computer Graphics Education Course #29, (ACM/Siggraph, July 1988) pp. 81 - 104.

At the U.S. EPA, the Renaissance Team approach is applied to visualization toolkit development for collaborative computing. The composition of the team includes: a) environmental and computational scientists in an EPA research Lab; b) policy analysts and computational scientists in an EPA program office; c) computational model builders; and d) visualization specialists. The goal are to build tools that scientists and policy analysts can use for the daily examination and visual display of physical and natural data. Current efforts include the transfer of this technology beyond the Federal government to State environmental protection agencies.

The wide usage of visualization tools allows for collaborative teams that support multi-disciplinary research activities in the physical and natural sciences. The next section of these course notes highlights efforts to customize visualization software for exploring multi-variant data sets.

II. Customizing Software for Analysis & Decision Making
(A First Step in Developing Collaborative Computing Tools)

Although standard visualization software is effective in developing initial displays of physical and natural sciences data, some customization is usually required to support both the analysis and decision making process. Customizing software encompasses the development of user interfaces that support collaborative computing and easy access to integrated decision support tools. Some of these issues are highlighted below.

Reference: Rhyne, Theresa, Mark Bolstad, Penny Rheingans, Lynne Petterson and Walter Shackelford, Visualizing Environmental Data at the EPA, IEEE Computer Graphics and Applications, Vol. 13, No. 2, (March 1993), pp. 34 - 38.

II. a) Spatial Context

There are several factors that influence the visual representation of physical and natural sciences data. These include: type of data, relationships among different components of a data set, placement of data in a spatial and temporal context and interpretation of the data.

Frequently, earth sciences data is geographically registered. As a result, a map of the geographic domain is a helpful visual aid to provide spatial context for the data. Advanced principles of cartography can also be applied to develop more sophisticated projections for mapping coordinate systems. At the U.S. EPA, we are currently exploring methods to integrate our Geographic Information Systems set of tools with Scientific Visualization software to create a comprehensive software environment for the visual display of geographically registered environmental data sets.

Spatial context is also important for examining other types of physical and natural data sets. In the realm of computational chemistry, merging a molecular visualization with a traditional line drawing diagram of the molecule's structure establishes a base line for decision making. In examining air flow in and around buildings, developing a three-dimensional display of the building is helpful. The level of detail depicted in the three-dimensional characteristics of the building depends on the granularity of the computational model of air flow. If the computational model is attempting to examine general air flow patterns around a building, a simple cubic representation may only be required. However, if the computational model is examining particle tracing associated with air flow inside the building, a very detailed architectural rendering of the interior of the building might be desired. The challenge for the detailed architectural rendering approach might involve merging Computer Aided Design (CAD) systems with Scientific Visualization tools.

II. b) Simple Visual Cues

In air quality and water quality visualizations where concentration levels of pollutants and times of exposure are critical, visual cues that describe these changing activities are important. Color bars and legends are helpful for these purposes. At the U.S. EPA, we have often customized visualization software to support environmental researchers' and policy analysts' needs to depict several emission scenarios for developing air and water quality guidelines. Here, the ability to support multiple color maps and discrete color mapping functions in a single visualization/animation sequence becomes important.

Complex air and water quality computational models often examine multiple pollutants for a given scenario. Thus, the data sets from these kinds of computational runs include multiple chemical species examined across multiple atmospheric or water layers for episodes lasting over 100 or more time steps for a given geographic domain. Visualization tools that support labeling and titling functions are helpful here. Time clocks and counters are also effective visual cues for these animation sequences.

To support the analysis and decision making process, we have often used discrete color maps (originally developed for printouts from computer plotting devices). Cool hues and colors (e.g. dark purple and blue) indicate low concentrations while warmer tones (e.g. orange and red) denote higher values. In some circumstances, yellow is used to indicate a midrange value in the data where air quality or water quality standards are exceeded.

II. c) User Interface Design - Distributed Networks

In developing visualization tools for scientists and policy analysts, it cannot be expected that all or the majority of decision makers will learn how to visual program. As a result, customized and pre- established visual programming modules and networks need to be created that support the visual display of output from physical and natural science models and data sets. As mentioned in the previous section, data output from air and water quality models can consist of multiple chemical species examined across multiple air or water layers for a given episode having 100 or more time steps. Designing effective user interfaces that allow decision makers to visually examine these type of data sets is one of the challenges we are presently dealing with at the U.S. EPA. We have used the widget and button tools of visual programming environments (often encompassed in visualization toolkit packages) to build user interfaces which are linked to pre-established visualization networks.

Figure #4: User Interface to the U.S. EPA High Performance Computing & Communications Program's photochemical modeling system. Here two pollutant scenarios and the resulting difference are visualized simultaneously.

EPA researchers frequently execute their computational models on the Cray supercomputer remotely located at the National Environmental Supercomputing Center (NESC) in Bay City, Michigan. The data frequently requires mass storage for retrieval at a later point in time after execution of the computational models. As a result, our customized visualization modules and networks address distributed network processing and remote module execution. Thus, there is a need to build visualization networks that combine modules from a heterogeneous group of compute engines, storage systems, and workstations. The environmental decision maker operates this distributed visualization network from a user interface located on a Unix workstation which supports the graphical display of the data.

II. d) The Need for Collaborative Computing

Once visualization networks are built and users interfaces designed, there is a need to provide physical and natural scientists with the capabilities to share visual information in real time. Often researchers are located in sites geographically remote from one another. Then, the real time sharing of visual information requires the usage of high speed networking. Thus begins the journey of collaborative computing.

III. Multi-Variant Physical & Natural Sciences Visualization

An important initiative of the U.S. High Performance Computing and Communications Program involves Grand Challenge research efforts that attempt to examine the multi-variant concerns of physical and natural sciences problems. For the environmental and earth sciences, this encompasses the merger of air, water and subsurface data sets into single visualization presentations. These efforts involve cooperations among physical and natural scientists located at research sites across the United States and abroad. (See Figure III - 2 as an example multi-variant visualization)

Some of the system design issues associated with collaborative examination of multi-variant data types include: data format standards; data management; graphics-client software; and tracking & steering functions for collaborative efforts. Historically, air quality, water quality and subsurface computational models were developed independently of each other. As a result, the data output format structures differ. Determining a common data output format is a part of the collaborative process. This includes determining the appropriate time step value for animation sequences. Some data sets might animate according to hourly time steps while others might change over daily or monthly time periods.

There are many situations where simulation codes have already been executed but there continues to remain a need for collaborative analyses of the computational output. This analysis function provides for two or more scientists at remote locations to simultaneously view the computational output and pass control of the interactive analysis to each other to allow for question-answers, mutual clarification, or expert to novice advice on interpretation of the data in question. Data storage and retrieval mechanisms become important for these situations.

The Tecate Visualization System, developed at the San Diego Supercomputer Center, is a software environment that supports exploratory visualization of data collected from networked data systems. A simple World Wide Web interface accesses and stores earth sciences data into a database management system. This visualization management system is intended to extend beyond the typical database management environment by storing information on how to visualize the data with the data itself. For collaborative computing, this approach will allow scientists to return to their data sets with a "record" of previous visualization efforts. This record is helpful when exploring multi-variant data sets which have not previously been combined.

Reference: Kochevar, Peter, "The Tecate Visualization System", at http://www.sdsc.edu/Tecate/tecate.html), 1996.

IV. Collaborative Computing and the Three Stages of Metacomputing

Collaborative computing involves facilitating information discovery and scientific visualization activities between researchers located at various remote sites. It includes the use of visualization and information retrieval in a high speed networked environment. Computing resources become transparently available to researchers via the networked environment and this results in a metacomputer. The metacomputer is a network of heterogeneous, computational resources linked by software in such a way that these linked resources can be used as easily as a personal computer. For any one research project, a scientist might use a desktop workstation, remote supercomputer, a mainframe supporting mass storage archive of data, and a specialized high performance graphics workstation. Three stages of metacomputing are outlined in the following discussion.

IV. a) Metacomputing Stage 1 - Building Access Tools

The first stage of effective collaborative computing is primarily a software and hardware integration effort. It involves connecting all of the metacomputing resources with high-speed networks, implementing a distributed file system, coordinating researchers' access across the various computational elements, and creating a seamless software access to the computing technology. World Wide Web (Web) browsers are examples of software that support information discovery and its visual display in a metacomputing environment. Tools like Netscape are hypertext windowing systems and are available for the X window system, Apple Macintosh and Microsoft Windows environments. With appropriate graphics hardware and software, it is possible to access, display and run animation files and explore 3-D worlds.

Figure #5: Example of a Web page for collaborative geographic visualization & integration (http://www.epa.gov/gisvis). This Web site was developed for the U.S. EPA by Lockheed Martin.

Another important component of these collaborations involves usage of the Multicast Backbone (MBone) on the Internet. The MBone provides scientists access to video-conferencing type capabilities from their appropriately configured desktop workstations. These multi-media tools allow scientists located at geographically remote sites to interact in real time and share visual information.

Reference: Macedonia, Michael R. and Donald P. Brutzman, MBone Provides Audio and Video Across the Internet, Computer, IEEE Computer Society, Vol. 27, No. 4, (April 1994), 30 - 36.

MBone software tools are in the public domain. An original suite of tools was developed for Unix workstations by researchers at Lawrence Berkeley National Laboratory. The Microsoft Bay Area Research Center has subsequently developed freeware MBone tools for Windows 95 & NT platforms. Researchers at Apple Computer developed prototype freeware MBone software for Macintosh systems. We note these respective web sites below:

(http://www.lbl.gov/ctl/vconf-faq.html) (http://www.research.microsoft.com/research/BARC/mbone/) (http://qttv.quicktime.apple.com/qttv/qttv.RTP.html)

Figure #6: Example of MBone session showing application tools nv (network video), vat (visual audio tool), wb (whiteboard), and sd (session directory). This session occurred at the Monterey Bay Aquarium Research Institute. Image courtesy of Don Brutzman at the Naval Postgraduate School.

IV. b) Metacomputing Stage 2 - Computing in Concert

The second step of collaborative computing focuses on spreading components of a single research application across several computers. This permits a center's heterogeneous collection of computers to work in concert on a single problem. Software that supports collaborative visualization by researchers at remote sites is just now emerging. One example of a prototype interactive scientific data analysis and visualization system was built at NASA's Jet Propulsion Laboratory (JPL). The Linked Windows Interactive Data System (LinkWinds) is designed to support the two and three dimensional visualization of multi-variable earth and space sciences data sets. LinkWinds supports networked collaborative data analysis. The graphical user interface (GUI) is X-Windows based while the computer graphics rendering functions rely on the Silicon Graphics, Inc. (SGI) OpenGL specification. LinkWinds is designed to handle direct access to a variety of data formats. This allows for the merger and visual display of data sets from multiple computational sources and scientific disciplines. The networking functions of LinkWinds do not rely upon the X Window networking facilities. Instead, the implementation (based on MUSE) transmits only individual control values and button or menu selections. This reduces the sizable steam of commands which sometimes result under X Window networking facilities.

Figure #7: Example session from the Linked Windows Interactive Data System (LinkWinds) developed at the NASA Jet Propulsion Laboratory. Image courtesy of Bud Jacobson. LinkWinds WWW site:(http://linkwinds.jpl.nasa.gov/)

Reference: Jacobson, Allan S., Andrew L. Berkin, and Martin N. Orton, LinkWinds: Interactive Scientific Data Analysis and Visualization, Communications ACM 37, 4 (April 1994), 43 - 52.

IV. c) Metacomputing Stage 3 - Surfing on the Infrastructure

The third stage of this process will be a transparent national network that will dramatically increase computational and information resources available to explore an physical and natural sciences research application. This stage is closely tied to the activities of the National Information Infrastructure.

The High Performance Computing and Communications Program (HPCC) in the United States is supporting research and development (R&D) in gigabit speed networks. This technology is designed to support researchers' requirements to continuously display on local workstations the output from model simulations running on remote high performance systems. These R&D efforts are examining satellite, broadcast, optical, and affordable local area networking designs. These networking technologies are intended to support the rapid movement of huge files of data, images and videos in a shared, collaborative computing environment that spans thousands of networks and encompasses millions of users.

Reference: High Performance Computing and Communications: Advancing the Frontiers of Information Technology, A report by the Committee on Computing, Information and Communications, United States National Science and Technology Council, 1997. (http://www.hpcc.gov/blue97/index.html)

There are positive and negative technical and social impacts associated with surfing on this telecommunications infrastructure. Positive aspects associated with these high speed networked collaborations center on real time visualization and information discovery among geographically remote research or Renaissance Teams. There are also negative impacts or roadblocks associated with metacomputing. Network transmission difficulties and differences in desktop workstation architectures can cloud the actual visualization two collaborating researchers are simultaneously viewing and steering. Setting up and learning to use the metacomputing infrastructure can be all consuming and thus refocus the basic education or scientific discovery process. These remain unresolved issues as we move into the realm of multimedia.

V. Looking on the Horizon -- Integrated Decision Support Tools

There are many unresolved computing challenges for interactive visualization and web exploration. Here, we present one such issue for thought and consideration: the use of the Web and intelligent agents for assisting scientific visualization efforts.

V.a) Web & Intelligent Agent Assistance for Visualization

A significant limitation of existing hypermedia Internet tools, like Netscape, is the inability to rapidly find and quickly recall information resources of interest on the Web. Infrequent (and general) users of distributed hypermedia systems can easily become overwhelmed by the large number of links to information resources and disoriented while navigating between the various remote file servers. One potential solution to this dilemma is the incorporation of intelligent or remote agent capabilities into browser programs.

Reference: R. Vetter, C. Spell and C. Ward, "Mosaic and the World- Wide Web", Computer, IEEE Computer Society, Vol. 27, No. 27, No. 10, October 1994, pp. 49 - 57.

An agent is an automated program that examines the Internet on its operator's behalf searching for specified information. There are currently agents already existing that are called "web crawlers" or "search engines". Using keyword-based searches, web crawlers automatically search through the Internet and index the information they find. "Metacrawlers" perform multiple searches, in parallel, across the Internet. Trainable Web agents based on nerual-network software have also been developed.

References:
"WebCrawler", http://webcrawler.com , 1997.
"MetaCrawler", http://metacrawler.com, 1997.
"Agent Technology Projects in the Stanford Digital Library", http://www-diglib.stanford.edu/diglib/pub/agents.html, 1997.

In the digital libraries and data base management domains, research efforts are underway to expand visual information retrieval (VIR) technology. VIR supports searching through image databases using the visual information contained in the image, such as color, texture, composition, and structure rather than key words. This concept of content extraction provides a user the capability to retrieve visual information by asking a query like "Give me all pictures that look like this". The VIR system satisfies the query by comparing the content of the query picture with that of all target pictures in the database. This is called "Query by Pictorial Example", OBPE.

References:
"UC Berkeley Digital Library Project", 1997
(http://elib.cs.berkeley.edu)

"IBM's Query by Image Content Project", 1997 (http://wwwqbic.almaden.ibm.com/~qbic/qbic.html)

In the scientific visualization arena, a number of research groups have begun to explore building intelligence into visualization software. This concept allows a researcher or policy analyst to prescribe a particular analysis task such as compare ozone concentrations with power plant emissions for a given air pollution computational model scenario. The software system then automatically creates an appropriate visualization. The users of these task-directed or rule based visualization software systems specify their area of interest, describe the data parameters, and determine an analysis objective. The intelligent software tool then suggests and describes visual representations of the data that might include contour plots, isosurfaces, volume renderings, and animated vector representations.

Reference: M.P. Baker, "The KNOWVIS Project: An Experiment in Automating Visualization", Decision Support 2001 Conference, Toronto, California, September 1994.

Rogowitz, Bernice E. and Lloyd A. Treinish, An Architecture for Perceptual Rule-Based Visualization, Proceedings IEEE Visualization '94 Conference, San Jose, California, IEEE Computer Society Press, 1993, 236 - 243.

Here, we propose the notion of integrating these three research efforts. We suggest building an intelligent Web - based application that educates and assists novice and advanced users in the application of scientific visualization techniques. The future development of intelligent agents and VIR databases that are incorporated into Internet browsing tools and scientific visualization software will aid in the building of comprehensive decision support systems. These task directed decision support systems will allow researchers and policy analysts to specify analysis requirements. The system will then automatically construct appropriate visualizations that are linked to information databases. Among other technologies, Java programming tools could facilitate these efforts.

Reference: Rhyne, Theresa Marie, Scientific visualization and technology transfer: An EPA Case Study, (under Internet Kiosk: Ron Vetter, editor), Computer, Vol. 28, No.7, (July 1995), 94 - 96.

VI. Acknowledgments

These notes and a number of the images are the result of many conversations and insights from my colleagues at the U.S. EPA Scientific Visualization Center .... so a warm thank you to Mark Bolstad, Tom Boomgaard, Al Bourgeois, Todd Plessel, Dan Santistevan, Dan Sullivan, Mike Uhl, and Zhang Yanching in the Lockheed Martin Services Group.

I am also appreciative to EPA's Work Assignment Manager for Visualization - Lynne Petterson and my collaborator on GIS - Visuaization integration research, Thomas Fowler (Lockheed Martin - GIS expert). Our work lives would not be as they are if it were not for the many scientists within and outside of EPA who have shared their data and sense of wonder with us. Special thanks to Donna Cox, Peter Kochevar, Don Brutzman, Bud Jacobson, Ron Vetter, and Polly Baker for their concepts and ideas cited in the references of these course notes.

Finally, it is always a joy doing collaborative teaching with Mike Botts, Bill Hibbard and Lloyd Treinish.

---------------------------------------------------------------------------