DETAILS


COLUMNS


CONTRIBUTIONS


ARCHIVE



New Visualization Techniques

Vol.34 No.1 February 2000
ACM SIGGRAPH


Two Stepping Information Technology with Visualization (A Viewpoint from the U.S. EPA Scientific Visualization Center)



Theresa- Marie Rhyne
Lockheed Martin Technical Services U.S. EPA Scientific Visualization Center

Introduction

Interactive visualization techniques facilitate the examination of unknown data sets. At the United States Environmental Protection Agency’s (U.S. EPA) Scientific Visualization Center, we are exploring the combination of 3D visual displays with emerging information technologies. The following discussion highlights prototype efforts in three arenas: (1) Integrating Visualization into Data Mining Techniques; (2) Exploring Policy Analysis Visualization and (3) Combining Object-Relational Databases, Visualization and Web Technologies. The three prototypes are still in embryotic form. We anticipate it will take three to five years for actualization of these concepts and precise implementation details will change over time.

The U.S. EPA is currently designing and implementing a new high profile Office of Environmental Information that will advocate the use and management of information as a strategic resource to enhance public health and environmental protection. New information technologies will be applied to facilitate the collection and dissemination of environmental data. We are hopeful that interactive visualization tools will be successfully used across the entire range of users, computing platforms and World Wide Web (Web) technologies that support the information life cycle.

Integrating Visualization into Data Mining Techniques

Data mining is a set of methodologies for the automated exploration of complex relationships in very large datasets. True data mining is discovery driven. This means that no a priori assumptions exist about the data. Data mining utilizes discovery approaches where pattern-matching and other algorithms are executed to determine key relationships in the data [2]. At the U.S. EPA Scientific Visualization Center, we are examining ways to integrate data mining with interactive visualization techniques. Our goal is to provide a visual tool for exploring multidimensional data sets that can reside with object-relational databases like Oracle. In the future, we hope to provide Web-enabled 3D visualization outputs in conjunction with data mining activities.

There are seven components associated with implementing a data mining activity: 1) Data Selection; 2) Data Preparation; 3) Feature Selection; 4) Model Building & Testing; 5) Results Analysis; 6) Stability Testing; and 7) Implementing Results.

  •  Step #1. Data Selection involves determining the datasets or information for performing data mining tasks.
  •  Step #2. Data Preparation focuses on cleaning and transforming data so that it is consistent and accurate.
  •  Step #3. Feature Selection encompasses selecting variables in the data sets that are most correlated to the problem we want data mining to help us examine.
  •  Step #4. Model Building & Testing includes using data mining alogrithms to build and perform preliminary testing of our predictive modeling efforts.
  •  Step #5. Results Analysis centers on analyzing the results from our preliminary testing efforts and jumping back to Step #4 to refine our predictive model.
  •  Step #6. Stability Testing can also be called "Reality Checking." Since we have used knowledge discovery techniques, we want to verify that our results from Step #5 correspond to real world observations and previously known behaviors.
  •  Step #7. Implementing Results is the final step of building a strategy for problem resolution based on the results from Steps 1-6.

Figure 1: Use of visualization techniques to examine relationships between data elements in a database associated with a mobile emissions computational model. This visual display was created in a working session with Sue Kimbrough (principal scientist) of the Mobile Emissions Characterization Team at the United States Environmental Protection Agency. The visualization was built with Visible Decisions Inc.’s SeeIT software at the U.S. EPA Scientific Visualization Center.

Figure 2: Example application of visualization techniques to environmental policy data. Objectives are plotted against policy topics. This 3D Web (Virtual Reality Modeling Language - VRML97) visualization was created for Don Barnes and Vickie Richardson of the U.S. EPA’s Science Advisory Board. VR Charts, from AlterVue Systems, Inc., was used to create this visual display.

Figure 3: Projection of a potential 3D interactive visual display with an integrated Web-enabled Spatial Analysis ORDBMS solution. Roads, buildings and terrain spatial data are geographically registered and merged together to yield this 3D user interface. This image was created with Environmental Systems Research Institute (ESRI)’s ArcView 3D Analyst software and converted to a VRML 97 file. This work was done as part of a "Human Exposure in Urban Environments" project for the U.S. EPA - Alan Huber, primary investigator. Richard Greene and Bettina Brinkley provided technical input for the execution of this visualization.

The most immediate location for applying visualization techniques in this seven step process is during Step #5 - Results Analysis. SGI Inc. designed their Mine Set data mining software to provide visualization output during this phase [4]. More information on Mine Set can be found at their website.

Figure 1 shows an example of how visualization techniques might be useful in Step #4 - Model Building & Testing. The visualization example shown was created with Visible Decisions Inc.’s SeeIT software. For more infomation on SeeIT, see Visible Decisions’ website. Here, SeeIT is used to examine results from a Mobile Emissions computational model. Applying visualization at Step 4 allows for a visual way of examining preliminary testing of predictive modeling efforts. Using the Virtual Reality Modeling Language (VRML97), we were successful at converting this visualization to a Web-enabled visual display. For this specific visualization example shown, our environmental scientists were aware, a priori, of this data relationship in building their predictive model.

Exploring Policy Analysis Visualization

In a recent project for the U.S. EPA’s Science Advisory Board, we explored the capablities of developing 3D Web visualizations of policy information using VRML 97 technology. First, weighting values were entered into a spread sheet. Next, using VR Charts, from AlterVue Systems Inc., an interactive 3D display was created from the spread sheet data. VR Charts stores its 3D displays in VRML 97 format. This greatly facilitated Web-enabling tasks. We found VR Charts to be usable by general as well as technical users. More information on VR Charts can be found at their website. Figure 2 shows the results of our efforts. The 3D Bar Chart depicts a conceptualization of planning objectives versus policy topics. Using hypertext technology (HTML), on-line Web pages of environmental policy items are easily linked to this 3D Web visualization. Since the interactive visualization is Web-enabled, it can be shared and accessed by all members of the U.S. EPA’s Science Advisory Board. We are presently exploring how to link similar 3D Web displays to metadata indexing and search engine technologies on the U.S. EPA’s internal and public access websites. It is our hope to use visualization techniques to aid in navigating through the many public policy repositories that encompass the U.S. EPA’s environmental information resources.

Combining Object-Relational Databases, Visualization & Web Technologies

Object-relational database management systems (ORDBMS) are designed with the goal of supporting relational and object databases. These hybrid database environments support relational data stored in two-dimensional table format along with providing data types such as collections, arrays or vectors for storing a data element as a single data object.

The emergence of Web servers as repositories for information has resulted in efforts to integrate ORDBMS environments with Web technologies. The Web can be viewed as a simple distributed object system where Hypertext Markup Language (HTML) or Extensible Markup Language (XML) pages are objects whose identities are defined by their Universal Resource Locators (URLs). The Document Object Model (DOM), implemented by the World Wide Web Consortium (W3C), provides the mechanism for modeling Web documents in an object-oriented way [5]. Efforts are underway to integrate the DOM structure with the Object Management Group’s (OMG’s) Common Object Request Broker Architecture (CORBA) standard. CORBA provides a specification for object oriented programmers and ORDBMS designers to specify standard objects for interoperability and sharing of data [3]. The integration of DOM and CORBA will allow for ORDBMS environments to support Web-enabled computing.

Within the U.S. EPA community, Oracle is the agency standard relational database management system. Currently, Oracle7.3.4 is used on the majority of U.S. EPA database servers although some upgrades to Oracle8 are beginning to appear. Oracle8 and the recently released Oracle8i are ORDBMS environments. The Oracle 8i product is built around the Java object oriented programming language to support Web-enabled database application development. More information on Oracle 8i is available at their website. As a result of this emerging technology, consideration is being given to incorporating object elements into future EPA ORDBMS development efforts.

A significant portion of environmental sciences information has geographic context. As a result, Geographic Information Systems (GIS) and spatial data repositories are important components in the U.S. EPA’s information technology roadmap. The U.S. EPA agency standard GIS environment is ARC/INFO from the Environmental Systems Research Institute (ESRI). In 1999, ESRI released their next generation geographic information systems products: ARC/INFO Version 8 and the Spatial Database Engine (SDE) Version 4. These new software versions will use an object-oriented GeoDataObject to facilitate working with actual objects (e.g. roads, buildings and land parcels) rather than just rows of data in a relational database. Currently, SDE is designed to facilitate integration with non-spatial data components in an existing ORDBMS environment such as Oracle8. More information on ESRI’s ARC/INFO, SDE and other software products at their website.

As a result of the emerging Web-enabled capabilities of Oracle8i and the GeoDataObject capabilities of recent ESRI products, we are starting to examine 3D interactive visual displays that have the potential to be integrated with these technologies. The image shown in Figure 3 depicts the merger of roads, buildings and terrain data into a "virtual city." The visualization was created with ESRI’s ArcView 3D Analyst from relational database files and converted into the VRML97 file format. We are currently exploring the use of Java to help us define a "pollution object" that encompasses both spatial and air quality data to examine environmental impacts at the local community level. Next, the intent is to design 3D visual interfaces for navigating, examining and interacting with multi-dimensional sources of environmental information. Our hope is to provide public access to these Web-enabled visualizations.

Conclusion

In this article, we have described three prototype efforts that attempt to integrate visualization techniques with emerging information technologies. As the U.S. EPA moves forward with its new Information Office, we hope visualization and 3D user interfaces will aid in the interactive exploration of multi-dimensional environmental data. We anticipate moving beyond traditional data models for visualization to include Web-enabled solutions that support public access and query of environmental information.

Acknowledgments

We would like to acknowledge Sue Kimbrough, Alan Huber, Don Barnes and Vickie Richardson at the U.S. EPA for providing challenging visualization projects. We are also grateful to Lynne Petterson, U.S. EPA Visualization Technical Services Manager, and Mark Bolstad, Lockheed Martin Manager for Visualization and Scientific Computing for their support. Richard Greene of Lockheed Martin Technical Services and Bettina Brinkley (U.S. EPA) provided technical support for spatial data analyses. All visualizations shown in this report were created by Theresa-Marie Rhyne of Lockheed Martin Technical Services at the U.S. EPA Scientific Visualization Center.

References

  1.  Darwen, Hugh and Chris J. Date. Foundation for Object/Relational Databases: The Third Manifesto, Addison-Weslely Publishing Co., New York, June 1998.
  2.  Groth, Robert. Data Mining: A Hands On Approach for Business Professionals, The Data Warehousing Institute Series from Prentice Hall PTR, New Jersey, 1998.
  3.  Object Management Group (OMG). "What is Common Object Request Broker Architecture (CORBA)," website.
  4.  "Silicon Graphics MineSet: Supporting the Discovery Research Process," SGI Inc., Spring 1999, website.
  5.  World Wide Web Consortium (W3C). "Document Object Model (DOM) Activity Statement," website.




Theresa-Marie Rhyne is a Lead Scientific Visualization Researcher for Lockheed Martin Technical Services at the U.S. EPA Scientific Visualization Center. She is currently a Director-at-Large on ACM SIGGRAPH’s Executive Committee. She was Co-Chair of IEEE Visualization 1998 and 1999.

Theresa-Marie Rhyne
Lockheed Martin Technical Services
U.S. EPA Scientific Visualization Center
86 Alexander Drive
Research Triangle Park, NC 27711

The copyright of articles and images printed remains with the author unless otherwise indicated.