Creating a Collaborative Space to Share Data, Visualization, and Knowledge
Collaboration is an effective approach to problem solving. Most large-scale scientific investigations are highly interdisciplinary and collaborative, with project investigators often geographically distributed. For example, ITER  is a joint international research and development project that aims to demonstrate the scientific and technical feasibility of fusion power. The partners in the project are distributed in more than ten countries, and it is imperative that they share their creations and findings since greater advances are only made with collective efforts. Similarly, I am participating in the U.S. Department of Energy’s SciDAC (Scientific Discovery through Advanced Computing) program , which sponsors a dozen high-profile science projects. Each of the project teams involves researchers from several different universities and national laboratories. Clearly, there is an explicit need to provide support for collaborative work. How can these researchers effectively share their data, problem solving strategies, and research findings without time and place constraint? The answer is web-based collaboratories, which have been created for many major science projects. However, most of these collaboratories are simply data repositories with web interfaces. For instance, users do not get to see the associations between data and users, which are as valuable as the data itself. These associations can be large and very complex, and thus hard to comprehend. Visualization can play a key role in both knowledge discovery and managing potentially complex and high-dimensional collaborative workspace. I have been thinking about how to address the need to support collaborative data analysis and visualization, and how to direct the visualization research community towards the development of such needed technologies. I am going to describe some of the ongoing efforts to facilitate sharing visualization resources that will provide the eventual support for the kind of collaborative workspace I have in mind.
Sharing High-Performance Visualization Facilities
Many researchers do not have access to the resources required to perform effective visualization for large data sets generated by their simulations. To remedy this problem, the TeraGrid team at the Argonne National Laboratory and University of Chicago is building a Visualization Gateway , which provides simplified access of such resources to a broad population of users. The TeraGrid  is a multiyear effort sponsored by the U.S. National Science Foundation. The goal is to build and deploy a collection of advanced compute clusters and specialized resources connected by high-speed networks and dedicated to open scientific research. This proposed visualization service provides users with a single point of access to all of the advanced visualization resources that available on the TeraGrid . To make such a remote visualization service usable, however, they also need to address account management, data management, and possibly collaborative work.
Sharing Visualizations and Findings
The International Linear Collider (ILC) project  involves researchers from SLAC (Stanford Linear Accelerator Center), KEK (High Energy Accelerator Research Organization) in Japan, DESY (Deutsches Elektronen Synchrotron) in Germany, and various U.S. national laboratories. Scientists on this project may run the same simulation code with a different parameter setting. To support collaborative analysis of the simulation results, a web-based interface was created to display the results of each simulation via visualization and animations, including notes made by those who have examined the simulation results . Figure 1 shows one such interface. The user can switch to a separate visualization to see how many notes have been written for each simulation run. The visualization can also show authorship and the evolution of the annotation created over a given time period. For example, users can discover patterns of annotation authorship by selecting different subsets of authors. Perhaps two people frequently work on the same data points because they have compatible ideas, work habits, or simply because they inspire each other. The visualization highlights such patterns easily, while they are essentially invisible without a visual means.
Figure 1. Left: An interface for visualizing simulation results shared by a group of scientists. Each point in the left half of the window represents a simulation run. A selected point corresponds to the collection of images and animations displayed in the right half of the window. Right: A view of the distribution of notes made by scientists for the results generated by simulations. Layer color and thickness are mapped to author and the size of notes, respectively.
How such a facility benefits the project and if it changes how scientists’ work remain to be studied. There is also the need to compare different simulation codes for the same modeling problem. Support for collaborative annotations seems to be equally useful.
Researchers at IBM has created a web space for making visualization as social activities, inviting contributions of data and visualizations, sharing experience through open discussion, and recording collective insights about data. They call the space Many Eyes. You can view and discuss data sets or visualizations, and create visualizations from existing data sets. Your “view” and comments are saved so others can see what you are seeing. It is fascinating to see how much interest and participation they have drawn. I encourage you to visit the site . A key point also made by Many Eyes is interactive visualization should also support social interaction. However, it is still not clear what the most appropriate collaboration mechanisms are for supporting this interaction . Many Eyes serves as a nice testbed to experiment with design considerations for asynchronous collaboration in visual analysis environments. I call for setting up similar testbeds for studying large-scale collaborative visualization of scientific data and, in particular, the social aspects of data analysis and visualization.
Scientific visualization has become an active area of research. However, most researchers and students in the field of visualization do not have access to data sets generated by state-of-the-art simulations. In the rare case that they do have access to some of these data sets, they often do not get to directly interact with the scientists who generated the data sets. This interaction is crucial for obtaining the understanding of what scientists really need to get out of their data sets and what visualization functionalities are missing in existing visualization software tools. I thus organized a “Meet the Scientists” panel, as an outreach activity of the SciDAC Ultravis Insitute , for the Visualization 2007 Conference to provide such an interaction. Scientists in four representative areas were invited to attend the Conference and participate in the panel. Each scientist described his/her application, data sets, and the corresponding visualization and data analysis needs and challenges, and then answered questions. Many questions were asked after each presentation so the panel clearly served its purpose. What has been done additionally is to make these scientists’ data sets openly available after the conference such that visualization researchers and students are given the chance to work on the precise problems faced by the scientists.
A web site, named VisFiles , is under construction to include the valuable information provided by the scientists and visualization researchers who have worked on their data sets. Figure 2 shows screen captures of the VisFiles page and the Meet the Scientists page, where application descriptions, data sets, and visualization examples are provided. The information there provides novices the correct understanding of the problems, and subsequently helps accelerate the development of the field of scientific visualization. Other scientists will be invited to contribute to this site. We expect many others to contribute their visualizations of the same or similar data sets as well as their findings using these applications and data sets. Note that VisFiles articles are also made available at this web site and thus become more accessible.
Like Many Eyes, users of VisFiles can join discussion groups while working on a particular application or visualization technique. This open forum will bring greater exchange of visualization experience, data sets, and visualization software. Unlike Many Eyes, due to the mission of the SciDAC Ultravis Institute, the site at this stage targets a smaller user group with special interest in the visualization of data generated by selected scientific simulations. In the long run, I believe VisFiles will grow to cover data visualization applications in general. Targeting a smaller community now will allow us to conduct a more focused study and make some immediate impact to important science applications. Our goal is to improve the performance of large-scale projects by appropriately support the social aspects of the data analysis and visualization tasks.
Figure 2. Left: The VisFiles site. Right: The Meet the Scientists page at the VisFiles site. Presently, there are materials provided by four scientists. Some of the services remain to be implemented.
This work is sponsored in part by the U.S. Department of Energy’s SciDAC program and the National Science Foundation.
 ITER, http://www.iter.org.
 SciDAC, http://www.scidac.org.
 TeraGrid Visualization Gateway https://viz.teragrid.org:8443/gridsphere/gridsphere.
 TeraGrid, http://www.teragrid.org.
 J. Binns, J. DiCarlo, J. Insley, T. Leggett, C. Lueninghoener, J.-P. Navarro, M. Papka. Enabling Community Access to TeraGrid visualization resources. Concurrency and Computation: Practice & Experience, Volume 19, Issue 6, April 2007, pp. 783-794.
 K. Ko. The International Linear Collider. SciDAC Review 1, 2006, pp. 17-20.
 Y. Wang, J. Shearer, and K.-L. Ma. VICA: A Voronoi Interface for Visualizing Collaborative Annotations. In Proceedings of the 4th International Conference on Cooperative Design, Visualization, and Engineering, Springer-Verlag, September 2007, pp. 21-32.
 Many Eyes, http://services.alphaworks.ibm.com/manyeyes/home.
 J. Heer and M. Agrawala. Design consideration for Collaborative Visual Analytics. In Proceedings of IEEE Symposium on VAST 2007, pp. 171-178.
 Ultravis Institute, http://www.ultravis.org.
 VisFiles, http://vis.cs.ucdavis.edu/VisFiles.
|Kwan-Liu Ma is a professor of computer science at the University of California at Davis. He directs the VIDI Research Group and the DoE SciDAC Institute for Ultrascale Visualization. His current research and activities can be seen at http://www.cs.ucdavis.edu/~ma and his email address is email@example.com.|