Large Networks Present Visualization Challenges
Bill Hibbard Space Science and Engineering Center University of Wisconsin  Madison
August 99 Columns
Visfiles
Figure 1: The Infolab Wall.
Figure 2: A view of Swift3d.

Large networks pose an increasingly important visualization problem. So for the first guest column during my tenure as VisFiles columnist, I asked Stephen North to contribute an article about the great work at AT&T Labs  Research.
— Bill Hibbard
James Abello
Eleftherios Koutsofios
Emden R. Gansner
Stephen C. North
AT&T Labs  Research
Introduction
Visualization is becoming increasingly important for understanding the structure of communication networks and services, but displaying the data associated with very large networks is difficult. There are abundant research problems in visualization metaphors, methods, algorithms and the engineering of scalable interactive systems. This area is being addressed in the AT&T Infolab, a multidisciplinary project investigating visualization and analysis for AT&T’s network and service businesses. The sheer magnitude of the data involved makes the challenge interesting  the voice network carries about 300 million calls per day, plus diagnostic information generated by the network itself; packet data networks such as frame relay, Asynchronous Transfer Mode (ATM) and Internet Protocol (IP) give rise to upward of millions of records daily, describing network configuration, dynamic behavior and events. The longrange goal for the Infolab work is to be able to provide a highlevel, integrated view of all the AT&T networks, handling all the underlying data streams in real time.
There are a variety of standard techniques for managing the display of large data sets. These include sampling the data, eliding or coalescing semantically chosen subsets or relying on distorted views such as fisheye [8] or hyperbolic [7] projections. Although these methods are useful, they clearly do not provide all the answers, especially when analyzing networkbased data. In this article, we describe two uncustomary approaches: a project in largescale graphics and network displays, and an experiment with a novel graphical representation of networks.
Large Displays
One approach to the scale problem is to use a physically large display. Figure 1 shows our display wall, inspired by projects such as the CAVE [3] and Powerwall [9].
Our main display wall (6’ x 15’) is driven by eight LCD projectors. These are connected through a softwarecontrolled video switch, usually to display the output of two graphics pipes of an SGI Onyx. The same Onyx can drive a smaller (7’ x 9’) fourprojector wall elsewhere in the same building, using its third pipe. Other compute and disk servers for network data analysis projects are connected on an 800megabit High Performance Parallel Interface (HIPPI) network, providing 10 terabytes of online storage and another 20 terabytes of tape under hierarchical storage management.
One important application that runs on the display wall is Swift3d, an interactive, largescale network viewer [6]. Swift3d has modules for data collection, storage, analysis and display of at least a full day’s worth of voice calldetail records. Swift3d itself is network and dataindependent and is adapted to applications by scripting and loading geometry files and datatogeometry maps. It provides interactive selection and aggregation to control the level of information presented in its views. Selection can take the form of a stream database query, graph search or geometric panandzoom. The forms of aggregation supported include computing counts, sums or set unions. Figure 2 shows a view of Swift3d.
In this figure, usage of two network services is compared (encoded by color). Activity is aggregated up to the level of local telephone exchanges. When an animated time series is displayed, one can observe how the balance between the two services shifts through the day. The user interface supports interactive queries combining time, geography, customer and type of service to create selective views for data exploration.
Our experiences with this work have shown that a large, dense display qualitatively changes interactive network visualization, beyond the mere presence of many more pixels. The large display favors group collaboration and investigation. It makes it possible to apply an assortment of linked views simultaneously. The underlying scalable analysis and visualization tools enable one to query the entire network activity database graphically.
A next step in this work is to learn how to display integrated views of multiple, layered networks. For example, although we can display a voice or wireless network, or the structure of a virtual private data network configured by large business customers, displaying an informative view of several networks at once so that the relationships between them are obvious is still difficult. Another problem we are investigating is how to interconnect multiple display walls and their applications. The goal is to support collaboration between multiple network operation centers, and other forms of distributed visualization over widearea networks.
The views in Swift3d are, at present, based on the underlying geography. This is a significant limitation for network visualization. Often, many endpoints are located in a few dense metropolitan areas, with large ‘deserts’ in between, so maps do not use the available pixels efficiently. Also, maps work well for displays of endpoints (vertices) but not for edges (vertex pairs), let alone groups of more complex structures such as routes, flows or subgraphs representing virtual networks. With IP networks, geographic coordinates are not always even available. One can move to more abstract topological representations, but the standard visualization techniques rapidly become unusable as the data grows, even though a large display delays this somewhat. This points to the need for additional work in display metaphors and interaction techniques for large graphs, one of which is described next.
Graph Surfaces
Graph surfaces [1] provide a metaphor that unifies visualization and computation on weighted multidigraphs. They are endowed with a collection of “natural’’ operations to provide hierarchical browsing. Mapping a graph to a hierarchy of surfaces gives flexibility in the handling of the I/O and screen bottlenecks. Graph surfaces can be updated incrementally. They are suitable for the maintenance and navigation of external memory graphs [2] whose vertex sets are hierarchically labeled.

Figure 3: A graph surface.

What is a Hierarchical Graph Surface?
The main idea is to view a weighted multidigraph as a discretization of a two dimensional surface in 3space. Under this view and for a fixed ordering of the vertex set, the corresponding rectangular domain is triangulated and each point is lifted to its correct height. This provides a piecewise linear continuous function forming a polyhedral terrain. The terrain is used as an approximation to a surface representing a multidigraph. An example is shown in Figure 3.
In order to handle very large graphs, a hierarchy of surfaces is constructed. Each of them represents a multidigraph obtained from an equivalence relation defined on the edge set of the input graph. Operations are provided that allow the user to navigate the surfaces.
Construction of a Hierarchical Graph Surface
To simplify the exposition, we concentrate on multidigraphs. (The adaptation to weighted multidigraphs is straightforward.) A multidigraph is a triplet G = (V, E, m), where V = V(G) is the set of vertices, E = E(G) is the set of edges, and m is a function from E to the nonnegative integers giving the edge multiplicity. For a rooted tree T, we let height (T) be the maximum distance from a vertex to the root of T and let T(i) be the set of vertices of T at distance i from the root of T.
Given a multidigraph G = (V, E, m) and a rooted tree T such that the set of leaves of T equals V(G), the islice of G is the multidigraph with vertex set T(i) and with a multiedge (p,q) defined to “represent” the collection of edges in G running from the subtree rooted at p to the subtree rooted at q. The multiplicity of the edge (p,q) is the sum of the multiplicities of the edges (x,y) that it represents.
Similar to the approach of Duncan et al [3], we construct the hierarchical view of G given by T, H(G,T) as T plus the collection of islices. The novelty now is that each slice is represented as a surface and T is used as a road map to move from surface to surface. We describe next the main navigation operations.

Figure 4: Graph surface navigation.

Navigating the Surface Hierarchy
Given a multiedge e = (u,v, m(u,v)) in the islice of G, the operation ZoomIn(e) deletes e from H(G,T), returns the set of edges in the (i+1)slice that are represented by e, displays the corresponding surface and registers in a navigation log file information about this operation This operation is exhibited in Figure 4.
The operation ZoomOut takes as input a subgraph of the (i+1)slice that is registered in the navigation log file, deletes the corresponding surface and inserts back the corresponding representative edge e.
Handling the I/O and Screen Bottlenecks
When G is an external memory graph residing on disk there are three cases to consider:
 T fits in main memory
 T does not fit but V(G) does
 V(G) does not fit
In the first case, the edges of G are read in blocks and each one is filtered up, through the levels of T, until it lands in its final slice. This can be achieved with one pass over the data. The second and third case are handled by setting up a multilevel external memory index to represent T. Filtering the edges now requires several passes over the data. The I/O performance depends strictly on the I/O efficiency of the access structure.
The visualization of a data set that is several orders of magnitude more than the available screen resolution calls for some form of hierarchical graph decomposition. Graph surfaces by definition provide a geometric view of a very large graph in a manner that is amenable to hierarchical browsing via the ZoomIn and ZoomOut operations. This makes it feasible to explore deeper hierarchy elements at a finer level of detail. Also, a graph surface provides both a uniform global and local view.
Implementation
The graphical engine generates polyhedral terrains that correspond to individual edges in H(G,T). Initially, a suitable slice in H(G,T) is chosen as the root of the visual hierarchy, depending on the available screen resolution. The polyhedral terrains are generated with a triangulation algorithm and are displayed using several visual cues and dynamically generated labels. The graphical engine is implemented in C++ and uses the OpenGL standard library for the rendering portion. Currently, a mouse/keyboard interface is used. The use of joysticks and gestures to navigate the environment is currently under consideration.
Applications
Currently, graph surfaces are being used experimentally for the analysis of several large multidigraphs arising from the AT&T network. These graphs are collected incrementally. For example, in the calldetail multigraph, vertices correspond to phone numbers and edges represent a call from one number to another. The graph grows by daily increments of about 300 million edges, defined on a vertex set on the order of 300 million vertices. The aim is to process and visualize this type of multidigraph at a rate of at least a million “edges’’ per second.
Internet data is another prime example of a hierarchically labeled multidigraph that fits quite naturally the graph surfaces metaphor. Each islice represents traffic among the aggregate elements that lie at the ith level of the hierarchy. The navigation operations can be enhanced to perform a variety of statistical computations in an incremental manner. These in turn can be used to animate the traffic behavior through time.
When the vertices of the multidigraph have an underlying geographical location, they can be mapped into a linear order using, for example, PeanoHilbert curves, in order to maintain some degree of geographical proximity. In this way, the constructed surface maintains a certain degree of correlation with the underlying geography.
References
 Abello, J. and S. Krishnan. “Graph Surfaces,” Int. Conf. Industrial and Applied Math, Edinburgh, July 1999.
 Abello, J. and J. Vitter. “External Memory Algorithms,” AMSDIMACS series, volume 50, 1999.
 CruzNeira C., D. Sandin and T. DeFanti. “SurroundScreen ProjectionBased Virtual Reality: The Design and Implementation of the CAVE,” SIGGRAPH 1993 Conference Proceedings, pages 35142, 1993.
 Duncan C., M. Goodrich and S. Kobourov. “Balanced Aspect Ratio Trees and Their Use for Drawing Very Large Graphs,” Symposium on Graph Drawing GD’98, ed. S. Whitesides, Lecture Notes in Computer Graphics, 1547, pages 111124, 1998.
 Eades, P. and Q. W. Feng. “Multilevel Visualization of Clustered Graphs,” Symposium on Graph Drawing GD’96, ed. S. C. North, Lecture Notes in Computer Science 1190, pages 101112, 1997.
 Koutsofios E., S. North and D. Keim. “Visualizing Large Telecommunication Data Sets,”IEEE Computer Graphics and Applications, 19(3), pages 1619, 1999.
 Munzer, T. “Drawing Large Graphs with H3Viewer and Site Manager,” Symposium on Graph Drawing GD’98, ed. S. Whitesides, Lecture Notes in Computer Science,1547, pages 384393, 1998.
 Storey, M.A. and H. Muller. “Graph Layout Adjustment Strategies,” Symposium on Graph Drawing GD’95, ed. F.J. Brandenburg, Lecture Notes in Computer Science, 1027, pages 48799. 1996.
 Woodward, Paul et al. University of Minnesota PowerWall, 1998. See the website.
Bill Hibbard's research interests are interaction techniques, data models and distributed architectures for numerical visualization. He leads the SSEC Visualization Project and is primary author of the Vis5D and VisAD systems. He has degrees in mathematics and computer science from the University of Wisconsin  Madison.
