In 2002, right before beginning his studies as a computer science Ph.D. student at Stanford University, Klingner Klingner attended his first SIGGRAPH, an experience he describes as "overwhelming and humbling." As a newcomer to academic computer graphics, Klingner wanted a quick way to get an overview of the field and learn about the evolution of the various subfields of graphics as well as discovering who were the key contributors to this area. Motivated by the desire to grock graphics, in 2003 Klingner wrote a web spider that crawled the website of the ACM Digital Library, gathering information on every paper in the 30-year history of SIGGRAPH. In addition to downloading the papers themselves, his program also extracted metadata, such as the title, author list, keywords, and citations. "The fact that papers nowadays are indexed electronically gives you a lot of analytical power," Klingner noted -- writing and debugging his paper-gathering program took only a few days, while such a comprehensive analysis would have been prohibitively time-consuming in the pre-digital-library era.
Once he has gathered all 1,399 of the SIGGRAPH papers from the first SIGGRAPH in 1974 up through last year's conference, Klingner was able to compile a set of a set of informative statistics. 2,204 distinct authors have contributed to the SIGGRAPH papers track, which, Klingner noted with amusement, is about as many passengers as sailed on the Titanic. Klingner was able to compile a list of "SIGGRAPH extremes", including:
- Longest Title: "Rendering Synthetic Objects into Real Scenes: Bridging Traditional and Image-Based Graphics with Global Illumination and High Dynamic Range Photography"
- Shortest Title: "Tint Fill"
- Most Cited Author: Hughes Hoppe (692 citations) (note that this is only counting citations by other papers that appeared in the ACM Digital Library, and does not take other conferences or journals into account)
- Most Collaborative Author: Pat Hanrahan (61 distinct co-authors)
- Longest SIGGRAPH Publishing Career: Tosiyasu L. Kunii (28 years, 1974 - 2001)
- Most Prolific Author: Pat Hanrahan (38 papers)
- Most Cited Paper: Lorensen and Cline. "Marching Cubes: A High Resolution 3D Surface Construction Algorithm" 1987. (237 citations by other SIGGRAPH papers)
Klingner also used the data visualization tool Tableau to construct graphical depictions of SIGGRAPH trends over time. One of the most interesting trends revealed by these graphs is the change in the number of authors per paper. In the early days of SIGGRAPH, one and two-author papers were quite common, but over the years, the average number of authors has steadily increased. In 2003, there were only two single-author papers, and this year there is only one. Klingner speculates that this could reflect the increasing competitiveness of the SIGGRAPH conference -- because such high quality is required of the papers, a contribution often requires collaboration.
In addition to compiling overview statistics and graphs of SIGGRAPH's academic history, Klingner also created an interactive visualization tool that allows people to explore the relationships between SIGGRAPH authors and papers. In the authors visualization, each paper author is represented as a node in a graph, where edges connect authors who have collaborated on a SIGGRAPH paper (meaning they appeared as co-authors together). Klingner notes that he was surprised to learn how highly interconnected the academic graphics community was -- the connections revealed by his visualization reflect organizational structures (such as universities and research labs), but also show a lot of cross-organizational collaboration -- for instance, using his tool reveals a highly connected network of collaborating authors who are all from the University of Washington, and another highly-connected group from Microsoft Research, but there are also a very high number of connections between those two groups, reflecting the high level of cooperation that is the result of their geographic proximity. Other linkages between separate institutions are often formed when a student who received their Ph.D. at one institution becomes a professor elsewhere.
Klingner's visualization tool also can be used to explore connections between papers. In this mode, each SIGGRAPH paper is represented as a node in the graph, and edges connect two papers if one references the other. This mode is useful for identifying key papers that are frequently referenced. The clusterings created in this view tend to indicate sub-areas within graphics, such as rendering or computer vision.
When this year's papers chair, Joe Marks, heard about Klingner's project, he invited Klingner to attend the papers committee's meeting this past spring at the University of Washington. Presenting his statistics to a set of SIGGRAPH veterans provided new insights. For instance, Klingner had created a graph showing how "important" certain SIGGRAPH conferences were, based on how common references were to papers from each year. When the committee saw peaks in this graph, they were able to explain them in terms of important events in graphics history, such as the large number of citations from the series of conferences when image-based rendering was introduced. The committee also offered Klingner suggestions on additional features that would be interesting to add to his analysis. For instance, by analyzing the full text of each paper, he could do automatic summarizations and keyword extraction, which he could use to further analyze the splitting off of graphics sub-fields.
Although Klingner isn't actively exploring the SIGGRAPH history data anymore, it proved a valuable exercise, both in helping him start his Ph.D. thesis work in the area of interactive visualization, as well as in helping him gain a more holistic view of the relationships between SIGGRAPH papers and authors.
One of the graphs Klingner prepared using Tableau -- this one shows the decline in the number of single-author papers over SIGGRAPH's 30-year history.
--image courtesy of Klingner Klingner