Rules and Principles of Scientific Data Visualization

Department of Electrical Engineering and Computer Science
The George Washington University
Washington, D. C. 20052


This report provides a set of rules and principles for scientific data visualization. These rules and principles have been acquired through informal discussions with data visualization experts and surveys of existing literature on graphics, data visualization, visual perception, exploratory data analysis, psychology, and human-computer interaction. Even though far from being complete and extensive, the set provided in this report forms a starting point for designing effective scientific data visualization techniques. Using these rules and principles, we are currently developing a visualization tool assistant (VISTA) which will advise scientists and engineers, who are not visualization experts, in selecting and creating effective data visualizations.

1. Introduction

One of the most important issues in scientific data visualization is mapping attributes of data into graphical primitives which effectively convey the informational content of data. In general, this mapping defines an abstract visualization technique for the given data. However, there are several possible mappings which may lead to different visualization technique designs. Selecting and creating the most effective design among all the alternatives for a given situation usually requires considerable knowledge and creativity on the part of the visualization technique designer. While the knowledge about characteristics of data, such as types, units, scales, and spacing among measurement points, as well as graphical primitives, which eventually compose a design, is important in constructing visualization techniques, the knowledge about comprehensibility of the resulting image is essential for effective presentation of the information inherent in the data.

Usually, the latter type of knowledge is in the form of heuristic rules and principles that are acquired through experience and experimentation. These heuristics also form the basis for the creativity of the visualization technique designer. In this report, a set of such heuristic rules and principles that have been developed by several researchers over the years is presented. The set is a preliminary result of our ongoing research for designing a knowledge-based advisory system for scientific data visualization. Although a considerably large subset of the rules and principles presented in this report is concerned with multi-dimensional data visualization, rules and principles concerning two dimensional graphics constitute the larger portion of the set. This is because of the fact that multi-dimensional data visualization is still an emerging field in which the techniques have not yet been evaluated extensively. As the field gains more maturity, there will undoubtedly be several other rules and principles that can be added to the set presented in this report. In addition to being incomplete, some of the rules and principles in this set are somewhat controversial and conflicting due to the lack of formal evaluation criteria in their development. These rules and principles are intentionally included in this report to reflect different views of experts in the field.

The report is organized into nine sections. Following this introductory section, several definitions that are essential in the rest of the report are given in Sections 2-4. These definitions are related to attributes of data, marks that constitute visualization techniques, and primitive visualization techniques each of which can encode one dependent and up to four independent variables. Sections 5-8 present several rules and principles that pertain to expressiveness and effectiveness of primitive visualization techniques, use of referential components that aid in understanding the meaning of data visualizations, and other issues, such as handling peculiarities in data, scaling, image sequencing, and depth perception, that are not covered by the preceding sections. In Sections 5-8, a reference is given for each rule and principle to indicate the source of acquisition whenever it is possible. The final section presents our concluding remarks and re-emphasizes the need for formal empirical studies to identify additional rules and principles that would further our ability to create more effective visualization techniques.

2. Attributes of Data

Scientific and engineering data can broadly be classified into two groups: qualitative and quantitative. Qualitative data is further subdivided into two groups: nominal and ordinal. Nominal data types are unordered collections of symbolic names without units. For instance, the names of planets, Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune, and Pluto, form a nominal data set. Ordinal data types are rank ordered only, where the actual magnitudes of differences are not reflected in the ordering itself. A typical example for an ordinal data set is the names of calendar months, January through December. Compared to qualitative data, quantitative data is more common in all scientific and engineering disciplines. Quantitative data is typically classified in two dimensions; (1) based on the number of components which make up the quantity, and (2) based on the scales of values. Along the first dimension, a quantitative data can be scalar, vector, or tensor. Scalar data types possess a magnitude, but no directional information other than a sign. They are simply defined as single numerals.

Vectors have both direction and magnitude. Quantitatively, their mathematical representation requires a number (equal to the dimensionality of the coordinate system) of scalar components. In general, a vector is a unified entity. This implies that the problem of visualization of vector fields is not equivalent to the problem of displaying independent, multi-variate scalar fields. The number of components which specify a tensor depends on the dimensionality of the coordinate system and the order of the tensor. If d is the dimensionality of the coordinate system, then an nth-order tensor requires dn scalar values to specify its components. Both scalars and vectors are special instances of tensors such that scalars are zeroth-order and vectors are first-order tensors. In a three dimensional coordinate system (d = 3), a scalar (n = 0) requires one value; a vector (n = 1) requires three values; and a second-order tensor (n = 2) requires nine values to define their components (Haber,1988). Along the other dimension, a quantitative data can be classified as interval, ratio, and absolute (Kosslyn, Pinker, Simcox & Parkin, 1983). Interval data scales preserve the actual quantitative difference between values (such as farenheit degrees), but do not have a natural zero point. Ratio data scales are like interval scales but they do have a natural zero and can be defined in terms of arbitrary units. For instance, two hundred dollars is twice as much as one hundred dollars. Absolute data scales are also ratio scales which are well-defined in terms of non-arbitrary units, such as inches, feet, and yards.

Other important attributes of data, that affect selection of visualization primitives, include functional dependencies among data variables, spacing between sampling points, cardinality of the data set, upper and lower bounds of values, unit of measurements, coordinate system, scale and continuity of data.

3. Marks

A mark is the most primitive component that can encode some useful information in any data visualization. In general, marks can be classified as simple or compound. Simple mark types include points, lines, areas, and volumes. A point has a single conceptual center that can indicate a meaningful position, a line has a conceptual spine that can indicate a meaningful length or connection, an area has a single conceptual interior that can indicate a meaningful region or cluster of marks, and a volume has a single conceptual interior that can indicate meaningful space in three dimensions. Of the four simple mark types, the first three are identified by Bertin (Bertin, 1983) as being the most primitive components of two dimensional graphics and used by Mackinlay (Mackinlay, 1986) to automate design of graphical presentations. The fourth mark type, volume, is a natural extension of Bertin's classification to three dimensions and appropriate when the third dimension can be perceived effectively. A compound mark is a collection of simple marks which form a single perceptual unit. Contour lines, wire meshes, glyphs, arrows, flow ribbons, and particles are all compound marks. A useful analogy is that simple marks are like letters in the alphabet, whereas compound marks are like words in a dictionary.

Information inherent in data can be encoded in an image by varying the positional, temporal, and retinal properties of its marks (Bertin, 1983). A positional encoding of information is a variation of the positions of the marks in the image. A temporal encoding of information is a variation of the mark properties over time. A retinal encoding of information is any variation of the "retinal" properties of the marks that the retina of the eye is sensitive to independent of the position of the marks. The retinal properties are size, texture, orientation, shape, transparency, three dimensions of color, namely, hue, saturation, and brightness. While size, texture, orientation, shape, hue and saturation were described by Bertin as being the only retinal properties of marks in two dimensional graphics rendered on paper media, the remaining retinal properties, brightness, and transparency can also be used to encode information on modern graphics hardware.

The marks can further be classified as to whether they represent single or multiple data variables and single or multiple data points. A single variable (SV) mark is associated with one variable, whereas a multiple variable (MV) mark is associated with several variables. A single data (SD) mark conveys a single value for a single data point, whereas a multiple data (MD) mark shows a range of summary information regarding the local distribution of several data points. This classification is particularly useful when visualizing large multi-variate data sets. It defines four groups of marks, SVSD, SVMD, MVSD, and MVMD as shown in Figure 1.

Figure 1. Classification of marks for large multi-variate data sets.

4. Primitive Visualization Techniques

Primitive visualization techniques are those that can encode one

dependent and up to four independent variables. Additional variables (dependent or independent) that may exist in a given data set can further be encoded by manipulating retinal properties of marks of primitive visualization techniques or equivalently composing two or more primitive visualization techniques into a single design. In general, the set of primitive visualization techniques are classified into three categories, that is, positional, temporal and retinal, depending on which properties of marks the techniques primarily manipulate. Positional techniques can be one, two, and three dimensional such as single axis, contour plot, and surface diagram, respectively. There is only one temporal technique, namely, animation, and one retinal technique for each retinal property associated with marks. Following the analogy made previously between simple or compound marks and letters or words, the primitive visualization techniques may be viewed as forming simple sentences in a language.

The set of primitive visualization techniques that is defined in this report includes the followings:

single axis isosurface texture

line plot volumetric orientation

scatter plot arrow plot shape

bar plot particle advection transparency

histogram flow ribbons hue

contour plot deformation saturation

pseudo-color animation brightness

surface diagram size

Although some of these techniques may be considered to be compositions of others, for instance, an arrow plot may be viewed as a composition of a scatter plot and a shape, where the scatter points have an arrow shape, it is more appropriate to include them in the set of primitive visualization techniques rather than construct them in terms of other primitives. The set also contains techniques that manipulate more than one of the positional, temporal, and retinal properties of marks. For instance, a pseudo-color image is basically a positional technique, which also uses one or more of the color parameters, hue, saturation, and brightness.

5. Expressiveness of Visualization Techniques

Expressiveness criteria identify graphical languages capable of expressing the desired information (Mackinlay,1986). A set of facts is expressible in a graphical language if the language contains a visualization technique that encodes every fact in the set and does not encode any additional facts. In what follows, rules describing the expressiveness criteria for primitive visualization techniques and mark classes are presented.


Rule: A line plot can only be used to express a continuous function. (Mackinlay,1986)

Rule: A line plot is appropriate if the cardinality of the variable X that is encoded by the horizontal axis is small, and there are several dependent measures that behave in different ways with respect to the variable X. A multiple line plot will translate those differences into salient nonparallel lines [assuming same axes are used]. (Kosslyn, Pinker, Simcox & Parkin,1983)

Rule: A line plot is mandated for nominal data when a) the plot is standardized so the left-to-right order of nominal scale values along the horizontal axis is non-arbitrary in the sense that it is the same for everyone, and b) various patterns of values of the vertical scale with respect to the horizontal axis must be differentiated. In these cases, a line plot allows each pattern to be represented as a line with a different, universally recognizable contour -- again sparing the reader from having to undertake a cognitively costly element-by-element comparison. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: If the values of the variable associated with the horizontal axis are spaced irregularly, the transitional case cannot be handled with line plots. The misleading features of connecting line segments of widely varying lengths are subtle and treacherous. (Tukey,1988)


Rule: If the values of the variable associated with the horizontal axis are coarsely spaced, we have little chance to see the features that concern us with a line chart regardless of whether the spacing is regular or irregular. (Tukey,1988)

Rule: If a plot represents a point cloud [where x and y enter on an equal footing and the main message is a pattern of distribution], it is most unlikely that linking up the points into a chain is either sensible or useful. The information sought is likely to be conveyed in terms of the general pattern and extent of the cloud. (Tukey,1988)

Rule: For a scatter plot [where dependence of y upon value of x is the focus of the plot], it is unlikely that linking up into a chain is sensible and useful. What we are likely to want from a scatter plot are indications of tilting, or arching up or sagging down, or of horizontal wedging. (Tukey,1988)

Rule: If linear relationships among variables in a set of multidimensional data are relevant, it is unlikely to

find them using [glyphs such as] stars, trees, or other types of symbolic representations. Carefully selected scatter plots are likely to be more informative. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: [Do not use line plots] if the intensity (amplitude) of quick wiggles (short waves) is comparable to that of medium wiggles (medium waves), linking up the points tends to divert our attention to the quick wiggles as to be often quite misleading. (Tukey,1988)

Rule: [Do not use line plots] if wild points are at all common. Connecting up points will almost surely lead the eye and brain to focus on these wild points to the exclusion of other features, which is often exactly what we do not want to happen. [Instead, use a scatter plot]. (Tukey,1988)

Rule: When we enhance a scatter plots with smooth curves, we should keep in mind the principle of proper visual impact. If the points are the real message and the curve is only a gentle guide, the curve should probably be drawn lightly so that it does not dominate. If the curve is the message, it should be heavy and the points light. (Chambers, Cleveland, Kleiner & Tukey, 1983)


Rule: Contour plots require some effort on the part of the viewer to establish quantitative relations between different contour levels - it is not always obvious whether a local extremum is a minimum or a maximum. (Haber,1988)

Rule: If a logical progression is used in mapping the scalar value of each contour to the color of the curves, then it becomes easy to determine whether the scalar value is increasing or decreasing between adjacent contours.(Haber,1988)

Rule: Contours can represent slopes or breaks in distribution of [quantitative components]. (Bertin,1983)

Rule: Contours [are used to determine] the main lines of a distribution. (Bertin,1983)

Rule: Contours cannot carry out quantitative comparisons. For example it would be difficult to answer the question "what is temperature difference between readings taken at location x and location y". (Bertin,1983)

Rule: Contours cannot represent absolute quantities calculated from the several data points. (Bertin,1983)

Rule: Contours cannot represent a sparse sample.(Bertin,1983)

Rule: Contours show greater numeric precision than is possible with pseudo-color images. (Treinish,1989)

Rule: Contours can only represent quantitative components and should be spaced uniformly. (Bertin, 1983)

Rule: Mapping the contours into the third geometric dimension makes the quantitative relation between the contours immediately apparent. [The third dimension is used to define the values associated with the contours.] (Haber,1988)

Rule: Contours do not indicate sense of slope, but can be enhanced to indicate direction of increase or decrease [of ordered values] by shading. (Bertin,1983)


Rule: Pseudo coloring contributes artifacts to the data not originally present in it. In particular, a pseudocolor table without smooth transitions between its colors introduces aliasing between areas of sharply distinct colors. (Smith, 1987)

Rule: If a step-function is used for the mapping, then a banded display is generated with the edges between bands corresponding to contours of constant value. (Haber,1988)


Rule: Continuous mapping of the scalar field into the third dimension produces a continuous potential surface representation. This approach requires a sophisticated rendering process, with hidden-surface removal and a lighting model. (Haber,1988)

Rule: The two-dimensional methods, such as, contouring and pseudo-color, can generally be applied to simple three-dimensional surfaces. (Haber,1988)

Rule: Contours and pseudo-color work well when applied to recognizable geometries, but become difficult to interpret if the surface geometry is unfamiliar or very complicated. Visualization must include sufficient perceptual cues to reveal the three-dimensional form of the surface. (Haber,1988)

Rule: [If the surface geometry is unfamiliar or very complicated] superimpose a regular grid on the surface to provide perspective depth cues and use lighting model (shading and/or shadows) with a pseudo-color representation of the scalar field.[The color represents the scalar value and the shaded/shadowed intensity provides information about the three-dimensional geometry] (Haber,1988)


Rule: ...generate contours as isopotential surfaces in three - dimensions...outermost contour surface will generally occlude all the others...problem can be mitigated by using transparency to allow different levels to be viewed simultaneously...image becomes difficult to interpret if more than a few contour levels are included in the display. (Haber,1988)

The rendering operation requires...the surface normal distribution over the surface...the lighting...[delivers] our primary perceptual cues for understanding the geometry of the scalar field contour itself. (Haber,1988)

Rule: Another possibility is to continuously vary the contouring value under user interactive control. This produces an animation effect that allows the user to construct a "mental picture" of the three - dimensional scalar field. (Haber,1988)

Rule: ...pass a cutting-plane through the body, and...use the simple two dimensional display methods to portray the scalar field on the resulting section. If the cutting plane can be moved through the body interactively, then the user can again construct a mental picture of the distribution over the entire domain. (Haber,1988)


Rule: Arrow plots are the most common method for displaying vector fields in two-dimensions. The vector field is sampled at a specified finite set of locations. At each location, an arrow is constructed...with its length and direction determined by the vector components. This is not truly a continuum representation, because only a finite set of points is sampled. If enough points are included, relative to the geometry of the domain and the gradients of the vector field, then the viewer can mentally interpolate between the sample points to fill in the continuum field. (Haber,1988)

Rule: Arrow plots provide a direct pictorial representation of the vector field itself and tend to be application-independent. (Haber, 1988)

Rule: [If there is a wide range of magnitudes in the vector field], let each arrow represent the local direction of the vector field, but make all arrows the same length. Let such attributes as color, line weight, or the vertical projection of the arrow be used to represent the vector magnitude. (Haber,1988)


Rule: The indirect methods show some physical result of a vector field, and are closely tied to the specific physical meaning of the vector field. (Haber,1988)

Rule: Coherence of the undeformed and deformed geometries is key to the success of this method. For example, this method is practically useless for describing the displacement field of a turbulent fluid flow. The displacement field is not drawn explicitly, but it is the difference between two geometries. There must be considerable geometric coherence between the two geometries for this understanding...The ability of the viewer's perceptual system to detect coherent changes in a scene as motion enhances the understanding of the displacement field 2 Color, transparency and other display techniques can be used to differentiate the two images. (Haber,1988)


Rule: Particle advection does not necessarily show twisting in the field. If it is desirable to display twisting effects, use flow ribbons. (Shirley & Neeman, 1989)



Nominal Ordinal Quantitative


Size - ) )

Saturation - ) )

Texture ) )

Hue ) *

Orientation )

Shape )

Figure 2. The expressiveness of retinal techniques. The - indicates that size and saturation should not be used for nominal measurements because they will probably be perceived as ordered. The * indicates that the full color spectrum is not ordered. However, parts of the color spectrum are ordinally perceived. (Mackinlay,1986; Ware & Beatty,1985)

Rule: A variation in position, texture, color, orientation, or shape is associative, that is, it permits the immediate grouping of all the marks belonging to the same [nominal] category. (Bertin, 1983)

Rule: If a single qualitative variable is to be represented by symbols, the different levels of the variable should be portrayed by symbols that are graphically distinct. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: Mapping different scalar fields to the red, green and blue channels of a display system might not be as easy to interpret as using alternative color spaces as hue, saturation and value. The latter color properties are easier for our perceptual systems to separate. (Haber,1988)

Rule: Use color for scalar values. ...For scalar [nominal] values not associated with a vector magnitude, representation should be color. (Ellson & Cox, 1988)

Rule: If no one element is more important, avoid using hues of different brightness or saturations for the different [nominal] elements. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: A variation in position, size, value [gray scale saturation], or texture is ordered [ordinal], that is, it imposes an order which is universal and immediately perceptible. (Bertin, 1983)

Rule: If color is used, do not use values from the entire color scale to represent quantitative values (color does not fall perceptually along a single continuum). (Kosslyn, Pinker, Simcox & Parkin,1983)

Rule: A shape variation cannot represent an ordered component.

(Bertin, 1983)

Rule: A variation in position or size is quantitative, that is, the visual distance between two categories of this variable can be immediately expressed by an [approximate] numerical ratio. (Bertin, 1983)

Rule: A value [gray scale saturation] variation is capable of representing an ordered component (Bertin, 1983)

Rule: Transparency allows areas of high interest to be spotted quickly by looking through areas of low interest.


Rule: If distributional information is needed for one or more variables then use collective summary characters (MD-marks). (Tukey,1988)

Rule: When the data can assume negative values, profiles have one advantage over stars in that the base line of the profile can be made to represent zero and the profile allowed to dip below the base line. [Stars cannot represent negative values] (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: These methods [glyphs such as stars, trees, or other types of symbolic representations] are better suited for informal [qualitative] clustering and spotting peculiar points. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: When designing or choosing compound character scales, one must consider whether the scales are separable (that is, whether one can easily shift attention from one coded aspect to another, and whether the coded aspects are individually value-mergeable into impressions of regional trends. Furthermore, most of these scales will produce displays that are very sensitive to the order of the most to the least important variable. Finally, it will clearly be easier to use such displays to study the relationship of each least important variable to the most important variables encoded on the two axes than to study relationships among the least important variables. (Tukey,1988)

6. Effectiveness of Visualization Techniques

Effectiveness criteria identify which visualization technique satisfying the expressiveness criteria is the most effective in a given situation at exploiting the capabilities of the output medium and the human visual system (Mackinlay,1986). In this section, several rules pertaining to effectiveness of various

components of visualization techniques are presented. The rationale behind giving effectiveness rules for the components of visualization techniques, rather than the rules for the techniques themselves, is that composite visualization techniques can be built upon primitive ones by simply manipulating existing marks or adding others. Since these changes are made to encode additional information, it is more appropriate to use components which result with the most effective encoding.

Rule: The representation of numbers, as physically measured on the surface of the graphics itself, should be directly proportional to the numerical quantities represented. (Tufte, 1983)

Rule: We can perceive lengths and directions [slopes] of short line segments somewhat more quantitatively than darkness on a gray scale. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: A variation in position, size, value [gray scale variation], texture, or color is selective, that is, it enables us to isolate all marks belonging to the same category. (Bertin, 1983)

Rule: Quantitative components (absolute quantitites, measurements, ratios, etc.) do not call for selective perception. (Bertin,1983)

Rule: Position is highly selective. (Bertin;1983)

Rule: Among all of these graphical aspects, it is location along an axis that has the most immediate visual impact. (Chambers,Cleveland,Kleiner & Tukey,1983)

Rule: Size tasks have severe stepping limitations...[and] size perception is not particularly selective. (Mackinlay,1986)

Rule: The difficulty of discriminating length is determined, in part, by the orientation of the lines...differential sensitivity is better for horizontal [or vertical lines] than oblique lines. [Demonstrates that length is perceived more accurately than orientation or slope] (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: The eye is able to perceive large objects with greater impact than small ones. (Chambers, Cleveland, Kleiner & Tukey,1983)

Rule: Differential sensitivity for area is 6.0 percent. This value implies that for differences in area to be detectable, the areas must differ by 6.0 percent or more. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: If three dimensional objects are projected onto two dimensions using perspective view, remember that

volumes and areas are not accurately read; avoid perspective views if you want to convey precise values. Also avoid sharply oblique projections, which distort quantities, or extra lines that turn 2D surfaces into 3D solid volumes if the extra lines have the potential to distract or group with other marks. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: Most people compare small perspective views of three dimensional objects on the basis of the area enclosed and not by the actual volume implied. Therefore, it should not be attempted to employ perspective projections with the expectation that readers will perceive differences in volume veridically. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: Color hue is quite effective for nominal information, because the perception of color is highly selective and many nominal values can be accurately distinguished. (Mackinlay, 1986)

Rule: Brightness is the most important color parameter for distinguishing objects; humans can detect brightness differences much more easily than hue or saturation differences. (Meier,1988)

Rule: Saturation may be affected by the size of a colored figure, with greater exponents for smaller areas. The same color placed in a smaller area appears "denser" and hence, more saturated. Thus, slight differences in colormetric purity may be required to make two figures of the same hue but different sizes appear equal in saturation. (Kosslyn, Pinker, Simcox & Parkin,1983)

Rule: The eye is able to perceive dark objects with greater impact than light ones. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: Because they do have a natural visual hierarchy, varying shades of gray show varying quantities better than color. (Tufte, 1983)

Rule: If shading is used, make sure differences in shading line up with the values being represented. The lightest ("unfilled") regions represent "less", and darkest ("most filled") regions represent "more". (Kosslyn, Pinker, Simcox & Parkin,1983)

Rule: Changes in color hue are more appropriate for nominal encoding.

Rule: Texture perception is selective and has a moderate stepping limitation...the major difficulty with texture is that it can have vibratory effects...(Bertin,1983)

Rule: The eye is able to perceive simple patterns more quickly than complex ones (Chambers, Cleveland, Kleiner & Tukey,1983)

Rule: Shape can only be used for nominal distinctions, shape is not perceived selectively. (Mackinlay,1986)

Rule: In order to be distinguishable, shapes must be quite distinct from one another. Veniar (Veniar, 1948) established a differential sensitivity of 1.37 percent for shape distortion when either the horizontal or vertical sides of the square were distorted.

Rule: Straight line configurations play a central role in many graphical methods...because of our ability to perceive lines more readily than other kinds of patterns. When we perceive curvature - especially if it is gentle or moderate - we probably see it more as a departure from linearity than as any particular kind of curve. [We perceive straight lines more effectively than curved lines.] ( Chambers,Cleveland,Kleiner & Tukey, 1983)

Rule: Numerosity refers to a subjective impression of the number of objects that a person can see in the visual field without counting the objects. It has been found that differential sensitivity index for dot numerosity is 0.204 (Taves,1940). For instance, if population density of one region is represented by 100 dots, the density of another region, greater than the first, must be represented by [at least] 120 dots to be perceived as greater. (Kosslyn, Pinker, Simcox & Parkin,1983)

number of group

data points classification



4-8 A

8-25 B

25-80 C

80-250 D

250-800 E

800-2500 F

2500-8000 G


Figure 4. Numerosity table (Tukey,1988)

Rule: With fairly few points (up to numerosity B, maybe C)

symbols [marks] may show individual values of back [least important] variables, although smoothing by change of value may often be required to enable value merging. As numerosity increases to C, maybe D, suppression will often be useful. Large data sets will probably require considerable agglomeration and display of only summarized values for the agglomerated points. As the number of points increases even further, the usefulness of collective representation (MD-marks) also increases. (Tukey,1988)

Rule: When assigning data variables to the rays of a star symbol (a glyph), variables expected to be somewhat correlated should be assigned to adjacent rays of the star. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: Stars form a more dramatic and memorable set of shapes than profiles. [Stars are more selective and associative than profiles] (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: A curve [of the arrow] suggests the idea of continuity; the elements are perceived as extended and linked. (Bertin,1983)

Rule: When the movements are supposed to be radiating or converging, the axis of the arrow, which the eye unconsciously continues, must pass through the central point. (Bertin, 1983)

Rule: The path of ellipse surrounding a sphere or a cylinder can produce perspective effects which increase the impression of real volume. (Bertin,1983)

Rule: Use shape [size] for vector quantities. Converging lines...denote directional information by leading the eye to the point of intersection. The vector magnitude, a scalar value, is represented by the relative length of the pointer. (Ellson & Cox, 1988)

Rule: For simple and slowly varying 2D vector fields, small arrows are useful. Arrows are impractical for 3D vector fields. (Farrell, 1987)

Rule: When using arrows to represent vector fields, do not use arrows to depict values under a certain threshold.

Rule: When very few particles (fewer than a hundred) are used in particle advection, scientists can visually track each particle and get a good feeling for the microscopic behavior of the fluid. As the number of particles grows to a hundred or so, this no longer becomes possible, so the method is almost useless. It is only when large number of very small, almost transparent particles are used that the method again

becomes useful. In this regime the representation ceases to be discrete and begins to resemble a cloud of fluid. (Upson et al., 1989)

Rule: High-frequency color maps create a perceptual interference pattern which contours the pixel gradation calculations. (Ellson & Cox, 1988)

Rule: Choose color maps [tables] carefully so the gradations in color correlate well with changes in the data. (Ellson & Cox, 1988)

Rule: The color scales must be mutually exclusive in perceptual appearance. (Ellson & Cox, 1988)

Rule: Choose a color and luminance that contrast most with the colors in the background. (Kosslyn, Pinker, Simcox & Parkin,1983)

Rule: Pick light colors for marks on dark backgrounds and vice versa. This is necessary because color contrast is not sufficient to ensure legibility; a light contrast is far more important. (Tinker & Paterson,1931)

Rule: Use desaturated color for large areas and saturated color for small areas. (Samson & Poiker,1985)

Rule: Yellow tends to attract the eye more than red, and perceptually takes on a higher value despite red's position at the top of the color map. (Ellson & Cox,1988)

Rule: Using color on partially transparent iso-surfaces increases their visibility. But then additional color coding cannot be used on the surface.

Rule: Up to three different color can be used effectively on partially transparent iso-surfaces.

Rule: Use saturation rather than hue in displays like maps where some variable must be plotted as a function of location. (Wainer & Francolini,1980)

Rule: Even a basically good set of symbols will become indistinct if either (i) they are plotted too small, (ii) the points are too numerous or overlap excessively, or (iii) too many categories are represented. (Chambers, Cleveland, Kleiner & Tukey,1983)

Rule: Using more than three distinct color ranges that represent different components of a data set causes confusion in interpreting the corresponding image. (Farrell, 1987)

7. Referential Components

The term referential component refers to any element in a visualization whose purpose is to facilitate proper interpretation of the graphics. Referential components are essential parts of data visualizations. Frame of the graphics, legends, labels, background, and light sources are all referential components in a visualization. Frame of the graphics can further be divided into two components: outer frame and inner frame (Kosslyn, Pinker, Simcox & Parkin, 1983). The outer frame is the set of lines that serve to define the general entities that are addressed in the display, whereas the inner frame may be a grid which does not represent a variable but is used as a perceptual cue. The following rules provide a few guidelines that are useful in defining various referential components.


Rule: The [outer] frame of a graphic delimits the signifying [visible] space, but it does not necessarily delimit the phenomena. (Bertin, 1983)

Rule: Keep marks within the frame. If you must have them extend beyond (perhaps to emphasize a point), remember that actual quantitative information will be difficult to extract. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: If the x-axis is more than twice as long as the y-axis, include a second y-axis on the right of the outer frame. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: A reasonable number of reference values [on a coordinate axis] might be between four and twelve. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: The marks that define the outer frame must be grouped together by the Gestalt principles so that the frame is clearly defined. Every necessary part must be present or obviously implied. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: If tick marks are used between scale values, there should be no more than five before a heavier tick mark or a new scale value. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: The marks must be congruent with the idea being conveyed. Thus, an ordinal or nominal scale must be clearly demarcated. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: Ideally, all parts of the frame should play a role in communicating quantitative information. If for some reason

(e.g., you use a particular depiction) they do not, make the superfluous parts lighter than the rest of the frame or clearly set aside. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: The axes should be uniform and continuous; if they are not, be sure you are distorting things to make a particular point, in a way that the reader can detect and understand. (Kosslyn, Pinker, Simcox & Parkin, 1983)


Rule: The inner frame should not group with the marks or the labels. This can be ensured by always drawing the inner frame with thinner, lighter lines than those used to draw the other graphic constituents. (Kosslyn, Pinker, Simcox & Parkin,1983)

Rule: Make the grain of the inner frame appropriate for the level of precision necessary. A coarse grid will not be of much help if detailed measurements are needed, and a fine grid will only get in the way if only general measurements are needed. (Kosslyn, Pinker, Simcox & Parkin,1983)

Rule: Every fifth line of the inner frame (if a grid is used) should be slightly heavier, which will help the reader to track along any single line. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: The ends of the lines of the inner frame should intersect the outer frame. This will ensure that the inner frame hooks up clearly to the outer frame, so that it maps specific labelled points on one part of the outer frame to specific points (preferably points that are perceptually isolable) on the marks. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: If absolute values are important, top of bar can be perceptually grouped with horizontal grid line and the grid line can be grouped with an absolute value label on the ordinate..(Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: If the rendering method is for continuous data and the data are not continuous, then a uniform grid must be created at some desired resolution. This requirement also arises from gridded data that has been transformed into a non-uniform grid or from data accessed from multiple sources with dissimilar grid resolutions. (Treinish,1989)


Rule: Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphics itself. Label (highlight) important events in the data. (Tufte, 1983)

Rule: For graphics in exploratory data analysis, words should tell the viewer how to read the [graphics] design (if it is a technically complex arrangement) and not what to read in terms of content. (Tufte, 1983)

Rule: Put on a title. The title should state clearly what is begin graphed or represented. The title should be recognizable as such because it clearly set off from the rest of the graph; it should not be close enough to any line to be grouped perceptually with it. A larger font size will also prevent the title from being perceptually grouped with other labels or parts. (Kosslyn, Pinker, Simcox & Parkin,1983)

Rule: Label each axis. The labels should be placed closer to the axis they label than to anything else, ensuring that they will be grouped perceptually with the right axis. Labels parallel to their axes are a good choice in complex displays because the Gestalt law of common fate will group label and axis together. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: Put scale values on both scales; make sure they are closer to the correct tick mark than to anything else. (Kosslyn, Pinker, Simcox & Parkin, 1983)


Rule: Attaching the light sources to the observer by an offset vector, rather than fixing their position in the model space, assures that the observer never positioned in the dark. (Smith, 1989)

Rule: The use of a slightly saturated blue background being lighter at the top than at the bottom tends to emphasize the three dimensionality of the model without detracting from it. (Smith, 1989)

Rule: Moving the eye point while animating the data is difficult to interpret. (Marshall & Carswell, 1988)

8. Miscellaneous


Rule: On the elementary level [of reading] optimum angular legibility is located near 70 degrees. On the overall level [of reading] the image tends toward the form of a square where optimum angular legibility is provided by the diagonal. Since these two conditions can be contradictory, angular legibility results from a compromise between the conditions of legibility for the two extreme levels of reading. [The graph should not have angles which are too sharp (or too flat) as well as graphs which are too wide or too high] (Bertin, 1983)

Rule: For overly pointed curves, the scale of the quantitative component should be reduced; Optimum angular perceptibility occur at around 70 degrees. If the curve is not reducible due to existence of both large and small variations, filled columns resulting in bar charts can be used. (Bertin, 1983)

Rule: For overly flat curves, the scale of the quantitative component should be increased. (Bertin, 1983)

Rule: For small variations in relation to the total, the total loses its importance, and the zero point can be eliminated, provided the viewer is made aware of this elimination. [zero-based axis vs truncated axis] (Bertin, 1983)

Rule: In general, when one selects a large scale unit, one is implying that the amount of increase is small; conversely, one amplifies an effect by selecting a larger vertical axis, spreading out the scale units. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: One of the most powerful ways of slanting a given by altering the aspect of the axes, or ratio of their scales. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: If a frame is made to project an angle in [three-dimesnional] space 2 the foreshortening that results can emphasize or de-emphasize a trend. (Kosslyn, Pinker, Simcox & Parkin, 1983)

Rule: If data squeezes toward zero, use a log scale. (Tukey,1988)

Rule: If points at the upper right are much more widely scattered than those at the lower left or quite a few points are crowded together in a small space then use logarithms. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: Re-express (i.e. transform) variables that can never be zero or negative by taking logarithms. (Chambers, Cleveland, Kleiner & Tukey,1983)


Rule: Empty spaces [must] signify absence of phenomena and not missing data. (Bertin, 1983)

Rule: [For missing data values], it may be better either (i) to impute a value, (ii) to indicate that the value is missing or (iii) not to draw the symbol for a particular observation. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: The drawing must indicate the unknowns of the information in an unambiguous way. [Unknown or missing data must not be represented as empty spaces] (Bertin, 1983)

Rule: One-sided symbolic scales are not suitable for positive and negative data. One-sided symbolic scales assume that the scale is positive. (Chambers, Cleveland, Kleiner & Tukey,1983)

Rule: Large positive and small negative residuals are equally important and deserve equal visual impact, but they must be visually distinct. We would conjecture that using up arrows for positive residuals and down arrows for negative residuals achieves this equality...Another example which distinguishes positive and negative values are solid and dashed circles. Different line styles can also be used for this purpose. (Chambers, Cleveland, Kleiner & Tukey, 1983)

Rule: In one-dimensional scatter plot, stacking or vertical jittering can be used to alleviate both exact overlap and crowding. (Chambers, Cleveland, Kleiner & Tukey,1983)

Rule: In two-dimensional scatter plots, jitter the points on a scatter plot by adding random noise to one or both of the variables to prevent mark overlapping. (Cleveland and McGill, 1984)


Rule: Using more than four display partitions [windows] does not significantly reduce the time needed for data interpretation. For example, it is more effective to have a series of four screens each with four image windows than to have one screen image with 16 image windows. (Farrell, 1987)

Rule: Repeated application of the Shrink Principle leads to a powerful and effective graphic design, the small multiples [which displays an image sequence on a single screen at once]. Small multiples resemble the frames of a movie; a series of graphics showing the same combination of variables, indexed by changes in

another variable. (Tufte, 1983)

Rule: For animation, 10 to 20 images [key frames] are often adequate when distributed over a range of images. (Farrell, 1987)


Rule: For static images, these [depth cues] include geometrical perspective, shadows and shading, texture perspective, distance blurring and occlusion. (Schiavone, Papathomas & Julesz,1988)

Rule: For dynamically changing scenes, the kinetic depth effect (KDE, the ability to extract the third dimension from an animated series of projections from rotating an object) [is] closely related to motion parallax. (Schiavone, Papathomas & Julesz,1988)

Rule: A vertical rocking of about two degrees shows a dramatic difference in depth perception.

9. Concluding Remarks

In this report, a set of rules and principles for designing data visualization techniques, which effectively convey the informational content of data, has been presented. The set is neither complete nor addresses all the issues relevant in visualization technique design. In particular, it does not include rules and principles for animation, volume visualization, and stereoscopic images. The rules and principles for effective use of color are not extensive and need to be expanded. It is hoped that the set of rules and principles presented in this report would provide some guidelines for the visualization technique designers in selecting various visualization primitives to effectively display the given data. These rules and principles can also be used to automate the generation of visualization techniques and to provide guidance to scientists and engineers whose work involves data visualization. The ongoing development of VISTA, a visualization tool assistant which automatically generates visualizations from data descriptions and further assists scientists in creating visualizations of their data, is based on the set of rules and principles presented in this report. Undoubtedly, the set provided here will grow as new visualization techniques are developed and our understanding of the existing ones mature. To gain better understanding of the effectiveness and expressiveness of various visualization primitives, it is essential that empirical studies of visualization techniques should be undertaken. Otherwise, the techniques that are generated may be viewed as meaningless "pretty pictures".


We would like to thank members of the Graphics and User Interface Research Group, in particular, Professors James Foley and John Sibert, at The George Washington University who made valuable suggestions and provided an intellectually stimulating environment for this work. Financial support was provided by USRA Center of Excellence in Space Data and Information Sciences (CESDIS) under NASA contract S/C 550-64.


Bertin, J., Semiology of Graphics, The University of Wisconsin Press, 1983, Translated by W. J. Berg.

Chambers, J. M., W. S. Cleveland, B. Kleiner and P. A. Tukey, Graphical Methods for Data Analysis, Duxbury Press, Boston, Massachusetts 02116, 1983.

Cleveland, W. S., The Elements of Graphing Data, Wadsworth Advanced Books and Software, Monterey, California, 93940, 1980.

Cleveland, W. S. and R. McGill, "Graphical Perception: Theory, Experimentation and Application to the Development of Graphical Methods," Journal of the American Statistical Association, 79(387), September 1984, pp. 531-554.

Cowan, W. B. and C. Ware, Colour Perception Tutorial Notes, SIGGRAPH'85, 1985.

Ellson, R. and D. J. Cox, "Visualization of Injection Molding," Simulation, 51(5), November 1988, pp. 184-188.

Farrell, E. J., "Visual Interpretation of Complex Data," IBM Systems Journal, 26(2) , 1987, pp. 174-200.

Haber, R. B., "Visualization in Engineering Mechanics: Techniques, Systems and Issues," Visualization Techniques in the Physical Sciences, SIGGRAPH'88, pp. 89-111.

Kosslyn, S., S. Pinker, W. Simcox and L. Parkin, Understanding Charts and Graphs: A Project in Applied Cognitive Science, January 1983.

Mackinlay, J., Automatic Design of Graphical Presentations, PHD Dissertation, Computer Science Dept., Stanford University, Stanford, California, 1986. Also Tech. Rep. Stan-CS-86-1038.

Marshall, R. & Carswell, P., "Alternative Views of a Hurricane",

Proceedings of the Conference on Three-Dimensional Visualization of Scientific Data, January 1988.

Meier, B. J., "ACE: A Color Expert System for User Interface Design," Brown University, Computer Science Dept., Master's Project, CS-87-M16. (Also in ACM SIGGRAPH Symposium on User Interface Software and Technology, Banff, Canada, 1988)

Samson, L. and T. Poiker, "Graphic Design with Color Using a Knowledge Base," Simon Fraser University, 1985.

Schiavone, J. A., T. V. Papathomas and B. Julesz, "Visualization of Meteorological Data: A Review of Computer Graphics Applications," SIGGRAPH '88 Proceedings, pp. 285-292.

Shirley, P. and H. Neeman, "Volume Visualization at the Center for Supercomputing Research and Development," Chapel Hill Volume Visualization Workshop, May 1989, pp. 17-20.

Smith, A. R., "Volume Graphics and Volume Visualization: A Tutorial," Tech Memo 176, Pixar Inc, San Rafael, California, 1987.

Smith, M.E., "The Evolution of a General Data Visualization Program", ACM Siggraph'89 Tutorial Notes: State of the Art in Data Visualization, Boston, MA, 1989.

Taves, E. H., "Two Mechanisms for the Perception of Visual Numerosisness," Archives of Psychology, 1940, 37(205).

Tinker, M. A. and D. G. Paterson, "Studies of Typographical Factors Influencing Speed of Reading: VII. Variation in Colour of Print and Background," Journal of Applied Psychology, 1931, 15, pp. 471-479.

Tufte, E. R., The Visual Display of Quantitative Information, Graphics Press, Box 340, Cheshire, Connecticut 06410, 1983.

Tukey, J. W., The Collected Works of John W. Tukey: Volume V Graphics 1965-1985, edited by W. S. Cleveland, Wadsworth and Brooks/Cole Advanced Books and Software, Pacific Grove, California, 1988.

Upson, C., T. Faulhaber, D. Kamins, D. Laidlaw, D. Schlegel, J. Vroom, R. Gurwitz and A. van Dam, "The Application Visualization System: A Computational Environment for Scientific Visualization," IEEE Computer Graphics and Applications, pp. 30-42.

Veniar, F. A., "Difference Thresholds for Shape Distortion of Geometrical Squares," The Journal of Psychology, 1948, 26, pp. 461-476.

Wainer, H. and C. Francolini, "An Empirical Inquiry into Human Understanding of Two Variable Maps," The American Statistician, 34, pp. 81-93.

Ware, C. and J. C. Beatty, "Using Colour as a Tool in Discrete Data Analysis," Tech. Rep. CS-85-21, University of Waterloo Computer Science Dept., August 1985.

Main Perception Page

HyperVis Table of Contents

Last modified on February 11, 1999, G. Scott Owen,