1975-present: High-D data visualization

It is harder to provide a succinct overview of the most recent developments in data visualization, because they are so varied, have occurred at an accelerated pace, and across a wider range of disciplines. It is also more difficult to highlight the most significant developments (and because we have focused on the earlier history), so there are presently areas and events unrepresented here.

With this disclaimer, a few major themes stand out

  • the development of a variety of highly interactive computer systems and more importantly,
  • new paradigms of direct manipulation for visual data analysis (linking, brushing, selection, focusing, etc.)
  • new methods for visualizing high-dimensional data (grand tour, scatterplot matrix, parallel coordinates plot, etc.)
  • the invention of new graphical techniques for discrete and categorical data (fourfold display, sieve diagram, mosaic plot, etc.), and analogous extensions of older ones (diagnostic plots for generalized linear models, mosaic matrices, etc.) and
  • the application of visualization methods to an ever-expanding array of substantive problems and data structures.
  • These developments in visualization methods and techniques arguably depended on advances in theoretical and technological infrastructure. Some of these are: (a) large-scale software engineering; (b) extensions of classical linear statistical modeling to wider domains; (c) vastly increased computer processing speed and capacity, allowing computationally intensive methods and access to massive data problems.

    In turn, the combination of these themes and advances now provides some solutions for earlier problems.

Added: 2008-07-17

Weekly chartbook (eventually computer-generated) to brief U.S. President, Vice President on economic and social matters


References:
none
1975
Circular display
Added: 2008-07-17

Fourfold displayFourfold display

Stephen Fienberg portraitStephen Fienberg portrait

"Four-Fold Circular Display'' to represent 2 x 2 table


References:
Fienberg:1975 Friendly:1994b
Added: 2008-07-17

USA 1970 Draft Lottery Data, with median and quartile tracesUSA 1970 Draft Lottery Data, with median and quartile traces

Cleveland portraitCleveland portrait

Enhancement of scatterplot with plots of three moving statistics (midmean and lower and upper semimidmean)


References:
ClevelandKleiner:1975
1975
Chernoff faces
Added: 2008-07-17

Experiment showing random permutations of features used in Chernoff's faces affect error rate of classification by about 25 percent


References:
ChernoffRizvi:1975
1975
Graphics versus tables
Added: 2008-07-17

Ehrenberg portraitEhrenberg portrait

Experimental tests of statistical graphics vs tables, findings favoring latter


References:
Ehrenberg:1975 Ehrenberg:1977
1975
scatterplot matrix
Added: 2007-02-01

Enhanced scatterplot matrixEnhanced scatterplot matrix

Scatterplot matrix, the idea of plotting all pairwise scatterplots for n variables in a tabular display


References:
Hartigan:1975
Added: 2008-07-17

Monthly chartbook (eventually computer-generated) to brief U.S. President, Vice President on economic and social matters (StatUS)

 


References:
USCensus:1976
1977
Cartesian rectangle
Added: 2008-07-17

Wainer portraitWainer portrait

"Cartesian rectangle'' to represent 2 x 2 table, experimentally tested against other forms


References:
WainerReiser:1976
1977
Committee on graphics
Added: 2008-07-17

Ad Hoc Committee on Statistical Graphics, leading to the ASA Section on Statistical Graphics, later to the Journal of Computational and Graphical Statistics


References:
none
1978
Linked brushing
Added: 2008-07-17

Original invention of linked brushing (highlighting of observations selected in one display in another display of the same data), although in a manner different from how we see it in today's systems


References:
Newton:1978
Added: 2001-07-02

Boxplot of the NJ Pick-it LotteryBoxplot of the NJ Pick-it Lottery

Richard Becker portraitRichard Becker portrait

John Chambers portraitJohn Chambers portrait

S, a language and environment for statistical computation and graphics. S (later sold as a commercial package, S-Plus; more recently, a public-domain implementation, R is widely available), would become a lingua franca for statistical computation and graphics


References:
BeckerChambers:1984 Becker:1994 BeckerChambers:1978
1979
Geographic correlation diagram
Added: 2008-04-04

Geographic correlation diagram, showing the bivariate relation between two spatially referenced variables using vectors to represent geographic covariation


References:
Monmonier:1979
1980
Bifocal display
Added: 2010-08-09

An initial, modern suggestion of a method for viewing a large database by the use of selective focus around a central region, using distortion to provide a context.


References:
ApperleySpence:1980 ApperleyTzvarasSpence:1982
1981
Mosaic display
Added: 2008-07-17

Mosaic display á la Hartigan and KleinerMosaic display á la Hartigan and Kleiner

Hartigan & Kleiner 5-way mosaic of TV ratings (629 x 663; 105K)Hartigan & Kleiner 5-way mosaic of TV ratings (629 x 663; 105K)

Mosaic display to represent frequencies in a multiway contingency table


References:
HartiganKleiner:1981 Friendly:2002:mosahist HartiganKleiner:1984
1981
Fisheye view
Added: 2008-07-17

Fisheye view of central Washington, D.C.Fisheye view of central Washington, D.C.

Fisheye view: an idea to provide focus and greater detail in areas of interest of a large amount of information, while retaining the surrounding context in much less detail


References:
Furnas:1981
1981
Draftsman display
Added: 2001-10-02

The "draftsman display'' for three-variables (leading soon to the "scatterplot matrix'') and initial ideas for conditional plots and sectioning (leading later to "coplots'' and "trellis displays'')


References:
TukeyTukey:1981
1982
Brushing
Added: 2008-07-17

Another early version of brushing, invented independently of Newton, together with a system for 3-D rotations of data


References:
McDonald:1982
1982
Visibility base map
Added: 2007-02-01

US Visibility MapUS Visibility Map

Monmonier portraitMonmonier portrait

Visibiltiy Base Map, a map of the United States where areas are adjusted to provide a readily readable platform for area symbols for smaller states, such as Delaware and Rhode Island, with compensating reductions in the size of larger states


References:
MonmonierSchnell:1983
1982
USA Today weather map
Added: 2005-01-07

imageimage

imageimage

George Rorick portraitGeorge Rorick portrait

The USA Today color weather map begins an era of color information graphics in newspapers. Shortly, colorful visual graphics become widespread.


Rorick used a combination of color, maps, tables, symbols and annotation to transform often dull and incomprehensible information into something more interesting and accessible
References:
none
1983
Sieve diagram
Added: 2008-07-17

Sieve diagram imageSieve diagram image

Riedwyl portraitRiedwyl portrait

Sieve diagram, for representing frequencies in a two-way contingency table


References:
RiedwylSchupbach:1983
1983
Graphical esthetics
Added: 2008-07-17

Tufte portraitTufte portrait

Esthetics and information integrity for graphics defined and illustrated (some concepts: "data-ink ratio'', "lie factor'')


References:
Tufte:1983 Tufte:1990 Tufte:1997
1985
Grand tour
Added: 2008-07-17

Grand tour, for viewing high-dimensional data sets via a structured progression of 2D projections


References:
Asimov:1985
1985
Parallel coordinates plot
Added: 2008-07-17

Representation of a six dimensional point in parallel coordinatesRepresentation of a six dimensional point in parallel coordinates

AlInselberg portraitAlInselberg portrait

Parallel coordinates plots for high-dimensional dataParallel coordinates plots for high-dimensional data

1987
Interactive linked graphics
Added: 2008-07-17

Figure 14 from "Brushing scatterplots'' showing interactive labeling of brushed pointsFigure 14 from "Brushing scatterplots'' showing interactive labeling of brushed points

Interactive statistical graphics, systematized: allowing brushing, linking, other forms of interaction


References:
BeckerCleveland:1987
Added: 2008-07-17

Buja portraitBuja portrait

First inclusion of grand tours in an interactive system that also has linked brushing, linked identification, visual inference from graphics, interactive scaling of plots, etc.


References:
Buja-etal:1988
1988
Interactive time-series
Added: 2007-02-01

DiamondFast image, overlaid time series, aligned and rescaled interactivelyDiamondFast image, overlaid time series, aligned and rescaled interactively

Unwin portraitUnwin portrait

Interactive graphics for multiple time series with direct manipulation (zoom, rescale, overlaying, etc.)


References:
UnwinWills:1988
Added: 2007-02-01

REGARD image: largest annual oil flows into EU, 1977--1990REGARD image: largest annual oil flows into EU, 1977--1990

Unwin portraitUnwin portrait

Statistical graphics interactively linked to map displays


References:
Monmonier:1989 Wills-etal:1989
1989
Nested dimensions
Added: 2007-02-01

TempleMVV image: 4 response variables vs. age, sex, educationTempleMVV image: 4 response variables vs. age, sex, education

TempleMVV image: 4-way associationTempleMVV image: 4-way association

Use of "nested dimensions'' (related to trellis and mosaic displays) for the visualization of multidimensional data. Continuous variables are binned, and variables are allocated to the horizontal and vertical dimensions in a nested fashion


References:
Mihalisin-etal:1989 Mihalisin-etal:1992
1990
Lisp-Stat
Added: 2008-07-17

Luke Tierney portraitLuke Tierney portrait

Lisp-Stat, an object-oriented environment for statistical computing and dynamic graphics


References:
Tierney:1990
1990
Multivariate grand tours
Added: 2008-07-17

Grand tours combined with multivariate analysis


References:
HurleyBuja:1990
1990
Textured dot strips
Added: 2008-07-17

Textured dot strips to display empirical distributions


References:
TukeyTukey:1990
1990
Lexis pencil
Added: 2008-07-17
1990
Parallel coordinates plot theory
Added: 2007-02-01

Wegman portraitWegman portrait

Wegman 1990 - Fig3.Wegman 1990 - Fig3.
Parallel coordinate plot of six-dimensional data illustrating correlations of 1, .8, .2, 0, -.2, -.8, and -1.

Wegman 1990 - Fig8Wegman 1990 - Fig8
The second permutation of the five-dimensional presentation of the automobile data. There are two classes of linear relationships between gear ratio and miles per gallon.

Statistical theory and methods for parallel coordinates plots


References:
Wegman:1990
1991
Enhanced mosaic display
Added: 2008-07-17

Mosaic display developed as a visual analysis tool for log-linear models (beginning general methods for visualizing categorical data)


References:
FriendlyFox:1991 Friendly:1994a
1991
Treemaps
Added: 2008-07-17

TreeViz image of files on the HCIL serverTreeViz image of files on the HCIL server

Ben Shneiderman portraitBen Shneiderman portrait

Treemaps, for space-constrained visualization of hierarchies, using nested rectangles (size proportional to some numerical measure of the node)


References:
JohnsonShneiderman:1991 Shneiderman:1991
Added: 2007-02-01

Swayne portraitSwayne portrait

Young portraitYoung portrait

A spate of development and public distribution of highly interactive systems for data analysis and visualization, e.g., XGobi, ViSta


References:
Swayne-etal:1991 Young:1994 Buja-etal:1996 Swayne-etal:1998
1992
Categorical data graphics
Added: 2007-02-01

Beginnings of the general extension of graphical methods to categorical (frequency) data


References:
Friendly:2000:VCD Friendly:1992
1994
Table lens
Added: 2008-07-17

Table lens screen shotTable lens screen shot

1996
Cartographic data visualiser
Added: 2008-07-17

Jason Dykes portraitJason Dykes portrait

Cartographic Data Visualiser: a map visualization toolkit with graphical tools for viewing data, including a wide range of mapping options for exploratory spatial data analysis


References:
Dykes:1996
1999
Grammar of Graphics
Added: 2007-02-01

Minard's March on Moscow graphicMinard's March on Moscow graphic

Wilkinson PortraitWilkinson Portrait

Grammar of Graphics: A comprehensive systematization of grammatical rules for data and graphs and graph algebras within an object-oriented, computational framework


References:
Wilkinson:1999 Wilkinson:2005
2002
Tag cloud, Word cloud
Added: 2011-04-20

Wordle of Dewey's Reflex Arc Article.Wordle of Dewey's Reflex Arc Article.
A Wordle of John Dewey's "The Reflex Arc in Psychology" (1896), produced by Dr. Christopher Green.

Wordle of Skinner's Theories of Learning article.Wordle of Skinner's Theories of Learning article.
A Wordle of B. F. Skinner's "The Reflex Arc in Psychology" (1950), produced by Dr. Christopher Green.

Wordle of the Milestones Project.Wordle of the Milestones Project.
A Wordle produced circa 2008 of the Milestones Database, created by Dr. Michael Friendly.

Tag clouds (also known as "word clouds") are visually stimulating summaries of large bodies of text.  Their purpose is to take a selection of text and visually display the frequency of the most commonly used words within that document.  These are useful for qualitative analyses by highlighting major themes found in particular works of interest.

Flanagan's Search Referral Zeitgeist The page that many claim started the trend, with measurement of web referrals [Note: only available through the Archive.org database]

Many sources cite Jim Flanagan as the founder of this idea with his Search Referral Zeitgeist Perl script, although the basic idea (of using word size to designate importance) had previously been used by Douglas Coupland in his 1995 novel "Microsurfs". Tag clouds are prominently featured on image/photo websites (such as Flickr), and blogs, where they can be used by visitors to navigate the website by theme or keyword. For an example of a more academic use of this technology, see the attached photos, which were generated by Dr. Christopher Green of York University using the Wordle.com website. These images (made using John Dewey's 1896 article "The Reflex Arc Concept in Psychology" and B. F. Skinner's 1950 article, "Are Theories of Learning Necessary?") can be studied to visually compare and summarize the similarities and differences in word usage between these two major psychological texts.
References:
none
2004
Sparkline
Added: 2007-02-01

Sparkline of the US deficit, 1983--2003Sparkline of the US deficit, 1983--2003

Sparkline graphic for 4 stocksSparkline graphic for 4 stocks

Sparklines: "data-intense, design-simple, word-sized graphics,'' designed to show graphic information inline with text and tables


References:
Tufte:2006
2005
Gapminder
Added: 2010-01-26

Buble chart for HIVBuble chart for HIV
Number of people living with HIV.

Buble chart for the WorldBuble chart for the World
What a world map should convey.

Hans RoslingHans Rosling

The moving buble chart.


"The main innovation from Gapminder is so far 'the moving bubble chart' in the form of the Trendalyzer software that was acquired by Google in 2007. Google has made a 2008 version freely available as Google Motion Chart. Gapminder is a non-profit foundation founded in 2005 with a goal of '…increase use and understanding of statistics and other information about social, economic and environmental development at local, national and global levels.” (Rosling and Johansson, 2009).
References:
Rosling:2009
2006
Computational graphics language: ggplot2
Added: 0000-00-00

ggplot2 plot of the diamonds datasetggplot2 plot of the diamonds dataset

An influential, open source implementation of the Grammar of Graphics from Wilkinson (1999) in R, together with other computational tools to make it easier to produce beautiful statistical diagrams


Around this time, software for producing statistical graphs progressed from limited point-and-click interfaces and low-level graphics languages to higher-level computational languages for specifying a graph. This item recognizes the work of Hadley Wickham in the ggplot2, plyr and other R packages, but there are a number of other important contributors to this topic.
References:
Wichham:2010
2009
Chord diagram
Added: 2012-05-04

Circos-D3Circos-D3
Circos images for genome research

Circos sample panelCircos sample panel
Circos images for genome research

A circular diagram desgined to facilitate the analysis of relationship among categorical and other variables using chords of a circle with various visual attributes. The main application is to genomic structure, where the chords can encode various properties of genomic sequences.


References:
Krzywinski:2009