1975-present: High-D data visualization
It is harder to provide a succinct overview of the most recent developments in data visualization, because they are so varied, have occurred at an accelerated pace, and across a wider range of disciplines. It is also more difficult to highlight the most significant developments (and because we have focused on the earlier history), so there are presently areas and events unrepresented here.
With this disclaimer, a few major themes stand out
- the development of a variety of highly interactive computer systems and more importantly,
- new paradigms of direct manipulation for visual data analysis (linking, brushing, selection, focusing, etc.)
- new methods for visualizing high-dimensional data (grand tour, scatterplot matrix, parallel coordinates plot, etc.)
- the invention of new graphical techniques for discrete and categorical data (fourfold display, sieve diagram, mosaic plot, etc.), and analogous extensions of older ones (diagnostic plots for generalized linear models, mosaic matrices, etc.) and
- the application of visualization methods to an ever-expanding array of substantive problems and data structures.
These developments in visualization methods and techniques arguably depended on advances in theoretical and technological infrastructure. Some of these are: (a) large-scale software engineering; (b) extensions of classical linear statistical modeling to wider domains; (c) vastly increased computer processing speed and capacity, allowing computationally intensive methods and access to massive data problems.
In turn, the combination of these themes and advances now provides some solutions for earlier problems.
Weekly chartbook (eventually computer-generated) to brief U.S. President, Vice President on economic and social matters
"Cartesian rectangle'' to represent 2 x 2 table, experimentally tested against other forms
Ad Hoc Committee on Statistical Graphics, leading to the ASA Section on Statistical Graphics, later to the Journal of Computational and Graphical Statistics
Original invention of linked brushing (highlighting of observations selected in one display in another display of the same data), although in a manner different from how we see it in today's systems
S, a language and environment for statistical computation and graphics. S (later sold as a commercial package, S-Plus; more recently, a public-domain implementation, R is widely available), would become a lingua franca for statistical computation and graphics
References:BeckerChambers:1978 BeckerChambers:1984 Becker:1994
Fisheye view: an idea to provide focus and greater detail in areas of interest of a large amount of information, while retaining the surrounding context in much less detail
The "draftsman display'' for three-variables (leading soon to the "scatterplot matrix'') and initial ideas for conditional plots and sectioning (leading later to "coplots'' and "trellis displays'')
Another early version of brushing, invented independently of Newton, together with a system for 3-D rotations of data
Visibiltiy Base Map, a map of the United States where areas are adjusted to provide a readily readable platform for area symbols for smaller states, such as Delaware and Rhode Island, with compensating reductions in the size of larger states
The USA Today color weather map begins an era of color information graphics in newspapers. Shortly, colorful visual graphics become widespread.
Rorick used a combination of color, maps, tables, symbols and annotation to transform often dull and incomprehensible information into something more interesting and accessible
Esthetics and information integrity for graphics defined and illustrated (some concepts: "data-ink ratio'', "lie factor'')
References:Tufte:1983 Tufte:1990 Tufte:1997
Grand tour, for viewing high-dimensional data sets via a structured progression of 2D projections
Parallel coordinates plots for high-dimensional data
References:Inselberg:1989 InselbergDimsdale:1990 Inselberg:2009 Inselberg:1985
Interactive statistical graphics, systematized: allowing brushing, linking, other forms of interaction
Grand tours combined with multivariate analysis
Textured dot strips to display empirical distributions
Lexis pencil: display of multivariate data in the context of life-history
Statistical theory and methods for parallel coordinates plots
Mosaic display developed as a visual analysis tool for log-linear models (beginning general methods for visualizing categorical data)
Table lens: Focus and context technique for viewing large tables; user can expand rows or columns to see the details, while keeping surrounding context
Tag clouds (also known as "word clouds") are visually stimulating summaries of large bodies of text. Their purpose is to take a selection of text and visually display the frequency of the most commonly used words within that document. These are useful for qualitative analyses by highlighting major themes found in particular works of interest.
Many sources cite Jim Flanagan as the founder of this idea with his Search Referral Zeitgeist Perl script, although the basic idea (of using word size to designate importance) had previously been used by Douglas Coupland in his 1995 novel "Microsurfs". Tag clouds are prominently featured on image/photo websites (such as Flickr), and blogs, where they can be used by visitors to navigate the website by theme or keyword. For an example of a more academic use of this technology, see the attached photos, which were generated by Dr. Christopher Green of York University using the Wordle.com website. These images (made using John Dewey's 1896 article "The Reflex Arc Concept in Psychology" and B. F. Skinner's 1950 article, "Are Theories of Learning Necessary?") can be studied to visually compare and summarize the similarities and differences in word usage between these two major psychological texts.
The moving buble chart.
"The main innovation from Gapminder is so far 'the moving bubble chart' in the form of the Trendalyzer software that was acquired by Google in 2007. Google has made a 2008 version freely available as Google Motion Chart. Gapminder is a non-profit foundation founded in 2005 with a goal of '…increase use and understanding of statistics and other information about social, economic and environmental development at local, national and global levels.” (Rosling and Johansson, 2009).
An influential, open source implementation of the Grammar of Graphics from Wilkinson (1999) in R, together with other computational tools to make it easier to produce beautiful statistical diagrams
Around this time, software for producing statistical graphs progressed from limited point-and-click interfaces and low-level graphics languages to higher-level computational languages for specifying a graph. This item recognizes the work of Hadley Wickham in the ggplot2, plyr and other R packages, but there are a number of other important contributors to this topic.
A circular diagram desgined to facilitate the analysis of relationship among categorical and other variables using chords of a circle with various visual attributes. The main application is to genomic structure, where the chords can encode various properties of genomic sequences.