|
Treemaps
for space-constrained visualization of hierarchies...
by
Ben
Shneiderman, December 26, 1998, updated April 17, 2000
A University of Maryland student's
study of data visualization.
During 1990, in response to the common
problem of a filled hard disk, I became obsessed with the idea of
producing a compact visualization of directory tree structures. Since the
80 Megabyte hard disk in the HCIL was shared by 14 users it was difficult
to determine how and where space was used. Finding large files that could
be deleted, or even determining which users consumed the largest shares of
disk space were difficult tasks.
Tree structured node-link diagrams grew
too large to be useful, so I explored ways to show a tree in a
space-constrained layout. I rejected strategies that left blank spaces or
those that dealt with only fixed levels or fixed branching factors.
Showing file size by area coding seemed appealing, but various
rectangular, triangular, and circular strategies all had problems. Then
while puzzling about this in the faculty lounge, I had the Aha! experience
of splitting the screen into rectangles in alternating horizontal and
vertical directions as you traverse down the levels. This recursive
algorithm seemed attractive, but it took me a few days to convince myself
that it would always work and to write a six line algorithm. This
algorithm and the initial designs led to the first Technical Report (HCIL
TR 91-03) in March 1991 which was published in the ACM Transactions on
Graphics in January 1992. Choosing the right name took probably as long,
but the term 'treemap' described the notion of turning a tree into a
planar space-filling map.
My
initial design simply nested the rectangles, but a more comprehensible
design used a border to show the nesting. Finding an effective
visualization strategy took only a few months but producing a working
piece of software took over a year. Brian Johnson implemented the
algorithms and refined the presentation strategies while preserving rapid
performance even with 5,000 node hierarchies. The TreeViz application ran
on color Macintosh models and led to the widely cited paper (HCIL
TR 91-06) jointly authored paper in the October 1991 IEEE conference
on Visualization. This paper was reprinted in
Readings
in Information Visualization.
PhD
student Brian Johnson's implementation added many other interesting
features such as zooming, sound (as a redundant or independent code, for
example, larger files had a lower pitched sound), hue/saturation control,
many border variations, and labeling control. We struggled to deal with
the problem of many small files in some directories, but wound up showing
only a blackened area that invited closer examination by zooming. We knew
that encoding a linear variable such as file size as an area was breaking
a graphic design guideline, but the benefits of seeing a large range of
file sizes seemed like a compensation. We also knew that visually
comparing long narrow rectangles to squarish ones was problematic, but
cursoring over the boxes produced the exact file size on the bottom of the
display.
My excitement about treemaps was great
and like many innovators I thought millions of users would be using this
tool within a few years. Our minds were not focused on getting a patent,
since I thought this was more of a concept that a product. Brian's
implementation of TreeViz was registered with the University's Office of
Technology Liaison which sought to distribute TreeViz.
We found that new users took 10-15
minutes to get acquainted with the treemap display, so we began to explore
improvements and training methods. We were impressed to examine thousands
of nodes at 5-7 levels at once on the screen, but novices did better
seeing 20-50 nodes at 1-3 levels. We had to bring our training times up to
about 15 minutes in order to demonstrate the strong benefits of treemaps. The
1992
HCIL Video Reports and the
1993
HCIL Video Reports showed TreeViz in action. TreeViz is available for
free
downloading.
Masters
student David Turo also built a treemap system on the Sun
workstation and to make it more comprehensible we chose a
fixed-level hierarchy. We used an appealing and familiar sports
application: 453 basketball players, organized into the 27 teams in four
leagues of the National Basketball Association. We had 48 statistics about
each player for the 1991-92 season, so users could chose color and area
coding from points scored, fouls, free throws, etc. . The
1993
HCIL Video Reports showed Turo's system with the basketball data. His
Masters thesis (Unpublished!) describes his implementations and an
empirical study. Johnson and Turo cooperated on a paper describing
improvements they made to the visual presentation (HCIL
TR 92-06).
By now we were pushing ahead on several
application domains. A German visitor, Alexander Jungmeiseter worked with
Dave Turo's implementation and built a stock portfolio visualization that
showed clients, portfolios, industry groups, stocks and trades (HCIL
TR 92-14) . Size might indicate worth of the holdings and color might
indicate the degree of increase/decrease in value. I still believe that a
worthwhile application would be a stock market monitor that would show the
current daily trade activity. It could present the 30 Dow Jones
Industrials, the Standard and Poor's 500, or all 2700 companies on the New
York Stock Exchange. They would be grouped by industry (airlines,
chemicals, drugs,...), area coded by volume of trading, and color coded by
increase/decrease.
A Japanese visitor, Asahi Toshiyuki,
built his own innovative treemap interface to implement the Analytical
Hierarchy Process in decision making (HCIL
TR 94-08). Users could express their opinions of the relative merits
of a decision choice (such as which site to chose for a factory) by
pumping up areas for their preferred choices, and pumping up the areas for
importance of costs, availability of labor, tax breaks, etc. (Figure 5).
The video demonstrates these processes (HCIL
Video Reports 1994, and
HCIL
TR 95-04) and an empirical study showed users could succeed with this
tool.
Another
success story for treemaps was their inclusion in a satellite management
system for Hughes Network Systems (HCIL
Video Reports 1994 and
HCIL
Report TR 94-07). The three-level hierarchy showed each node of
a network as a fixed size and color was used to indicate available
capacity. The engineering-oriented community of ground station operators
grasped this simplified version quickly.
In my travels to lecture about our work,
treemaps became a major topic. However, I ran into resistance when showing
still images of our hard disk directories with thousands of nodes. Once at
the University of Washington after my talk produced a mixed reaction about
treemaps, I asked my audience to follow me down to one of their labs. I
installed TreeViz and examined their hard disk directories. I immediately
spotted a problem, and with a few clues they could see for themselves that
there were three copies of the same C compiler installed on this machine.
The x-ray vision metaphor had proven to be effective on this occasion.
Similarly, at Apple Computers, my audience much preferred the dynamic
queries demos of the HomeFinder, but I gave copies of TreeViz to several
interested attendees. The next day one of them reported finding many
megabytes of useless information on their network servers.
While TreeViz was appreciated for the
Apple Macintosh, we were getting requests for a Windows version. Graduate
student Marko Teittinen took up the challenge and used Galaxy from Visix
to produce a Windows 3.1 implementation called WinSurfer.
Marko's carefully scripted video (HCIL
Video Reports 1995) showed the features which were meant to match the
Windows Explorer. WinSurfer allowed users to view, delete, copy, move,
rename and run files. It worked nicely, but novices were often struggling
to understand the layout that might show 5000 or more files at 7 or more
levels. Simplifying the initial screen presentation to only 2 levels and
allowing simpler user control never got implemented. In fact, we never
produced a Windows 95 version and Galaxy is no longer a viable commercial
product.
However, we did make a temporary patch so
as to enter WinSurfer in the Browse-Off
competition at ACM SIGCHI's 1997 Conference. This event drew six
software tools for exploring hierarchies. There was no clear cut winner
and WinSurfer was a leader in some tasks.
Micro Logic Corp, a New Jersey company,
sells a commercial product, called DiskMapper
for Windows machines based on the treemap idea. They have received great
press attention and awards for their product. The University of Maryland
receives a modest royalty on DiskMapper by way of a license agreement with
the Office of Technology Liaison.
A recent implementation by Univ. of
Maryland grad students Jerome Brown and Shaun Gittens, was done in Delphi.
It is called TreeMap97 (http://www.otal.umd.edu/Olive/Class/Trees/).
This general purpose version of slice-and-dice treemaps was revised by
Chris North.
Other implementations of treemaps have
emerged elsewhere. John Stasko at Georgia Tech produced a nice X-Windows
version, and his colleagues Sougata Mukherjea, James D. Foley, and Scott
Hudson produced a ACM CHI95 paper that used treemaps: Visualizing Complex
Hypermedia Networks through Multiple Hierarchical Views
Pedro Szekeley of University of Southern
California's Information Sciences Institute cleverly built a quick and
dirty version using his user interface building tools. Treemaps began
popping up in surprising applications such as the visualization of a
tennis match (ref from Info Viz 1997?) and information search results in
the Forager for the Information Super Highway (FISH) system (Mitre Corp
videotape).
The Storyspace hypertext authoring system
from Eastgate offers a treemap viewer
http://www.eastgate.com
A well-written tutorial on treemaps by
Chris Jones appears at
http://orcs.bus.okstate.edu/jones98/treemaps.htm
A well done implementation of treemaps
shows 535 popularly held stocks, organized by industry groups, size-coded
by market capitalization, and color-coded to show rise or fall:
http://www.smartmoney.com/marketmap/
This clever variation (created by Martin
Wattenberg), which I call cluster treemaps, ensures low aspect ratio
rectangles (most rectangles are square-ish and there are very few thin
rectangles), but gives up the lexicographic ordering in slice-and-dice
treemaps. Wattenberg describes his method in an ACM CHI99 short paper, in
the Conference Companion. Marketmap was written in Java and the software
is available for licensing. The Smartmoney website also offers a software
package called MapStation
http://www.smartmoney.com/shopping/mapstation/.
Statisticians point out that the mosaic
display, shown by Bertin, and others is similar to the treemap concept.
For fixed level hierarchies there is a great similarity, but the gist of
the treemap idea was intimately tied to the computerized implementation
and user control panel for setting attributes.
I think treemaps are a convenient
representation that has unmatched utility for certain tasks. The capacity
to see tens of thousands of nodes in a fixed space and find large areas or
duplicate directories is very powerful. I still use TreeViz for cleaning
up my Macintosh. It does take some learning for novices to grasp the tree
structure layout in treemaps, but the benefits are great. I'm delighted to
see the great success of Martin Wattenberg's
http://www.smartmoney.com/marketmap
and the growing interest in other treemap variations during 1999.
A visually intriguing variation by Jarke
J. van Wijk, "cushion treemaps", shows depth of nesting by
shadows on cushion-like 3-D mounds. It was presented at the IEEE Symposium
on Information Visualization, October, 1999:
http://www.win.tue.nl/~vanwijk/ctm.pdf.
Van Wijk also created a new layout strategy that he calls "squarified
treemaps" that avoids high aspect ratio rectangles by using an
alternative to Wattenberg's algorithm. This is presented at the
Eurographics/IEEE TVCG 2000 Symposium:
http://www.win.tue.nl/~vanwijk/stm.pdf.
|