I don’t use a lot of area graphs. In fact, they weren’t even included in the visuals I use most that were covered in storytelling with data: a data visualization guide for business professionals (tangentially related: all the data and visuals from the book are now available to download). That means I didn’t use any area graphs for a whole year!
I’ve mainly avoided area graphs for a couple of reasons. They take up a lot of ink, simply due to the typical amount of filled in color and limited white space. This can make them feel a bit heavy, and may mean there isn’t space left for annotations or labels the way that you often have with a line graph. They are prone to some shared issues with stacked bars: if anything interesting is happening further up the stack, this can be hard to see because it’s stacked on other things that are also changing in size from point to point (the connected lines and area, however, seem to help make up for some of this, compared to the disjointed view you get with bars given the white space in between them). Finally, I often find it unclear whether the series in the stack are literally stacked on top of each other (each graphing from the bottom of the given segment to the top), or if they overlap (each graphed from x-axis upwards).
All of that said, I’ve found myself using a couple area graphs lately, and so thought I’d share one example with you. This is from a recent client makeover, where I iterated through a number of different views in order to both get a better understanding of the main point I wanted to make as well as identify an effective visual that would help me do so. The team in this case was plotting the capital budget outlook over time across five categories of spend. The original graph looked similar to the following:
There are a number of things that are not ideal about this initial view. One of the first things I notice is the y-axis and the (000’s) label at the top. A thousand thousands is a million, so I’ll likely drop some zeros and change the scale to millions of dollars. When I look at the x-axis, it starts to become clear that some of this data has already happened, while other dates are in the future (reflecting some sort of forecast or plan). I’ll want to make this visually clear. I can do this in a couple of ways: I can add words, and/or I can make the actual data points for the future dates somehow visually distinct (for example, lighter color, or lower intensity). The following graph incorporates these changes:
Even after I’ve made these amendments, however, a larger issue remains: I have no idea what I’m supposed to do with this data!
I’ve encountered a number of situations similar to this lately, where an analyst or an analytical team views their role first and foremost to inform. The sentiment seems to be something like, “I’ll put the data out there, and my audience will know what to to with it.” This is a far too passive view of the analyst role, in my opinion. If you are analyzing data, you likely know it better than anyone else. This means you are in a unique position to drive value based on that data. Don’t simply show data, turn it into information that your audience can do something with and act upon!
Let’s spend some time looking at the segments being graphed and considering what they represent. We see the budget for three major projects decreasing markedly from 2018 to 2019, and then decreasing slowly over time:
There is also decreasing trend for other existing projects, which look to be at zero by 2021:
The budget for new projects increases a bit over time:
That leaves two final pieces: existing allowance and proposed increase. Hm. There might be something interesting here. But before we jump to that, let’s get a little clearer on what we’re looking at. While the three data series up to this point have been stacked one on top of the other, that changes when we get to the existing allowance. The existing allowance is actually the first three data series combined, plus a small amount in excess of that—we only see this latter bit graphed separately:
The final series, proposed allowance, is all of this together, plus the incremental part that is shown explicitly with the top blue series currently:
So we aren’t even graphing this data quite right currently.
Let’s unstack the bars and focus on the total amounts for these final two data series (I’ll also use this as an opportunity to adopt a color scheme that is more fitting with the branding of the organization, the logo of which is bright blue and dark blue-grey):
One of the most interesting parts of the data here is how the proposed allowance compares to the existing allowance, so let’s actually get rid of everything else:
It will be easier to focus on the difference between the bars in the preceding view if we turn them into lines:
The really interesting thing here is the gap between the lines, so let’s focus on that:
At this point, I decided to play with the design a bit and put words on the graph to make what we are looking at clear:
Next, if we want to pull the breakdown of the original first three data series back in, I can do that with a stacked area graph:
In this final view, focus is meant to be on the gap between the existing and proposed allowance, which we can see amounts to $195M over the period from 2019 to 2021. I can still see the 3 major projects and other existing projects decreasing over time, as well as the slight increase in new projects, so that detail from the original view is preserved but kept from being distracting by graphing it all in the same (relatively muted) color and labeling directly. I focus on the thing that is different: the gap in bright blue. There are words, both in the title and in the right hand margin that help me understand what I’m looking at and what action needs to be taken. Looks like we need some additional budget!
In case it’s useful, you can download the Excel file with these graphs.