Last month, we focused on a variation of a bar graph when I challenged you to make a waterfall chart (45 people shared their creations: be sure to check out the recap post!) This month, we’ll be practicing an alternative form of a line graph: the slopegraph.
what is a slopegraph?
For me, “slopegraph” is really just a fancy word to describe a line graph that only has two points on the x-axis. Sometimes we don’t think of using lines when we only have two points, however depending on our data—and what we want to enable our audience to see—they can sometimes work quite well.
Let’s look at a progression to better understand what a slopegraph helps us see. Check out the following basic dual-series bar chart depicting sales at two points in time by region.
When we look at a dual series bar chart like the one above, typically we are comparing the heights of the paired bars to each other. So let’s actually draw some lines on the graph to reflect this.
Now that I’ve drawn the lines, I no longer need the bars. Let’s eliminate those.
I also don’t need the spread in horizontal space: I can collapse these lines. (I’ll also remove all of the other details; I’ll add these back momentarily.)
I guess I don’t need to compress the lines that much: let’s expand a bit to make better use of the space.
Next I’ll add titles and labels back in so we know what we’re looking at. Viola! We have a slopegraph.
Let’s pause to take note of a couple of things. First, consider what these lines depict: they are visualizing the change. When the line slopes upwards, it means that region had an increase in sales. Where the line slopes downwards, the region's sales decreased. The lines having steeper slopes represent regions where sales changed more than those with flatter slopes. I find often a solution raised when we want to focus on change in an example like the one above is to move from the dual series bar chart to a single series bar chart that graphs the difference, the change. The downside of this is that you lose the context of the basis: the relative beginning (or ending) heights of the bars. With the slopegraph, however, the focus is on the change (the line) but you still preserve the context of relative sizes of the categories (via vertical position on respective axes). So my audience can intuitively see relative magnitudes of change, as well as what basis that change is happening from.
As another sometimes-benefit, when I shift from bars to lines, I now have the ability to rework my y-axis and can scale it to start at something other than zero. (This is not ok with bars, where we compare endpoints relative to each other and the baseline and thus need the full bar for accurate comparison, but ok for lines where we focus on relative positions in space and the relative slopes of the lines that connect those positions, which remain constant as we zoom). This means if numbers are close and small differences are meaningful, I have the ability to zoom in to get more spread. I didn’t need to do that in this particular example, but this can sometimes be useful.
Also, when there is a general trend (many things increasing, many things staying flat, or many things decreasing), exceptions can be picked out very quickly with the slopegraph view. For example, in the slopegraph above, notice how easy it is to see that Region A is the only region that decreased in sales in the time period shown. I can take this a step further, highlighting this takeaway via words and sparing color:
slopegraphs for group comparisons
The preceding slopegraph shows change over time. Slopegraphs can also work well to show group comparisons. I’ve had people get angry with me before for using lines for non-continuous data. This is one of those popular data visualization myths (that lines are only for continuous data—I’ve been guilty of oversimplifying with these words in the past, which aren’t quite right). The rule—and though there aren’t many hard and fast rules in data visualization, I feel comfortable characterizing this as one—is when you are graphing data in a line graph, the lines that connect the points need to make sense. While this is commonly the case for continuous data, it can happen for non-continuous, or categorical, data as well. Take the case of the slopegraph for group comparisons as an illustration. In the following example, I’ve graphed employee survey data for the Total Organization on the left and the Sales Team on the right.
Let’s consider what the lines in the slopegraph above represent. In cases where the line slopes downwards, Sales underperformed (scored lower than) the overall organization. Where the lines slope upwards, Sales outperformed (scored higher than) the overall organization. Lines sloping upwards with steeper slope indicate categories that increased by a greater proportion. These lines make sense. They are visually encoding the difference between each category across the two groups.
By the way, if I spend some more time playing with the design of the above slopegraph, I might do something like the following:
In the above view, I moved the x-axis labels (Total Org and Sales Team) to the top so that my audience sees it before they get to the data so they know right away what they're looking at. I also added vertical lines, as I like how this helps highlight that the overall spread in feedback (from min to max score across categories) is larger for the Sales Team compared to Total Org. I also de-emphasized the data points, labels and lines coming from Total Org so those are there for reference, but attention is focused more on the right hand side via black markers and labels. In the examples I've highlighted here, I used circle data markers for all. In cases where you have many lines, you may simply want to show the lines (without markers), which can look less cluttered. That said, I like using end markers when I'm including the data points and labels, as I think this helps visually anchor and tie the two together.
Perhaps we take it a step further, highlighting a specific takeaway:
And now, recognizing that I’ve focused only on the negative in the two examples we’ve looked at, here’s a view with a happier point made:
Because the lines are what take up most of the ink in a slopegraph, the focus is primarily on the change or difference between the categories—consider using a slopegraph when that is something important you’d like your audience to focus on or easily see.
Slopegraphs are often lauded for their clean design. I frequently do an exercise in my workshops where people in small groups look at the same data graphed multiple ways, with a slopegraph being one of the options. I continue to be surprised by how many favor this view, in spite of the issues that the slopegraph has for the specific example I use. Speaking of issues…let’s discuss some of the shortcomings of the slopegraph.
Whether a slopegraph will work for any given data is highly dependent on the data itself. If you have tightly bunched data, or many crisscrossing lines, it can be difficult to see what’s going on, or label the data in a way that is legible. Also you don’t have control over the relative ordering of categories (rather, they are where they are because of the data they are tied to). This is no issue if your categories don’t have intrinsic ordering, but can become problematic if they do (for example, survey categories ranging from strongly agree to strongly disagree, which you’d want to order meaningfully). Slopegraphs can also throw people because they look different than a typical line graph, which can sometimes be off-putting. While slopegraphs are possible in most tools (since they are simply line graphs), they can take time and patience to format in the desired fashion.
That said, there are certainly good uses for slopegraphs. Here’s where you come in.
My challenge to you: find some data of interest that will lend itself well and create a slopegraph. DEADLINE: Friday, June 8th by midnight PST. Full submission details follow (be sure to email it to us, taking note of specifics below, for inclusion in recap post!). You're also welcome to share at any point on social media using #SWDchallenge.
- Make it. Identify your data and create your visual with the tool of your choice. If you need help finding data, check out this list of publicly available data sources. You're also welcome to use a real work example if you'd like, just please don't share anything confidential.
- Share it. Email your entry to SWDchallenge@storytellingwithdata.com by the deadline. Attach your image as a .PNG. Put any commentary you’d like included in my follow up post in the body of the email (e.g. what tool you used, any notes on your methods or thought process you’d like to share); if there’s a social media profile or blog/site you’d like mentioned, please embed the links directly in your commentary (e.g. Blog | Twitter). If you’re going to write more than a paragraph or so, I encourage you to post it externally and provide a link or summary for inclusion. Feel free to also share on social media at any point using #SWDchallenge.
- The fine print. We reserve the right to post and potentially reuse examples shared.
In case it's of interest, we’ve written about slopegraphs a few times in the past. Check out the following if you’d like to read more: I like [candy] bars better than donuts, more on slopegraphs, slopegraph template, visualizing change via slopegraph. Also if you have tips or tricks or other points related to slopegraphs that others can learn from, please leave a comment to share!
I look forward to seeing what you come up with! Stay tuned for the recap post in the second half of June. Also check out our new #SWDchallenge page on the blog for past challenge details and recaps.