power pairing: color + words

What is one thing you’ll do differently after learning the storytelling with data lessons?

At the end of our workshops, participants are often prompted to reflect on this question. The resulting discussion usually evolves into things that can be easily integrated into the day-to-day work already being done. One piece of advice we frequently give may surprise you—there are two easy actions that don’t require complicated technical skills! First, adopt the habit of stating your takeaway in words. Second, develop the practice of using color sparingly. Today’s post is a quick illustrative example that puts these tips to use. 

At a recent client workshop, we discussed a visual similar to the one below. It is a snapshot of an organization’s current accounts payable (AP) by vendor at a point in time. At a basic level, the graph is fine. It’s cleanly designed with a left-aligned chart title, data labels incorporated into the bars, and no clutter of gridlines or chart border. The bar chart is easy for me to read—I can quickly see that AP is highest for Microsoft and how incrementally larger it is compared to the other vendors because of the consistent baseline (the y-axis). 

 
Picture1.png
 

What I can’t easily see is what I should take away from this chart. At client workshops, we often don’t have this important context—because of this, we often show multiple approaches for highlighting different potential takeaways. Below you’ll see several strategies for employing color and words in this visual. In each of these, notice how the words set up your expectations for what’s emphasized in the graph and color used sparingly indicates where to look in the visual. 

If the audience is interested in the highest spend, I could emphasize the largest vendor:

 
 

Perhaps the audience will be more curious where AP is concentrated. I could instead focus attention on the top vendors:

 
 

What if the conversation is about expectations—is this spend surprising or unsurprising? I might add additional context with super-categories—useful if the audience is unfamiliar with these vendors’ services—grouping and employing similarity of color and position to visually tie the text to the data it describes. 

 
 

Practice pairing color and words in your visuals to be more effective when communicating for explanatory purposes with data. Bonus: you don’t need fancy tools to do either of these things!

Download the file for a peek at how I created these visuals in Excel. 


Elizabeth Ricks is a Data Storyteller on the SWD team. She has a passion for helping her audience understand the ’so-what?’ as concisely as possible. Connect with Elizabeth on LinkedIn or Twitter.

tactical tip: embedding a vertical reference line in Excel

Today's post is a step-by-step Excel “how-to” inspired by a reader question we received following a recent post on using dotted lines in data visualizations.

Dave asked:
“Do you know of a trick for drawing vertical lines to delineate years (or actuals/historical vs forecast/future segments of the chart)? I currently have to draw them with the line drawing tool, which gets messy when moving the chart on a PPT slide. If there were a way to embed it in the data or somehow format the chart, that'd be awesome.”

The following chart illustrates what Dave describes. The data is units of output over time where the first nine months of the series are actual data and the remaining four months of the year are a forecast. The dotted line serves as a visual cue to differentiate actual from forecast. Created in Excel, the line was physically drawn on the graph with the Shape Illustrator. While this approach might suffice as a quick method for achieving the desired effect; it isn’t ideal for recurring use of the graph, particularly if the line’s position on the x-axis might change in future iterations.

 
Picture1.png
 

After some research and playing around in Excel, I’ve devised one method for achieving this effect, which I’ll outline in this post (I’m sure there are others!). Don't be dismayed by the number of steps: it's a one-time setup after which can be easily refreshed in future iterations by changing where you want the reference line. I’m using Excel 2016 and you can download the accompanying file.

In my spreadsheet, the data for the Output over time chart looks like this:

 
Picture2.png
 

1. Go to a blank cell range and enter these values as shown in my screenshot below. I’m choosing to add these new values directly underneath my data range in cells F19:G21. This will eventually become the coordinates for a secondary scatterplot that we’ll add in a later step.

 
Picture3.png
 

2. Choose where you want the vertical reference line to cross the x-axis and enter those values below “X”. In this example, I want the line located on the September data point, the ninth point in my data series. In cells G20:G21, I entered “9” in each, as shown below. (Note: for a more automated approach in a larger dataset, a MATCH formula could also calculate where September falls in the range: =MATCH("Sep",$F$7:$F$18,0).

 
Picture4.png
 

3. Add a new data series by right-clicking the graph and choosing Select Data:

 
 

4.  In Select Data Source dialog, click the Add button.

 
 

5. In the Edit Series dialog, enter a name for your data series (I chose “reference”) and select the X values you entered from Step 2. I selected the 9’s in G20:G21. Click OK to exit the dialog boxes.

 
 

The resulting visual looks like this:

 
 

6. Right click the new line and choose Change Series Chart Type.

 
 

7. In the Change Chart Type dialog box, select Combo section under All Charts tab. Then select Scatter with Straight Lines and check the option for Secondary Axis. Click OK to exit.

 
Picture10.png
 

The resulting visual looks like this:

 
 

8. Go to the chart, right click the red reference line and choose Select Data again. In the Select Data Source dialog, highlight reference and click Edit.

 
 

9. In Edit Series dialog, update the X values to be the original values you selected in Step 5. Set the Y values to be 0,1. Click OK to exit.

 
Picture13.png
 

The resulting visual looks like this:

 
 

The remaining steps are visual cleanup: first, I forced the red line to align with the top of the primary y-axis and second, I hid the secondary axis line and text labels.   

10. Right-click on the secondary y-axis and select Format Axis:

 
Picture15.png
 

11. In the Axis options section, type 1 into the textbox beside the Maximum option.

 
 

12. In the Text Options section, under Text Fill, choose No fill. This will remove the text labels on the secondary y-axis.

 
 

13. In the Axis Options section, under Line, choose No line. This will remove the secondary y-axis line.

 
 

Voila! The resulting visual has an embedded vertical line, which is plotted on a hidden secondary y-axis.

 
Picture20.png
 

Recall that this goal of this specific scenario was a dotted line which visually differentiated the actual and forecast sections. My last step was to change the formatting of the line to appear as a thin, grey dashed line.  

(Note: To achieve your preferred formatting, right-click the line and select Format Data Series in the context menu where you’ll find formatting selections in the resulting dialog pane.)

 
 

This method does come with some trade-offs to consider.

One downside is that you lose some control over the exact placement of the line where it crosses the x-axis. Below you’ll see a comparison between the manual vs embedded approach. With the manual approach, the line can be drawn exactly on the tick mark between the Aug & Sep data points, providing a clean alignment with the x-axis. With the embedded approach, the line is centered above the Sep label, resulting in a slightly less seamless effect.

 
 

On the cosmetic side, another downside is losing the flexibility to manipulate the length of the line for labeling purposes. I’ll illustrate this with a horizontal bar chart (which I also created using this method). With the manual approach, I can physically draw the line to extend above the x-axis line, aligning it closely to the “Target” text label. With the embedded approach, the line stays below the x-axis line, creating a gap between the line and the label that describes it.

 
 

You can download the Excel file to see the behind-the-scenes of these graphs. Are there other methods you’re aware of for achieving this effect? Or other considerations with embedding the reference line directly? Leave a comment with your thoughts!


Elizabeth Ricks is a Data Visualization Designer on the Storytelling with Data team. She has a passion for helping her audience understand the ’so-what?’ as concisely as possible. Connect with Elizabeth on LinkedIn or Twitter.

when to use a dotted line

When visualizing and communicating with data, one design element we can play with is line style. Most tools default to and we most often use and see solid lines. But a dotted line is another possibility. What considerations should we think about with a dotted line? When should we use one? In this post, I’ll outline my thoughts and illustrate the scenarios in which I find myself using dotted lines through examples plus will share some commentary on how to do this in your tools.

considerations with dotted lines

Dotted lines are super attention grabbing. They also convey a sense of uncertainty that can be useful. The challenge is that dotted lines introduce some visual noise. From a clutter standpoint, we’ve taken what could have been visualized as a single visual element (a line) and chopped it into a ton of pieces (many little lines, dashes, or dots). Because of this, I recommend against using the dotted line as a way to attract attention (rather use less noisy means of contrast, such as position, size, or color for this). Preserve the use of dotted lines for when there is a target or goal we are trying to hit or remain a certain side of or when there is uncertainty to depict (a forecast or prediction). In these cases, the visual differentiation and sense of uncertainty that the dotted line helps depict makes up for the additional visual noise it introduces. Let’s look at an example of each of these use cases and some dotted lines in action.

dotted line for a goal or target

I was recently working with a graph similar to the following that depicted time to fill a given type of role at a company:

Dotted Lines 1.png
 

There are three lines in the graph above: (1) the Goal—which you likely looked at first both due to position and because it’s bold black, which stands out more than the other colored lines, (2) average days to fill roles for Internal candidates (orange), and (3) average days to fill roles with External hires over time (teal). The Internal and External lines represent data that we’ve collected and summarized, whereas the Goal is something we have set and in this case we’d like to stay below. To set the Goal apart—and here I'm interested in making it less attention grabbing but still want it there for reference—we could use a dotted line. Dotted lines come in different styles, both in terms of the thickness of the line and how large or small the individual pieces are. We also have some other design elements at our disposal when it comes to the formatting of the line and the text that goes with it. Here’s an example of how I iterated to land on a combination I liked:

Dotted Lines 2.png

I prefer the final view, where the dotted line is thin and grey, effectively pushing it to the background (in spite of the noise that this line style introduces). I should probably mention that there are other styles of dotted lines as well (for example, some that combine dashes and dots)—I recommend avoiding these because they look quite messy and instead choose a style where the segments (whether dashes or dots) are consistently spaced. Here is what my final iteration looks like in the full graph:

Dotted Lines 3.png
 

I like this. The GOAL is clearly stated and still the first thing I see, but due to the formatting it feels more like reference or context, while the thicker solid lines are the clear focus of the graph. Also—and perhaps it's just me, but—there's something about this view that makes it feel easier to compare each of the individual lines (Internal and External) to the GOAL than in the original view when all of the lines were of similar thickness.

dotted line to depict uncertainty

I often see graphs where some data is actual and some is forecast and there isn’t anything done to differentiate the two, like the following example.

 

Given that we are standing in 2018, some of the data in the graph above clearly hasn’t happened yet and so must be forecast. But how much? Was this graph recently made using actual data through 2017 and forecasting thereafter? Perhaps, but we’d have to make that assumption, because nothing in the graph tells us. Or maybe there was a footnote hiding down at the bottom of the original that articulated this (there wasn't, but in case there were); I shouldn't have to read the fine print in order to know how to read the graph. Don’t make your audience question, make assumptions, or hunt for detail like this—make it clear.

One option is to use words to differentiate between actual and forecast. In the following, I added supercategories along the bottom to indicate which dates are associated with actual data and which are forecast (check out the recent post illustrating the step-by-step on how to achieve in Excel if that’s of interest). Since we're on the topic of dotted lines, I could also add a dotted line to further visually differentiate actual vs. forecast data:

Dotted Lines 5.png
 

Even better, though, if I preserve my use of the dotted line to depict the uncertainty directly for the portion of the line that represents forecast data. If I do this, I don’t need to add the additional line at all. Here’s what it could look like:

Dotted Lines 6.png

In this case, both the line style and the words on the x-axis make it quickly clear which data points represent actual data and which are forecast. I'm a big advocate of thick, bold, solid lines and data markers for actual data and thin, dotted lines (and sometimes non-filled in data markers, though I felt that looked too messy here) for forecasts and predicted data because of the way it helps us intuitively understand what we are looking at when executed well.

In the above, I’m assuming the forecast data points are important enough to label directly (in other words, that the specific numeric values are important; if that weren’t the case, you might approach this differently, perhaps only labeling the 2017 and 2022 points, or not labeling any of them and rather letting the y-axis for general magnitude be good enough). I can imagining different people making different choices here depending on both what you want the audience to focus on as well as personal aesthetic preferences. The meta-point: be thoughtful when it comes to design details in general, and your use of dotted lines in particular.

the tactical: how do I do this in my tool?

Changing the formatting of a single line from solid to dotted—like in the first example above—is possible and pretty straightforward in most tools. This is typically achieved through a menu or code to change the line style. If you aren’t sure how to do this, some smart Google searching with the name of your tool and something like “change line in graph to dotted” should point you towards a solution.

Changing just part of a line from solid to dotted is slightly more complicated, but there are a couple of solutions for that. There is the brute-force method of physically formatting each individual data point (for example if you are working in Excel, you would click once to highlight the series, then click again to highlight an individual data point, and then can format that data point or associated line individually as you would like it). As you can imagine, this can be time-consuming. Another way is to make what will appear to be a single line actually two different data series, allowing you to format them separately. In the Sales example above, I’d have a column of dates that goes from 2010 to 2022 to set my x-axis. Then my first series for the ACTUAL data would have values from 2010 to 2017. I’d have a second series for my FORECAST line that has values from 2017 to 2022 (note the overlap with 2017 having values for both ACTUAL and FORECAST to avoid a gap in the line). I imagine you could use a similar approach in other tools.

You can download the Excel file with the above examples (including both the brute force and more elegant solutions described above for the second example).

These are the two use cases in which I find myself using dotted lines and promoting their use. Are there other cases where you’d recommend using a dotted line? Or additional considerations we should have in mind when choosing line style? Leave a comment with your thoughts!

the cat in the hat knows a lot about data visualization

Check out this video of my recent Stanford MBA guest-lecture, where I discuss the importance of being sparing and intentional in your use of color and putting your thoughts into words when communicating visually with data.

Read More

tell your audience what you want them to know

It sounds simple. It sounds obvious. But how often do people give a presentation or send out a report or email without ever making it clear what they want you to know? 

Have you ever looked at a graph and thought, I'm not sure what I'm meant to get out of this? Or sat through a presentation or meeting only to realize once it's over that you're not really sure what you just witnessed?

I listened to a presentation last week where the person speaking put up a busy-looking, data-heavy slide. It wasn't a good slide, but the speaker was clearly comfortable on stage and knowledgable about his topic, so I was motivated to understand what he was trying to communicate. Then he said a few magic words: "what this is meant to show is..." followed by a clearly articulated statement. It is amazing how those simple words can make the intimidating accessible.

The effect these words had on me was to generate more patience on my part (and even a little curiosity) to understand what the slide was showing. The speaker knew what he wanted the audience to get out of it and walked us through the visual in a way that made sense. It still wasn't a fantastic visual - there are changes that could have made it more effective - but his words overcame this shortcoming.

Tell your audience what you want them to know.

It's simple advice, but the impact can be profound!