“how do I incorporate visual design into our monthly deck?”

After reading storytelling with data or participating in a workshop, people often ask how they can incorporate the lessons into a recurring (i.e. monthly, quarterly) report. These reports often materialize as a PowerPoint deck, which started sparsely, but over time has taken on a life of its own and now resembles the “slideument”: part presentation, part document but not exactly either at its best.

Consider the slide below, which is based on an actual slide from a recent client workshop. (I’ve anonymized the client’s data to preserve confidentiality.) Today’s post demonstrates how to apply data storytelling lessons to a visual from a monthly deck, illustrating the thought process to improve it.

Picture1.png

This slide shows a monthly trend of customer service complaints: in total (top chart) and broken down by category (bottom chart). The commentary section tells us (the audience) what the important points of reference are: what happened this month compared to last month (complaints are up 14%), where it changed (Employees) and their proposed next steps. However, notice how much work takes to read through all this text and then find evidence of this in the graphs.

Imagine if you were given this slide to determine an action plan. If you were in a live meeting, would you be able to read all of this text and listen to the presenter at the same time? If you weren’t in the meeting and were reading through the deck, how much time would you realistically spend trying to digest the information presented? We can improve on this visual in both scenarios with a few design changes.

In both cases, I used the commentary as a guidepost for the important takeaways and re-designed the visuals accordingly.

First, let’s a closer look at the top chart. The commentary tells us that complaints were up 14% vs the prior month.

Picture2.png
 

Where did your eyes go first in this graph? Mine went to the red Average line, which I visually estimated to be about 410 per month.  In looking for evidence of the 14% increase in December, I had to do a lot of mental math (add the Solicited + Unsolicited for November and compare it to Solicited + Unsolicited for December) which took more time than someone would likely spend doing this.

If that 14% increase is what the audience should know, check out the difference between the original visual and this:

Picture3.png
 

When applying the “where are your eyes drawn?” test, my eye went straight to the data markers & labels at the end of the total line, where I could see both the absolute numbers and annotations telling me it’s a 14% increase. Since we’re visualizing time, I changed the graph type from a bar chart to a line chart, unstacked the data series, and added a series for the total. This was intentional based on the commentary, which only referenced the total trend. I chose to de-emphasize the subcomponent pieces (Unsolicited and Solicited) by using grey.

Side note: what about the Average line? If the monthly deviation from average was really important, one option would be to keep it in the graph for reference with the tradeoff that adding a fourth data series could create clutter. Another option is an entirely different choice of visual, depicting the monthly change (from average), with a visual cue to indicate that December’s data is acceptable. Both are choices the information designer would make knowing the audience and what context is relevant. In this case, I didn’t feel that this additional point added anything to the overarching story, so I chose to eliminate it altogether.

Let’s take another look at the second visual now. The commentary tells us that complaints were up in a specific category: Employees. Not only did they increase, but they increased from 87 to 117. Apply the “where are your eyes drawn?” test again with the original visual.

Picture4.png
 

If I took an informal poll of readers here, some might have gone to the black line, others might have noticed the blue list first and others (like me) went to the red line. Regardless of which line you focused on first, I’d likely bet that you didn’t focus first on the November to December increase in the Operations line (red).  In fact, it’s difficult to discern the absolute numbers (87 and 117) here because of the general clutter: overlapping data series, gridlines, color, heavy chart border and legend at the bottom requiring some visual work to figure out which line goes with which complaint category.

When setting out to improve a visual, there’s not necessarily a right or wrong answer in choosing a visual type: it often takes looking at the same data several different ways to find which view is going to create that magical “lightbulb” moment. Let's look at a few different variations of this visual.  

First let’s keep the existing line chart, remove some of the clutter and focus attention on the November to December change in Employees.

Picture5.png
 

This view gives the audience the full context of the 12 month trend, while focusing attention strategically on a specific point. However, if the emphasis is really about the November to December change, we could also visualize only those two data points. Let’s look at a few different ways of displaying this.

First, this horizontal bar chart compares this month (December) to last month (November). Horizontal bar charts are useful when your category names are long and therefore can be displayed horizontally from left to right on the y-axis without having to rotate or shorten them.

Picture6.png
 

Another option is a vertical bar chart, if you’re more inclined to preserve the left-to-right construct of displaying time.

 

As a third option, we could use a slopegraph. Slopegraphs can work well in making change visually apparent across categories. Check out how clear it is that some of these categories changed more drastically than others. In fact, looking at the data this way, we see that there was also a marked increase in service-related complaints, something that didn't stand out as much in the other views of the data. You can read more about slopegraphs, including design considerations, in this previous post.

Picture8.png
 

Any of these three visuals could work for depicting this data, I chose the slopegraph for the final version to keep the emphasis on the change in the two data points.

Here's what it could look like if all of this needed to be on a single slide:

Picture10.png

In the remade version, I’ve moved the text to be closer to the data it describes and used color strategically to create a visual link between the text and where to look in the graphs for evidence. I’ve also made the call to action more visible—remember when communicating with data for explanatory purposes, we should always want our audience to do something with the data we’re showing them!

Check out the difference between the original and the remade version:

Picture9.png
Picture10.png

This single view works well as a remake of the original, but not as well in a live presentation. There’s still too much text to read and process, while listening to a presenter at the same time. For a live setting we can still use the same visuals, but build piece by piece (using animation), which forces the audience to listen to the presenter describing the data. For example, consider the Complaints over time visual again:

Picture11.png
 

Now imagine if each of these images were its own slide. Sparse slides lead to better presentations because a person is there to narrate what’s happening.

Picture12.png
 

One final note on the choice of red as the emphasis color. Some readers may be surprised to see something different from our usual blue & orange as emphasis colors (and readers who are Michigan fans are probably having heart palpitations!). In this case, red was the client’s brand color so we chose to stay consistent with the rest of their visuals. If that weren’t the case, we might avoid red because it could a negative connotation, even though this is a somewhat positive story (complaints declining over time).  

In conclusion, we can indeed incorporate visual cues such as strategic use of color and words into a monthly recurring presentation so that our audience clearly knows 1) what’s important and 2) what action to take.  You can download the Excel file with accompanying visuals here


Elizabeth Ricks is a Data Visualization Designer on the Storytelling with Data team. She has a passion for helping her audience understand the "so-what?" Connect with Elizabeth on LinkedIn or Twitter .  

area graph to highlight a line

I don't use a lot of area graphs. But I found myself pausing on one that was submitted as part of the recent annotated line graph #SWDchallenge. It was created by Mike M. and the interesting thing to me was that the focus of this particular area graph wasn't on the area so much, but rather on the line that separated the areas.

This apparently stuck with me, because I found myself recommending a similar approach in a recent client makeover. 

The original graph looked something like the following (data has been modified to protect confidentiality):

Area to highlight line_1.png
 

This is collections data from a bank. In case you aren't familiar with how collections work, typically an automated dialer makes calls to overdue accounts. The grey bars above represent total dials made. When someone answers the phone on the other end, the dialer connects them to a collections agent, who talks to the person who hasn't paid their bill and tries to get them to make a payment. The accounts where a person is reached (a collections agent talks to someone) are considered to be "worked," which is what the teal bars above represent. The penetration ratio, depicted by the black line, is...hmm. What is a penetration ratio exactly? This one threw me. I'm familiar with penetration rate, which would be the proportion of accounts that were worked out of the total dialed. So in other words, if penetration rate is 33%, we worked a third of the accounts. The ratio seems less straightforward. I think to describe it, it would be something like "if the penetration ratio is 3, it means we dialed 3x more accounts than we talked to." This seems unnecessarily complicated. Let's see if we can make some changes to how we show this data to make it more straightforward. Oh, and let's use that cool idea that I picked up from Mike M, too.

First, I'm going to remove the secondary y-axis on the right side of the graph and the data (Penetration Ratio) that goes with it. That gets us a simple two-series bar chart:

Area to highlight line_2.png
 

In the above, we see accounts worked (teal) and total dials made (grey). Dials made is the sum of accounts that were worked and those that weren't reached. So I'm going to change this data slightly—from dials made in grey to those not reached—and stack the bars on top of each other.

Area to highlight line_3.png
 

We can get the same information out of the view above as the previous one: we can see total dials made (overall height of bars) and within that, the portion that were worked and the portion that were not reached. Notice that because worked series is on the bottom of the stack, we can easily see how it has varied over time. Total dials made have decreased over time, so has the number of accounts we've worked. But are we working a lower proportion of total dials now than we have historically? It's hard to tell here. Let's shift to 100% view to answer that question:

Area to highlight line_4.png
 

With the 100% stacked bar, we lose the context that overall call volume (total dials made) has decreased over time. But that's ok, because we know it now, so we can state it in words: "Call volume decreased 47% over the course of the year." With the 100% view, we can see that the proportion of accounts that we are working has decreased recently. So in spite of reduced call volume, we are reaching a lower proportion of accounts. Interesting. Perhaps we can make that a little easier to see?

Let's remove the space between the bars and turn this into an area graph:

Area to highlight line_5.png
 

Bingo! With this view, we can see the proportion of accounts that were worked out of the total dialed. The white line separating the teal from the grey now represents the penetration rate. We can make this clear by adding some text and calling out the most recent data point:

Area to highlight line_6.png
 

I might add a headline that says something like, "Despite decreasing call volume, penetration rate hit a 12-month low in December." And like that, we've used an area graph to highlight a line.

What do you think? Do you like this approach? What might you do differently? Where else could an approach like this work? Leave a comment with your thoughts!

You can download the Excel file with the above visuals.

our tools don't know the story

A question that frequently arises in our workshops is “What tools do I need to tell stories like you do?” Many are surprised to hear the answer: we’re tool-agnostic. Rather, the concepts we teach are universal. No matter if you’re using Excel, Tableau, PowerBI, R, SAS, or something else, the tools themselves don't know your data, your organization, or your audience like you do. That’s where an analyst adds value by bringing the data & its underlying story contextually to life.

Today’s post was inspired by a real-world makeover of data originally created in a tool highly regarded for data visualization. The client was visualizing advertising data across multiple countries. Their initial visualization looked similar to the one below. (Note: I’ve anonymized the data to preserve confidentiality).  

scatterplot1.png
 

This chart shows two dimensions of advertising effectiveness: reach (how many users saw an ad) and engagement (how many users clicked on the ad) across several countries (United States, Germany, Great Britain, China, and Brazil). The higher these numbers, the better.

Upon further exploration, we see that the magnitudes of reach and engagement are very different across countries. In China, 52% of users were reached compared to 68% in Brazil. With engagement, the magnitude of the difference is even more pronounced: China’s engagement is 6%, half of Brazil’s 12%.  

Imagine yourself as a decision maker tasked with determining an action plan from these results. If the analyst presented you with the visual above, what conclusions might you draw?  An informal poll of readers might return multiple answers, which demonstrates the danger of letting our tools "tell the story" for us.

Don’t assume two different people looking at the same graph will come to the same conclusion. Add value by highlighting key takeaways for your audience.

An important distinction made in the book, storytelling with data, is the difference between exploratory and explanatory analysis. Exploratory analysis is what we do to find interesting things in our data. For example, the analyst might have asked many questions during the exploratory phase, including (but not limited to):

 

1. How have these metrics changed over time?
2. Are there geographical differences when drilling down by country?
3. What is the revenue impact of this data?
4. Are there noticeable patterns in users’ behavior that can be used for predicting next quarter’s results?

After exploratory analysis, then we move to explanatory analysis. Explanatory analysis is where we take the interesting thing we found via exploratory analysis and communicate it to our intended audience. In explanatory analysis, often times that requires creating a different visual or using a different tool than we used in the exploratory phase.

Let’s assume that what’s relevant in this data is the varying levels of reach & engagement and therefore, each country needs its own strategy for next year. If that’s the interesting conclusion, how might the analyst communicate this? One option is to use the initial design and visual cues like color and annotations to focus attention appropriately:

scatterplot2.png
 

In this version, I’ve preserved the horizontal bars, sorted by reach in descending order, and decluttered by removing the border and grid lines. While this a step in the right direction, it still takes a lot of work to read all this text and mentally process the different takeaways:

 

1. Low engagement/high reach
2. High engagement/high reach
3. High engagement/low reach
4. Low engagement/low reach

Perhaps a different visual would make this more visually apparent. Since these takeaways fall into four quadrants, a scatterplot is another alternative:

scatterplot3.png
 

We now have a visual with a well-labeled construct on how to interpret the data. The categories on the axes (Many/Few, Low/High) help the audience understand the range of values and where each country falls on that range. For further reading on the importance of categorization, check out this post.

Finally, I’d add back the color & annotations, while being thoughtful about how the audience will intake the information. For example, in a setting where only the Brazil team is present, I might focus attention only on their data:

scatterplot4.png
 

Or the European countries, where the results are mixed:

scatterplot5.png
 

Both views would be important considerations in knowing the audience for our explanatory analysis.  

Scatterplots are often used with scientific data, but in this case work well for visualizing categories of differing takeaways. This works because of the additions of text and categorization, which helps the audience process the information. Remember, never make your audience do more work than necessary to understand a graph!

If all the takeaways need to be on one view, I can still leverage the scatterplot while being strategic about the use of color to focus attention appropriately.  

scatterplot6.png
 

In conclusion, there’s a huge difference between simply showing data from the exploratory phase vs. using data tell the a story in explanatory analysis. Check out the difference between the remade view above vs where we started:  

scatterplot7.png
 

We have a wide disposal of great tools for visualizing data, but our tools will never know our data’s story like we do. We can add value to our roles and our organizations by bringing the story to life.

If interested, you can download the Excel file with the above graphs.

Want more on story? Check out Episode 2 of the SWD podcast, where Cole discusses her thoughts on, "What is story?" She makes a distinction between story with a lower case 's' (the takeaway, or the so what—the way "story" was used in this post) and Story with a capital 'S,' which has a shape (plot, twists, ending—a narrative arc). Also stay tuned for the next post here, where Cole will recap and share the 75+ annotated line graphs received in response to the latest #SWDchallenge.

Update from Cole: We have a couple of additional views to share based on reader comments. First, the following view is similar to the final visual above, only with text moved out of the graph itself to the side. 

Scatterplot - words on side.png
 

This next graph was created by Daniel Zvinca, which follows what he calls his "obsessive concern for a flexible design."

Scatterplot alternative_DanZvinca.png
 

He notes that he preserved color for potential additional enhancement and outlines the following benefits of this view:

  1. More metrics can be added or just one can be used (works fine for 1, 2, ...5 metrics).
  2. More countries can be added. When number is higher, gridlines ever 5 countries or so would help localize the associated values.
  3. Any metric is clearly encoded/decoded and can be used for sorting.
  4. Comments do not require special care, they never overlap (unless they are too long).
  5. They can be defined for several performance levels (e.g. Likert scale intervals). For purpose of this design, bad=dark background, good=light background.

Nice idea, Dan, and thanks for sharing! Thanks also to everyone who has commented and contributed to the discussion, both here and on other posts.


Elizabeth Hardman Ricks is a Data Visualization Designer on the storytelling with data team. She has a passion for helping her audience understand the ’so-what?’ Connect with Elizabeth on LinkedIn or Twitter.  

how we position and what we compare

When visualizing data, one piece of advice I often give is to consider what you want your audience to be able to compare, and align those things to a common baseline and put them as close together as possible. This makes the comparison easy. If we step back and consider this more generally, the way we organize our data has implications on what our audience can more (or less) easily do with the data and what they are able to easily (or not so easily) compare.

I was working with a client recently when this came into play. The task was to visualize funnel data for a number of cohorts. For each cohort, there were a number of funnel stages, or “gates,” where accounts could fall out: targeted, engaged, pitched, and adopted. Each of these stage represents some portion of those accounts that made it through the previous stage. In this case, the client wanted to compare all of this across a handful of cohorts and regions. Here is an anonymized version of the original graph:

 
Cohort Analysis 1.png
 

There are some things I like about this visual. Everything is titled and labeled. So, while it takes a bit of time to orient and figure out what I’m looking at, the words are all there so that I can eventually figure this out, helping to make the data accessible. But when I step back and think about what I can easily do with the current arrangement of the data, there are a number of limitations. Let’s consider the relative levels of work it takes to make various comparisons within this set of graphs.

The easiest comparison for me to make is looking at a given region within a given cohort and focusing on the relative stages of the funnel. For example, if we start at the top left, I can easily compare for the Q1 Cohort in North America the purple vs. blue vs. orange vs. green bar. This is because they are both (1) aligned to a common baseline and (2) close in proximity (directly next to each other).

The next most straightforward comparison I can make is for a given stage in the funnel, I can compare across the various regions for a given cohort. So again, starting at the top left, I can compare within the Q1 Cohort the first purple bar (Targeted in North America) scanning right to the next purple bar (Targeted in EMEA), and so on. They are still aligned to a common baseline, but in this case they aren’t right next to each other (I’m inclined to take my index finger and trace along to help with this comparison). This is a little harder than the first comparison described above, but still possible.

The next comparison I can make—and this one is quite a bit more difficult—is a step in the funnel for a given region across cohorts. Again, starting at the top left, I can take that initial purple bar (Targeted in North America) and now scan downwards to compare to that same point for the Q2 cohort and the Q3 cohort. This is harder, because these bars are not aligned to a common baseline and they are also not next to each other. I can see that the bottom leftmost purple bar is bigger than the ones above it. But if I need to have a sense of how much bigger, that’s hard for me to wrap my head around. The numbers are there via the y-axis to make it possible, but it means I'm having to remember numbers and perhaps do a bit of math as I scan across the bars, which is simply more work.

And if we step back and think about it… comparisons across cohorts… this is actually potentially one of the most important comparisons that we’d like to be able to make! Visualizing and arranging our data differently could make this easier.

Perhaps it’s just me (and this really could be the case), but when I think of cohort analysis, it actually reminds me of my days in banking (a former life) and decay curves, and when I think of “curves,” it makes me think of lines, which makes me want to draw some lines over these bars… Actually, let’s try that. Here’s what it looks like if I draw lines over the bars in the first graph (Q1 cohort):

 
Cohort Analysis 2_short.png
 

While I’m at it, I might as well draw lines across the other graphs, too:

 
Cohort Analysis 3.png
 

And now that we have the lines, we don’t need the bars…

 
Cohort Analysis 4.png
 

The bars would have likely been too much to put into a single graph. But now that I’ve replaced what was previously four bars with a single line—thus remaking my original 16 bars in each graph into 4 lines, or if we multiply that across the three graphs, I’ve turned 48 bars into 12 lines—those, I can potentially all put into a single graph. It would look like this:

 
Cohort Analysis 5.png
 

While it’s nice to have everything in a single graph, those lines on their own don’t make much sense. Next, I’ll add the requisite details: axis labels and titles so we know what we’re looking at.

 
Cohort Analysis 6.png
 

Note that I didn’t have space to write out “Targeted,” “Engaged,” “Pitched,” and “Adopted” for every single data point. Instead, I chose to use just the first letter of each of these along the x-axis, and then I have a legend of sorts below the region that lists out what each of these letters means. This may not be a perfect solution, but every decision when we visualize data involves tradeoffs, and I’ve decided I’m ok with the tradeoffs here.

You’ll perhaps notice here that I haven’t labeled the various cohorts yet. With this view, I could focus on one at a time (calling out either via text or my spoken narrative if talking through this live to make it clear what we are focusing on). For example, maybe first I want to set the stage and focus on the Q1 cohort and how it looked across the various funnel stages and regions:

 
Cohort Analysis 7.png
 

I could then do the same for the Q2 cohort (lower across everywhere: Is this expected? What drove this? My voiceover could lend commentary to raise or answer these questions):

 
Cohort Analysis 8.png
 

Then finally, I could do the same for the Q3 cohort (ah, now our metrics have recovered from their lows in the Q2 cohort and are now even higher than Q1, did we do something specific to achieve this? Looks like we targeted a higher proportion of the overall cohort, and it’s interesting to see how that impacted the downstream funnel stages):

 
Cohort Analysis 9.png
 

Note with this view, I could also focus on a given region at a time. For example, it might be interesting to note that these metrics are lower across all cohorts in North America compared to the other regions:

 
Cohort Analysis 10.png
 

Or the spread in APAC across cohorts might be noteworthy, as it’s the largest variance across cohorts compared to the other regions:

 
Cohort Analysis 11.png
 

This piece-by-piece emphasis could work well in a live presentation. But in the case where this is for a report or presentation that will be sent out where we’d likely have a single version of the graph (vs. the multiple iterations that can work well in a live setting so you can focus your audience on what you’re talking about as you discuss the various details), I’d venture to guess that the most recent cohort (Q3) is perhaps the most relevant, so let’s bring our focus back to that:

 
Cohort Analysis 12.png
 

Within the Q3 cohort, we may consider emphasizing one or a couple of data points. Data markers and labels are one way to draw attention and signal importance. If I put them everywhere, we’ll quickly end up with a cluttered mess. But if I’m strategic about which I show, I can help guide my audience towards specific comparisons within the data. For example, if the ultimate success metric is what proportion of accounts have adopted whatever it is we’re tracking (I’ve anonymized that detail away here), I might emphasize just those data points for the most recent cohort:

 
Cohort Analysis 13.png
 

Given the spatial separation between regions, I don’t necessarily have to introduce color here. But if I want to include some text to lend additional context about what’s going on in each region and what’s driving it, I could introduce color into the graph and then use that same color schematic for my annotations, tying those together visually:

 
Cohort Analysis 14.png
 

Let’s take a quick look at the before-and-after:

Cohort Analysis 15.png

Any time you create a visual, take a step back and think about what you want to allow your audience to do with the data. What should they be able to most easily compare? The design choices you make—how you visualize and arrange the data—can make those comparisons easy or difficult. Aim to make it easy.

The Excel file with the above visuals can be downloaded here. I should perhaps mention a hack I used to achieve this overall layout: each cohort is a single line graph in Excel, where I’ve formatted it so there is no connecting line between the Adopted point for one region and the Targeted point in the following region. (It may be brute force, but it works!)

a tale about opportunity

One statement that I make often and emphasize repeatedly in my work is that when it comes to explanatory analysis, we should never simply show data; rather, we should make data a pivotal point in an overarching narrative or story. Today, I’ll take you through an example that illustrates this transition from showing data to using data to answer a question in a way that leads to new insight.

Let’s assume you work for the pharmaceutical company, Gleam. At Gleam, you focus on Product X (common abbreviation: PX), a medication for Aglebazoba (this is a real example, but I’ve anonymized it and had some fun with the names to preserve confidentiality—these names sound like a foreign language because that’s how pharmaceutical naming sounds to me!). You’ve been tasked with providing an update on Product X’s penetration in the marketplace.

After considering this for a bit and discussing with some colleagues, you decide there are two important things to consider. First, the disease doesn’t affect everyone equally. Rather, diagnoses tend to be classified by severity into Mild, Moderate, and Severe. So you decide that categorizing the data in this way will make sense. Second, when thinking about how to measure penetration, you decide that the population of those diagnosed with the disease is the most straightforward way to quantify the potential market currently. Given these considerations and the data you have on hand, you create the following visual.

Opportunity1.png
 

This graph looks pretty good. The design is clean, everything is titled and labeled. Severity increases as we move up the graph, which makes sense. N counts were included to tell me how many people each bar represents. Color has been used sparingly to focus the audience's attention, with words at the top to tell them why they should focus there. Let's consider the takeaway highlighted here: a greater proportion of Moderate patients are taking PX compared to the total diagnosed with Moderate severity Aglebazoba. That's interesting. But does it answer the question we set out to?

In the above, we're graphing the % of total across two categories: (1) total patients diagnosed and (2) total patients taking Product X. But what if rather than severity as a % of total, we make severity the primary category and within that look at those taking the drug out of total diagnosed? I'll do this in the following step, and will also switch from graphing percents to graphing the absolute numbers (we'll incorporate the percents back in momentarily). 

Opportunity2.png
 

In the above view, the overall length of the bars represents the total number of patients diagnosed with Aglebazoba. The blue portion represents those taking Product X. If percents are important, we could add labels on the blue bars. I'll do that in the next view. Note now that this isn't % of total taking Product X, but rather the % taking Product X out of the total diagnosed with the given level of severity.

Opportunity3.png
 

So 35% of those diagnosed as Severe are taking PX, 61% of those with Moderate severity are taking PX, and 23% of those with Mild severity take the drug. Note that we can see the same thing here that was highlighted in the original graph: a higher proportion of those with Moderate severity are taking Product X compared to the other severity levels. But with this view, I can also see something new: opportunity. The blue portions of the bar represent those currently taking PX. Which means the grey portions of the bar represent those who aren't currently taking Product X... but potentially could be. Let's show this as empty space to be filled in:

Opportunity4.png
 

Now I can see the opportunity. But let's emphasize that even more, via darker, thicker lines:

Opportunity5.png
 

When I look at the above, the labels in the blue portion of the bars seem to be competing for attention with the opportunity in white. That's an easy fix: let's label the white portion instead.

Opportunity6.png
 

I recognize I may be bothering some people when I graph absolute numbers and label with percents. If you fall within that camp, we could address by taking the percents out of the graph...

Opportunity7.png
 

...but then tie the percents back in when we put all of the words around the visual to help make sure it makes sense to my audience and that they focus on the takeaway that I want them to. I see this as a tale about opportunity. Let's use words to make that point clear to my audience:

Opportunity8.png
 

After you've created a graph in response to a question, consider that question again. Too often, I find that we stick with the first way we aggregate the data and first view of it that we land on. It's easy to provide data that is relevant to a question without actually answering the question. If we step back and think about what sort of tale we can use the data to tell—is it a success, a failure, a call to action, or, as we've seen here, a tale about opportunity—it may reveal new ways to aggregate or visualize the data that will help you help your audience understand something new.

If interested, you can download the Excel file with the above visuals.

10/31 update: A couple people have commented that the tendency is to want to tie the blue percents in the text to the blue portions of the bars in the final iteration above, which is confusing. This is a great point (that's the Gestalt Principle of similarity, by the way, that makes us want to connect similar elements, like things that are colored the same). I've made an update to outline the opportunity in black and use black for those percents instead, as a way to visually make a distinction between the blue (people taking PX) and black (opportunity: those who aren't but could be taking PX) and tying the black portion visually to the percents in text through similar use of color. See below for the updated version. I think this resolves that prior confusion—let me know what you think!

Related thought: this is a great example of why it can be useful to seek input from others on our visual designs. When we get familiar with our data, we know intuitively how we want others to look at it, but this isn't necessarily how they will. Soliciting a fresh perspective is a great way to see our data through someone else's eyes and learn from this how to potentially further improve or refine our approach. Thanks for the feedback!

Opportunity9.png
 

novel vs. the boring old bar chart

Often, to kick off a workshop we’ll do a quick round of introductions, where I ask participants to tell me something they are hoping to learn over the course of the day. It is not uncommon for someone to respond with something like, “I want to learn some new exciting ways to show data so I’m not just using boring old bar charts all the time.” I jot down a note, silently challenging myself to convince them otherwise over the course of the following hours.

This happened just last week, where a participant voiced a wish to learn novel ways to show data. I can understand this desire. But it’s not the graph that makes the data interesting. Rather, it's the story you build around it—the way you make it something your audience cares about, something that resonates with them—that’s what makes data interesting.

We circled back to this novel-ways-of-showing-data idea later in the workshop when looking at some of the team’s specific examples. I want to share with you the makeovers and discussion; it was another important reminder to me that simple often beats sexy. A “boring old bar chart” can get the job done—and even end up being people's preference.

The team was looking at some market research data, wanting to compare their company to their main competitor across a few dimensions. They originally visualized this with a connected scatterplot. It didn’t work well. People found they were having conversations about how to correctly read the graph for too long before they were getting to the point where they could really look at the data and see what they might say with it. I won’t go through the work of fully recreating it here, but to give you a sense, I’ll do a quick sketch (all data has been changed and scenario generalized to preserve confidentiality):

Connected scatter 1.png
 

There were two main points to make with this data:

1.    The company was doing better than the competitor across all areas except ATTRIBUTE 1.

2.    The company was beating their target across all areas except ATTRIBUTE 5.

This seems pretty straightforward, right? You can get to these takeaways through the above visual, but there are improvements we can make to it and other potential views that could also work. I originally thought we should look at two alternatives: (1) a dot plot and (2) a slopegraph. It was actually Elizabeth on my team who added a third option—the “boring old bar chart”—into the mix (I'm glad she did!). Let’s take a look at each of these.

First, the dot plot. This was mainly an attempt to improve their original visual with a similar view and some slight modifications. I don't use these super often, but have found there are some good use cases. I thought this would be an appropriate scenario for it. The base visual looked like this:

Sexy vs boring 1.png
 

Rather than connecting the dots downward, as in the original connected scatterplot, which makes the primary comparison how OUR COMPANY is doing across the various attributes, the horizontal lines here draw the eyes from left to right. This makes the primary comparison OUR COMPANY vs. the COMPETITOR, which seemed to be the main point here. 

Next, let's apply the brand colors:

Sexy vs boring 2.png
 

Now that I've incorporated color, I can vary intensity to emphasize certain points. For example, we could first draw attention to ATTRIBUTE 1, where OUR COMPANY scores lower than the COMPETITOR:

Sexy vs boring 3.png
 

I could also add in a marker and text designating the target and use the same strategy to draw attention to ATTRIBUTE 5, where we score below TARGET:

Sexy vs boring 4.png
 

I actually thought the dot plot worked well. But I wanted to show some alternatives. Slopegraphs can sometimes be a good way to visualize group comparisons, like we have here. By putting the COMPETITOR at the left and OUR COMPANY at the right, the relative slopes of the lines give a sense of how we're doing across the various attributes compared to the competition. Where the line slopes upwards, OUR COMPANY is outperforming the COMPETITOR and vice versa.

Sexy vs boring 5.png
 

I could emphasize just the ATTRIBUTE 1 line to draw attention to the one area where we score lower than the COMPETITOR. Note that with the slopegraph, since I already have the clear spatial separation between COMPETITOR at the left and OUR COMPANY at the right, I don't need to introduce color as a means of telling the groups apart (vs. in the dot plot, where OUR COMPANY was left of COMPETITOR for ATTRIBUTE 1, but right for all the others, so we need some other way to distinguish one from the other). Here, I can instead use color for emphasis, or simply keep everything grey and use intensity to draw attention to where I want my audience to look.

Sexy vs boring 6.png
 

Similar to the dot plot, I can also add a symbol and text showing where the TARGET is and drawing attention to ATTRIBUTE 5, where we fall below it.

Sexy vs boring 7.png
 

Dot plots and slopegraphs aren't anything crazy. But they aren't as popular or well-known as traditional lines bars, and pies, which means they sometimes carry that novel appeal that those wanting something fresh desire. 

Next, let's look at the data in a bar chart:

Sexy vs boring 8.png
 

As with the slopegraph, position distinguishes for us OUR COMPANY vs. the COMPETITION (the former is always first, the latter second) so we don't necessarily have to introduce color here. Instead, we could use intensity—pushing some elements back by lowering intensity and drawing others forward via higher intensity—to focus our audience's attention. We might focus first on ATTRIBUTE 1:

Sexy vs boring 9.png
 

As in the other views, we can incorporate the TARGET into the visual and draw attention to ATTRIBUTE 5, where we missed it. Let's try a different view of the TARGET this time, using light background shading to illustrate the region where we are above target and emphasizing ATTRIBUTE 5, where we fall short:

Sexy vs boring 10.png
 

There's no "right" answer in terms of how one should display this data. Any of these visuals could work. The different views let us more or less easily see different things. Let's look at the base versions (without any emphasis) side by side:

Sexy vs boring all.png

I showed a similar side-by-side after we discussed each of the options during the workshop and asked people to vote which they liked best. Remember, this was the audience who said they were seeking novel approaches. Which did they choose? I'm sure you've guessed it—the "boring old bar chart."

To take the next step and put words around it so my audience knows why they are looking at this data and why they should care, we could do something like the following:

Sexy vs boring 11.png
 

Meta-lesson: novelty may not be the best goal. Bars don't have to be boring when you've used them to help make the data accessible and made it clear to your audience why they should care.

What do you think? Which view of the data do you like best? Why? Leave a comment with your thoughts!

If interested, you can also download the Excel file with the above graphs.