Monday, March 26, 2012

visualizing survey data

Having recently wrapped up our annual employee survey, we are in heavy survey-analyzing mode in my day job at the moment. This may sound strange, but I enjoy this immensely. I love working with masses of numbers and comments that can seem at the onset overwhelming and cumbersome and teasing a story out of it, with clear insights and areas to act upon and impact.

Since survey data is abundant in many organizations and good ways to show it are not always clear, I thought it might be useful to share a couple of genericized examples of how I've been visualizing some of this data.

The following examples are based on survey data collected on a likert scale that are grouped into three categories (in decreasing order of agreement): favorable, neutral, and unfavorable. Note that these are two different examples - they are not based on the same source data.

Example 1: summarizing responses
The following visual shows the breakdown of survey responses by % favorable, % neutral, and % unfavorable. In this example, related survey items are grouped into categories. The theme score represents the average across the various items in a given category. The story I ended up wanting to tell here was focused on % unfavorable, so I've organized the items within each category in order of descending % unfavorable.



Example 2: comparing to peer groups
The following visual shows the theme % favorable for the group of interest (blue markers) against the range of % favorable scores across the same categories for a peer group (grey bars). As in the last example, the theme scores are averages across a group of related items, but this same approach could be used to show specific survey items as well.


The visuals alone don't get you the whole way there. The context that you can bring as you analyze the data and pull in related information is where the story gets created. Data in a vacuum is difficult to interpret: it's the context that will help bring it to life and help your audience make sense of it. Some things to consider along these lines:
  • Do subgroups within the data you're summarizing feel the same? Are there any interesting outliers worth mentioning?
  • Are there any useful comparisons to other groups that could aid in the interpretation of the results? 
  • Is there qualitative data (for example, open text comments) that can be pulled in to help bring the data to life?
  • Have any specific actions been taken that are impacting the results? If so, describe them and the impact they have on what the data shows.
For example, here's a (highly genericized) version of the final "story" I formed around the peer group comparison visual above:


I tried to connect the story to the data visualization via the colored number markers, but am on the fence on whether I like this approach. There's no question that this leads to a pretty packed visual. Maybe too packed?

In case you're interested in taking a closer look, here is the excel file (examples 1 and 2) and here is the power point file (story). Let me know what you think!

10 comments:

  1. Glad you did a post on survey data! I work with survey data almost exclusively and can say its both exciting and frustrating from beginning to end.

    I agree with your overall approach, and like the visuals. I do struggle a bit with your second graph though. I like the addition of a reference point. But I'm having a hard time easily pulling the story out of the data.

    Also, would love to get your thoughts on building a story around different points in time. If I'm a manager of sub-group A, its important to know how i rank against my peers, but equally important to know how my dept did compared to last year. Perhaps category 16 was even worse last year, and while my team is still toward the bottom relative to my peers, we've actually improved significantly. This would turn from a story of "this needs immediate attention" to "look how far we've come in a year!

    The reason I bring it up is because I have been struggling with this idea of infusing current performance with relevant benchmarks while considering previous performance to tell the full story. Would love to get your ideas and insight

    ReplyDelete
    Replies
    1. Hi Matt! Yes, great point on the over time comparison. Survey data is tricky for that reason - there's the internal comparison (how is my group doing across the different items), the comparison to others (how is my group doing compared to peer groups or the overall organization) and the comparison over time (how is my group doing now vs. some time in the past). It's certainly challenging to bring it all together into a coherent visual and story. One way I've seen it addressed is to have columns to the right of the graph in example 1 that show the difference between the given group and peer groups or the given group at an earlier point in time. There are certainly other ways to address as well, but this is one idea. In general, I'd aim to devote the bulk of the visual to whatever the most important comparison is that you want your audience to focus on. Thanks for raising!

      Delete
  2. what about comparing vs last year?

    ReplyDelete
    Replies
    1. Hi Khallel - thanks for your question! See my response to Matt, which has some ideas for over time comparisons.

      Delete
  3. I guess I am a little bit confused about the breakdown of category theme and items. Shouldn't the category theme be an aggregate of the scores for all the corresponding questions?

    ReplyDelete
    Replies
    1. Hi Tony - yes, that's exactly what it is. Sorry if I wasn't clear. The theme score is the average of the individual item scores. For example, Category 3 Theme score = 62 = average(48,75,63).

      Delete
  4. It is always difficult to imagine a message if there are undefined categories only. But I don't care as I really like these 2 types of visual. I can imagine to use it for other type of enterprise data as well. Thank you

    ReplyDelete
    Replies
    1. Hi Lubos - Yes, I agree the story would be clearer with real categories instead of the "sanitized" version (and it is!) - but I had to do it that way to preserve the confidentiality of our internal survey. I'm glad you find the examples useful!

      Delete
  5. Cole

    Thanks for this. Some comments: On example 1 you've effectively inverted the scale, putting the negative on the right. I can understand that this enables you to highlight the positive but it was initially confusing.

    Another way is to treat the responses as either positive or negative and show them around a central, neutral axis. I wrote on this a year ago (http://www.organizationview.com/net-stacked-distribution-a-better-way-to-visualize-likert-data). They're often called bilateral bar charts. Adding sorting, for example by the number of positive responses, can add clarity.

    For your second example an alternative to % positive is use a net score - i.e. % positive - % negative. The advantage of doing this is to take into account the negative scores. An example is useful:

    Say you have two Categories; the first is polarising with 50% in favour and 50% unfavourable. A second has 40% favourable, 40% neutral and 20% unfavourable.

    Which is 'better'?

    With your example the first Category would be rated higher with 50%. In the 'net' example the second would be with a rating of 20%.

    The answer to my question depends on what you're looking for. I suggest that in most instances it is where there are strong feelings either way where attention is needed.

    Andrew

    ReplyDelete
  6. Great post Cole - the theme summary is particularly nice! One thing I found slightly unintuitive is recognizing that the items fall under the themes in the first chart - the highlighting and larger font helps, but at first glance it looks a bit like those are just notable items. I've found that increasing the row height helps identify these as categories more quickly, though this unfortunately makes it impossible to then use a single chart in Excel.

    ReplyDelete