Earlier this week, Google launched an exciting new tool: the Books nGram Viewer for visualizing how the occurrences of phrases in books have waxed and waned over the years. The viewer sits on top of a dataset of 500 billion words from 5.2 million books in Chinese, English, French, German, Russian, and Spanish, with phrases up to five words and a count of how many times the phrase appears each year (a subset of Google's 15 million digitized books worldwide).
Play around with it, and you'll see firsthand how a clean and simple visual can allow you to understand a massive amount of data in seconds and use that data to start to create and tell stories.
Say, for instance, that I want to understand the varying popularity of my personal favorite amusement park ride (the Ferris Wheel) in English literature over the years. For a point of comparison, I'll also plot my least favorite amusement park ride (the rollercoaster). Here is the visual:
Ferris Wheel vs. Rollercoaster popularity over time
We see both rides beginning to be captured with the written word in the 1930s. The Ferris Wheel has had several relative rises and falls in popularity since then, with (sadly) a continued decline since the mid-1990s. The popularity of rollercoasters, on the other hand, was initially slow to build, but then overtook the Ferris Wheel around 1985 and has skyrocketed in comparison since that time. Based on this visual, my affection for Ferris Wheels puts me in a dwindling minority, while rollercoasters are rapidly gaining in popularity.
As a reminder on the importance of context, let's add another series. If you enjoy Ferris Wheels like I do, you may know that the first one was built for the World's Columbian Exposition in Chicago in 1893 and that it was intended to rival the Eiffel Tower, that had been built for the Paris Exposition 4 years earlier. Let's check out what happens if we plot mentions of the Eiffel Tower in English literature on our chart:
Eiffel Tower mentions dwarf Ferris Wheel and Rollercoaster throughout history
As I called out in the chart title, mentions of the Eiffel Tower dwarf our initial two series. Also note that, whereas we see mentions of the Eiffel Tower pop up immediately following its unveiling, the Ferris Wheel took a little longer to make its way from the World Expo to the written word.
Just starting to play around with this sparks more interesting questions: what led to the bumps in the Ferris Wheel's popularity? What genre of novels most mention the Eiffel Tower - romance? history? Google gives us the ability to dig to our heart's content by making the full datasets freely downloadable as well.
What stories might you use nGram to tell?