Better Timeseries

Better Figures was very happy to be invited to the Scott Polar Research Institute, to give a talk on data visualisation to the UK Polar network of early career scientists.

The talk leant heavily on the advice from The Visual Display of Quantitative Information, by Edward Tufte.

One of the original bits however, was a sequence of diagrams suggesting ways to make a timeseries easier to understand. The principles we use are set out briefly in the talk:

  1. Remove non-data ink (or pixels), where possible.
  2. Don’t make the reader work too hard.
  3. Design is choice: what is it you are trying to say?

The dataset is “ldeaths” – monthly deaths due to lung disease in the UK in the late 1970s.  The dataset is one that ships with R, and is available by just typing “ldeaths” into the console. To see other avalilable datasets, just type data() into the command line.

We start with the default output from R:

ldeaths1

Monthly deaths due to lung disease in the UK. The dotted line shows number of female deaths, the dashed line shows male deaths, with the total shown by the solid line.

We’ve added a caption to help identify the lines. the dataset “ldeaths” only gives the total, so we’ve added “mdeaths” (males) and “fdeaths” (females) to the plot. There is clearly a strong seasonal signal, but can we make the plot any clearer?

First, we’ll fix the aspect ratio. Tufte suggests that the average slope of a timeseries should be around 45 degrees, but there are clearly choices to be made here. For example, many timeseries can be fit on a page for comparison, if the aspect ratio is higher.

ldeaths2

We’ve also added a legend in the top right, although it is tricky to quickly identify which label belongs to which line. It might be clearer to label the lines directly:

ldeaths3

We’ve also rotated the y-axis labels, removed the distracting box, and tried to reduce the impact of the (unimportant) axis lines.

Other Design Choices

The same trick with the palettes can be used if you want to highlight some aspect of the data – perhaps using a restricted palette of colours in order to draw attention to something:

ldeaths4

Finally, we notice that the timeseries can be simplified. As the “Total” is just the sum of the “Males” and “Females”, we can get rid of one of the lines altogether. The plot might be better as a stacked timeseries:

ldeaths5

This visualisation obviously owes a great deal to the marvellous baby name wizard. And yes, we have chosen those colours on purpose, as pink is a nice strong shade for boys. It also serves as a reminder that the palette you choose will have cultural baggage, so best to take that in into account from the start.

How would you change the figures? Suggestions for improving the plots are, as always, very welcome.

UPDATE 20/09/2013

After comments from Lucia, this post was edited sightly (added subheading “Other design choices”) to make it clear that the last two plots are not necessarily an improvement, but might highlight different aspects of the data.

About these ads

7 comments

  1. To improve the graph: Avoid stacking. Stacking makes it difficult for people to quantify. More here.

    http://rankexploits.com/musings/2013/what-i-hate-about-this-figure/

    (BetterFigures edit Hmm, having problems with the link. Apologies.)

  2. Make the y axis start from zero?

    1. They all start from (contain) zero, but the default R setting is to add a little bit to the range. I’ve overridden that on the last plot.

  3. That’s a lousy default! Glad I don’t use R.

  4. Graph 3 is probably the best. Stacking two graphs is only useful when what you care about is the total and one of the two categories (which you show at the bottom). In this case male and female are equally “important” so the stack makes no sense.

  5. FWIW I agree with Lucia that stacked graphs are generally crap, for the reasons she expresses.

  6. Diogo de Gusmão-Sørensen · · Reply

    Doug, I love it, well done. It is without a shadow of a doubt a BETTER timeseries.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: