Visualising Ensembles

In climate science, an ensemble is the term for a collection of climate model runs. The idea is that each run is different in some cruicial factor; the ensemble gives you a handle on how that factor influences the way the model behaves.

The crucial factor that is different might be a number of things:

  1. the way the model is forced (for example with different human emissions of greenhouse gases)
  2. the way the processes work within the climate model (parameterisation constants, for example)
  3. the initial conditions – the state of the world at the beginning of the simulation
  4. the model itself (many groups build climate models, they all behave slightly differently)

Ensembles are often used to try and estimate the uncertainty in future climate behaviour. For example, we can create a number of different setups of the model which are all consistent with our uncertainty about the way the Earth system works. We then run them, and look at the way they evolve through time.

We suddenly have a large collection of plausible ways in which the Earth system might evolve: this creates real challenges for visualisation. We have to represent to represent the behaviour of many versions of a 3-dimensional world, through time, and often on a 2-dimensional surface such as a computer monitor.

Here is one of my efforts at visualising an ensemble.


The graph shows a collection of simulations of river discharge, through the 21st Century. Each row represents a river, organised by latitude from North to South. The black dot at top of each row represents the discharge of that river in the year 2000: the dot at the bottom of the row is the discharge at 2100.

The red dot is the observation of the river discharge in the year 2000, for reference. The lines show the evolution of the river discharge through the 21st century, for each of 17 members of a perturbed physics ensemble. This is a collection of slight variants of a single climate model, run under a single greenhouse gas forcing scenario. Don’t read too much into the data itself – it is an unpublished data set, that I’ve used for demonstration purposes only.


In this visualisation, I’ve been forced to make choices, in order to highlight certain things.

First, I’ve taken the unusual step of having time run downwards, instead of the more common left to right. This was to preserve the North-South ordering of the rivers, in order to help the viewer place the rivers on the globe. It also makes it much easier to read the names of the rivers.

Second, I’ve chosen to show the absolute magnitude of the river discharge, on a linear scale. I often get frustrated seeing only anomalies (differences from the mean) plotted in simulations of the future: it is usually appropriate, and nicely shows patterns of change in the future. Unfortunately, it can give a false impression of the accuracy of climate models  – people can be surprised at the size of the systematic differences between the models and observations (termed biases). Usually, these biases don’t affect the future behaviour of the system, but I think they should be shown as a matter of course.

Further, if I only plotted an anomaly, you wouldn’t get an idea of the magnitude of the river discharges, and the size of the projected changes, relative to that.

I imagine that this figure would be useful in a scientific paper, or presentation, in order to set the projected changes in river discharge in context. Is it successful? After being so close to the data, I find it difficult to judge if the figure is easy to get.

Suggestions welcome – as ever, please read the comments policy.



  1. Thanks to Paul Matthews (@etzpcm) for sending me this comment earlier.

    “A lot of them are bunched over to the left so you can’t see much. Try it with a log scale?
    Also I’d suggest that it’s more usual to have time running up rather than down – certainly is in my field.

    The figure suggests that the increase is linear from 2000 to 2100. Is that a reasonable approx of what the models do?

    It’s not easy to ‘get’ instantly, but as long as there’s a clear caption to explain it I think it’s fine.

    I agree it’s good to avoid ‘anomalies’ where possible. The term suggests something ‘anomalous’ is going on and there is always the problem of the arbitrary baseline.”

  2. I like this diagram. First, as a thing to look at: it reminds me of a book on fortune telling where the future could be seen in the way that a handful of cast matchsticks fell. I also like the way that you can easily see the overall tendency of the models as well as the biases and behaviour of each one. It would be nice to have some idea of the variability in discharge, but I appreciate that’s probably not the purpose of the plot.

    The Amazon and Congo make it quite hard to see what’s going on with the other rivers because they all get squished over on the left. I’d try (try, mind – I have no idea if this would work better or not) normalising each river by dividng the model output by the observed value and show the absolute discharge as a wiggly line running top to bottom.

    I did wonder why the the legend was split in two with black and red dots explained at the top and the time axis explained at the bottom. I missed the time axis the first time I looked at the diagram. It would be neater to have the whole thing explained in one place at the top.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: