Fun with Cholera!
This post looks at how data can be symbolised in different ways to make for more effective communication using data from John Snow’s mapping of the cholera outbreak in Soho.
Well there’s nothing fun about cholera, really. But there are some important lessons we’ve learned about spatial thinking from cholera, particularly with an outbreak in London in 1854 as well as with map making. Here’s another image of John Snow’s original 1854 map of cholera deaths in Soho as a reminder:
Snow’s map was an effective example of geographic communication, and the main purpose of this post is to play around with some different ways of portraying those data to get the message across. Let’s start with a digital version of the raw data:
(You can find a copy of these data on J:\Current_Projects\SnowGIS. These data were created by Robin Wilson)
On this map we have some points that represent the locations of deaths overlain on a portion of a UK Ordinance Survey map to give us some spatial context. The table shows that each point has a Count attribute that holds the number of deaths occurring at that location (minimum:1, maximum: 15). Showing just the points may suggest a bit of a pattern; it would be tempting from this to look somewhere roughly in the middle as some sort of focus (which is roughly where the pump was), but with each point the same colour and size, it implies a lower clustering than Snow’s map suggests. Perhaps we could better communicate that by showing them in a different way. Next we’ll use points but we’ll also use colour to highlight the number of deaths:
Here I used the graduated colours option under the Quantities option in the Symbology tab:
This is a slight improvement – but only slight, methinks. The colours blend in a bit too much against the OS map – it’s basically using the shade to highlight the number of deaths – the darker the colour, the higher the number. This is a common strategy – we naturally tend to associate a darker colour with more “stuff” and our eyes are usually attracted to the darker shades – we can use this to draw our map readers’ attention to those points we wish to highlight (though it might look a bit odd to use darker colours for the points with lower number of deaths, even if we were trying to highlight that) . I’ll try one more configuration of shade with a different colour ramp, green blending in to red:
Another slight improvement – the colours “guide” the reader in towards the centre, though some of the higher death points are getting a bit hidden, but there does seem to be a drawing of the reader’s eye in towards the centre. This green-to-red colour ramp is a commonly used one that takes advantage of the social engineering that we associate with those colours: green = good or go, yellow/orange = slow or caution and red = stop or bad. No death is a good thing so maybe the green’s not a great choice. Plus we may well be alienating our colour-blind readers. Perhaps a bit of trial and error with colours would get us a better result. (Mind you, we could also play around with the ranges of our categories, i.e. make the 9 – 15 class larger so that more points are coloured red.) One more quick tweak:
A very subtle change – I’ve just made the OS map 30% transparent (using the Effects toolbar) so that it fades into the background a bit, helping the points to stand out a bit more.
So far we’ve looked at changing colours but we can also play with size. Next I’ll use graduated symbols so that the size of the point scales with the number of deaths – the larger the circle, the more deaths:
Perhaps better – what do you think? Note that I’ve also tried to emphasise the negative nature of these data by using red. So, returning to Snow, he used stacked symbols to make his point – we can try a couple of similar techniques. From Properties > Symbology I’ve opted to use the Bar/Column chart with the Count attribute:
With this result:
Yikes! I think we’ve taken a few steps backward with this one! Each bar is associated with one of the points but so that things don’t overlap, the bars are shifted outward with callout lines connecting each to its location. In some contexts, this option would work fine, but here it just dilutes the story – the information has gotten diffused. Let’s try that again but this time we’ll untick the “Prevent chart overlap” box:
Well, we’re sort of returning to Snow’s original version here, though to my mind, it lacks the same impact of Snow’s map (some detail below):
We’d have to work hard to recreate this digitally, i.e. with the stacks oriented perpendicular to the streets. That’s not a trivial task without some scripting so we won’t go down that road here. We could add a bit of 3D depth to our bars here:
Neat, but does it really make it any more effective? I don’t think so. Let’s follow this 3D idea and try to visualise this in ArcScene:
Each point has been extruded based on the Count attribute (truth be told, it’s Count * 10 to make them large enough to see at this scale). Not bad, though this one is a bit more effective in ArcScene where you can actually move it around and zoom in and out. Nonetheless, it does help to highlight things a bit more.
Let’s look at one final option – heat maps. These are layers that allow us to map the density of point-based spatial phenomena. Another time we’ll cover “hot spot” analysis which applies a much more rigorous, statistical approach to clustering of points. I’ll switch gears here and use ArcGIS Online to do this next bit, mainly because it’s a lot easier to do there than from ArcMap.
Now I don’t know about you but, of all we’ve done in this post, I think this is the only thing that even comes close to bettering Snow’s map. This is the result of some statistical modelling of the point densities and helps to identify where the significant clusters are. There’s not doubt with this where the “epicentre” of the outbreak was, nor that it was related to the access people had to the pump via the street (an artifact of the original address-based data, really). This type of analysis is used for all sorts of things, from epidemiology (thanks for that, John Snow) to criminology to animal movements – perhaps we’ll cover those in more detail in a later post. Iinteresting to note that with all the digital tools at our disposal in ArcMap, it’s arguable if we’ve topped a hand-drawn map from 1854.
So we’ve had a brief tour through some of the tools of the trade of map making, from colours and shades, to size, to charts, 3D and finally hot spots. Map making is ultimately a form of communication, albeit visual, and just as we might put a lot of thought into the words we choose in an essay, it pays to think carefully about what you’re wanting to say, who you’re saying it to, and how best to say it. They say a picture is worth a thousand words – well perhaps a map is sometimes worth more than that.
C
All images by the author unless otherwise stated.
Crile,
Two things struck me as I read your great post:
1 – it is almost always better to form a hypothesis to test or plan what you’re trying to communicate before you fire up your computer or start mapping things. As you show it means you get to a great result faster with less revisions.
2 – be aware of your audience. You mention red and green in the context of colour blind viewers. The audience’s culture and level of education should also be considered so that the end product is not misunderstood. In China red infers good luck, in western cultures it infers danger. For many Muslims silver signifies death, black for western cultures, and white for Hindus.
Showing 11 different ways to map the same data it makes it easy for us, to see why one is SO much better than the others.
Thanks