Mapping the Census
In this post I’ll take about how we can make the most of the recent (and previous) census data and how to add spreadsheet data to a map.
So did you fill in your census form on Tuesday night? In the course of filling in the forms, you no doubt entered a lot of personal information about you and whoever you share a dwelling with. Those data provide the government with a hugh amount of demographic data on where people are, how much money they make, what level of education they have, whether they smoke or not, what religion they follow, and the list goes on. In between when you submit your forms and when the data are released, a subtle process of aggregation occurs. Your personal information gets aggregated into various levels so that the results can be presented in spreadsheets and maps. When the results are compiled they are made available in Excel spreadsheets from the Statistics New Zealand website. Your personal information can’t be found but you can see where you fit in with people who live around you. Here’s an example of the spreadsheet from the most recent census (2006):
This particular page is from the families spreadsheet (one of seven) and shows the breakdown of families on a region by region basis. Note the tabs at the bottom – each sheet is linked to area units against which the data are presented, from the regional scale (RC) to territorial authorities (TA) to wards (roughly suburbs) to area units (not quite sure what those are to be honest) and down to the finest resolution – meshblocks (MB). At the top level there are 15 regions; at the meshblock level there are 41,381! The intriguing thing about all this is that we should be able to map any of the census data to one of those spatial levels. But how do we do so? Spreadsheets aren’t maps. Happily, StatsNZ has made the spatial data behind those levels available for free so at the very least we can map the areas. But we need some way to link together the spreadsheets and the data layers. Here’s an image of the regions layer:
(This layer includes areas out to 23 km offshore.) To make this work, there needs to be a common attribute between the spreadsheets and the spatial layers. If you look at the spreadsheet image above, you’ll see the “Regional Council Code (2006 Areas)” at the left hand side. If we look in the attribute table for the regions, you’ll see there’s a similar number:
You’ll note that we actually have two attributes we can work with – “REGC_NO” and “REG_NAME”. So in theory we can use these common values to link together the data in the spreadsheet with the map layer. Before we can do so, though, we’ve got to make some changes to the spreadsheet before we add it to the map. To add data like spreadsheets to our map, the first row should consist only of headers (field names) and entries should have no spaces, so I’ve had to make some changes to that sheet. To keep it simple I’m just going to show the first few fields; total number of families and then the breakdown by family type:
So the first row contains the field names and there are no spaces in those names. I can now add the spreadsheet to my map just like any other layer.
Next step is to “Join” the spreadsheet table to the regions layer. To do so, I’ll right-click on the regions layer, and go to Joins and Relates > Join. I could use either the region number or the region name to link – doesn’t really matter in this case, but I’ll use the number here:
When I click OK, all the attributes in my spreadsheet table get added to my map layer and then I can use these new attributes to map the data:
We’ll go one step further with this mapping – the data also include the breakdown of families by couples with and without children, and one parent with children. I’ll show these as pie charts for each region but will zoom in on Canterbury to have a closer look (Look for the Charts option on the Symbology tab):
Note that I can set it so that the diameter scales with the total population, so Canterbury’s population being larger than the West Coast’s is reflected by its size and the rough breakdown between the different classes is shown by the pie chart.
So now, any data included in the census can be mapped reasonably easily at several different scales. Earlier I mentioned the meshblocks being the finest scale. Here’s an image of the Christchurch meshblocks mapped to show populations.
I’ve covered a lot in this post – but one important thing I’d like to highlight is how it’s possible to add spreadsheet data to your map. The most important element of doing this is having a common attribute that allows you to link together the features on the map with the records in your spreadsheet. Very handy, that.
If you’ve got explicit location data in your spreadsheet (e.g. northings and eastings or Lat/Long from GPS) then it’s a different process to map the points – something I can cover in another post.
Also note that the 2006 census data are freely available from the StatsNZ website. We have all the spreadsheets and spatial layers on the data server in J:DataCensusData2006. (Click here if you need to know how to connect to the J: drive.)
Crile