The Evolution of a Map
In the post, the ongoing process of developing a map with a colleague is documented.
For many people, GIS is all about the map. And fair enough, too. Why use GIS if you’re not interested in a map? The final destinations for maps are many-fold but their main intent is to communicate the results of some analysis visually. While we tend to focus on the map output, the process of developing a map can be quite interesting in its own right. I’m frequently in the position of working with someone to develop a map for a specific purpose and a back and forth, iterative process is, more often than not, the way it evolves. So in this post I’ll step you through how a colleague and I recently worked through a map from beginning to end – the subject being a map of domestic tourism guest nights for 2016.
The colleague in question is Jude Wilson from SSPRT. She rang the GIS hotline wanting to create a map for a report that showed the variation in guest nights for domestic tourists during 2016. My first comment: show me the data! Jude sent me a spreadsheet of what they were wanting to map. Data had been collected from tourism operators that broke down their guest nights by domestic and international tourists. One interesting challenge for this map was that the data were collected within RTO areas, which are areas in which regional tourism operators work. They are mostly composed of either regions or districts (TLAs or territorial local authorities). We don’t have an existing layer specifically for RTOs so my first task was to create a new layer that showed these areas.
Luckily, Bernie from Statistics NZ had done this within the spreadsheet. There was a column on one sheet that had the RTOs broken down and linked to (mostly) existing TLAs. Here’s a screenshot of part of that sheet:
We,ve already got a layer of the TLAs, shown below:
To convert this over to an RTO layer, I added a new text field to the attribute table called “RTO” (from the table menu > Add Field of type text with a size of 50 characters). Next, I used a Select by Attribute query to select all the TLAs that make up a each RTO (below is an example for the Northland RTO)
With those records selected, I then used the Field Calculator to add the RTO name to the right field for the selected records (being a text field I had to enclose the name in quotation marks):
I did this for each RTO. Once done, I needed to create a new layer that used the RTO names to group the districts together. For this I used the Dissolve tool. In essence, what this operation does is take all the records that have the same value in the RTO field and merges them together into a new polygon for each RTO. So now I’ve got the spatial layer I need for the map. In the process, the data went from 158 TLA polygons down to 32 RTO polygons.
Next I’ve got to link the polygons to the values to be mapped from the spreadsheet. There are two options for this: first, since I’ve only got 32 records, I could go through and add the data for each record one by one This is feasible but leaves the process open to human error, something I’m certainly capable of (as we’ll see). Another option is to use a table join to do this automatically since I’ve already got the data in the spreadsheet. I opted for this approach so as to reduce the possibility of me adding the wrong information in.
Back in the Excel spreadsheet, let’s look at those data – here’s a screenshot:
Column H has the data I’d like to map. We can see the RTO area name in column A – I’m going to take advantage of this to link this sheet to my RTO layer. What’s in column H is the % of domestic nights for the year ending March, 2016. Columns E and F have the raw data and column G gives us the totals. Before joining I need to do a few key steps:
- Make sure the first line has the names of each attribute with no spaces or special characters;
- ArcMap won’t honour spreadsheet formulas so copy all the data in column H and paste as values;
- Remove empty rows
After making those changes in a new sheet (so I still have the original data), it looks like this:
I only need columns A and F so if I really wanted to I could delete the other columns. Now we’re ready to do our join. Just to make things easier, I’ll add this sheet to my map (yes, you can add a spreadsheet sheet to a map and work with it as a stand alone table). The only trick is that I can only add a specific sheet rather than the whole spreadsheet so I have to be sure I know the name of the sheet in question – mine is called MapData here:
I’ve got a lot of <Null> values which I suspect get down to the format they are in – they could be formulas or references to data in other cells – this is why I copied and repasted them in as values. I’m not worried as, at least for this map, I don’t really need those data.
Okay – my spatial data are ready and my spreadsheet data are ready – time to make the join (we covered some of this in a previous post).
As long as I’ve got a common attribute, or key, that exists in both my spatial layer and in my spreadsheet, I can use that to link the two together. In this case it’s the RTO name. From ArcMap, I left-click on the RTO layer name and got to Joins and Relates > Joins and a new window pops up to help me make the connection:
The tool recognised that the MapData$ table was the only standalone table available so all I had to do was specify which attributes are common in both layers. Before I click OK, it’s always a good idea to click the “Validate Join” button to get a sense of how well it’s going to work. Here’s the result I got:
In the result above you can see it wasn’t entirely successful – only 30 of 32 records could be matched. What’s the problem?
It’s most likely to be some sort of typo between the two datasets. On closer inspection I found two problems. For one, I typed “Destination Cluthai” instead of “Clutha” in the RTO layer (D’oh! User error…) and for the other it’s the difference between “Christchurch and Canterbury Marketing” and “Christchurch & Canterbury Marketing”. Both are easily fixed and on the next validation I’m 32 for 32. Click close then OK to finish the join and once the table join is completed, I now have the data from the spreadsheet as attributes in my table. We’re ready for the next stage.
So, close this window and click OK and the join goes through – here’s the new RTO layer table:
The white fields are from the RTO spatial layer. I’ve highlighted the joined attributes in blue – these are the ones from the spreadsheet. The upshot is that now I’ve got the spatial data and the data to be mapped in one layer. The next step is to bring Jude to the conversation and start getting the map fleshed out. Before we spoke, I sent her a rough first draft so we would have something to talk over. As you’ve seen above, the data to be mapped are percentages and each RTO looks to have a unique value. 32 different RTOs mean potentially 32 different unique values could be mapped. Here’s what that could look like if we gave each RTO a unique colour:
This tells us something but it’s a bit too much information than most of us can sensibly process, so my rough draft was to break it up into a few classes. Jude had already told me that she wanted five classes: 0 – 50%, 50 – 60%, 60 – 70%, 70 – 80%, 80 – 90%. This made my life significantly easier. I chose a simple red to green colour ramp and set up the classes – here’s the first draft I sent her:
A bit easier to get a sense of how things vary around the country now. At this point we were talking on the phone and looking at the same map. Her first comments were could I add the regions in? Could I also remove the box around the legend? And reword the title? Easy as:
We played around a bit with the line thicknesses of the regions so that they were discernible but not overwhelming. Getting closer – a few tweaks later and we got to here, version 5:
Not major differences in this one but if you look closely you’ll see some differences in the Ruapehu boundary and a few others. Jude had noticed that some areas appeared to be over-represented so she made a few decisions about what should go where. I made those changes and then we got to the final version, shown below:
A few more tweaks and we were at the final version. In this version, region names were added, a Lincoln logo and some bibliographic data about sources were inserted and we finalised the way the legend looks. And here we had a slight disagreement. Note the class boundaries on the legend: 0 – 50%, 51 – 60%, etc. On the original it was 0 – 50%, 50 – 60% etc. This second approach was my choice and reflected my bias. From my point of view I know that the data are continuous, ranging from 0 to 1. So in my scheme, all potential data values have a place to live, i.e. a number like 0.50463 (or 50.463% as shown in the legend) is catered for. If classes go from 0 – 50 and then 51 – 60, that value is out in the cold. In the end we went with what Jude wanted because it was her map and she felt that that’s what her readers would be more comfortable with. This is a useful point to emphasise – one of the cartographic principles that I tend to focus on with students is that anything on the map should be there for a good reason and not be a distraction. I’d hate the readers of this map to be side-tracked by their annoyance at the class boundaries, so 51 – 60 it was.
All up, this whole process took around three hours. Much of that was preparing the data for the mapping but the most important aspect was the back and forth between me and Jude to get the map right. I brought the ability to make the map and Jude had all the knowledge about what she wanted to show and how best to show it, given the audience. This map has ended up as a figure in a report that (hopefully) communicates the story in the most effective, visual way.
In my experience, this was quite a common sort of mapping experience helped greatly by Jude having a clear picture of what she needed to communicate. Getting the final map should be an iterative process that gradually hones in on a finished product. This is sort of the case of the customer is always right, within the constraints of what we can map. Despite all the effort that may go into an analysis project, it’s often only the map that people see or care about. So getting it right is well worth the effort. Its main job is to tell a story and how it looks is everything. The thing that most people quickly see with GIS is how easy it is to make a map – but making a good map is not. Hopefully this has given you a bit of insight into how this process works.
C
Crile,
thanks for giving us an insight into the process that happens when a map is created, and the importance of focusing on the audience and purpose.
I found it really interesting inferring international tourist nights from the map. In general international tourists travel around New Zealand visiting the following locations Auckland, Rotorua, Canterbury, Mount Cook, Queenstown and West Coast. Which matches how I’ve interpreted the map you’ve created of the data.
What surprised me was the relatively high international tourist nights in Marlborough.
As always thanks for helping me understand GIS