In this post we look at addressing a spatial problem, coming up with an approach and implementing the solution.  (Spoiler alert – it worked, but not as it was supposed to.) 

We often get emails here at GIS Central asking for a bit of assistance with things spatial.  One never knows what will come through the Inbox and I’m almost always up for a challenge.  Why, just the other day we got this:

“Hi Crile,

I am looking for assistance with ArcGIS and was wondering if you may know a thing or two which could help me.”

(Ed. They always start off so nice…)

“I have 10 individuals with 50 days’ worth of hourly locations (some data points will be missing.)

I want to split the data for each individual into 2 day sliding windows, e.g day 1&2, 2&3, 3&4 … etc.

For each two day window, I want to calculate the distance between each location and the first location recorded in that window.

Do you know of possible steps in ArcGIS I might undertake to achieve this?”

We always aim to please and on the face of it, this certainly sounded feasible (these individuals are deer, by the way).  With the idea of “beginning with the end in mind“, I began formulating an approach.  At this point I think the main output will be a table (CSV or Excel) with a column of distances away from one point, broken down into these sliding time windows.  We’re talking about distances between dear locations so the Near tool comes to mind immediately – that’s exactly what it’s designed to do.  I’m also anticipating that I may have to think about how to structure the data to do this most efficiently.  With 10 animals and 50 days worth of data I’m already beginning to wonder if this is a manual job or if I’ll need/want to automate it somehow (either with ModelBuilder or Python).

With a very rough plan in mind, the next thing might be to have a look at the data – so I asked for something to work with and was sent a CSV file of the data:

A few things to notice about these data:

  • 1,035 points in all covering 50 days worth of data collection – time readings are not regular
  • The Latitude and Longitude columns are critical – they will allow us to map the points
  • BUT – being latitude and longitude, distances measured from these data will be in degrees of arc – very small numbers at this scale and not easy to work with as “on the ground” distances.  I’ll need to project these data to NZTM to get some decent distances in metres
  • Dates and Times are present BUT they are in UTC (the successor to Greenwich Mean Time) – does that matter?  I’ll need to ask.  Depending on the time of year, NZ is either 12 or 13 hours ahead of UTC

So let’s get these points on the map using Display X Y Data (data shown have already been projected to NZTM):

As a first note, this deer (henceforth known as DeerA to protect the innocent) was released at Glenorchy at the top of Lake Wakatipu and very quick hightailed it (as deer do) up into the Beans Burn river valley, covering around 35 k in five hours!

Next we’ve got to come up with a workflow that works – manually first to make sure it works and then we can think about automating it.  I know that Near is the likely tool, so let’s look at how to set that one up:

This tool needs some Input Features (what we’re finding the distance from) and Near Features (what we’re finding the distance to – though it is the same in both directions).  I can limit the search radius, request the Location (x and y coordinates of the near feature), the angle to it, and either use planar (flat) or geodesic (along the earth’s curved surface at great distances).  Really important thing to know about this tool is that it doesn’t create a new output layer.  Instead, it adds two new fields to the Input Features layer – one for the ID of the nearest feature (NEAR_ID) and another for the straight-line distance to that feature (NEAR_DIST).  These are the default values – I can give them new names in the Field Name window if I like.

We can control what features get used for Near by selecting features – this is useful as I need to find the distance from a single point for a collection of points in one of the sliding windows.  In other words, I might select one point as the start of a 2-day window and then all the points that occur within that 2-day window.  Another thing – do the Input and Near Feature layers need to be different?  i.e. can I use the same layer for both?  Happily, the answer is yes.  I’ll play a little trick on Pro by having two copies of the same layer on my  map:

I’ve labelled one copy as “DeerA_TM_Input Feature” and the other as “DeerA_TM_Near Feature”.  Then I manually selected the first record in the Near Feature layer and all the points within the 2-day window starting then in the Input Features layer.  I’m anticipating this will add the NEAR_ID and NEAR_DIST fields to both layer and then I’ll check and see if it worked.  It did!

The -1 value occurs because it’s the same feature.  Using the Measure tool, I confirmed that the correct distances are being calculated so I’ve got a viable workflow.

So.  Next decision is do I do this manually or should I automate it?  I’ve potentially got 10 animals each with around 50 days worth of data.  With 2-day sliding windows let’s call that 25 windows per animal, so 250 iterations all up.  If this was just one animal, I might be tempted to do it manually, but with 10, I think I’m going to have to automate it.  Should I use ModelBuilder or Python?  I’ll check out ModelBuilder first – I know it’s got iterators built in so it might work.  Before I do that, though, I want to think about if I can structure my data to make life easier.

In the bit above, I manually selected the records.  This was easy enough to do; find the first record and then go forward two days.  If my records were at regularly sampled times, I might be able to use Select by Attribute to make this work, but given the irregular sampling, I think that’s going to be a nightmare.  At this point I’m willing to do the work of manually grouping the 2-day windows by adding a new attribute.  This new attribute is called “Window”, starts at 1 and changes to 2 two days after the start time.  Below you can see where the records transition from window 1 to window 2:

To further make life easier when I come to automate this, I decided to extract out the first day points into a new layer.  This layer is called DeerAFirst (on the right below):

To do this manually I’ve got to do two Select by Attributes – one for the first day of a window and the second for all the points within that window.  Let’s see if I can do that in ModelBuilder.

I started by dragging my two input layers on to the canvas and the Near tool.  From the Iterators menu I picked “Iterate by Feature Class” which should allow me to work through each record one at a time:

So far, so good – until I tried to add a second iterator…they’re all greyed out when I try and chose another one:

This is a MAJOR limitation of ModelBuilder: I can only use one iterator per model.  It’s going to have to be Python.

I won’t got into all the gory details (and there are plenty) but the script ended up being pretty simple really.  The biggest problem I ran into was that whenever Near runs, it works on the selected records as expected, but it also resets all the non-selected records to -1, meaning each time it runs I loose all the previously calculated distances.  My solution?  Create a new layer with all the same attributes but no features (records) and at the end of each iteration, export the selected records  to it.  For this I used the Append tool.  Here’s the script for those of you having trouble sleeping (assuming you’re still awake):

import arcpy

# Some script removed to protect the guilty
# variable for the Input, near and output feature classes
 inputFC = "DeerA"
 nearFC = "DeerAFirst"
 outputFC = "DeerAOut"

# iterator to move through each window
 count = 1

# while loop to iterate through Window attribute and run Near. There are 25 windows so this forces it to stop after the 25th
 while count < 26:

# where clause
  whereClause = """Window = {}""".format(count)
  print(whereClause)

 arcpy.SelectLayerByAttribute_management(inputFC, "NEW_SELECTION", whereClause)

 arcpy.Near_analysis(inputFC, nearFC)

 # Near overwrites all unselected records so use Append to add new records to OutputFC
  arcpy.Append_management(inputFC, outputFC)
 
  # add one to the count for the next iteration
  count = count + 1

 print(count)

# end of while loop - go back and do it again on a new set of selected records until count < 26

print("Done")

This worked pretty nicely, I must say, and was pretty basic.  In the output you can see the added columns that show the distances from OBJECT_ID = 1 (from an exported CSV file – distances are in metres):

I’d love to say that this was the end of it, but sadly, just after I sent this off, I revisited the original email and realised to my horror (I was going to put in a link to Colonel Kurtz’s horror speech at the end of Apocalypse Now, but that was a bit too much), that I hadn’t actually done what I was supposed to do…  I hope there were a least a few of you out there that had a nagging suspicion about this from early on, but I wasn’t among them.  Such hubris…I thought I knew what they wanted.  At issue here is the windows themselves.  What I did above was windows that were set up as:

  • window 1 = day 1 and day 2
  • window 2 = day 3 and day 4
  • window 3 = day 5 and day 6
  • etc

What was wanted was:

  • window 1 = day 1 and day 2
  • window 2 – day 2 and day 3
  • window 3 = day 3 and day 4
  • etc

Doh!  My mistake.  Big sigh.

Well I can see what the next post will be about…better get back to work…

This was fun while it lasted.  And my only solace is that Python did exactly what I told it to do, frustratingly so.  When things go wrong with scripts, I find that it’s usually not Python that’s made the mistake, but the Python user.  Tell that to your nearest and deerest ones next time you’re coding.

C