We follow up a previous post and attempt to clean up a spatial mess of our own creation

Previously, we looked at a script that calculates distances between deer locations.  Things were going along swimmingly until I realised that I hadn’t quite done what was asked.  That knock on my door was Deer A, just wanting to remind me:

I’m currently in therapy.

As a quick reminder, we’ve got a dataset of GPS locations for an individual deer over a 50-day period.

Using 2-day sliding windows, what’s needed is to calculate distances from the location at the start of each sliding time window to all the other locations within that window.  My error first time around was that I set up these sliding windows as:

  • window 1 = day 1 and 2
  • window 2 = day 3 and 4
  • etc

whereas what was needed was:

  • window 1 = day 1 and 2
  • window 2 = day 2 and 3
  • etc

(There are good reasons for this…but I won’t elaborate here.)

At first, crestfallen, I thought this was going to be a(nother) nightmare but, happily, it turned out to not be a huge deal.

You’ll recall that one of the issues with the workflow was that when working with a layer with selected features, the Near tool sets all non-selected feature distances to -1.  In the first go, I handled this by writing the results of each iteration to a new feature class, building it up as the while loop runs.

That issue hasn’t gone away but it’s slightly more complicated in that I’m now going to have overlapping distances.  What I mean is that each point (except for the first day) will be in two different sliding windows, so I need to somehow craft this so I don’t end up overwriting the distances from the previous window.  I’m envisioning that in the end, my output table will need to have two NEAR_DIST fields, one for each sliding window.   Make sense?

I’m pretty happy with the guts of my first script – but there are some issues, particularly:

  • My first version used the Window attribute  to make selections on my two input layers.  That’s not going to work this time – I’ll need to recast that attribute so that it groups each day together – a window should now be the starting day and the one that follows (day and day + 1).  This also means I’m going to need two separate whereClauses – one for the first day layer and another to select the windows (day and day + 1) from all the records.  Let’s call this the selection problem.
  • With the overlapping days I need some way to keep a copy of the calculated distances before they get written over in the next iteration.  Is that an output layer for each window?  Or maybe some smaller number, at least two?  Let’s call this the output problem

The Selection Problem

First bit here is to group my data not by window but by day.  I was just about to bite the bullet and start to do this manually, but Sara, from ERST607 had a few spare moments (ha!) and whipped up a Python script to do this (yes, we’re getting her some help…).  Thanks Sara!  This was a nifty script that used Python’s datetime library to do some grouping and added in a day counter attribute called “Day_id”.  This did a great job of setting up a layer which could be used for this script.  Below are my two input layers:

You can see the Day_id attribute for each first day (left) and each window (right).  I’ll use that to my advantage in making rolling selections.

Ivan from 310 also did a bit of work with pivot tables that helped the cause.  He’s currently in therapy, too.

In the original version of this script I set up a whereClause:

(In this syntax, “count” gets inserted into the curly brackets, helpful as it changes every iteration.)

This was then used twice to select the right records in each layer for Near:

That’s not going to work here as I need to select the Day_id as well as Day_id + 1 in the inputFC.  I can use my count variable to make this work in a new version with two whereClauses:

See how “count” and “count + 1” picks up both?  Also necessary to use “Or” for this to work properly. (an “And” would not select anything.  Why?)  These then go to work in two Select by Attribute lines:

Selection problem solved – we’re done here.

The Output Problem

I fretted over this one a bit.  In the end, I kept it simple by recognising that my original script worked fine so long as the windows were consecutive (1&2, 3&4, 5&6, etc).  Then it occurred to me that if I could take advantage of there being windows that start on odd and even days, thus only needing one set of input and output layers.  A relatively easy way to do this is to use the modulo operator, a somewhat obscure cousin of +, -, * and /.  The modulo operator returns the remainder of a division operation.  For example, if I divide 13 by 5, I get 2 (5 x 2 = 10) with a remainder of 3, which is what you get when you use modulo.  In code, that might look something like:

13 modulo 5 = 3, or in Python, 13 % 5 = 3.

I’ll use this to my advantage as an even number mod 2 will always return 0.  I’ll use this with an if statement to branch which layer Append sends the output to, which I can later merge.

To implement all this I first needed to set up two output layers – each with the same set of attributes as my inputs but with no actual features in them.  In Pro I simply created two copies of a layer with the right attributes (DeerAOutOdd and DeerAOutEven) in the project geodatabase and deleted all their records.

In the code I set then up variables to refer to these:

Inside the while loop, after Near runs, I use modulo on the count variable in an if statement to determine if it’s odd or even.  If it’s even, send the selected records to DeerAOutEven.  If not, send them to DeerAOutOdd:

“if count % 2 == 0:” does the deciding – it has to be “==” as this is how we do comparisons in Python.  A lone “=” is used to set the value of a variable which doesn’t really make any sense in this context.  If I were to translate this line, it might read: “If you divide the value of count by two and the reminder is 0”.  The code then executes the next line.  If not, it runs the else line

Output sussed.  We’re ready to run.

I’ll include the whole script at the bottom of this post for the gluttons amongst you (please seek help).  Happily, this all worked pretty well.  There was a final table join to do (using the OBJECT_ID attribute) to bring the two sets of data together into one layer which was then exported as an Excel sheet and sent off for review:

You can see the Day_id values here (column H).  The odd windows are highlighted in blue – as expected, the first day only has one set of distances.  The even windows are highlighted in greenish at right and  I think we can now deliver what was asked of us.  With a bit more work, I could put this script to work on the other nine sets of data when they come in and everyone’s (hopefully) happy.  I’m still not sure this is the best format for my end user, but in any event I’ve got the results and anything else is about repacking them.

It’s been worthwhile to spend the time writing this script given 1) the number of times I’ll need to rerun it on different data and 2) the sheer number of times it has to loop for each animal – 52 for DeerA so possibly around 520 times to do this repetitive task.  That alone was worth the price of admission.  Plus I got two blog posts out of it!  (Ed. if only you were paid by the post, you’d be rich by now.)  

As with any of my scripts, I’m sure someone who really knew what they were doing could put together something much more elegant and efficient.  I’ll get there one day.


Full script here with one comment.  I hard coded the number of times the while loop would run with .  The 52 comes from the number of windows, which is essentially the number of days.  As noted in the comments, I could have used something like GetCount to find the total number of days automatically each time it starts – something I’ll probably do when this gets scaled up to do the other nine animals.  One less thing for me to stuff up.

(If you’ve read this far you should seek immediate help.)