Keeping Your Feet Dry with Mosaic Datasets
Mosaic datasets can be used store and manage a collection of raster grids. We demonstrate this by downloading some high resolution DEMs and the stitch them together in a mosaic dataset. (Updated May 2019 to point out how to fix any problems from the start.)
Even as we speak, students in ERST202 are engaged in a fairly big project looking at the impacts of potential sea level rise on the New Zealand coast. We’re modelling the effects of a 2 m rise in sea level and, as you might imagine, a key first step is trying to determine the areas that would be directly affected. For this we need some good elevation data, handily provided by some digital elevations models (DEMs). While we’ve got a national scale DEM with a grid cell size (resolution) of 25 m by 25 m, this is pretty coarse for our purposes.
Top of the list is a 1 m DEM for Christchurch and Selwyn derived from LiDAR data – sweeeeeeet! All up, this DEM is 2.7 Gb in size:
After signing in, they zoomed into their area of interest, which was Sumner/Godley Head:
The LINZ Data Service allows one to crop a smaller area out by clicking on the crop button, , and drawing a box around the area to be cropped (I’m clipping out a smaller area than they needed just for demonstration purposes):
This gets the file size down to 5.1 Mb, much more manageable. When the download button is clicked, you’ve got a choice of format and coordinate systems:
We try and keep all of our data in New Zealand Transverse Mercator (NZTM) so that’s a good choice for the projection. For rasters we could download them as TIFFs or ASCII files – I’ll go with TIFFs for this but maybe we’ll talk about ASCII files another day. The data get packaged up in a zip file and once that’s been downloaded and unzipped, I can add the DEMs to a map:
Well. That’s interesting. It’s certainly not what we’re used to when looking at DEMs – it’s clearly tiled. When downloading, the data service automatically tiles the DEM so in this case we’ve ended up with nine smaller grids. It would certainly be nice to somehow group these grids together rather than have to treat each grid separately. And this is where mosaic datasets come in.
These are data structures that allow you to store and manage collections of raster data layers. Once a dataset is set up, it’s as if the grids have been stitched together and treated as one whole. Mosaic datasets must reside in a geodatabase so we have to start there. To get this process going, I created a geodatabase:
First I’ve got to create the mosaic dataset. This is done by right-clicking on the geodatabase and going to New > Mosaic Dataset:
A dialogue box takes us through creating the dataset by giving it a name and setting its coordinate system:
After clicking OK we’ve got a new, empty mosaic dataset. Next we need to add some rasters by right-clicking on the dataset name and going to Add Rasters…
Since mosaic datasets are geared towards rasters, the options under “Raster Type” feature lots of satellite image formats. For our purposes, “Raster Dataset” will work fine. For the “Input Data” box, we can point the tool at a folder that holds all the rasters rather than having to add each one separately. Clicking OK adds the rasters to the dataset – here’s what it looks like when I add the mosaic dataset to my map:
Wait…hang on…where are my data? Something’s weird: the maximum/minimum values don’t make sense, ranging from 3.40282e+038 to -3.40282e+038. That’s just bizarre. But all is not lost. The dataset is actually fine and can be used for analysis straight away. It’s just not displaying correctly (and I admit to being stumped by this in lab only yesterday). To fix this, we need to build “overviews“, which help the grid (now called an image in the mosaic dataset) redraw faster. We can do this by right-clicking on the dataset in Catalog and going to Optimize > Build Overviews:
No need to change any settings in the tool.
Click OK and off we go – here’s the result:
Now that’s more like it! Notice how with the mosaic dataset we also have the green footprints of each input tile and a purple boundary line (these can be turned on and off) and the mosaic dataset will be honoured by all geoprocessing tools. So this grid is now good to go – here’s a slope layer derived from the dataset:
(Finally, some colour…!) The grid can now also be used to idenitify all those areas affected by a 2 m rise in sea level (amongst other things). Doing this with nine tiles isn’t such a big deal, but it’s stitching them together has already simplified our analysis. Imagine if we were using the whole of the original grid, which could easily contain over 1,000 individual tiles (or more). And that’s where the real power of this data structure comes in – managing large amounts of raster data (DEMs, satellite images, aerial photos) as if they were one, fast-acting grid. It will also relieve that occasional heartburn (not really). And hopefully this particular one will help us keep our feet dry in the coming years.
Getting it right at the start
What we’ve talked about above goes through how to fix the problem after you’ve created the mosaic dataset. It’s better to get it fixed from the start and here’s how. When you’re at the point of adding the rasters using the Add Rasters to Mosaic Dataset, scroll down to the bottom of the window and expand the Raster Processing section. Tick the Calculate Statistics and Build Raster Pyramids boxes. Under Mosaic Post-processing, tick Build Thumbnails and Update Overviews as shown below. This should allow your new dataset to display nicely.