GDAL to The Rescue

Recently I had an interesting project come up.

A customer wanted to show data from his rain gauges, both current event values and past event values. No problem. Hit the database, drop points and values on a mashup, Bob’s your uncle.

But he also wanted an interpolated raster created for the data (heat map, kriging, etc.). On the fly. Ouch. The customer noted they hadn’t seen this on other web sites of this kind.

For a good reason, I thought. Every time you generated a raster you’d be beating your server with a rubber mallet. You can set up the current stuff on a timer, but queries of past data would have to be generated upon request.

Pulling my best Scotty* impression, I went about trying to solve this problem. I looked at ESRI software, but quickly ruled out their software suite as being too slow for on-the-fly raster interpolation. I wanted to get under 5 seconds per throw (hoping for as little web traffic as possible), and I couldn’t cobble together an ESRI solution that could get close.

Enter GDAL (pronounced Goodle). From the web site:

GDAL is a translator library for raster geospatial data formats that is released under an X/MIT style Open Source license by the Open Source Geospatial Foundation. As a library, it presents a single abstract data model to the calling application for all supported formats. It also comes with a variety of useful commandline utilities for data translation and processing.

The commandline utilities bit was what I was interested in. In particular, gdal_grid, which creates a raster grid from scattered XY data using inverse distance, average, or nearest neighbor (moving average) algorithms to interpolate the data.

I made a scratch XYZ file like so:


And a vrt file to tell the command how to interpret the data:

<OGRVRTLayer name=”finsdata”>
<GeometryField encoding=”PointFromColumns” x=”POINT_X” y=”POINT_Y” z=”Value”/>

And then wrote out a simple(ish) command:

gdal_grid -quiet -a invdist:power=2.0:smoothing=1.0:radius1=90000:radius2=90000:max_points=0 -a_srs EPSG:2264 -txe 1383251 1538013 -tye 459814 648739 -outsize 500 700 -a_srs EPSG:2264 -of GTiff -ot Float64 -l finsdata finsdata.vrt finsdata2.tif

Which basically means do an inverse distance interpolation with specific parameters on my NC NAD 83 xy data and toss out a geotiff. The result with some styling applied looks like so:

Which, of course, isn’t how you’d style it, and the test data is totally random, but back off man - I’m demoing here.

~80 data points, a tad under 3 seconds. On my laptop. Mission accomplished. Now I can send this bit of code to one of our much-smarter-than-me developers to do the real work.

I would tell you if you haven’t used GDAL before to go check it out, but you have used it before. ESRI uses GDAL. Pretty much every open source GIS project uses GDAL. You can’t swing a dead chicken around GIS software without hitting GDAL at some point. But what you might not have done before is use it from the command line to automate raster processing. If you have any such needs and blinking command prompts don’t frighten you, check it out. It works with a long list of formats and can do anything from this type of raster interpolation to creating tiles for Google Maps to reprojecting rasters and a million things inbetween.

*“Very funny, Scotty. Now beam down my pants.”