Python Script to Monitor Shapefiles

In an ideal world, the primary, most up to date spatial layers are accessed and maintained on an enterprise spatial database. Here on the little blue planet, a lot of folks still like to edit and access data as shapefiles.

I can’t say I blame them. Despite doing spatial DBA work on Postgres/PostGIS and SDE/SQL Server for years, I have a soft spot for shapefiles. For small to mid-size layers they draw faster, they’re easy to consume, easy to exchange, and they’re the only open format ESRI has. Just the inexplicable delay between double-clicking a SDE connection and seeing a list of layers makes me want to kick something. But having shapefiles that are out of sync with your enterprise database server is bad for everybody.

Here’s a little Python script I threw together to keep track of shapefile changes, which we use to let us know when SDE layers could be out of sync. It reads a comma delimited text file which is formatted <sde layer>, <path to shape file with no extension>, finds the most recent date of .shp/.dbf change, and can email a report or write it to a file, highlighting changes within a certain time period from the present date and shapefiles that have vanished (as they sometimes do).

import os
import datetime
import time
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText

# get file date
def modification_date(filename):
    return os.path.getmtime(filename)

# see if file exists
def file_exists(filename):
	return os.path.exists(filename)

# convert system time to date
def convert_time(t):
	datestamp =  datetime.datetime.fromtimestamp(t)
	date_str = datestamp.strftime("%Y-%m-%d %H:%M:%S")
	return date_str

def process(list_of_lines):
	data_points = []
	for line in list_of_lines:
		# split file line into array of sde name and shapefile path
		parts = line.split(",")
		# check to see if file exists
		if file_exists(parts[1].rstrip() + ".shp") :
			# get the latest time stamp for the shape file
			timestamp1 = modification_date(parts[1].strip() + ".shp")
			timestamp2 = modification_date(parts[1].strip() + ".dbf")
			timestamp = timestamp2
			if timestamp1 > timestamp2 :
				timestamp = timestamp1
			timestamp_rpt = convert_time(timestamp)
		else:
			timestamp_rpt = "File be gone!"
			timestamp = 0
		data_point = [timestamp, parts[0], parts[1], timestamp_rpt]
		data_points.append(data_point)
	return data_points

# create web page
def create_output(data, timep) :
	table = ""
	for rec in data :
		if rec[0]  >  (time.time() - timep) :
			table += "<tr class='highlight'><td>" + rec[3] + "</td><td>" + rec[1] + "</td><td>" + rec[2] + "</td></tr>"
		elif  rec[0] == 0 :
			table += "<tr class='highlight2'><td>" + rec[3] + "</td><td>" + rec[1] + "</td><td>" + rec[2] + "</td></tr>"
		else :
			table += "<tr><td>" + rec[3] + "</td><td>" + rec[1] + "</td><td>" + rec[2] + "</td></tr>"
	output = "<html><head><style>table { border: 2px solid black}  td {border: 1px solid gray } .highlight { background-color: yellow } .highlight2 { background-color: red }</style></head><body><table>"
	output += "<tr><th>LAST EDITED</th><th>SDE LAYER</th><th>SHAPE FILE</th></tr>"
	output += table
	output += "</table></body></html>"
	return output

# create file
def write_file(f, data) :
	file = open(f, "w")
	file.write(data)
	file.close()
	return 0

# Email
def mail(serverURL=None, sender='', to='', subject='', text=''):
    msg = MIMEMultipart('alternative')
    msg['Subject'] = "Link"
    msg['From'] = sender
    msg['To'] = to
    ptext = "Python Shapefile Checker Report"
    html = text
    part1 = MIMEText(ptext, 'plain')
    part2 = MIMEText(html, 'html')
    msg.attach(part1)
    msg.attach(part2)
    mailServer = smtplib.SMTP(serverURL)
    mailServer.sendmail(sender, to, msg.as_string())
    mailServer.quit()

#########################################################
# Customize script here
#########################################################
input_file = "shape_check.txt"     # path to the file containing input parameters,  format:  sde name, path to shape file (no file extension)
output_file = "shape_check.htm"    # where to put output file when finished (optional)
time_period = 30 * 86400  # number of days x 86400 to highlight (yellow if modified within last x days)
email_address = "you@yourcompany.com"    #  where to send email report (optional) (NO HOTMAIL)

# open file
f = open(input_file, "r")
data = f.readlines()
f.close()

# prcess data
datareturn = process(data)

# sort return
datareturn_srt = sorted(datareturn, reverse=True)

# create web page
output = create_output(datareturn_srt, time_period)

# move output to file and lanuch web browser (optional)
write_file(output_file, output)
os.system(output_file)

# email result (optional)
mail("email_server_name", email_address, email_address, "Shape File Monitor",  output)
This entry was posted in Code. Bookmark the permalink.

6 Responses to Python Script to Monitor Shapefiles

  1. maning says:

    excellent post! One quick question.
    Is there a way to compare changes within two shapefiles? Some sort of “geo-diff”? Something that shows changes in geometry and attributes.

  2. Great! Thanks for sharing. Here is my own foray into using Python with ESRI. Fun stuff.

    http://wpdupre.blogspot.com/2010/06/esri-and-python.html

  3. Fuzzy says:

    If you have arcgis desktop, you could do Data Management Tools – Data Comparison – Feature Compare in toolbox.

    From the command line, there’s an old library you can try compiling here (I think they have a compiled version for windows bundled with ms4w):
    http://www.obviously.com/gis/shpdiff/

    The easiest thing to do is drop the data into postgresql/Postgis or spatialite. Then you can just do a SELECT * FROM foo EXCEPT SELECT * FROM moo; (I think that should work with geometry columns – haven’t tried it).

  4. chaipat says:

    Great example python script. thank for your post.

  5. Sean Gillies says:

    Python — is there anything it can’t do?

    Your formatted time is so close to ISO/RFC 3339 you could use datestamp.isoformat(). For a diff: JSON-ify your features to dicts (GDAL, and I hear the new ArcPy can do this) and do something like http://code.activestate.com/recipes/576644-diff-two-dictionaries/.

Leave a Reply

Your email address will not be published.

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>