Free Software Hulk-Smashes Traffic Spike

When I did my periodic glance at Google Analytics for GeoPortal, I was not expecting that.

Traffic to that site is generally a predictable sine wave, slightly swelling during the week and slighting receeding on the weekends, with a monthly average of ~10-12k people (absolute unique IP’s) per month. Having been around the block a few times, I went to the Charlotte Observer web site to see what got linked to GeoPortal and why. In this case it was recycling pickup day changes.

Had I not checked the web stats, I never would have known anything was going on. It isn’t just that nothing went wrong during the traffic surge. It’s that the whole event was marked with a deep and profound nothingness. Nothing even slowed down. The whole software stack reeked of thank you sir, may I have another?

I raise a glass to the free software that turned this traffic spike into a non-event: PostgreSQL, PostGIS, and GeoServer. Best in class software, period.

Posted in Opinion & Rant | 8 Comments

Active Directory Authentication in MediaWiki

This is a bit random, but since it took me an embarrassing number of attempts to get this to work I thought I’d post a quick how-to on getting MediaWiki to authenticate to Microsoft Active Directory.

First you’ll need to grab the LDAP Authentication extension for MediaWiki. Place the unzipped LdapAuthentication folder in your MediaWiki installation’s extensions directory.

Next, you’ll need to enable LDAP for PHP in your php.ini file. If you’re running PHP in *SAPI mode make sure to bounce the web server when you’re done, or if you’re running FastCGI just kill all the php instances.

;extension=php_interbase.dll
extension=php_ldap.dll
;extension=php_mbstring.dll
;extension=php_ming.dll

You’ll need to know your AD domain name and at least one AD server name. You can use nslookup to find that stuff out. Now head to your LocalSettings.php configuration file in your main MediaWiki directory and add a section like so:

/* Grab the extension and create a new object. */
require_once( "$IP/extensions/LdapAuthentication/LdapAuthentication.php" );
$wgAuth = new LdapAuthenticationPlugin();

/* Pick a name for your domain
 (it can be anything, and you can have more than one). */
$wgLDAPDomainNames = array(  'mydomain' );

/* Give it a list of AD servers. Note it won't do tree parsing, so you need
 the actual server name(s). */
$wgLDAPServerNames = array(  'mydomain' => 'server1 server2' );

/* Give it a search string. You'll need your actual domain name here.
 Leave USER-NAME alone - it's a place holder. */
$wgLDAPSearchStrings = array( 'mydomain' => 'domain\\USER-NAME' );

/* Encryption type. 'clear' worked for me, but if it doesn't, try 'ssl'. */
$wgLDAPEncryptionType = array( 'mydomain' => 'clear' );

/* The first setting here allows you to also use MediaWiki logins. The dev docs
 say this could cause problems, but I haven't run into any. Set it to false if you
 don't already have mediawiki logins to support.
 The second setting is really only necessary if you set the first one to true. It
 won't allow local users to login as domain users (domain passwords are not stored
 by MediaWiki). */
$wgLDAPUseLocal = true;
$wgMinimalPasswordLength = 1;

Finally, if you only want logged in users to be able to edit, drop this in there too:

/* Allow only logged in users to edit. */
$wgGroupPermissions['*']['edit'] = false;

Hopefully that will save somebody some swearing. YMMV.

Posted in Code | 1 Comment

News Roundup – Bilski, Google Phases Out Windows (Maybe), GeoData.gov

First up in this month’s news roundup is the Supreme Court’s decision on Bilski. The SC took its traditional narrow ruling approach (i.e. “punt”), invalidating Bilski’s patent but not invalidating software patents in general. Process patents are still allowed, and the machine/transformation standard can’t be the only standard applied. But it did uphold the machine/transformation standard as a useful test, and the argument used to toss Bilski could be used to toss many software patents. But to make a long story short: process patents will be harder to get but still obtainable, and the validity of software patents will remain an open question until a software-specific patent case makes it to the Supreme Court.

As first reported by the Financial Times, Google is planning to phase out the use of Microsoft Windows. Google has traditionally been a “run whatever you want” shop, but after the Aurora exploit employees are being pushed toward OSX and Linux. One employee was quoted as saying “Getting a new Windows machine now requires CIO approval.” While the security angle seems to be getting the most press, in many ways it’s a smart business move. With Android and Chrome OS, Google is moving more and more into the operating system market. You always want companies to eat their own dog food. Maybe it’ll get the Google Earth Plugin running on Linux a bit faster. I should note that although this has been reported everywhere, to my knowledge Google has still not confirmed the story.

Aaaaaarrgh!

Sean Gorman at Off the Map hit the nail on the head when summarizing the community’s sentiment on the ESRI GeoData.gov contract. If you haven’t been following this, in a nutshell ESRI has a no-bid contract with the Feds to do all of the geo work for data.gov and…wait for it…

…share the data out in proprietary formats and via proprietary API’s. Basically, ArcGIS.com becomes our de facto national Spatial Data Infrastructure.

Aaaaarrrgh!

If I still had my cheap home-built Strat I’d Pete Townsend it on my amp. Well, not my Vox amp. I’m starting to buy into the idea that local open data portals are better at many things than centralized SDI services.

ArcGIS 10 is officially out. Those with maintenance contracts will be getting an email with instructions on how to download it (yes, download it – I can has torrent please?). I can’t say anything else about it until I can lay mits on it. Hopefully it won’t reject my work machine outright for being .2ghz below the new minimum requirements.

Boston GIS has a great post comparing the spatial features of PostgreSQL/PostGIS, SQL Server 2008, and Oracle 11g. It’s a very detailed analysis of the capabilities of each platform. For my (no) money, it’s PostgreSQL/PostGIS all the way.

And now for a few quick hitters:

  • Google Earth 5.2 has been released, and includes a lot of great new features, like improved GPS tracks, elevation profiles, and a better embedded browser.
  • Opensource.com had a couple of good GIS posts this month. Citizens are involved in a grassroots mapping project on the oil spill. Using weather balloons and kites with cameras attached, they’ve captured and stitched together some of the best and most current aerial photography available.
  • Another opensource.com post talks about open data initiatives at MassDOT, and how the community has used their data to make a lot of useful applications, including real time bus tracking.
  • OpenGeo released Prj2EPSG, an application to convert well known text projection information to EPSG codes. I can finally take the brain cells holding 4326, 2264, 900913, and 102113 and do something useful with them.
  • On the browser front, Mozilla has released the first beta of Firefox 4, and with IE9 preview 3 they’ve made a first run at HTML5 canvas support. IE9 preview 3 also upped ACID3 compliance to a more acceptable level. As much as I begrudge saying nice things about IE, IE9 is headed in the right direction.
Posted in News | 1 Comment

Python Script to Monitor Shapefiles

In an ideal world, the primary, most up to date spatial layers are accessed and maintained on an enterprise spatial database. Here on the little blue planet, a lot of folks still like to edit and access data as shapefiles.

I can’t say I blame them. Despite doing spatial DBA work on Postgres/PostGIS and SDE/SQL Server for years, I have a soft spot for shapefiles. For small to mid-size layers they draw faster, they’re easy to consume, easy to exchange, and they’re the only open format ESRI has. Just the inexplicable delay between double-clicking a SDE connection and seeing a list of layers makes me want to kick something. But having shapefiles that are out of sync with your enterprise database server is bad for everybody.

Here’s a little Python script I threw together to keep track of shapefile changes, which we use to let us know when SDE layers could be out of sync. It reads a comma delimited text file which is formatted <sde layer>, <path to shape file with no extension>, finds the most recent date of .shp/.dbf change, and can email a report or write it to a file, highlighting changes within a certain time period from the present date and shapefiles that have vanished (as they sometimes do).

import os
import datetime
import time
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText

# get file date
def modification_date(filename):
    return os.path.getmtime(filename)

# see if file exists
def file_exists(filename):
	return os.path.exists(filename)

# convert system time to date
def convert_time(t):
	datestamp =  datetime.datetime.fromtimestamp(t)
	date_str = datestamp.strftime("%Y-%m-%d %H:%M:%S")
	return date_str

def process(list_of_lines):
	data_points = []
	for line in list_of_lines:
		# split file line into array of sde name and shapefile path
		parts = line.split(",")
		# check to see if file exists
		if file_exists(parts[1].rstrip() + ".shp") :
			# get the latest time stamp for the shape file
			timestamp1 = modification_date(parts[1].strip() + ".shp")
			timestamp2 = modification_date(parts[1].strip() + ".dbf")
			timestamp = timestamp2
			if timestamp1 > timestamp2 :
				timestamp = timestamp1
			timestamp_rpt = convert_time(timestamp)
		else:
			timestamp_rpt = "File be gone!"
			timestamp = 0
		data_point = [timestamp, parts[0], parts[1], timestamp_rpt]
		data_points.append(data_point)
	return data_points

# create web page
def create_output(data, timep) :
	table = ""
	for rec in data :
		if rec[0]  >  (time.time() - timep) :
			table += "<tr class='highlight'><td>" + rec[3] + "</td><td>" + rec[1] + "</td><td>" + rec[2] + "</td></tr>"
		elif  rec[0] == 0 :
			table += "<tr class='highlight2'><td>" + rec[3] + "</td><td>" + rec[1] + "</td><td>" + rec[2] + "</td></tr>"
		else :
			table += "<tr><td>" + rec[3] + "</td><td>" + rec[1] + "</td><td>" + rec[2] + "</td></tr>"
	output = "<html><head><style>table { border: 2px solid black}  td {border: 1px solid gray } .highlight { background-color: yellow } .highlight2 { background-color: red }</style></head><body><table>"
	output += "<tr><th>LAST EDITED</th><th>SDE LAYER</th><th>SHAPE FILE</th></tr>"
	output += table
	output += "</table></body></html>"
	return output

# create file
def write_file(f, data) :
	file = open(f, "w")
	file.write(data)
	file.close()
	return 0

# Email
def mail(serverURL=None, sender='', to='', subject='', text=''):
    msg = MIMEMultipart('alternative')
    msg['Subject'] = "Link"
    msg['From'] = sender
    msg['To'] = to
    ptext = "Python Shapefile Checker Report"
    html = text
    part1 = MIMEText(ptext, 'plain')
    part2 = MIMEText(html, 'html')
    msg.attach(part1)
    msg.attach(part2)
    mailServer = smtplib.SMTP(serverURL)
    mailServer.sendmail(sender, to, msg.as_string())
    mailServer.quit()

#########################################################
# Customize script here
#########################################################
input_file = "shape_check.txt"     # path to the file containing input parameters,  format:  sde name, path to shape file (no file extension)
output_file = "shape_check.htm"    # where to put output file when finished (optional)
time_period = 30 * 86400  # number of days x 86400 to highlight (yellow if modified within last x days)
email_address = "you@yourcompany.com"    #  where to send email report (optional) (NO HOTMAIL)

# open file
f = open(input_file, "r")
data = f.readlines()
f.close()

# prcess data
datareturn = process(data)

# sort return
datareturn_srt = sorted(datareturn, reverse=True)

# create web page
output = create_output(datareturn_srt, time_period)

# move output to file and lanuch web browser (optional)
write_file(output_file, output)
os.system(output_file)

# email result (optional)
mail("email_server_name", email_address, email_address, "Shape File Monitor",  output)
Posted in Code | 6 Comments

Apple: All Your Location Are Belong to US

I ran across this creepy bit of news on Slashdot this morning:

In an updated version of its privacy policy, the company added a paragraph noting that once users agree, Apple and unspecified ‘partners and licensees’ may collect and store user location data. When users attempt to download apps or media from the iTunes store, they are prompted to agree to the new terms and conditions. Until they agree, they cannot download anything through the store.

Steve says you will share the location of your precious with him, and he will share your location with anybody else he wants to, period. It isn’t really a choice, because without app store access, the iPhone is a really expensive paper weight. The data is supposed to be anonymous, but I’m not sure how my real time position is anonymous in any meaningful way (plus they have to be logging some kind of unique key to track movement from one point to another).

I can hear an argument coming that this is nothing but some LBS boiler plate jargon and that everybody does it. MrHanky on Slashdot had a good quote from Cory Doctorow on the subject (and I intuitively trust anybody with a handle like MrHanky).

This is different from Android, in that Google does not gather your information unless you opt in, and if you do opt in, you can opt out later. By contrast, Apple gathers your information without asking you to opt in, and does not present you with the option of opting out.

What’s more, Apple is presenting these new terms retrospectively. People who bought iPads and iPods on the understanding that they could be used without having their location information gathered and shared now find that they *must* allow this information to be gathered and shared (I suppose you could try not updating iTunes, but then you would also have to not upgrade your OS — OS upgrades come with iTunes upgrades — and be prepared to be locked out of the app store, and since Apple’s use of DRM prevents third parties from putting apps on your devices, you’re fundamentally abandoning any hope of loading any code, even third-party code, onto your iPad and iPod).

I’m always torn on Apple products. On the one hand, they make some of the sexiest hardware out there. On the other hand, the price you pay for that hardware is often quite a bit more than you see on your credit card bill.

Posted in Opinion & Rant | Leave a comment