Finding the Nearest House Number in Postgres

I have been advocating we stop maintaining geocoding attributes on our streets file for years. We have an address point file with the points on the freaking buildings. Bibiana McHugh of Portland TriMet gave a great keynote at FOSS4G-NA in Minnesota and made the same point.

One contrary viewpoint I’ve run across has come from Police, as their officers often don’t communicate a house number, they communicate a block number. As in “I’m on the 500 block of Independence and there are zombies here OVER!”. I always thought this was a red herring (in 2013 we have the technology to know where officers and cars are, like, all the time), but after Bibiana’s talk I thought trying to figure out the closest point address to a block number would be an interesting exercise. Turns out it is neither very interesting nor much of an exercise.

Find Nearest House Number

SELECT field1, field2 FROM 
    ( 
        (
            SELECT field1, field2 
            FROM master_address_table 
            WHERE street = 'RUTH' AND street_number >= 5000 
            ORDER BY street_number ASC
            LIMIT 1
        ) 
        UNION ALL 
        (
            SELECT field1, field2 
            FROM master_address_table 
            WHERE street = 'RUTH' AND street_number <= 5000 
            ORDER BY street_number DESC
            LIMIT 1
        ) 
    ) AS addresses
ORDER BY abs(5000 - street_number) 
LIMIT 1;

Here we have a hypothetical call for the 5000 block of Ruth. What we’re doing in the inner queries is first getting one record with a house number >= 5000 on Ruth, ordered by house number with a limit of 1. That gets us either 5000 if it exists or the closest above 5000 if it doesn’t. Next we do house number <= 5000 to get 5000 or the closest number below it. Last we order these two unioned records by their absolute distance from 5000 and use LIMIT 1 to get the closest. Viola, the closest point location to your block number.

Put an index on the house number and this query running against ~500,000 address points takes ~30ms. This could also be useful for autocomplete geocoding, so it comes back with the exact match if the user gets the house number right or the closest one if they don’t, or you can fiddle with the limits to get a broader range of candidates.

*Note: Edited to add ASC and DESC to the order on the inner queries. ASC is implied, but DESC needs to be there. Thanks to Mike Q for the patch!