## Simple PostGIS nearest neighbour function

**Here’s a less generic and slightly different nearest-neighbour function based on Regina’s generic nearest-neighbour function at Boston GIS.**

It follows the same basic idea of using series of enlarging search radii to restrict distance calculations to a manageable subset of things-that-might-be-near. The difference is that it uses a geometric progression of sizes (`x, x * y, x * y^2, x * y^3, ...`

) instead of an arithmetic one (`x, x + y, x + 2y, x + 3y, ...`

).

For some distributions of things-that-might-be-near, and tuned with the right parameters (`x, y`

), this turns out substantially faster (I’ve used it to locate the nearest UK postcode to each mappiness response).

Note that it’s a lot simpler and less generic than the Boston GIS alternative: it returns only one nearest neighbour, without the associated distance, and is also less flexible about what it accepts as input.

create or replace function nn(nearTo geometry , initialDistance real , distanceMultiplier real , maxPower integer , nearThings text , nearThingsIdField text , nearThingsGeometryField text) returns integer as $$ declare sql text; result integer; begin sql := ' select ' || quote_ident(nearThingsIdField) || ' from ' || quote_ident(nearThings) || ' where st_dwithin($1, ' || quote_ident(nearThingsGeometryField) || ', $2 * ($3 ^ $4))' || ' order by st_distance($1, ' || quote_ident(nearThingsGeometryField) || ')' || ' limit 1'; for i in 0..maxPower loop execute sql into result using nearTo -- $1 , initialDistance -- $2 , distanceMultiplier -- $3 , i; -- $4 if result is not null then return result; end if; end loop; return null; end $$ language 'plpgsql' stable; |

**Update, August 2011** — Looking at this again, I realise there was an error in the function originally posted above. Because it did only a simple bounding-box check at each step, it was possible for it to return a geometry that was not in fact the nearest, but could in the worst case be up to 41% (sqrt(2) times) further away than the nearest. The code has been updated to fix this by using `st_dwithin`

. I’ve also tidied it up a bit.

Pingback: Infrastructure Coverage based on Open Data | Free and Open Source GIS Ramblings