Geo Parse Digital NZ API Records by Placename Extraction

From Open NZ Wiki

Jump to: navigation, search

hi everyone

Inspired by the weekend bar camp I have finally got round to releasing the code for the geoparser I wrote a couple of months back. For more information about the process involved, see http://groups.google.com/group/digitalnz/browse_thread/thread/b5b0c96ce08ca441?pli=1 - no point in repeating it here.

The parsing is far from perfect but it is a start - it would be useful for me to see examples where it has failed (e.g. matching against words that ought to be stopped, failing to picking out a street name etc) so I can tweak the algorithm. I think a hybrid approach where items are automatically geoparsed (and marked as such visually to the user) but can then subsequently be ratified by a human (and again marked as such and shown to the user) is the way to go, something that was discussed over the weekend. A useful metric for spotting erroneous geoparsing is the area of the match, if it is massive then the chances are a stop word such as 'Photographer' or 'Premises' has matched a street somewhere random in the world, and an area of 0 may indicate that place names have been missed, either due to misspelling or an error in the algorithm.

Oh the code :)

http://github.com/gordonbanderson/digitalnzgeoparser/tree/master

Instructions are in the README file for installing on a clean Ubuntu Jaunty. Let me know if there are any errors in those. There are still a bunch of things I wish to add, but it might have to wait until I have moved to Thailand

Cheers

Gordon


sources http://groups.google.com/group/nzopengovtbarcamp/t/331bcb2c97c0cbc3

Personal tools