Friday, December 30, 2005

Bright Young Sparks

Dan Egnor, anyone? You probably don't know him and neither did I, so here's why he's a smart boffin:

Geographic Search
Google contest submission

This is the submission that won the 2002 Google Programming Contest. It includes a geocoder (which uses TIGER/Line data to turn street addresses into latitude/longitude coordinates), a simple indexer that looks for addresses and keywords in documents, and a query engine to search for documents matching certain keywords that also contain addresses within a certain distance of a target location.

If any of these components might be useful to you, feel free to download a tarball, read the README file or browse the CVS repository.

Update (5 Nov 2005): The builder now supports Second Edition TIGER data, and a bug affecting address geocoding accuracy has been fixed. Thanks to Bill Thoen for the patches!

My code is available to the public under the terms of the GNU General Public License. Portions of my submission are derived from Google's contest materials, which are covered by their own license. See the LICENSE file for details.

You may find it helpful to download a pre-built index of the 2000 TIGER/Line data; you can feed this to "geo-client" to get a functional geocoder. Be careful, this is a 300MB download.

Please do not ask me to support this code. I will accept bug fixes but that's about it.

-- Dan Egnor


Google local, anyone?
Post a Comment