[nycphp-jobs] Wanted: Geocoding Ninja
Matthew Knight
matt at reconstrukt.com
Mon Aug 25 13:22:26 EDT 2008
Hi,
ETL gig involving bulk-geocoding a table of Location names. You'll need to be familiar with the Geonames service (and perhaps Yahoo/Google geocoding APIs) and have solid SQL skills.
You'll be given 3 MySQL tables:
1.) A "Locations" table containing 7400 international locations. Each location row has a "parent id" defined, creating a hierarchical relationship between the various locations. So, more precise names can be created by recursively concatenating the location names together. For example:
-- World (id = 0, parent = null)
-- United States (id = 100, parent = 0)
-- New York (id = 200, parent = 100)
-- Montauk (id = 300, parent = 200)
2.) A table of US zip codes (will help to disambiguate some US locations).
3.) A table of country codes (will help to break up the lookups into country chunks)
Your delivarable:
1.) A plan/approach to geocoding. This will outline the steps you'll take before doing any work.
For example, perhaps you'd want to join countries -> locations, pull geonames feeds for those countries, break down the lookups by country. Also, what's your plan for disambuating the lookup results? If you want to write a loop that hits geonames 7500 times with a raw location name, thank you in advance for not replying. :)
2.) An updated Locations table containing 5 new fields:
a.) lat coord
b.) long coord
c.) foreign key to the US zip code table (if applicable)
d.) foreign key to the Country table
e.) geonames ID
Cheers,
Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nyphp.org/pipermail/jobs/attachments/20080825/40c3272e/attachment.html>
More information about the jobs
mailing list