I found that a lot of services and libraries do a decent job in geocoding datasets, with one - deal-breaking - caveat: they drop the associated data. This means a tedious merging job between the original and geocoded dataset. This scenario has essentially defined my core requirement set:

The former — along with limited option for hosting the application — led me to seek a JavaScript based solution, which meant that one can copy / clone the app, boot up any basic HTTP server on a local machine and get a job done. (The server is required to make AJAX requests to the geocoding service).

The works

js.geocoder takes a csv file with at least an ID field (to allow easy re-merging) and an Addr (for Address) field and gives you another csv file with

Dependencies

Roadmap

Fine print

As geocoding is reasonably resource intensive (en masse) and relies on massive geo-databases made available by generous third parties, they also imply some — fairly reasonable, I should say — usage terms. To ensure that these terms are understood, I've added a little read the terms before starting enforcement mechanism to the UI. I'm sorry. I know it's annoying, but I also know that most people just wouldn't care... and would get themselves or in worse cases their organisation or some good-soul open host banned from these services.

For there are strict rules to request frequency hitting these geocoding services, I currently imply a 1.2 second interval between requests. This means that to geocode a few thousand entries may take a while. The good news is: you can just let your browser open and running. It won't time out, thanks to papaparse's awesome, streamed parsing mechanism.

Get Yours

Its first release, v0.4 is now up on bitbucket, go and download / clone / fork yours.