RepFinder: USPS and Congressional Data Mashup
Last fall I had an business idea for the nonprofit and cause advocacy – allow people to look up their representatives through all levels of government with their address. I did some sleuthing, found that the service was possible, and that already existed in several forms. Most prominently, the CapWiz product from Capitol Advantage. I still thought there was an opportunity, as CapWiz is quite expensive and solving the underlying technology problems did not seem hard.
Basically, there are two steps. The fact that congressional districts are gerrymandered all over the place means they do not map cleanly onto counties or 5 digit zipcodes. First, to save people the hassle of knowing or looking up their 9-digit zipcode, you need a database from the USPS which matches street addresses, cities, states, and 5-digit zipcodes with the 9 digit ones. Then you use the 9 digit zipcode and associated congressional district to look up current representatives in a congressional database.
I purchased the USPS database from RIBBS, in the Memphis office of the USPS (They provide some of the best customer service on the phone I have ever experienced). Next, I purchased the Postal version of CongressMerge‘s database. One year licenses for each ran me a total of $1350.oo, and I hoped spending the money would induce me to quickly finish the product and try to sell it.
Well, I never finished it. Some big projects got in the way, and as the months passed, I heard about other open, data-based efforts sprouting up everywhere. Since I knew the profit margins would be fairly slim until I signed up a lot of people, and also because I knew it wouldn’t be that hard for someone to recreate my work, I decided to abandon the project. So, I created a public SVN repository of the code.
http://svn.preludeinteractive.com/repfinder
It’s open source in the purest sense – no license, no nothing. Use it, modify it, sell it, whatever, but remember that I’m not responsible for it. There are some 3rd party libraries in there, so please review and abide by those licenses where appropriate.
I’d say about 50% of the work is done – the database structure, queries, classes, and other helper methods are there. I was to the point of testing the accuracy of the address lookups against the USPS’s required tests for becoming a stamp-of-approval data vendor. There are some handy methods for unpacking the massive zip files from the USPS and parsing them into a database. Depending how many indexes you need, the size of the database will probably be on the order of 10 to 50 GB, so there are some hosting difficulties as well.
I also registered the trademark “Repfinder” with the USPTO. I plan on hanging on to that, you know, just in case. If you’re interested in that, let me know. Also, if you want to use my data until the end of September 09′, let me know. We’ll need to check to make sure that’s OK by the license agreements of course.
Otherwise, here are the set of links I gathered in my searching.
http://www.govtrack.us/source.xpd
http://public.resource.org/
http://www.congress.org/congressorg/dbq/officials/?lvl=L
http://capitoladvantage.com/
http://capitoladvantage.com/products/capwiz/
http://www.democracydata.com/techproducts.aspx
http://projects.washingtonpost.com/congress/
http://thomas.gov
http://projects.washingtonpost.com/congress/110/senate/vote-missers/
http://www.backspace.com/action/advocacy_tools.php
http://www.care.org/getinvolved/advocacy/tools.asp
http://wiki.advocacydev.org/cgi-bin/wiki.pl?OpenDatabases
http://www.downhillbattle.org/
http://www.cs.york.ac.uk/fp/polyparse/
http://www.opengovdata.org
http://www.govtrack.us













Hey – ran across your post. Thanks for the detail and info…
I am interested in doing something like this (mainly on the address lookup piece) and wanted to understand the data you got from USPS before buying their stuff. I was playing around with the USPS APIs, but it looks like having a local db is better and it looks like you are / were well on your way to putting this together. Any chance we can connect?