Scraping the Oregon Business Registry

When you register a business in the State of Oregon it goes into the Oregon Business Registry, which is a public database. The public is freely able to search this database. However, if you want to stay current on what businesses have just expired or been registered, you’ll have to fork over $50 per month for an Excel spreadsheet that is delivered via e-mail or CD-ROM.

I don’t know how they get away with charging that much considering:

  • It’s Public Record
  • It’s really small. Even without compression this wouldn’t be a problem considering it’s likely less than 1,000 rows.
  • It’s not tangible (since they are e-mailing it)
  • It should be available on their website. The registry allows for complex searching, but cannot display a simple list.

However, I’ve decided to do something about it. Since this is public data the original plan was to simply buy the CD-ROM and make it freely available on the internet, but it would still be limited to being updated monthly, and I want a solution that would be more real-time.

So, I did what most would – I just scraped it. The result was over 2 million business entries being scraped for data going back to 1995. All of the data has been imported into a simple database that I can now make available to the public. I’ve started crafting some front-end to visualize the data, so stay tuned for that website within the next few months. It’s specific to Oregon and free of cost.

Interested in getting involved? Catch me on Twitter via @KristopherIves or leave a comment below.

One Comment

Post a Comment

Your email is never shared. Required fields are marked *