The Missed Revenue Opportunities For The State Of California Because They Do Not Have A Business Registry API

I was talking with OpenCorporates CEO Chris Taggart (@countculture) while in Washington DC a couple of weeks ago, reminding me of a previous conversation we had about the current state of business registry submission and search for the State of California. He had asked me a couple months ago if I knew anyone working for the State of California on opening up data, or possibly specifically with the business registry—I didn’t. Chris is wanting to talk with anyone in the know about evolving and modernizing the states approach to managing their corporate entity information. Clearly he has a desire to get better access to the data, but as our conversation in DC reminded me, he has some serious concerns about the importance of not just access to corporate data, but being able to keep up to speed what is happening at the business level in not just California, but other states as well.

There are many ways in which government can’t keep up with the pace in the private sector, and corporate filings is definitely one of the more critical areas that is falling behind. Like many areas of government I’m guessing this is by design. Not because of some grand government or corporate conspiracy, but just by the the ongoing suffocation of the resources government has to get things done. I’m guessing the folks running the State of California business search are doing the best they can with the resources they have. I’m guessing that the private sector has no real motivation to lobby and compel the State of California to do any better when it comes to filing, managing, and search for business data. The result is a business registry system that barely meets the needs of everyone involved, and leaving a lot of opportunities on the table when it comes to understanding and responding to change in the way business gets done in the state.

When it comes to business registry in the State of California, there are two primary interfaces to get at the information you need:

  • Business Search - Free online access to corporate, limited liability company and limited partnership information. Available information includes the complete entity name, entity number, formation, registration or conversion date, status, jurisdiction, entity address, and the name and address of the agent for service of process.
  • Publicly Traded Disclosure Search - Free online access to abstracts of reported information for all publicly traded corporations that have filed a Corporate Disclosure Statement with the California Secretary of State.

There is no business registry API for the State of California (that I can find)--I have asked for more information. The search result pages are pretty easy to automate and scrape using scripts, but there are measures placed on the detail pages for business entities to counteract scraping of the valuable data present. It is something that isn’t impossible to get around, but clearly efforts have been made to make it more difficult for the public to get at data within the system—if nothing else just to minimize the server traffic scraping data will introduce. You can find scraped versions of business registry data for the state of California, and you can find businesses who claim they have it and will sell it to you, but there are no APsI or public data sets available as part of the State of California’s open data efforts. This leaves a pretty massive gap when it comes to providing access to trusted 3rd parties, researchers, and the general public when it comes to making sense of the business landscape in the State of California.

To keep up with the pace of business there should be a public API for the State of California business registry. It should employ a simple RESTful design for conducting searches, providing basic request and response access to data. You should be given a healthy number of API requests before you are required to signup and justify your usage for significantly higher access. If you are accessing on a regular basis for commercial usage, you should have to pay to your access, and establish a partnership arrangement. All of this can be managed using existing API management solutions. There should also be web hooks available for receiving pushes when something changes, as well as sustained connections using Server-Sent Events (SSE) or web sockets, allowing for consumers to tune into events as they occur across the platform. 

I have a lot of experience working with federal government agencies on data efforts so I know how tough these projects are to move forward. However, business registries for all states is one area that we need to find the will to invest in, and work to develop public / private partnerships to tackle. There is so much opportunity for the State of California to better understand business within their state, and identify missed revenue opportunities that exist. Opening up the data for commercial access would generate new revenues, and having easy, self-service access for reseachers and journalists would add to the brain power available to the state when it comes to making sense of what is going on. I wanted to just write an initial post to get my thoughts out there and see if I know anyone who is working in the State of California—if you are drop me a line, I’d love to brainstorm on the possibilities. I want to also start poking around a little more, playing with the available business search tools, mapping out the underlying schema, and maybe scraping some more data to see if I can use in some sort of proof of concept to demonstrate some of my ideas around moving forward the conversation around the State of California business registry, and see what fires I can light when it comes to cooking something up.