Why The Open Data Movement Has Not Delivered As Expected

I was having a discussion with my friends working on API policy in Europe about API discovery, and the topic of failed open data portals came up. Something that is a regular recurring undercurrent I have to navigate in the world of APIs. Open data is a subset of the API movement, and something I have first-hand experience in, building many open data portals, contributing to city, county, state, and federal open data efforts, and most notably riding the open data wave into the White House and working on open data efforts for the Obama administration.

Today, there are plenty of open data portals. The growth in the number of portals hasn’t decreased, but I’d say the popularity, utility, and publicity around open data efforts has not lived up to the hype. Why is this? I think there are many dimensions to this discussion, and few clear answers when it comes to peeling back the layers of this onion, something that always makes me tear up.

  • Nothing There To Begin With - Open data was never a thing, and never will be a thing. It was fabricated as part of an early wave of the web, and really never got traction because most people do not care about data, let alone it being open and freely available.
  • It Was Just Meant To Be A Land Grab - The whole open data thing wasn’t about open data for all, it was meant to be open for business for a few, and they have managed to extract the value they needed, enrich their own datasets, and have moved on to greener pastures (AI / ML).
  • No Investment In Data Providers - One f the inherent flaws of the libertarian led vision of web technology is that government is bad, so don’t support them with taxes. Of course, when they open up data sets that is goo for us, but supporting them in covering compute, storage, bandwidth, and data refinement or gathering is bad, resulting in many going away or stagnating.
  • It Was All Just Hype From Tech Sector - The hype about open data outweighs the benefits and realities on the ground, and ultimately hurt the movement with unrealistic expectations, setting efforts back many years, and are now only beginning to recover now that the vulture capitalists are on to other things.
  • Open Data Is Not Sexy - Open data is not easy to discover, define, refine, manage, and maintain as something valuable. Most government, institutions, and other organizations do have the resources to do properly, and only the most attractive of uses have the resources to pay people to do the work properly, incentivizing commercial offerings over the open, and underfunded offerings.
  • Open Data Is Alive and Well - Open data is doing just fine, and is actually doing better, now that the spotlight is off of them. There will be many efforts that go unnoticed, unfunded, and fall into disrepair, but there will also be many fruitful open data offerings out there that will benefit communities, and the public at large, along with many commercial offerings.
  • Open Data Will Never Be VC Big - Maybe open data share the spotlight because it just doesn’t have the VC level revenue that investors and entrepreneurs are looking for. If it enriches their core data sets, and can be used to trying their machine learning models, it has value as a raw material, but as something worth shining a light on, open data just doesn’t rise to the scope needed to be a “product” all by itself.

My prognosis on why open data never has quite “made it”, is probably a combination of all of these things. There is a lot of value present in open data as a raw material, but a fundamental aspect of why data is “open”, is so that entrepreneurs can acquire it for free. They aren’t interested in supporting city, county, state, and federal data stewards, and helping them be successful. They just want it mandated that it is publicly available for harvesting as a raw material, for use in the technology supply chain. Open data primarily was about getting waves of open data enthusiasts to do the heavy lifting when it came to identifying where the most value raw data sources exist.

I feel pretty strong that we were all used to initiate a movement where government and institutions opened up their digital resources, right as this latest wave of information economy was peaking. Triggering institutions, organizations, and government agencies to bare fruit, that could be picked by technology companies, and used to enrich their proprietary datasets, and machine learning models. Open doesn’t mean democracy, it mostly means for business. This is the genius of the Internet evolution, is that it gets us all working in the service of opening things up for the “community”. Democratizing everything. Then once everything is on the table, companies grab what they want, and show very little interest in giving anything back to the movement. I know I have fallen for several waves of this ver the last decade.

I think open data has value. I think community-driven, standardized sets of data should continue to be invested in. I think we should get better at discovery mechanisms involving how we find data, and how we enable our data to be found. However, I think we should also recognize that there are plenty of capitalists who will see what we produce as a valuable raw resource, and something they want to get their hands on. Also, more importantly, that these capitalists are not in the businesses of ensuring this supply of raw resource continues to exist in the future. Like we’ve seen with the environment, these companies do not care about the impact their data mining has on the organizations, institutions, government agencies, and communities that produced them, or will be impacted when efforts go unfunded, and unsupported. Protecting our valuable community resources from these realities will not be easy as the endless march of technology continues.