Posted on 07-19-2013
I've had so many discussions on this topic, I felt like I had written a post on it, but when the topic came up again today, I realized I hadn't. This is a regular conversation I get into with open data folks, regarding whether you should deploy open data as download vs an API.
I'll spare you having to read my entire rant, the answer is if you can, both. You should always provide a full data download, and additionally API access when the resources are available. There is no versus argument, there are scenarios where both are valuable and offer benefits over the other.
As a developer, I just want quick, complete access to data. I'm always in a hurry to just get to work, and I have the skills to parse, store, transform this data for whatever reason I will need. So let's just get to work. Give me a download and get the hell out of my way. Pretty straight forward logic. This is all true, but it is a very short-sighted, narrowly focused perspective and shouldn't impede larger, long term goals around data acquisition, management and access.
Taking your data, keying it up, and asking users to request a key and access via an API, might be a wall for the eager alpha dev, but there are many that could benefit from well thought out API end points, search & filtering tools and the ability to offload all operations to someone else. Not all developers will be equipped to deal with the overhead of a full download.
Looking at it from a different vantage point, as a steward of data, an API can provide you with some very granular analytics on how people are using your data or even, not properly using your data. This valuable insight can provide an extreme amount of value in the future of how you gather your data, organize and make accessible. This is something you will never get from a single bulk download.
Many data folks argue that you can just rely on forums, IRC, Twitter to get feedback on this, no API needed. Again this is the single perspective from an alpha developer who will actively participate in these channels and offer insight. In reality, this is a small percentage of any ecosystem, and in my experience users don't usually offer feedback.
Secondarily if you are dealing with large datasets and large user bases, dealing with human feedback channels are expensive, something that can also done with API analytics in a cost effective, scalable way. And don't think I'm arguing for not using these channels, I'm saying don't overburden them unnecessarily. They are still very important.
The best example of this, that I use in the wild, is with U.S. Census Data. A couple years ago I had conversations with database folk at the Census, and asked them their thoughts on how innovators like Google are using census data for flu tracking, and news outlets are merging with their data to track on politics, immigration and other relevant issues. The response was, "we are familiar with some of this, but honestly haven't heard of many of these usage scenarios".
I asked why they don't launch an API alongside of downloads, require developers to get a key, then utilize API analytics to provide insight into other use cases around census data. You still don't have visibility into all use cases, but you'd get another group of interesting integrations.
The response was quickly, NO! The Census has to balance the trust of being gatherers and keepers of this data, we can't lock it up or apply any interpretation on top of the data. We need to remain a neutral party. (They now have an API, but that is another story)
Ok. I get that. But don't you want to make the census a more efficient process? Possibly completely digital? Or be able to gather new data points that could help people better understand health, educational, race or other issues that are reflected in our society? An API isn't always about control (it can be), an API can be about understanding and empower sensible stewardship of data, providing the insight you might need to make the process better.
Every person I've had this discussion with is coming from an alpha developers perspective, who quickly wants access to data and has the skills and resources to get it done. They are not thinking about the rest of developers, or the stewards of this data and the lifelong process that goes into making data acquisition, transformation and publishing better for everyone involved.
So when you are thinking about data download vs. API, remember, if you have the resources, you should do both. A download is a great place to start, but the overall health of your data will be better off if you can introduce the more granular level access and the measurement that an API can deliver.
comments powered by Disqus
Winning in the API Economy
|Download as PDF|
Latest Blog Posts
- You Can Have An API Just By Choosing Products And Services That Have APIs
- Using Excel As An API Datasource And An API Client For The Masses
- Relationship Between APIs And Containers
- Real-time and Visualizations Will Be Key in Financial API Deployments
- Notification Focused Startups Within Leading API Ecosystems
- APIs That Do One Thing And Do It Well Like ZipLocate
- Which API Do I Need?
- The Expanding API Conference Landscape
- Ocotoparts Open Source Google Spreadsheet
- Andrew Nacin Of WordPress @APIStrat Chicago
- Push Button API Deployment With The Heroku Button
- WordPress Style API Modules For Government
- The Heroku HTTP API Design Guide
- What I Have Been Calling API Trends, Are Slowly Being Baked Into API Operations
- FDA Finding Their API Mojo With A New Drug Label API
- Adding PokitDok To Healthcare Research And The API Stack (Well They Did)
- Why I Am Continuing To Integrate Zapier In My Business Workflow
- Who Is Going To Build The Uber API Platform For The Sharing Economy?
- The API Focused Dev Shop
- Route SMS Messages To Google Spreadsheets Via Twilio API With TwilioSheet
- Publishing Your APIs To Product Hunt
- Providing Users With Reciprocity Tools So Important Intuit Purchases itDuzzit
- Bing Developer Assistant for Visual Studio Delivers Relevant API Code
- Average Number of APIs Used In A Modern App
- An APIs.json Collection Of API Resources Across Your Public, Partner Or Internal Resources