{"API Evangelist"}

Politics Of The API Economy

I did a napkin doodle the other day at my favorite local hangout during lunch and beers. I was thinking about the finer points of the API economy that I dwell on, and feel are important not just for API success, but also developer, end-user, and industry level health. In this doodle, I'm trying to think about where I need to focus my energy when working to keep the API pipes as transparent and open as possible.

For me, this isn't just an API story, this is a story about the evolution fo the web, and ultimately a story of the journey humans are on, when it comes to defining our virtual self, that participates in the online domains that capture our attention on a daily basis. I wanted to better understand how companies are moving digital assets online, generating new resources, which are often times user-generated resources, empowered by developers. This is a complex world unfolding, and here was my attempt to better understand.

I worked to distill this down, and represent the key actors, and struggles, we are facing as the API economy becomes a reality. Companies are moving more digital resources online, while also developing platforms for users and developers to generate, and work with valuable resources like images, text, videos, and more. These resources (or lifebits to users) live within the growing number of virtual domains.

In the early Internet days, these resources were often static content, but around 2005, during the web 2.0 phase, things became much more more dynamic. This evolution continued with mobile, and is only picking up momentum with the Internet of Things. These resources x domains x users x apps x developers = the units that make up the API economy. 

The relationship between all the actors involved in the exchange of these units, are often being managed by oAuth, but also other more simplistic ways like keys, passwords, etc. Identity is king, of serious value to platforms, but simply your digital self to the average online user. This relationship is heavily impacted by government and industry influences ranging from regulation to patent and copyright challenges, as well as the higher level NSA and cybersecurity conversations that plague the online world.

If we are lucky the resources we depend on, and the life bits we generate daily, are securely stored and transmitted with end-users knowledge and consent, and enjoying a high level of ethics by developers, with transparency at the corporate and government levels. Unfortunately this isn't something we enjoy across the board, and will be the biggest challenge of the API economy.

Access to resources, security, transparency, licensing, ownership and other items I included in my doodle will come into play heavily across this new economy. This isn't just about companies deploying APIs. This will be a real-time balance between platforms, developers, users, and the government, that will have to be constantly managed by everyone involved.



Augmenting Data Sources and APIs with POST, PUT, and DELETE Using Restlet APISpark

Most of the APIs you find out there are read only, meaning they act like a website and just allow you to retrieve data, content, and other media types, but usually do not let you read or write any data to them. I wrote a while back about augmenting a read only API with an external POST, PUT, and DELETE as I came out of the government, but wanted to update the topic, based upon some open data work I'm doing with RESTlet API Spark.

I am preparing for my Restlet Summer of APIs webinar tomorrow morning, so I am publishing some open data files (CSV, Excel, etc.) to the platform, generating data stores, and ultimately APIs. A simple example of this in action would be with farmers market data from the US Department of Agriculture. There is a farmers market API, but it leaves a lot to be desired, something others, like Code for America have tried to solve, but I figured I could not just launch an API from the open dataset, I would make it read and write.

Deploying an API with Restlet's APISpark is pretty straightforward, something I will build a more detail walk-through in the future, but to summarize, you just download the Excel file from data.gov, and upload to Google Spreadsheet. Once you have cleaned up the spreadsheet, fields, and other elements, you can connect to your Google Drive using API Spark, and generate an entity data store--you now have a separate copy of the Google Spreadsheet, stored on APISpark. 

Next, you can publish a web API using the service, something you can use to launch multiple versions, require authentication to use, track usage, and much more. The great part, is it doesn't just generate the GET for your API, you also get the POST, PUT, and DELETE. You have now have a full read and write API for Farmers Market dataset. You could now accept submissions from Famer Market owners, merchants, and attendees to expand on the dataset, outside the original USDA source.

This is just one example of how you could augment existing open data sources, or even an existing API, and improve on them using Restlet API Spark. The best part is you can do this without any coding. While it does take some work to massage the data, and API design into shape, I did a handful of APIs with never actually writing any code. If you want to see this in action, catch my workshop tomorrow morning at 10 AM, and maybe you can find a dataset you would like to see improved with a simple API, and participant in the Restlet Summer of APIs



Taking A Look At Whats Next For The Environmental Protection Agency (EPA) Envirofacts Data Service API

EPA

I was asked by folks at the Environmental Protection Agency (EPA) to provide some feedback on the Envirofacts Data Service API, as they prepare to work on the next iteration. I took a quick glance at the landing page for their service, I saw a simple URL layout showing how to make API calls, and made an estimate that it would take me probably an hour or two (at the most) to profile the API.

As I dug into the process of profiling the Envirofacts Data Service API one evening in May, I realized I was wrong about the scope of the API, and became unsure how long it would actually take me. Then this work got lost in the shuffle of my summer, and is something I only recently picked up. I'm not happy if I can't provide an agency with some direction on where to go next, and after about 12 hours of work, I think I have some valuable feedback that they can run with.

The Envirofacts Data Service API program consists of a single landing page, with an overview of how to use the API, and a myriad of pages below, that explain the underlying data model put to use. The API is what I consider a very resource driven API design, meaning it reflects the database resource it came from, and not much emphasis on how the API driven resources will be used.

While the API does use the URL, it uses few of the other HTTP components that make some RESTful. I can see how the design would make sense to a database engineer, but will be a little confusing for API developers.

After looking beyond this portal I have since found other possible APIs, but honestly they are often even more incoherent than the Envirofacts Data Service API. I'm not trying to review the entire EPA API efforts, and will be specifically focusing on the resources available in the Envirofacts Data Service API for this round.

Environmental Protection Agency
  EPA Air Facility System (AFS) API  
  EPA Biennial Report API  
  EPA Comprehensive Environmental Response, Compensation, and Liability Information System API    
  EPA Facility Registry System API  
  EPA Greenhouse Gas API  
  EPA Integrated Grants Management System API  
  EPA Locational information API  
  EPA Permit Compliance System API  
  EPA Radiation Ambient Monitoring API  
  EPA Radiation Information Database API  
  EPA Resource Conservation and Recovery Act Information API  
  EPA Safe Drinking Water Information System API  
  EPA Toxics Release Inventory API  

After I discovered the 411 tables across these 13 groups, and learned the common URL pattern for querying, I decided to define each table as its own endpoint, rather than relying on each table to be included via a {table} path parameter, I opted to hard code it. Even though most of them are incoherent, some still articulate a little bit more about what they resource might do, and once you make a request, you get an even better idea. All of this can go a long way towards helping people understand what is going on.

It wouldn't take much to apply a coherent summary  to each endpoint that describes what is stored in the table for use. Once I had a list of all tables, I went ahead and made a call to each of the 411 endpoints in the 13 areas, and generated a Swagger API definition for each. Using Charles Proxy I was able to generate the underlying data model for each, which is necessary for generating SDKs, and can be used as a central truth throughout other aspects of API integration. The current API design also allows you pass in a field, and apply an operator against it when searching--I opted to leave this out of this iteration, until I had a clear diction of endpoints, and the underlying data model defined for each.  The API is perfectly usable without this.

Keeping Things Simple
My recommendation for any future API release out of the EPA team would be focused on just simplifying things. When you land on the home page, you get the idea there is an API present, but you do not grasp the depth of the resource. A simple list of the various API groups is important. A list that I hydrated from the acronyms, to better demonstrate what lies beneath. Calling things by their actual names just makes things more intuitive. You need to reach out of your government silos. I had to really work hard to make sense of the data model at play, I was sure there would be a meta API or download allowing me to quickly understand things, but I couldn't find it. By creating Swagger definitions for all API endpoints, complete with associated definitions for the data model, I can now easily build querying, filtering, and other mechanisms into my clients. 

Speaking In Plain English
While FRS_PROGRAM_FACILITY may had made sense to the database administrator when naming the original, it does not adequately describe the resource it is serving up. A big part of the next version for these APIs needs to focus on renaming towards more meaningful endpoints over the cryptic table names, and more descriptive fields for each of the underlying data definitions. After crafting the Swagger definitions for these APIs I am blown away by the amount of information in here, obfuscated by the cryptic database naming conventions.

Wrap In A Clean Portal
The current landing page for the Envirofacts API is fairly cluttered, and ultimately doesn't say much--it made me work to hard to get what I need. My goal was to distill down the 13 APIs I found buried in the Envirofacts API page, and expose exactly what you need to understand and get to work using any of the 13 APIs and the over 400 endpoints--nothing more.  I started with a simple Github Pages hosted template, with a single APIs.json home page, and interactive documentation for each of the APIs (which you can fork).

Environmental Protection Agency (apis.json)
The United States Environmental Protection Agency (EPA or sometimes USEPA) is an agency of the U.S. federal government which was created for the purpose of protecting human health and the environment by writing and enforcing regulations based on laws passed by Congress. The EPA was proposed by President Richard Nixon and began operation on December 2, 1970, after Nixon signed an executive order. The order establishing the EPA was ratified by committee hearings in the House and Senate. The agency is led by its Administrator, who is appointed by the president and approved by Congress. The current administrator is Gina McCarthy. The EPA is not a Cabinet department, but the administrator is normally given cabinet rank.
APIs
EPA Air Facility System (AFS) API
EPA Biennial Report API
EPA Environmental Response, Compensation, and Liability Information API
EPA Facility Registry System API
EPA Greenhouse Gas API
EPA Integrated Grants Management System API
EPA Locational information API
EPA Permit Compliance System API
EPA Radiation Ambient Monitoring API
EPA Radiation Information Database API
EPA Resource Conservation and Recovery Act Information API
EPA Safe Drinking Water Information System API
EPA Toxics Release Inventory API


With my new portal, you get the overview of the EPA API, with link to each API, but I also use the Swagger definition to generate Swagger interactive documentation, rather than sending you to the EPA data model page. This is just the start. I can also use Swagger to generate sandbox environments, cloning APIs, and maybe allowing for updates and changes. I could also use the Swagger to generate client libraries for EPA APIs using APIMATIC. I'll add all of this to the roadmap, I think I have done enough work for now, and ready to hand things back to EPA.

I'd like to see EPA consider some of the common building blocks I recommend as part of my default developer portal. You don't have to do everything, but the more you do to engage the public around your API, the more chances they will actually use it. Additionally if you go through all APIs, and translate everything from databaseze to English, the potential someone will build on it will exponentially increase.

Continue On The API Journey At EPA
Beyond the portal, and better describing the APIs, my advice is to just continue on the API journey at EPA--this is where the learnings come from. On the current Envirofacts API page, there is another API in addition to the Envirofacts Data Service API, the UV Index API. I can tell the thinking that went into this, are the beginning steps of more experience based API design, focusing on how the API will be used. There is still a lot of the same design mistakes in crafting URLs for this API, but I can tell the desire is there to continue improving on the original design.

When you look at the Envirofacts Multi-system Search, and the widgets that are present, you also see some serious thought put into usability--this needs to be applied to the API design. The API is for other developers, but you can assist them in better understanding the potential through better API design.

I haven't changed anything with the current EPA Envirofacts Data Service API, I just worked to understand how it works, profiled each service I found as a Swagger definition, and then brought them all together as a single APIs.json driven collection. This process helped me understand the 13 APIs and 400+ endpoints, while also distilling this definition into a set of machine readable index, that I use to drive the Github Page developer portal I launched. APIs.json drives the home page, and each APIs Swagger definition drives its associated interactive documentation.

When it comes to the next iteration of the EPA Envirofacts Data Service API, I'd focus on a simple, concise portal for supporting developers, complete with the common building blocks found in other leading API platforms. I would also focus on taking the API definitions I've created, and get to work humanizing the design of these 13 APIs, and 400+ endpoints. Make the endpoints intuitive, standardize your approach to query, and pagination based upon other leading approaches established by API architects. Then do the dirty work of humanizing the underlying definitions, field names, and descriptions. Think deeply about both the request and response structure, and make it speak to developers--your simple, intuitive portal, with the right building blocks will provide a potential feedback look for this cycle (if you do it right).

If you do this, then get to work generating some SDKs, setting up some monitors with API Science or Runscope, and provide Postman Collections for your API consumers, and  get busy evangelizing that these APIs even exist--the API will get used. There is a lot of value present here, it just needs to be brought out, polished, and presented in a way that showcases the hard work going on at EPA.



The Window To The API Economy For Everybody Else

API Evangelist has always been about helping on-board the masses with concepts involving APIs, something that takes a lot of work, because more often than not, an API is a very abstract concept, far removed from the everyday lives of normal people. When services like Zapier launched, I became very optimistic about how APIs can be put to work by the average individual, and today my optimism went up another notch, with the redesign of Blockspring.

The Blockspring home page image says it all in my opinion--valuable API resources, neatly available in Google Sheets, and Excel on the desktop, and in the Cloud with Office 365:

This is it! This is the window to the API economy that will be required to take things to the next level. Most of the worlds business is conducted through this interface, without drag and drop access to valuable API resources, the API economy is a non-starter.

I've been tracking on Blockspring for a while, and have been working with them to define machine readable API definitions in Swagger that they can use in their spreadsheet connector(s). However their site redesign really struck a chord with me when it comes to the API Evangelist mission, something I wanted to take a moment to share. 



Setting rel=api Into Motion With Latest APIs.json Release

Bruno Pedro (@bpedro) who has been building APIs.json into his API Changelog service, made a pull request to the specification recently, pushing forward the link relation conversation for APIs.json. As listed in the specification, we have long intended to make APIs.json an official media type:

3.5.  Media Type

It is intended that if there is sufficient traction, the media type "application/apis+json" will be   submitted to IANA as per RFC: http://tools.ietf.org/html/rfc4288

However when it came to expressing your APIs.json as a link relation, we didn't even really have a plan in our road-map, resulting in a very generic allocation of a link relation for APIs.json.

3.8. Link Relation

 In order for an API to reference its own description, it is recommended that it include header or in-line references to the APIs.json file using the existing described by link relation:

[Note, this is a generic link relation but seems to fit the bill]

What Bruno is suggesting that we get a little more precise when it comes to our link relation, something the rest of the governing group for APIs.json agrees with. Here is the last update to the link relation:

3.8. Link Relation 

In order for a Web site to reference its API description, it is recommended that it includes a header or   in-line reference to the APIs.json resource using the api link relation, e.g.: 

It is intended that if there is sufficient traction, the link relation “api” will be   submitted to IANA as per RFC: http://tools.ietf.org/html/rfc5988

As with the media type, we intend to submit the link relation to Iana, per its RFC. Bruno's pull request sets in motion the formal link relation, but then also escalates the media type submission as well. 

With the interest and usage we have seen in the first year of the specification, we are confident the API discovery format will get traction. We are already seeing exploration around the link relation, achieving RSS like experience in the browsers, when visiting websites that have an active API program.

I have added a link relation to my API Evangelist, Kin Lane, and Master API project, all pointing to my central APIs.json, which provides an index of not just each of the APIs I use to operate the API Evangelist network, but also the supporting building blocks of API operations like pricing, support, etc. 

Thanks Bruno for the pull request, pushing out a minor release of the APIs.json spec, moving us to 0.15. This release primes the pump, for a queue of APIs.json requests that are in the works, like more API property types, country origin, and much more.