The API Evangelist Blog

This blog represents the thoughts I have while I'm research the world of APIs. I share what I'm working each week, and publish daily insights on a wide range of topics from design to depcration, and spanning the technology, business, and politics of APIs. All of this runs on Github, so if you see a mistake, you can either fix by submitting a pull request, or let me know by submitting a Github issue for the repository.


Charles Proxy Generated HAR To OpenAPI Using API Transformer

I was responding to Jean-Philippe M. (@jpmonette) tweet regarding whether or not I had moved forward my auto generation of OpenAPIs from traffic captured by Charles Proxy. It is one of many features of my internal systems I have not gotten around to finishing, but thankfully he actually answered his own question, and found a better solution than even I had–using my friends over at API Transformer.

I had been exploring ways for speeding up the process of generating OpenAPI specs for the APIs that I’m reviewing, something that becomes very tedious when working with large APIs, as well as just profiling the sheer number of APIs I am looking profile as part of my work. I haven’t been profiling many APIs lately, but the approach Jean-Philippe M. came up is petty damn easy, leaving me feeling pretty silly that I hadn’t connected the dots myself.

Here is what you do. Fire up Charles Proxy:

Then open up Postman, and make any API calls. Of course you could also proxy mobile application or website API calls through your Charles Proxy, but Postman is a great way to for a majority of the APIs I depend on.

After you’ve made the calls to all the APIs you are looking to generate an OpenAPI for, save your Charles Proxy session as a .har file, which is the last option on the dropdown menu available while saving. Then you head over to API Transformer and upload your .har file, and select OpenAPI (Swagger) 2.0 as the output–push convert.

API Transformer will then push a fresh OpenAPI to your desktop, or allow you to publish via a portal, and generate an SDK using APIMATIC. Automated (mostly) generation of OpenAPI definitions from API traffic you generate through your browser, Postman, Restlet Client, mobile application, or other tooling.

I have abandoned my internal systems, except for my stack of APIs, and depending mostly on 3rd party services like Charles Proxy, Postman, and API Transformer. So I won’t be moving forward the custom solution I had developed. However, there still might be benefit of automatically saving .har files to my Dropbox sync folder, then using the Dropbox API, and API Transformer API to automate the conversation of .har files to OpenAPI, and write them back to the appropriate Dropbox folder.


100K View Of Bot Space From The API Evangelist Perspective

I had a friend ask me for my thoughts on bots. It is a space I tend to rant about frequently, but isn’t an area I’m moving forward any meaningful research in, but it does seem to keep coming up and refuses to ever go way. I think bots are a great example of yet another thing that us technologists get all worked up about and think is the future, but in reality, while there will only be a handful of viable use cases, and bots will cause more harm, than they ever will do any good, or fully enjoy a satisfactory mainstream adoption.

First, bots aren’t new. Second, bots are just automation. Sure, there will be some useful automation implementations, but more often than not, bots will wreak havoc and cause unnecessary noise. Conveniently though, no matter what happens, there will be money to be made deploying and defending against each wave of bot investment. Making bots is pretty representative of how technology is approached in today’s online environment. Lot’s of tech. Lot’s of investment. Regular waves. Not a lot of good sense.

Top Bot Platforms
Ok, where can you deploy and find bots today? These are the dominant platforms where I am seeing bots emerge:

  • Twitter - Building bots on the public social media platform using their API.
  • Facebook - Building Facebook messenger bots to unleash on the Facebook Graph.
  • Slack - Building more business and productivity focused bots on Slack.

There are other platforms like Telegram, and folks developing interesting Github bots, but these three platforms dominate the conversation when it comes to bots in 2017. Each platform brings it’s own tone when it comes to what bots are capable of doing, and who is developing the bots. Another important thing to note across these platforms is that Slack is really the only one working to own the bot conversation on their platform, while on Facebook and Twitter allow the developer community to own the conversation about exactly what are bots.

Conversational Interfaces
When it comes to bots, and automation, I’m always left thinking more broadly about other conversational interfaces and Siri, or more specifically Amazon Alexa. The Amazon Alexa platform operates on a similar level to Slack when it comes to providing developers with a framework, and tooling to define and deliver conversational interfaces. Voice just happens to be the interface for Amazon, where the chat and messaging window is the interface for Slack, as well as Twitter and Facebook. Alexa is a bot, consuming API resources alongside the other popular definitions of what is a bot on messaging and social channels–expanding the surface area for how bots are deployed and engaged with in 2017.

Bots And APIs
To me, bots are just another client application for APIs. In early days APIs were about syndicating content on the web, then they were used to deliver resources to mobile applications, and now they are delivering content, data, and increasingly algorithms to devices, conversational interfaces, signage, automobiles, home appliances, and on and on. When any user asks a bot a question, the bot is the making one or many API calls to get the sports statistic, news and weather report, or maybe the purchase of a product. There will be many useful scenarios in which APIs will be able to deliver critical resources to conversational interfaces, but like many other client implementations, there will be many, many bad examples along the way.

Algorithmic Shift
In 2017, the API space is shifting gears from primarily data and content based APIs, to a more algorithmic focus. Artificial intelligence, machine learning, deep learning, cognitive, and other algorithmically fueled interfaces are emerging, wrapped in APIs, intent on delivering “smart” resources to the web, mobile, and conversational interfaces. We will continue to see an overwhelming amount of discussion at the intersection of bots, API, and AI in coming years, with very little actual results delivered–regardless, there will be lots of money to be made by a few, along the way. Algorithms will play a central role in ensuring the “intelligence” behind bots stay a black box, and sufficiently pass as at least magic, if not entirely passed off as comparable to human intelligence.

Where Will The Bot Money Be?
When it comes to making money with bots, there will only be a couple value creation centers. First, the platforms where bots operate will do well (most of them)–I am not sure they all will generate revenue directly from bots, but they will ensure bots are driving value that is in alignment platform revenue goals. Next, defensive bot solutions will generate sufficient amounts of revenue identifying and protecting businesses, institutions, and government agencies from the bot threat. Beyond that, venture capital folks will also do well investing in both the bot disruption, and bot defensive layers of the conversation–although VCs who aren’t directly involved with bot investment, will continue to be duped by fake users, customers, and other bot generated valuations. Leaving bot blemishes on their portfolios.

Who Will Lose With Bots?
Ultimately it is the rest of us who will come out with on the losing side of these “conversations”. Our already very noisy worlds will get even noisier, with more bot chatter in the channels we currently depend on daily. The number of humans we engage with on a daily basis will decrease, and the number of frustrating “conversation” we find ourselves stuck in will increase. Everything fake will continue inflate, and find new ways to morph, duping many of us in new and exciting ways. Markets will be noisy, emotional, and always artificially inflated. Elections will continue be just an an outright bot assault on voters, leaving us exhausted, numb, and pretty moldable by those who have the biggest bot arsenals.

Some Final Thoughts On Bots
I am continuing to see interesting bots emerge on Twitter, Facebook, Slack, and other channels I depend on like Github. I have no doubts that bots and conversational solutions will continue to grow, evolve, and result in a viable ecosystem of users, service providers, and investors. However, I predict it will be very difficult for bots to ever reach an acceptable mainstream status. As we’ve seen in every important conversation we are having online today, some of most badly behaved amongst us always seem to dominate any online conversation. Why is this? Bots. We will see this play out in almost every business sector.


Managing Platform Terms of Service In A Site Policy Repository

Github is releasing an update to their platform Terms of Service and Corporate Terms of Service. Guess what platform their are using to manage the evolution, and release of their terms of service? Github of course! They are soliciting feedback, along with clarifications and improvements to their terms of service, with an emphasis on helping making things more readable! #nice

Github has provided a deadline for everyone to submit comments by the end of the month, then they’ll spend about a week going through the comments before making any changes. It provides a pretty useful way for any platform to manage their terms of service in a way that gives the community a voice, and provides some observability into the process for everyone else who might not feel confident enough to chime in on the process. This can go a long way towards building trust with the community, even if they don’t directly participate in the process.

Managing terms of service using Github makes sense for all providers, not just Github. It provides an open, transparent, and participatory way to move forward one of the most important documents that is governing API consumption. It is logical that the drafting, publishing, and evolution of platform terms be done out in the open, where the community can watch and participate. Pushing forward the design of the legal document in sync with the design, deployment, management, SDKs and other aspects of API operations. Bringing the legal side of things out of the shadows, and making it part of the conversation within the community.

Eventually, I’d like to see the terms of service, privacy policies, service level agreements, and other legal documents that govern API operations managed and available on Github like this. It gives the wider API community the chance to play a more significant role in hammering out the legal side of API operations, ensuring this are easier to follow and understand, and maybe even standardized across APIs. Who knows, maybe some day terms of service, privacy policies, and service level agreements will all be available in plain language, as well as machine readable YAML, shifting how the API contract will scale.


The Plivo Support Portal And Knowledge Base

I’m always watching out for how existing API providers are shifting up their support strategies in their communities as part of my work. This means staying into tune with their communications, which includes processing their email newsletters and developer updates. Staying aware of what is actually working, and what is not working, based upon active API service providers who are finding ways to make it all work.

Plivo opted out to phase out direct emails at the end of the month, and pushing developers to use the Plivo support portal, and the ticketing system. The support portal provides a knowledge base which provides a base of self-service support before any developer actually uses the support ticketing system to:

  • Create, manage, respond to and check the status of your support ticket(s)
  • Select improved ticket categories for more efficient ticket routing and faster resolution
  • Receive resolution suggestions from our knowledge base before you submit a ticket to help decrease resolution time

Email only support isn’t always the most optimal way of handling support, and using a ticketing system definitely provides a nice trail to follow for both sides of the conversations. The central ticketing system also provides a nice source of content to feed into the self-service support knowledge base, keeping self-service support in sync with direct support activity.

I’m going to continue to track on which API providers offer a ticketing solution, as well as a knowledge base. I’m feeling like these are what I’m going to recommend to new API providers as what I consider to be default support building blocks that EVERY API platform should be starting with, covering the self-service and direct support requirements of a platform. I’m going to start pushing 1-3 support solutions like ZenDesk, also giving API providers some options when it comes to quickly delivering adequate support for their platforms.


More Investment In API Security

I’m getting some investment from ElasticBeam to turn up the volume on my API security research, so I will be telling more stories on the subject, and publishing an industry guide, as well as a white paper in coming weeks. I want my API security to become a first class area of my API research, along side definitions, design, deployment, management, monitoring, testing, and performance.

Much of my API security research is built on top of OWASP’s hard work, but honestly I haven’t gotten very far along in it. I’ve managed to curated a handful of companies who I’ve come across in my research, but haven’t had time to dive in deeper, or fully process all the news I’ve curated there. It takes time to stay in tune with what companies are up to, and I’m thankful for ElasticBeam’s investment to help me pay the bills while I’m heads down doing this work.

I am hoping that my API security research will also help encourage you to invest more into API security. As I do with my other partners, I will find ways of weaving ElasticBeam into the conversation, but my stories, guides, and white papers will be about the wider space–which Elastic Beam fits in. I’m hoping they’ll compliment Runscope as my partner when it comes to monitoring, testing, and performance (see how I did that, I worked Runscope in too), adding the security dimension to these critical layers of operating a reliable API.

One thing that attracted me to conversations with ElasticBeam was that they were developing a solution that could augment existing API management solutions like 3Scale and Amazon Web Services. I’ll have a talk with the team about integrating with Tyk, DreamFactory, and Restlet–my other partners. Damn I’m good. I got them all in here! Seriously though, I’m thankful for these partners investing in what I do, and helping me tell more stories on the blog, and produce more guides and papers.

I feel like 3Scale has long represented what I’ve been doing over seven years–a focus on API management. Restlet, DreamFactory, and Tyk represent the maturing and evolution of this layer. While Runscope really reflects the awareness that has been generated at the API management layer, but evolving to serve not just API providers, but also API consumers. I feel like ElasticBeam reflects the next critical piece of the puzzle, moving the API security conversation beyond the authentication and rate limiting of API management, or limiting the known threats, and making it about identifying the unknown threats our API infrastructure faces today.


The Most Important Aspect Of The API Discussion Is Learning To Think Outside Our Boxes

There are many good things to come out of doing APIs properly. Unfortunately there are also many bad things that can come out of doing APIs badly, or with misaligned expectations. It is easy to focus on the direct benefits of doing APIs like making data resources available to partners, or maybe developing a mobile application. I prefer looking for the more indirect benefits, which are more human, more than they are ever technical.

As I work with different groups on a variety of API definitions and strategies, one very significant part of the process I see, is people being forced to think outside their box. APIs are all about engaging around data, content, and algorithms on the web, with 3rd parties that operate outside your box. You are forced to lookup, and outward a bit. Not everyone I engage with is fully equipped to do this, for a variety of reasons, but overall the API process does make folks just a little more critical than they do with even their websites.

The web has come with a number of affordances. Those same affordances aren’t always present in API discussions forcing folks to have more conversations around why we are doing APIs (an answer shouldn’t always be yes), and discussing the finer details not just storing your data, and managing your schema, but doing in a way that will play nicely with other external systems. You may be doing things one way internally, and it might even be working for you, but it is something that can only get better with each outside partner, or consumer you are exposed to along your journey. Even with all of the internal politics I encounter in my API conversations, the API process always leaves me enjoying almost any outcome.


Does Your Platform Have An Integrations Page?

I’m continuing to come across more dedicated integration pages for the API platforms I’m test driving, and keeping an eye on. This time it is out of spreadsheet and database hybrid AirTable, that allows you to easily deploy an API complete with a portal, with a pretty robust integrations page for their platform. Airtable’s dedicated integrations page is made easier since they use Zapier, which helps them aggregate over 750+ APIs for possible integration.

Airtable is pretty slick all by itself, but once you start wiring it up to some of the other API driven platforms we depend on, it becomes a pretty powerful tool for data aggregation, and then publishing as an API. I don’t understand why a Zapier-driven API integrations page isn’t default for every API platform out there. API consumption today isn’t just about deploying web or mobile applications, it is about moving data and content around the web–making sure it is where we need it, when we need it.

I’m playing with different variations of the API integrations page lately. I’m exploring the idea of how I can encourage some higher education folks I know, and government open data folks I know to be Zapier advocates within their organizations, and publish a static integrations page, showing the integrations solutions available around the platforms they depend on. Dedicated integration pages help API developers understand the potential of any API, and they help non-developers also understand the potential, but in a way they can easily put into action to solve problems in their world. I’m going to keep beating the API integration page drum, and now that Zapier has their partner API you will also hear me talking about Zapier a lot more.


Containerized Microservices Monitoring Driving API Infrastructure Visualizations

While I track on what is going on with visualizations generated from data, I haven’t seen much when it comes to API driven visualizations, or specifically visualization about API infrastructure, that is new and interesting. This week I came across an interesting example in a post from Netsil about mapping microservices so that you can monitor them. They are a pretty basic visualization of each database, API, and DNS element for your stack, but it does provide solid example of visualizing not just the deployment of database and API resources, but also DNS, and other protocols in your stack.

Netsil microservices visualization is focused on monitoring, but I can see this type of visualization also being applied to design, deployment, management, logging, testing, and any other stop along the API lifecycle. I can see API lifecycle visualization tooling like this becoming more common place, and play more of a role in making API infrastructure more observable. Visualizations are an important of the storytelling around API operations that moves things from just IT and dev team monitoring, making it more observable by all stakeholders.

I’m glad to see service providers moving the needle with helping visualize API infrastructure. I’d like to see more embeddable solutions deployed to Github emerge as part of API life cycle monitoring. I’d like to see what full life cycle solutions are possible when it comes to my partners like deployment visualizations from Tyk and Dreamfactory APIs, and management visualizations with 3Scale APIs, and monitoring and testing visualizations using Runscope. I’ll play around with pulling data from these provides, and publishing to Github as YAML, which I can then easily make available as JSON or CSV for use in some basic visualizations.

If you think about it, thee really should be a wealth of open source dashboard visualizations that could be embedded on any public or private Github repository, for every API service provider out there. API providers should be able to easily map out their API infrastructure, using any of the API service providers they are using already using to operate their APIs. Think of some of the embeddable API status pages we see out there already, and what Netsil is offering for mapping out infrastructure, but something for ever stop along the API life cycle, helping deliver visualizations of API infrastructure no matter which stop you find yourself at.


One API Development Partner Every API Provider Should Have

Yet another reason to be making sure Zapier is part of your API operations–issue management. Zapier is now providing an important window into how people are integrating with your API(s)–now any public API connected to Zapier can see filtered, categorized feedback from their users with Zapier Issues, and use that information to improve upon their APIs and integrations. This is the biggest movement I’ve seen in my API issues research since I first started doing it on April of 2016.

Zapier Issues doesn’t just provide you with a look at the issues that arise within API integrations (the bad news), it also provides you with a feedback look where you can engage with Zapier users who have integrated with your API, and hear feature requests (the good news), and other road map influencing suggestions. Zapier sees, “thousands of app combinations and complex workflows from more than 1.5 million people—and we want to give you more insight into how your best customers use your app on Zapier.”

It is another pretty big reason that ALL API providers should be baking Zapier into their platforms. Not only will you be opening up API consumption to the average business user, you can now get feedback from them, and leverage the wisdom Zapier has acquired integrating with over 750 APIs. As an API provider you should be jumping at this opportunity to get this type of feedback on your API resources. Helping you make sure your APIs more usable, stable, reliable, and providing the solutions that actual business users are needing to solve the problems they encounter in their daily lives.


Specialized Collections Of Machine Learning APIs Could Be Interesting

I was learning more about CODEX, from Algorithmia, their enterprise platform for deploying machine learning API collections on premise or in the cloud. Algorithmia is taking the platform in which their algorithmic marketplace is deployed on and making it so you can deploy it anywhere. I feel like this is where the algorithmic-centered API deployment is heading, potentially creating some very interesting, and hopefully specialized collections of machine learning APIs.

I talked about how the economics of what Algorithmia is doing interests me. I see the potential when it comes to supporting machine learning APIs that service an image or video processing pipeline–something I’ve enjoyed thinking about with my drone prototype. Drone is just one example of how specialized collections of machine learning APIs could become pretty valuable when they are deployed exactly where they are needed, either on-premise or in any of the top cloud platforms.

Machine learning marketplaces operated by the cloud giants will ultimately do fine because of their scale, but I think where the best action will be at is delivering curated, specialized machine learning models, tailored to exactly what people need, right where they need them–no searching necessary. I think recent moves by Google to put TensorFlow on mobile phones, and Apple making similar moves show signs of a future where our machine learning APIs are portable, operating on-premise, on-device, and on-network.

I see Algorithmia having two significant advantages right now. 1) they can deploy their marketplace anywhere, and 2) they have the economics, as well as the scaling of it figured out. Allowing for specialized collections of machine learning APIs to have the metering, and revenue generation engines built into them. Imagine a future where you can deploy and machine learning and algorithmic API stack within any company or institution, or the factory floor in an industrial setting, and out in the field in an agricultural or mining situation–processing environmental data, images, or video.

Exploring the possibilities with real world use cases of machine learning is something I enjoy doing. I’m thinking I will expand on my drone prototype and brainstorm other interesting use cases beyond just my drone video. Thinking about how I can develop prototype machine learning API collections, that could be used for a variety my content, data, image, or video side-projects. I think when it comes to machine learning I’m more interested in specialty collections over the general machine learning hype I”m seeing peddled in the mainstream right now.


Diagramming The Components Of API Observability

I created a diagram of the politics of APIs sometime ago that has really held true for me, and is something I’ve continue to reference as part of my storytelling. I wanted to do a similar thing to help me evolve my notion of API observability. Like the politics of APIs, observability overlaps many areas of my API life cycle research. Also like the politics of APIs, observability involves many technical, business, and legal aspects of operating a platform online today.

Here is my first draft of a Venn diagram beginning to articulate what I see as the components of API observability:

The majority of the API observability conversation in the API space currently centers around logging, monitoring, and performance–driven by internal motivations, but done in a way that is very public. I’m looking to push forward the notion of API observability to transcend the technical, and address the other operational, industry, and even regulatory concerns that will help bring observability to everyone’s attention.

I do not think we should always be doing API, AI, ML and the other tech buzzwords out there if we do not have to–saying no to technology can be done. In the other cases where the answer is yes, we should be doing API, AI, and ML in an observable way. This is my core philosophy. The data, content, algorithms, and networks we are exposing using APIs, and using across web, mobile, device, and network applications should be observable by internal groups, as well as partners, and public stakeholders as it makes sense. There will be industry, community, and regulatory benefits for sectors that see observability as a positive thing, and go beyond just the technical side of observability, and work to be more observable in all the areas I’ve highlight above.


HTTP Status Codes Are An Essential Part Of API Design And Deployment

It takes a lot of work provide a reliable API that people can depend on. Something your consumers can trust, and will provide them with consistent, stable, meaningful, and expected behavior. There are a lot of affordances built into the web, allowing us humans to get around, and make sense of the ocean of information on the web today. These affordances aren’t always present with APIs, and we need to communicate with our consumers through the design of our API at every turn.

One area I see IT and developer groups often overlook when it comes to API design and deployment are HTTP Status Codes. That standardized list of meaningful responses that come back with every web and API request:

  • 1xx Informational - An informational response indicates that the request was received and understood. It is issued on a provisional basis while request processing continues.
  • 2xx Success - This class of status codes indicates the action requested by the client was received, understood, accepted, and processed successfully.
  • 3xx Redirection - This class of status code indicates the client must take additional action to complete the request. Many of these status codes are used in URL redirection.
  • 4xx Client errors - This class of status code is intended for situations in which the client seems to have errored.
  • 5xx Server error - The server failed to fulfill an apparently valid request.

Without HTTP Status codes, application won’t every really know if their API request was successful or not, and even if an application can tell there was a failure, it will never understand why. HTTP Status Codes are fundamental to the web working with browsers, and apis working with applications. HTTP Status Codes should never be left on the API development workbench, and API providers should always go beyond just 200 and 500 for every API implementation. Without them, NO API platform will ever scale, and support any number of external integrations and applications.

The most important example I have of the importance of HTTP Status Codes I have in my API developers toolbox is when I was working to assist federal government agencies in becoming compliant with the White House’s order for all federal agencies to publish a machine readable index of their public data inventory of their agency website. As agencies got to work publishing JSON and XML (an API) of their data inventory, I got to work building an application that would monitor their progress, indexing the available inventory, and providing a dashboard the the GSA and OMB could use to follow their progress (or lack of).

I would monitor the dashboard in real time, but weekly I would also go through many of the top level cabinet agencies, and some of the more prominent sub agencies, and see if there was a page available in my browser. There were numerous agencies who I found had published their machine readable public data inventory, but had returned a variety of HTTP status codes other than 200-resulting in my monitoring application to consider the agency not compliant. I wrote several stories about HTTP Status Codes, in which the GSA, and White House groups circulated with agencies, but ultimately I’d say this stumbling block was one of the main reasons that cause this federated public data API project to stumble early on, and never gain proper momentum–a HUGE loss to an open and more observable federal government. ;-(

HTTP Status Codes aren’t just a nice to have thing when it comes to APIs, they are essential. Without HTTP Status Codes each application will deliver unreliable results, and aggregate or federated solutions that are looking to consume many APIs will become much more difficult and costly to develop. Make sure you prioritize HTTP Status Codes as part of your API design and deployment process. At the very least make sure all five layers of HTTP Status Codes are present in your release. You can always get more precise and meaningful with specific series HTTP status codes later on, but ALL APIs should be employing all five layers of HTTP Status Codes by default, to prevent friction and instability in every application that builds on top of your APIs.


Writing API Stories That Speak To But Also Influences Their View Of Technology

I know that some of my friends who follow API Evangelist shake their heads when I talk about API business models, partner programs, and many of the business sides of API operations. Much of my work will have an almost delusional attraction towards the concept of an API. Heavily doused in a belief in technology as a solution. This isn’t accidental. This is API Evangelist. A persona I have developed to help me make a living, and help influence where we go (or don’t go) with technology.

I am delusional enough to think I can influence change in how the world uses technology. I’m borderline megalomaniac, but there really is not sufficient ego to get me quite all the way there. While still very, very, very minor, I feel I have influenced where technology has flowed over my seven years as the API Evangelist. Even if it just slowing the speed (seconds) at which the machines turn on us, and kills us all. If nothing else, I know there are few folks out there who I have touched, and shaped how they see, use, and allow technology in their lives (cause they told me so).

Through my storytelling on API Evangelist, I am always looking for the next convert–even if it takes years and hundreds of stories. A significant portion of this outreach involves telling stories that reach my intended audience–usually startups, business, institutional, and government agency workers and influencers. To reach them I need to tell stories that speak to them, and feed their current goals around finding success in their startup, or their role within businesses, institutions, and government agencies. With this in mind, I am always trying to bend my stories in their direction, talking about topics that they’ll care about, and tune into.

Once I have their attention, I will work on them in other ways. I’ll help them think about their business model, but also help them understand transparency and communication when it comes to executing this model. I will help them understand the best practices for managing an API using open source solutions like Tyk or Dreamfactory, and the leading approaches to using Runscope for monitoring and testing, while also encouraging them to me more observable with these practices. Making sure companies tell stories about what they are doing, and how they are doing it all–the good and bad.

I’m always working to build bridges to folks who might not see this whole API thing like I do. I’d say that many of these bridges will never get fully walked across by my target audience, but when someone does, and my stories influence the way they see or use technology even a little bit–mission accomplished. I’m constantly testing new ways or reaching out, speaking in the language of my target audience (without selling out), using trendy terms like microservices, devops, and serverless, but this isn’t just about following the latest fad. It is meant to capture your attention, build some trust, and then when it matters I can share some information about what really matters in all of this–in hopes of influencing how you see technology, and how it can be used a little more sensibly, securely, or maybe not even at all. ;-)


Bot Observability For Every Platform

I lightly keep an eye on the world of bots, as APIs are used to create them. In my work I see a lot of noise about bots usually in two main camps: 1) pro-bot - bots are the future, and 2) anti-bot - they are one of the biggest threats we face on the web. This is a magical marketing creating formula, which allows you to sell products to both sides of the equation, making money off of bot creation, as well as bot identification and defense–it is beautiful (if you live by disruption).

From my vantage point, I’m wondering why platforms do not provide more bot observability as a part of platform operations. There shouldn’t be services that tell us which accounts are bots, the platform should tell us by default, which users are real and which are automated (you know you know). Platforms should embrace automation and providing services and tooling to assist in their operation, which includes actual definitions of what is acceptable, and what unacceptable bot behavior. Then actually policing this behavior, and being observable in your actions around bot management and enforcement.

It feels like this is just another layer of technology that is being bastardized by the money that flow around technology so easily. Investment in lots of silly useless bots. Investment in bot armies that inflate customer numbers, advertising, and other ways of generating attention (to get investment), and generate revenue. It feels like Slack is the only leading bot platform that has fully embraced the bot conversation. Facebook and Twitter lightly reference the possibilities, and have made slight motions when it comes to managing the realities of bots, but when you Google “Twitter Bots” or “Facebook Bots”, neither of them dominate the conversation around what is happening–which very telling around how they view the world of bots.

Slack has a formal bots directory, and has defined the notion of a bot user, separating them from users–setting an example for bot developers to disclose who is bot, and who is not. They talk about bot ethics, and rules for building bots, and do a lot of storytelling about their vision for bots. Providing a pretty strong start towards getting a handle on the explosion of bots on their platform–taking the bull by the horns, owning the conversation, and setting the tone.

I’d say that Slack has a clearer business model for bots–not that people are actually going to pay for your bot (they aren’t), but a model is present. You can some smell of revenue strategies on Facebook, but it just feels like all roads lead to Facebook, and advertising partners there. I’d say Twitter has no notion of a botlike business model for developers. This doesn’t mean that Facebook and Twitter bots don’t generate revenue for folks targeting Facebook and Twitter, or play a role in influencing how money flows when it comes to eyeballs and clicks. Indirectly, Twitter and Facebook bots are making folks lots of money, it is just that platforms have chosen not to observable when it comes their bot practices and ecosystems.

Platform observability makes sense for not just platform, and bot developers, as Slack demonstrates it makes sense for end-users. Incentivizing bots generating value, instead of mayhem. I’m guessing advertising-driven Facebook and Twitter have embraced the value of mayhem–with advertising being the framework for generating their revenue. Slack has more of a product, with customers they want to make happy. With Facebook and Twitter the end-users are the product, so the bot game plays to a different tune.


Making All Sub-Resources Available Within The Core Set Of Human Service APIs

I had recently taken the Human Services Data Specification (HSDS) and exposed it as a set of API paths that provide access to about 95% of the schema, which we are calling the Human Services Data API (HSDA). When you make a call to the /organizations/ path, you receive an array collection of organizations that each match the HSDA organization schema. The same applies when you make a call to the /locations, /contacts, and /services, opening up access to the entire schema–minus three objects I pushed off until future releases.

After the core set of API paths /organization, /service, /location, /contact, there are a set of sub-resources available across those as it makes sense–including /phone, /programs, /physical_address, /postal_address, /regular_schedule, /holiday_schedule, /funding, /eligibility, /service_area, /required_document, /payment_accepted, /language, /accessiblity_for_disabilities, and /service_at_location_id. I took the HSDA schema, and published API paths for each sub-resource so that it exactly returned, and accepted HSDA compliant schema–making all aspects of the schema accessible via an API, with POST and PUT requests accepting compliant schema, and GET returning compliant schema.

One of the “stoppers” we received from several folks in the HSDS community during the feedback cycle going from version 1.0 of the API to version 1.1, was that the design was overly complex, and that it would serve any of the human services use cases on the table currently, unless you could get at all the sub resources directly with each core API path, eliminating the need for making additional call(s) to each sub-resource. You could get at everything about an /organization, /service, /location, and /contact in a single API URL.

Currently the core four API paths accept and return the following schema:

Organization

Field Name Type (Format) Description Required? Unique?
id string (uuid) Each organization must have a unique identifier. True True
name string The official or public name of the organization. True False
alternate_name string Alternative or commonly used name for the organization. False False
description string A brief summary about the organization. It can contain markup such as HTML or Markdown. True False
email string (email) The contact e-mail address for the organization. False False
url string (url) The URL (website address) of the organization. False False
tax_status string Government assigned tax designation for for tax-exempt organizations. False False
tax_id string A government issued identifier used for the purpose of tax administration. False False
year_incorporated date (%Y) The year in which the organization was legally formed. False False
legal_status string The legal status defines the conditions that an organization is operating under; e.g. non-profit, private corporation or a government organization. False False

Service

Field Name Type (Format) Description Required? Unique?
id string Each service must have a unique identifier. True True
organization_id string The identifier of the organization that provides this service. True False
program_id string The identifier of the program this service is delivered under. False False
name string The official or public name of the service. True False
alternate_name string Alternative or commonly used name for a service. False False
description string A description of the service. False False
url string (url) URL of the service False False
email string (email) Email address for the service False False
status string The current status of the service. True False
interpretation_services string A description of any interpretation services available for accessing this service. False False
application_process string The steps needed to access the service. False False
wait_time string Time a client may expect to wait before receiving a service. False False
fees string Details of any charges for service users to access this service. False False
accreditations string Details of any accreditations. Accreditation is the formal evaluation of an organization or program against best practice standards set by an accrediting organization. False False
licenses string An organization may have a license issued by a government entity to operate legally. A list of any such licenses can be provided here. False False
taxonomy_ids string (Deprecated) A comma separated list of identifiers from the taxonomy table. This field is deprecated in favour of using the service_taxonomy table. False False

Location

Field Name Type (Format) Description Required? Unique?
id string Each location must have a unique identifier True False
organization_id string Each location entry should be linked to a single organization. This is the organization that is responsible for maintaining information about this location. The identifier of the organization should be given here. Details of the services the organisation delivers at this location should be provided in the services_at_location table. False False
name string The name of the location False False
alternate_name string An alternative name for the location False False
description string A description of this location. False False
transportation string A description of the access to public or private transportation to and from the location. False False
latitude number Y coordinate of location expressed in decimal degrees in WGS84 datum. False False
longitude number X coordinate of location expressed in decimal degrees in WGS84 datum. False False

Contact

Field Name Type (Format) Description Required? Unique?
id string Each contact must have a unique identifier True False
organization_id string The identifier of the organization for which this is a contact False False
service_id string The identifier of the service for which this is a contact False False
service_at_location_id string The identifier of the ‘service at location’ table entry, when this contact is specific to a service in a particular location. False False
name string The name of the person False False
title string The job title of the person False False
department string The department that the person is part of False False
email string (email) The email address of the person False False

To ensure that all sub-resource area available as part of each of the requests and responses for all core API paths, we are going to have to evolve the HSDS schema to be:

Organization

Field Name Type (Format) Description Required? Unique?
id string (uuid) Each organization must have a unique identifier. True True
name string The official or public name of the organization. True False
alternate_name string Alternative or commonly used name for the organization. False False
description string A brief summary about the organization. It can contain markup such as HTML or Markdown. True False
email string (email) The contact e-mail address for the organization. False False
url string (url) The URL (website address) of the organization. False False
tax_status string Government assigned tax designation for for tax-exempt organizations. False False
tax_id string A government issued identifier used for the purpose of tax administration. False False
year_incorporated date (%Y) The year in which the organization was legally formed. False False
legal_status string The legal status defines the conditions that an organization is operating under; e.g. non-profit, private corporation or a government organization. False False
services array Returns a collection of services for each organization False False
locations array Returns a collection of locations for each organization False False
contacts array Returns a collection of contacts for each organization False False
phones array Returns a collection of phones for each organization False False
programs array Returns a collection of programs for each organization False False
fundings array Returns a collection of fundings for each organization False False

Service

Field Name Type (Format) Description Required? Unique?
id string Each service must have a unique identifier. True True
organization_id string The identifier of the organization that provides this service. True False
program_id string The identifier of the program this service is delivered under. False False
name string The official or public name of the service. True False
alternate_name string Alternative or commonly used name for a service. False False
description string A description of the service. False False
url string (url) URL of the service False False
email string (email) Email address for the service False False
status string The current status of the service. True False
interpretation_services string A description of any interpretation services available for accessing this service. False False
application_process string The steps needed to access the service. False False
wait_time string Time a client may expect to wait before receiving a service. False False
fees string Details of any charges for service users to access this service. False False
accreditations string Details of any accreditations. Accreditation is the formal evaluation of an organization or program against best practice standards set by an accrediting organization. False False
licenses string An organization may have a license issued by a government entity to operate legally. A list of any such licenses can be provided here. False False
taxonomy_ids string (Deprecated) A comma separated list of identifiers from the taxonomy table. This field is deprecated in favour of using the service_taxonomy table. False False
contacts array Returns a collection of contacts for each service. False False
phones array Returns a collection of phones for each service. False False
regular_schedules array Returns a collection of regular schedules for each service. False False
holiday_schedules array Returns a collection of holiday schedules for each service. False False
fundings array Returns a collection of fundings for each service. False False
eligibilities array Returns a collection of eligibilities for each service. False False
service_areas array Returns a collection of service areas for each service. False False
required_documents array Returns a collection of required documents for each service. False False
payments_accepted array Returns a collection of payments accepted for each service. False False
languages array Returns a collection of languages for each service. False False

Location

Field Name Type (Format) Description Required? Unique?
id string Each location must have a unique identifier True False
organization_id string Each location entry should be linked to a single organization. This is the organization that is responsible for maintaining information about this location. The identifier of the organization should be given here. Details of the services the organisation delivers at this location should be provided in the services_at_location table. False False
name string The name of the location False False
alternate_name string An alternative name for the location False False
description string A description of this location. False False
transportation string A description of the access to public or private transportation to and from the location. False False
latitude number Y coordinate of location expressed in decimal degrees in WGS84 datum. False False
longitude number X coordinate of location expressed in decimal degrees in WGS84 datum. False False
phones array Returns a collection of phones for each location. False False
physical_addresses array Returns a collection of physical addresses for each location. False False
postal_addresses array Returns a collection of postal addresses for each location. False False
regular_schedules array Returns a collection of regular schedules for each location. False False
holiday_schedules array Returns a collection of holiday schedules for each location. False False
languages array Returns a collection of languages for each location. False False
accessiblity_for_disabilities array Returns a collection of accessiblity_for_disabilities for each location. False False

Contact

Field Name Type (Format) Description Required? Unique?
id string Each contact must have a unique identifier True False
organization_id string The identifier of the organization for which this is a contact False False
service_id string The identifier of the service for which this is a contact False False
service_at_location_id string The identifier of the ‘service at location’ table entry, when this contact is specific to a service in a particular location. False False
name string The name of the person False False
title string The job title of the person False False
department string The department that the person is part of False False
email string (email) The email address of the person False False
phones array Returns a collection of phones for each contact False False

Once we add all relevant sub-resources added as arrays to the HSDS schema, we can allow API consumers to POST and PUT, or GET using as little, or as much of the schema using the path, header, or parameter. Allowing for reading and writing HSDS at the granular level, or everything at once using a single path.

Next we need to consider this updated schema as part of the 1.2 release of HSDS. If we update HSDA to allow for filtering the schema across /organization, /service, /location, and /contact, and return each sub-resource as part of the API request or response, it will be out of sync with HSDS. Ideally, both HSDS, and HSDA move forward in sync with each version. I’m curious why this expanded schema became such an issue once we got to the API phase–it seems like it should have been part of v1.1 of HSDS, making the schema drive the API instead of the other way around.

I’m guessing that these concerns about schema don’t come into focus until we start talking about access to the schema, and data. Making the separation and relationships between HSDS and HSDA all the more important, providing a framework to move the schema forward in a way that is rooted in how it will actually be accessed. Which is why we do APIs…


Learning More About Amazon Alexas Approach to APIs And Skills Development

I have had Amazon Alexa in my cross hairs for some time now. I regularly digest stories about what Amazon is up to with Alexa, but haven’t had the time to think deeply about voice enablement, and their approach to developing what they call “skills”. I’m not 100% convinced voice enablement is the future of human compute interfaces, but I do see the role they can play in some situations, for some people. Plus, all the actions involved with Alexa and it’s ecosystem are all driven using APIs, which will almost always make me perk up, and pay a little more attention–I have a serious problem.

The Amazon Alexa platform centers around two specific areas of development:

  • Alexa Voice Service - The actual voice enablement, and baking Alex voice into applications, devices, your home, car, and other physical objects in our world.
  • Alexa Skills Kit - The things that you can say to your Alex that will tigger specific actions, which make calls to APIs, and return something useful (or not).

Its all about baking Alex Voice Service baked into as many devices you possibly can, and develop the catalog of skills that the voice enabled application can put to use. Ok. Well, my next question(s) are 1) what is a skill, and 2) what can you actually do with skills? Amazon provides some resources to help with the basics:

Ok, helps me grasp what is their definition of a skill a little bit, and how it delivers their view of a voice enabled user interaction. Next, what can a skill really do? Or, what types of “skills” does Amazon want you to build? They start with the lofty perspective of “anything”, or “custom skills”, to hook us technologists who like to think about at this level and fill in the gaps with our magical technological skills–a fundamental building block of API culture.

  • Look up information from a web service
  • Integrate with a web service to order something (order a car from Uber, order a pizza from Domino’s Pizza)
  • Interactive games

This is API Evangelism 101. You start with everything and anything is possible, and work your way down from there. After lighting the imagination with custom skills, the focus in on a couple of specific types of actions, serving very specific purposes:

Now the concept of a skill comes into focus a little more for me. Amazon really wants to encourage developers to develop features that deliver in a home environment, entertaining us. Alexa performance and entertainment skills. So far I have just pulled references from Amazon Alexa’s documentation, if you want to see what has been developed by the community, you can head over to the Alex skills catalog. I also recommend checking out a pretty robust 3rd party skills list, which gives a view of skills from the outside-in.

Alexa Voice Service and Skills Kit are the two core services, but when you browse the documentation, you see there is a 3rd area given just as much prominence–the Alexa Fund, which “provides up to $100 million in investments to fuel voice technology innovation”. This is an important aspect of the skills development conversation, an opportunity to get funding to support the creation of skills. While Slack doesn’t use the word skills, they also have a fund for investing in conversational interfaces (messaging, chat, and bot). One thing to note here, Amazon has additional rewards program where developers can earn rewards when developing specially game skills for the Alexa platform–providing another glimpse into their strategy.

I am writing this post to support our Contrafabulist podcast, but I’m also doing it to feed my wider voice research as the API Evangelist. So I have to highlight some of the common building blocks of the Alexa approach to API management, helping me better understand Amazon’s approach to this set of API resources.

  • Glossary - https://developer.amazon.com/public/solutions/alexa/alexa-skills-kit/docs/alexa-skills-kit-glossary
  • Forum - http://forums.developer.amazon.com/forums/category.jspa?categoryID=60
  • FAQs - https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/docs/alexa-voice-service-developer-preview
  • Blog - https://developer.amazon.com/blogs/alexa/tag/AVS
  • Terms of Service - https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/support/terms-and-agreements

Beyond the common building blocks for operating their developer portal, supporting their APIs, they have some interesting design elements available for developers. Helping direct Alex developers to develop skills and voice-enabled applications that fit in with their objectives:

  • Designing for AVS - https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/content/designing-for-the-alexa-voice-service
  • Functional Design Guide - https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/content/alexa-voice-service-functional-design-guide
  • UX Design Guidelines - https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/content/alexa-voice-service-ux-design-guidelines
  • Marketing Brand Guidelines - https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/content/marketing-brand-guidelines

These design elements tell an interesting story regarding how Amazon is aligning the concept of skills development with their voice enabled API strategy. It also provides an interesting approach to design guides that other API providers might want to consider. After the design guides, Amazon provides some interesting code and hardware to help developers, providing starter kits to get going developing skills as well as actual physical voice integration:

  • Projects and Sample Code - https://github.com/alexa/alexa-avs-sample-app
  • API and Reference - https://developer.amazon.com/public/solutions/alexa/alexa-voice-service/content/avs-api-overview
  • Development Kits for AVS - https://developer.amazon.com/dev-kits
  • Development Kits (Hardware) - https://developer.amazon.com/alexa-voice-service/dev-kits

There are three different stories going on here for me which I think are relevant to how we think about, and approach our usage of technology. I’m really interested in Amazon Alexa because of:

  • Voice Enablement - How do you enable voice applications using APIs? What role will voice play in the wider conversational API landscape?
  • Skills Concept - I am fascinated by the concept of a skill and how it applies to not just voice enablement, and conversational interface, but also representing a unit of compute or a transaction that occurs in API land (serverless, microservices, containers, etc.)
  • API Management - How do you manage a set of APIs that serve conversational interfaces and voice-enablement? What is different than regular API management, and what can others learn from their approach?

As I said in the opening, I’m not 100% convinced that voice interfaces will be the future for everyone. I depend on the intimacy that exists between my fingers and the keyboard to make the magic happen each day–which might be a little noisy, but it doesn’t involve me rambling on, talking to a device. Even though I’m not into voice interfaces, and much of a talker, I am interested in the motivations behind, and the approach that Amazon is taking with their conversational interfaces. It is something i’m comparing with my research into Twitter, Slack, as well as Facebook. These companies are investing a lot into their ecosystems–you can see the signs of it over at Amazon with the 137 job openings for their Alex team.

Audrey and I are going to talk about Amazon Alexa on our Contrafabulists podcast this week, so you can tune in to get more thoughts of mine regarding Alexa, and voice enablement APIs. I’ll probably continue the exploration of my thoughts about this approach to interfaces, and particularly the development of “skills”. Which I think has an interesting overlap with APIs, and the modularization we are seeing as a result of compute (ie. containers, microservices, devops), as well as the impact APIs are having on labor (ie. Uber, Mechanical Turk, Task Rabbit). I feel like there is a lot more going on here than just developing fun skills for having conversations with Alexa in your home.


Quantifying The Difference Between Human Services Data Specification (HSDS) And Its API

To help quantify the move from version 1.0 to 1.1 of the Human Services Data API (HSDA) definition I took the existing Ohana API and created an OpenAPI definition to describe what was present in version 1.0 of the HSDA. Then I took version 1.1 of the Human Services Data Specification (HSDS) and made sure as much of HSDS was returned as part of API responses, as well as allowing adding, updating, and deleting across the schema.

During the vendor API review portion of our process I took the documentation for four of the vendors APIs and created OpenAPI for each of them. I then laid all the vendor OpenAPIs alongside the current draft I had of the HSDA definition. I then consider each path, the parameters, body, and responses for inclusion as part of the HSDA definition. This allowed me to consider the existing vendor API implementations that are already serving human service implementations.

OpenAPI plays a central role in defining what is, what might be, while opening up a forum for having a conversation about the specific detail of the HSDS/A definition. I’m using OpenAPI to establish a definition of what both HSDS and HSDA are. It will be the contract that gets hammered out as part of the Open Referral governance process, so you will see me use it regularly to articulate specific aspects of what is going. With this in mind, I’d like to use a distilled OpenAPI, articulated just a single API path for GET /organizations.

I won’t go into to much detail on the OpenAPI, I recommend learning more about the specification on the GitHub repository, and at the OpenAPI Initiative (OAI). What I’d like to articulate for this story is to help quantity the separation and connection between the Human Services Data Specification (HSDA), and the Human Services Data API (HSDA), using this single OpenAPI, describing a single HSDA path–organizations.

When you take the schemes: located at line 7, and combine it with host: at line 5, basePath: at line 6, and the path for /organizations/ at line 12 you get http://api.open.referral.adopta.agency/organizations/, which when you load in a browser will give you a JSON listing of many organizations. Line 13-29 describes how to make an API request, and line 30-36 describes what you can expect as a response.

Lines 1-38 is HSDA, and 42-71 is HSDS. Line 36 is the link between HSDA, and HSDS, providing a reference that binds the API request, with the API response. HSDS is the valid schema being returned–HSDA is not the schema, it is the surface of the API that lets you send, and in this case received valid HSDS. This is a line that we honestly haven’t had the level of detail, or even the acronyms before now to even articulate HSDS/A at this level. So don’t worry if what I said doesn’t quite make sense–it will come. ;-)

The first reason I’m writing this story is to help myself better articulate the difference between HSDS, and HSDA, and the relationship between them. The second portion is to help other folks participating in the HSDS/A governance conversation see the separate layers between the schema and API, but also understand how they work together. I’d say a third portion is about helping folks understand the value of using OpenAPI to facilitate these types of conversations.

It might take a couple times reading this post and/or having a conversation directly with me about OpenAPI, but there is nothing in this OpenAPI definition that any engaged users can learn to work with–developers, and non-developers. I’ll keep producing simple lessons like this to better articulate aspects of the HSDS/A contract using OpenAPI. I’d love to hear any feedback on how I can better articulate the great work we are doing around the Human Services Data Specification and it’s API.


Moving The Human Services API Specification From Version 1.1 to 1.2

I am preparing for the recurring governance meeting for the Open Referral Human Services Data API standard–which I’m the technical lead for. I need to load up every detail of my Human Services Data API work into my brain, and writing stories is how I do this. I need to understand where the definition is with v1.1, and encourage discussion around a variety of topics when it comes to version 1.2.

Constraints From Version 1.0 To v1.1

I wasn’t able to move as fast as I’d like from 1.0 to 1.1, resulting in me leaving out a number of features. The primary motivation to make sure as much of the version 1.1 of Human Services Data Specification (HSDS) was covered as possible–something I ended up doing horizontally with new API paths, over loading up the core paths of /organizations, /locations, and /services. There were too many discussion on the table regarding the scope and filtering of data, and schema for these core paths. Something which led to a discussion, about /search–resulting in me pushing off API design discussions on how to expand vertically at the core API path level to future versions.

There were just too many decisions to make at the API request and response level for me to make a decision in all the areas–warranting more discussion. Additionally, there were other API design discussion regarding operational, validation, and more utility APIs to discuss for inclusion in future versions expanding the scope and filtering discussions to the API path, and now API project level. In preparation for our regular governance meeting I wanted to run through all of the open API design issues, as well as additional projects the community needs to be thinking about.

API Design

As part of my Human Services Data API (HSDA) work we have opened up a pretty wide API design conversation regarding where the API definition could (should) be going. I’ve tried to capture the conversations going on across the Slack, and Google Group using GitHub issues for the HSDA GitHub repository. I will be focusing in on 16 of these issues for the current community discussions.

Versioning

We are moving forward the version of the API specification from 1.0 to 1.1. This version describes the API definition, to help quantify the compliance of any single API implementation. This is not guidance regarding how API providers should version their API–each implementation can articulate their compliance using an OpenAPI definition, or just in operation by being compliant. I purposely dodged providing versioning guidance of specific API implementations–until I could open up discussion around this subject.

If you need a primer on API versioning I recommend Troy Hunt’s piece which helps highlight:

  • URL: We put the API version into the URL: https://example.com/api/v1.1/organizations/
  • Custom request header: Using a header such as “api-version: 1.1”
  • Accept header: Using the accept header to specify the version “Accept: application/vnd.hsda.v1.1+json - which relates to content negotiation discussions.
  • No Versioning - We do not offer any versioning guidance and let each API implementation decide for themselves with no version being a perfectly acceptable answer.

API versioning discussions are always hot topics, and there is no perfect answer. If we are to offer API versioning guidance for HSDA compliant API providers I recommend putting it in the URL, not because it is the right answer, but it is the right answer for this community. It is easy to implement, and easy to understand. Although I’m not 100% convinced we should be offering guidance at all.

I would like to open it up to the community, and get more feedback from vendors, and implementors. I’m curious what folks prefer when they are building applications. This decision was one that was wrapped up with potential content negotiation, hypermedia, and schema scope discussions to make without more discussion.

Paths

The API definition provides some basic guidance for HSDA implementations when it comes to naming API paths, providing a core set or resources, as well as sub-resources. There are a number of other API designs waiting in the wings to be hammered out, making more discussion around this relevant. How do we name additional API paths? Do we keep evolving a single stack of resources (expanding horizontally), or do we start grouping them and evolve using more sub-resources (expanding vertically)?

Right now, we are just sticking with a core set of paths for /contacts, /locations, /organizations, and /services, with /search somewhat of an outlier, or I guess wrapper. We have moved forward with sub-resource guidance, but should standard API design guidance when it comes to crafting new paths, as well as sub-paths, including the actions discussion below. This will be an ongoing discussion when it comes to API design across future versions, making it an evergreen thread that will just keep growing as the HSDA definition matures.

Verbs

HTTP verbs usage was another aspect of the evolution of the HSDA specification from v1.0 to v1.1–the new specification uses its verbs. Making sure POST, PUT, and DELETE were used across all core resources, as well as sub-resources, making the entire schema open for reading and writing at all levels. This further expanded the surface of the API definition, making it manageable at all levels.

Beyond this expansion we need to open up the discussion regarding OPTIONS, and PATCH. Is there a need to provide partial updates using PATCH, and providing guidance on using OPTION for providing requirements associated with a resource, and the capabilities of the server behind the API. Also we should be having honest conversations about which verbs are available for sub-resources, especially when it comes to taking specific actions using HSDA paths. There is a lot more to discuss when it comes to HTTP verb usage across the HSDA specification.

Actions

I want to prepare for the future when we have more actions to be taken, and talk about how we approach API design in the service of taking action against resources. Right now HTTP verbs are taking care of the CRUD features for all resources and sub-resources. While I don’t have any current actions in the queue to discus, we may want to consider this as part of the schema scope and filtering discussion–allowing API consumers to request partial, and complete representations of API resources using action paths. For example: /organization/simple, or /organizations/complete.

As the HSDA specification matures this question will come up more and more, as vendors, and implementations require more specialized actions to be taken against resources. Ideally, we are keeping resources very resource oriented, but from experience I know this isn’t always the case. Sometimes it becomes more intuitive for API developers to take action with simple, descriptive API paths, than adding more complexity with parameters, headers, and other aspects of the APIs design. I will leave this conversation open to help guide future versions, as well as the schema scope and filtering discussions.

Parameters

Currently the numbers parameters in use for any single endpoint is pretty minimal. The core resources allow for querying, and sorting, but as of version 1.1, parameters are still pretty well-defined and minimal. The only path that has an extensive set of parameters is /search, which possesses category, email, keyword, language, lat_lng, location, org_name, page, per_page, radius, service_area, and status. I’d like to to continue the discussion about which parameters should be added to other paths, as well as used to help filter the schema, and other aspects of the API design conversation.

I’d like to open up the parameter discussion across all HSDA paths, but I’d also like to establish a way to regularly quantify how many paths are available, as well as how loaded they are with default values, and enumerators. I’d like to feed this into overall API design guidance, helping keep API paths reflecting a microservices approach to delivering APIs. Helping ensure HSDA services do one thing, and do it well, with the right amount of control over the surface area of the request and response of each API path.

Headers

Augmenting the parameter discussion I want to make sure headers are an equal part of the discussion. They have the potential to play a role across several of these API design questions from versioning to schema filtering. They also will continue to emerge in authentication, management, security, and even sorting and content negotiation discussions.

It is common for there to be a lack of literacy in developer circles when it comes to HTTP headers. A significant portion of the discussion around header usage should always be whether of not we want to invest in HTTP literacy amongst implementors, and their developer communities, over leveraging other non-header approaches to API design. HTTP Headers are an important building block of the web that developers should understand, but educating developers around their use can be time intensive and costly when it comes to guidance.

Body

There is an open discussion around how the body will be used across HSDA compliant implementations. Currently the body is default for POST and PUT, aka add and update. This body usage has been extended across all core resources, as well as sub-resource, requiring the complete, or sub resource representation to be part of each POST or PUT request.

There is no plan for any other APIs that will deviate from this approach, but we should keep this thread open to make sure we think about when the usage of the body is appropriate and when it might not be. We need to make sure that developers are able to effectively use the body, alongside headers, as well as parameters to get the desired results they are looking for.

Data Scope / Filtering

Currently the only filtering beyond pagination that is available is the query parameter available on /contact, /organizations, /locations, and /services resources. After that search is where the heaviest data scope and filtering can be filtered and defined. We need to discuss the future of this. Should the core resources have similar capabilities to /search, or should /search be a first class citizen with the majority of the filtering capabilities?

There needs to be more discussion around how data will be available bia default, and how it will be filtered as part of each API request. Will search be carrying most of the load, or will each core resource be given some control when it comes to filtering data. Whatever the approach it needs to be standardized across all existing paths, as well as applied to new API designs, keeping data filtering consistent across all HSDA designs. As this comes into focus I will be making sure there is a guide that provides guidance when it comes to data filtering practices in play.

Schema Scope / Filtering

This is one of the top issues being discussed as part of the migration from v1.1 to v1.2, regarding how to not just filter data that is returned as part of API responses, but how do you filter what schema gets returned as part of the response. When it came to v1.0 to v1.1 I didn’t want to shift the response structure so that I can reduce any breaking changes for existing Ohana implementations, and open up with the community regarding the best approach for allowing schema filtering.

My current recommendation when it comes to the filtering of how much or how little of the schema to return with each request is to allow for schema templates to be defined and named, then enable API consumers to specify which template they’d like returned. This should be specified through either through a prefer header, as part of the path structure as an action, or possibly through a parameter–all would accept the name of a schema template they desire (ie. simple, complete, etc.).

This approach to enabling schema templating could be applied at the GET, and could be also applied to POST or PUT requests. I personally recommend using a prefer header, but I also emphasize the ease of use, and ease of defining the usage as part of documentation, and the OpenAPI definition–which it might make sense to allow for schema enablement as pat of the path name as an action. I’ll leave it to the community to ultimately decide, as with the rest of this API design and project list, I’m just looking to provide guidance, and direction, built on the feedback of the community.

Path Scope / Filtering

Next up in the scope and filtering discussion is regarding how we define, group, and present all available API paths included in the HSDA specification. With the current specification I see three distinct groups of API paths emerging: 1) core resources (/contacts, /organizations, /locations, /services), and 2) sub resources (/physical-address, /postal-address, /phones, and more), then the more utility aspects of meta data, taxonomy, and eventually webhooks.

When a new user lands on the API documentation, they should see the core resources, and not be burdened with the cognitive load associated sub resources or the more utility aspects of HSDA consumption. However, once ready more advanced API paths are available. The grouping and filtering of the API paths can be defined as part of the OpenAPI definitions for the API(s), as well as the APIs.json index for the site. This path grouping will allow for API consumers to limit scope and filter which API paths are available in the documentation, and possibly with SDKs, testing, and other aspects of integration.

There are additional API projects on the table that might warrant the addition of new API groups, beyond core resources, sub resources, and utility paths. The approval, feedback, and messaging discussions might require their own group, allowing them to be separated in documentation, code, testing, and other areas–reducing the load for new users, while expanding the opportunities for more advanced consumers. Eventually there might be a one to one connection between API path groups, and the API projects in the queue, allowing for different groups of APIs to be moved forward at different rates, and involve different groups of API consumers and vendors in the process.

Project Scope / Filtering

Adding the fourth dimension to this scope / filtering discussion, I’m proposing we discuss how projects are defined and isolated, which can allow them to move forward at different rates, and be reflected in documentation, code, and other resources–allowing for filtering by consumers. This will drive the path filtering described above, but apply beyond just the API, and influencing documentation, SDKs, testing, monitoring, validation, and other aspects of API operations.

With this tier I am looking to decouple API projects from one another, and from the core specification. I want the core HSDS/A specification to stay focused on doing one thing well, but I’d like to establish a clear way to move forward complimentary groups of API definitions, and supporting tooling independently of the core specification. As we prepare to begin the journey from version 1.1 to 1.2, there are a number of significant projects on the table, and we need a way to isolate and decouple each additional API project in the same we we do with individual API resources–keeping them clearly defined, focused on specific problem set, and a buffet of resources that the community can choose where they’d like to participate.

Pagination

This is the discussion around how results will be paginated, allowing for efficient or complete requests to be requested, and navigate through large volumes of human services data. We need to be discussing how we will evolve the current approach to using page= and per_page= to articulate pagination. This approach is a common, well understood way to allow developers to paginate, but we need to keep discussion open as we answer some of the other API design questions on the table.

The pagination topic overlaps with the hypermedia and response structure discussion. Eventually we may offer pagination as part of a response envelope, or relational links provided as part of the response when using JSON API, HAL, or other media type. Right now we will leave pagination as it is, but we should be thinking about how it will evolve alongside all other API design conversations in this list.

Sorting

According to the current Ohana API implementation, which is the HSDA v1.0 definition, the guidance for sorting availability is as follows:

Except for location-based and keyword-based searches, results are sorted by location id in ascending order. Location-based searches (those that use the lat_lng or location parameter) are sorted by distance, with the ones closest to the search query appearing first. keyword searches are sorted by relevance since they perform a full-text search in various fields across various tables.

This guidance follows the API definition from version 1.0 to 1.2, but for future versions we should be considering providing further guidance regarding sorting of results. I’d like to get more feedback from the community on how they are providing data sorting capabilities for API consumes, or even as part of web and mobile applications.

Response Structure

Right now the API responses for HSDA are pretty flat, like the schema. As part of the move from version 1.1 to 1.2 we need to be expanding on them, allowing for sub-resources to be included. This conversation will be heavily influenced by the schema filtering conversation above, as well as potentially the hypermedia and content negotiation discussions below. If we are gong to expand on the the schema being returned with API response we should be discussing all possible changes to the schema at once.

This conversation is meant to bring together the API schema filtering, hypermedia, and content negotiation conversations into a single discussion regarding the overall structure of a response, by default, as well as through filtering at the path, parameter, or header levels. I’d like to see HSDA responses expand to accommodate sub resources, but also the relationships between resources, as well as assisting with pagination, sorting, and other aspects of data, schema, and path filtering. I am looking to make sure the expansion of the response structure be more inclusive beyond just talk of sub resource access.

Hypermedia

I really want to see a hypermedia fork in the HSDA definition, allowing more advanced users to negotiate and hypermedia version of the specification, instead of the more simpler, or even advanced default versions of the API. I recommend the adoption of HAL, Siren, or JSON API, as an alternate edition of an HSDA implementation. This expansion of the design of the HSDA specification would not impact the current version, but would allow for another dimension of API consumption and integration.

The relationships between human services data, and the semantic nature of the data really begs for a hypermedia solution. It would allow more meaningful API responses, and defining of relationships between resources, and emphasis of the taxonomy. I will be encouraging a separate, but complimentary version of HSDA that uses one of the leading hypermedia media types. I’d like to ensure there is community awareness of the potential of this approach, and support for investing in this as part of the HSDA design strategy.

Status Codes

One of the areas of design around version 1.1 of the HSDA specification that was put off until future versions is guidance when it comes to API response status and error codes. Right now the OpenAPI definition for version 1.1 of the HSDA specification only suggests a 200 successful response, returning a reference to the appropriate HSDS schema. A project needs to be started that would provider further guidance for 300, 400, and 500 series status codes, as well as error responses.

Each HSDA path should provide guidance on all relevant HTTP Status Codes, but should also provide guidance regarding the error object schema returned as part of every possible API response. Helping standardize how errors are communicated, and provide further guidance on how to help API consumers navigate a solution. Currently there is no guidance when it comes to HTTP responses and errors, something that should be considered in version 1.2 or 1.3, depending on available resources.

Content Negotiation

Augmenting other conversations around schema filtering, API response structure, and hypermedia, I want to make sure content negotiation stays part of the conversation. This aspect of API design will significantly impact API integration, and the evolution of the API specification. I want to make sure vendors, and other key actors are aware of it as an option, and can participate in the conversation regarding the different content types.

This conversation should begin with making CSV and HTML representations of the data available as part of the API response structure alongside the current JSON representations. API consumers should have the option to get raw HTML, CSV, and JSON through content negotiation–with JSON remaining as the default. Then the conversation should evolve to consider HSDA specific content type designation, as well as implementation of a leading hypermedia media type like JSON API, HAL, or Siren.

Content negotiation plays an important role in versioning the HSDA specification, as well as providing different dimensions for dealing with more complex integrations, as well as other aspects of operations like pagination, sorting, access to sub resources, other actions and even data, schema, and path filtering. Like headers, the mainstream developer community tends to not all be aware of content negotiation, but the benefits of adopting far outweigh the overhead involved with bringing developers up to speed.

That concludes the list of API design conversations that are occurring as part of the move from version 1.0 to 1.1, and will set the stage for the move towards 1.2, and beyond. It is a lot to consider, but it is a manageable amount for the community to think about as part of the version 1.1 feedback cycle. Allowing us to make a community informed decision regarding what should be focused on with each release–delivering what matters to the community.

API Projects

As the version 1.0 to 1.1 migration occurred several projects were identified, or suggested for consideration. I want to make sure all these projects are on the table as part of the evolution of HSDA, beyond just the current API design discussion occurring. These are the projects we added to the specification that are moving forward but will have varying degrees of impact on the core API definition.

Taxonomy

There are two objects included in version 1.1 of the Human Services Data Specification (HSDS) that deal with taxonomy, the service_taxonomy object, and the core taxonomy object. I purposely left these aspects of the schema out of version 1.1 of HSDA. I wanted to see more discussion regarding taxonomy before we included in the specification. This is one of the first areas that influenced the above discussions regarding path scope and filtering, as well as project scope and filtering.

I’d like to see taxonomy exist as a separate set of paths, as a separate project, and out of the core specification. In addition to further discussion about what is HSDA taxonomy, I’d like to see more consideration regarding what exactly is acceptable levels of HSDA compliant taxonomy. Ideally, the definition allows for multiple taxonomy, and possibly even a direct relationship between the available content types and a taxonomy, allowing for a more meaningful API response.

I will leave open a Github issue to discuss taxonomy, and either move forward as entirely separate schema, or inclusion in version 1.2, 1.3 of the core HSDA definition. One aspect of this delay is to ensure that my awareness of available taxonomies is up to snuff to help provide guidance. I’m just not aware of everything out there, as well as an intimacy the leading taxonomies in use–I need to hear more from vendors and implementors on this subject before I feel confident in making any decision.

Metadata

The metadata, and the meta_table_description objects v1.1 of HSDA were two elements I also left out of version 1.1 of HSDA. I felt like there should be more discussion around API management, logging, and other aspects of API operations that feed into this area, before we settled in on an API design to satisfy the HSDA metadata conversation. I’d like to hear more from human services implementors regarding what metadata they desire before we connect the existing schema to the API.

The metadata conversation overlaps with the approval and feedback project. There are aspects of logging and meta data collection and storage that will contribute to the transactional nature of any approval and feedback solution. There is also conversation going on regarding privacy concerns around API access to HSDS data, and logging, auditing that occurs at the metadata level. This thread covers these conversations, and is looking to establish a separate group of API paths, and separate project to drive documentation, and other aspects of API operations.

Approval & Feedback

One of the projects that came up recently was about working to define the layer that allows developers to add, update, and delete data via the API. Eventually through the HSDA specification we to encourage 3rd party developers, and external stakeholders to help curate and maintain critical human services data within a community, through trusted partners.

HSDA allows for the reading and writing of organizations, locations, and services for any given area. I am looking to provide guidance on how API implementors can allow for POST, PUT, PATCH, and DELETE on their API, but require approval before any changing transaction is actually executed. Requiring the approval of an internal system administrator to ultimately give the thumbs up or thumbs down regarding whether or not the change will actually occur.

A process which immediately begs for the ability to have multiple administrators or even possibly involving external actors. How can we allow organizations to have a vote in approving changes to their data? How can multiple data stewards be notified of a change, and given the ability to approve or disprove, logging every step along the way? Allowing any change to be approved, reviewed, audited, and even rolled back. Making public data management a community affair, with observability and transparency built in by default.

I am doing research into different approaches to tackling this, ranging from community approaches like Wikipedia, to publish and subscribe, and other events or webhook models. I am looking for technological solutions to opening up approval to the API request and response structure, with accompanying API and webhook surface area for managing all aspects of the approval of any API changes. If you know of any interesting solutions to this problem I’d love to hear more, so that I can include in my research, future storytelling, and ultimately the specification for the Open Referral Human Services Data Specification and API.

Universal Unique IDs

How will we allow for a universal unique ID system for all organizations, locations, and services, providing some provenance on the origin of the record. There is a solid conversation started about how to approach a universal ID system to live alongside, or directly as part of the core HSDA specification–depending on how we decide to approach project scope. Ideally, a universal ID system isn’t pat of being compliant, but could add a healthy layer of certification for some leading providers.

More research needs to be done regarding how universal IDs are handled in other industries. An exhaustive search needs to be conducted regarding any existing standards and guidance that can help direct this discussion. This approach to handling identifiers will have a significant impact on individual API implementations, as well as the overall HSDA definition. More importantly, it will set the stage for future HSDA aggregation and federation, allowing HSDA implementations to work together more seamlessly, and better serve end-uses.

Messaging

I separated this project out of the approval and feedback project. I am suggesting that we isolate the messaging guidance for APIs, setting a standard for how you communicate within a single implementation as well across implementations. There are a number of messaging API standards and best practices available out there, as well as existing messaging APIs that are already in use by human services practitioners, including social channels like Facebook and Twitter, but also private channels like Slack.

HSDA compliant messaging channels should live as a separate project, and set of API path specifications. It should augment the core HSDA definition, overlaying with existing contact information, but it should also be dovetailed with new projects like approval and feedback system. More research needs to be conducted on existing messaging API standards, and leading channels that existing human services implementations and their software vendors are already using.

Webhooks

I want to begin separate project for handling an important aspect of any API operations, and not just being their to receive requests, but can also push information externally, and respond to scheduled, or event driven aspects of API operations. Webhooks will play a role in the approval and feedback system, as well as the metadata, and messaging projects–eventually touching all aspects of the core HSDA resources, and separate projects.

Alongside the approval and feedback, universal id, and messaging projects, webhooks will set the stage for the future of HSDA, where individual city and regional implementations can work together, share information, federate and share responsibility in updates and changes. Webhooks will be how each separate implementation will work in concert, making the deliver of human services more real time, and orchestrated across providers, achieving API the vision of Open Referral founder Greg Bloom.

What Is Next?

We have a lot on the table to discuss currently. We need to settle some pretty important API design discussions that will continue to have an impact on API operations for a long time. I want to help push forward the conversation around these API design discussions, and get these API projects moving forward in tandem. I need more input from the vendors, and the community around some of the pressing discussions, and then I’m confident we can settle in on what the final version 1.1 of the API specification should be, and what work we want to tackle as part of 1.2 and beyond. I’m feeling like with a little discussion we can find a path forward to reach 1.2 in the fall of 2017.


Challenges When Aggregating Data Published Across Many Years

My partner in crime is working on a large data aggregation project regarding ed-tech funding. She is publishing data to Google Sheets, and I’m helping her develop Jekyll templates she can fork and expand using Github when it comes to publishing and telling stories around this data across her network of sites. Like API Evangelist, Hack Education runs as a network of Github repositories, with a common template across them–we call the overlap between API Evangelist, Contrafabulists.

One of the smaller projects she is working on as part of her ed-tech funding research involves pulling the grants made by the Gates Foundation since the 1990s. Similar to my story a couple weeks ago about my friend David Kernohan, where he was wanting to pull data from multiple sources, and aggregate into a single, workable project. Audrey is looking to pull data from a single source, but because the data spans almost 20 years–it ends up being a lot like aggregating data from across multiple sources.

A couple of the challenges she is facing trying to gather the data, and aggregate as a common dataset are:

  • PDF - The enemy of any open data advocate is the PDF, and a portion of her research data data is only available in PDF format which translates into a good deal of manual work.
  • Search - Other portions of the data is available via the web, but obfuscated behind search forms requiring many different searches to occur, with paginated results to navigate.
  • Scraping - The lack of APIs, CSV, XML, and other machine readable results raises the bar when it comes to aggregating and normalizing data across many years, making scraping a consideration, but because of PDFs, and obfuscated HTML pages behind a search, even scraping will have a significant costs.
  • Format - Even once you’ve aggregated data from across the many sources, there is a challenge with it being in different formats. Some years are broken down by topic, while others are geographically based. All of this requires a significant amount of overhead to normalize and bring into focus.
  • Manual - Ultimately Audrey has a lot of work ahead of her, manually pulling PDFs and performing searches, then copying and pasting data locally. Then she’ll have to roll up her sleeves to normalize all the data she has aggregated into a single, coherent vision of where the foundation has put its money.

Data research takes time, and is tedious, mind numbing work. I encounter many projects like hers where I have to make a decision between scraping or manually aggregating and normalizing data–each project will have it’s own pros and cons. I wish I could help, but it sounds like it will end up being a significant amount of manual labor to establish a coherent set of data in Google Sheets. Once, she is done though, she has all the tools in place to publish as YAML to Github, and get to work telling stories around the data across her work using Jekyll and Liquid. I’m also helping her make sure she has a JSON representation of each of her data projects, allowing others to build on top of her hard work.

I wish all companies, organizations, institutions, and agencies would think about how they publish their data publicly. It’s easy to think that data stewards will have ill intentions when it comes to publishing data in a variety of formats like they do, but more likely it is just a change of stewardship when it comes to managing and publishing the data. Different folks will have different visions of what sharing data on the web needs to look like, and have different tools available to them, and without a clear strategy you’ll end up with a mosaic of published data over the years. Which is why I’m telling her story. I am hoping to possibly influence one or two data stewards, or would-be data stewards when it comes to the importance of pausing for a moment and thinking through your strategy for standardizing how you store and publish your data online.


20K, 40K, 60K, and 80K Foot Levels Of Industry API Design Guidance

I am moving my Human Services Data API (HSDA) work forward and one of the top items on the list to consider as part of the move from version 1.1 to 1.2 is all around the scope of the API design portion of the standard. We are at a phase where the API design still very much reflects the Human Services Data Specification (HSDS)–basically a very CRUD (Create, Read, Update and Delete) API. With version 1.2 I need to begin considering the needs of API consumers a little more, looking to vendors and real world practitioners to help understand what the next version(s) of the API definition will/should contain.

The most prominent discussion in the move from version 1.1 to 1.2 centers around scope of API design at four distinct levels of this work, where we are looking to move forward a variety of API design concerns for a large group of API consumers:

  • Data Scope / Filtering - Discussions around how to filter data, allowing API consumers to search across the contents of any HSDA implementation, getting exactly the data they need, no more, no less.
  • Schema Scope / Filtering - Considering the design of simple, standard, or full schema responses that can specified using a prefer header, parameter, or path levels.
  • Path Scope / Filtering - How are API paths going to be group and organized, allowing a large surface area to be shared via documentation (APIs.json) in a way that new API consumers can start simple, advanced users can get what they need, and serving as many needs in between as we can.
  • Project Scope / Filtering - Adding the fourth dimension to this scope / filtering discussion, I’m proposing we discuss how projects are defined and isolated, which can allow them to move forward at different rates, and be reflected in documentation, code, and other resources–allowing for filtering by consumers, as well as prioritization by vendors involved in the API design discussion.

In short I have a large number of desires put on the table by vendors and practitioners. There is a mix of desire to load up as much functionality and API design guidance as we can at the single path level–meaning /organizations, /locations, and /services will allow you to get as much, or as little, as you desire. In this same conversation I have to defend the interests of newcomers, allowing them to easily learn about, and get what they need from these three distinct API paths, without loading them up with too much functionality. While also defend the long tail of needs for mobile, voice, and other leading edge application developers.

I’m looking at this discussion in these four dimensions, but I am trying to apply a horizontal or vertical approach in all four dimensions. Meaning, do I access or filter the amount of data I receive vertically with parameters or headers at the single API path level, or do I access and filter the amount of data I want horizontally across many separate API paths. The same logic applies to the API schema level with accessing vertically at the same path, by adding many subpaths, or do we approach horizontally with new paths. This continues at the API path, and project levels–how do discuss, develop, evolve, and allow API consumers to filter and only get at the API paths and projects they need–limiting the scope for new users, but meeting the demand of vendors, implementors, analysts, and other power consumers.

Aight, ok. Phew. I just needed to get that out. Not 100% sure it makes sense, but it is a framework I’m running with for a group conversation I’m having tomorrow–so we’ll see how it goes. One significant difference in this API design process from others is that it is not about any single API implementation. It is focused on moving forward a single API definition, or in the case of this discussion, many little API definition discussions, with lots of overlap, and under a single umbrella–to support thousands of API implementations. I’ll be publishing another piece shortly which zooms out to the 100K level of my HSDA work, where I’d consider this to be a little 20K (data), 40K (schema), 60K (path), and 80K (project) levels. I just needed to get a handle on this piece–if you actually read this far in the post, you are pretty geeky, and probably need a hobby (HSDA is mine).


Providing Solid Examples That API Consumers Can Learn From Like Slack App Blueprints

People often learn through example. Before I’d ever consider myself a software engineer, I’d consider myself a reverse software engineer. 93% of what I know has been extracted from the work of others. Even with 7% being of my own creation, it is always heavily influenced by the work of others. People emulate what they know, what they see, and use. This is why as an API provider you should be showcasing best practices, positive examples, and healthy blueprints of what API consumers could (should) be doing.

You can see this in action with Slack’s best practice blueprints page, where they provide six blueprints of applications that API consumers should be learning from. Slack doesn’t just provide a title, description and image of example applications, it is truly a blueprint–providing diagrams, links to documentation, code samples, and other essential knowledge you will need to successfully develop an application on Slack. Providing six solid examples that anyone can reverse engineer to understand how Slack application development could (should) work.

Slack app blueprints is just one component of a pretty sophisticated getting started section offered as part of the Slack API ecosystem. I am adding application blueprint as a building block to my getting started API research, and adding it as a dimension to my API documentation & SDK research–the overlap in these areas seem like it should be strong to me. Coming across Slack app blueprints, and writing this story has reminded me that I also need to write another piece on the Slack ecosystem, and generate an outline of all the building blocks they are using in their API ecosystem, and create an updated blueprint for successful API operations that other API providers can emulate.


A Zapier Advocate And Dedicated API Resources Page For Your Company

I am spending time going through some of the most relevant APIs I know of online today, working to create some 101 training materials for average folks to take advantage of. I’m looking through these APIs: Twitter, Google Sheets, Github, Flickr, Instagram, Facebook, YouTube, Slack, Dropbox, Paypal, Weather Underground, Spotify, Google Maps, Reddit, Pinterest, NY Times, Twilio, Stripe, SendGrid, Algolia, Keen, Census, Yelp, Walgreens. I feel they are some of the most useful solutions in the average business person who is API curious.

With these new lessons I’m trying to continue my work evangelizing APIs amongst the normals, helping them understand what APIs are, and what is possible when you put them to work. Once I introduce folks to each API I’m left with the challenge of how do I actually onboard them with each API when they aren’t actually a programmer. The number one way I’m helping alleviate this problem is by including Zapier examples with each of my API lessons, helping folks understand that they can quickly get up and running with each API using the Zapier integration platform as a service (iPaaS). I will be including one or more Zapier examples along with each of my API 101 lessons, helping normal folk put what they’ve learned about APIs to use–hopefully making each lesson a little more sticky.

One of the primary targets for my lessons is the average worker at small, medium, and enterprise businesses, trying to help them understand that APIs aren’t just for developers, and that they can be putting APIs to use in their world. I tried to pick a handful of APIs that are relevant and useful in their daily lives, and helping them become aware of useful Zapier recipes they can adopt in their daily work. I’m looking to encourage users to become more API-literate, and begin connecting and orchestrating using APIs in their daily work. I’m hoping that eventually they will become confident enough by leverage APIs using Zapier that they will eventually become an advocate within their companies and organizations.

In my opinion, each company could really use a Zapier advocate. To help incentivize this behavior I’m going to show folks how they can become an advocate for APIs and Zapier at their company, and provide them with some templates for how they can publish API training material on a page dedicated to Zapier within the company firewewall, or on some sort of company portal that the rest of the company has access to. Similar to how I’ve been advocating API providers to publish an integration page in their developer portals, I’m looking to also encourage business users to publish a similar page of useful Zaps involving API that are relevant to their company–allowing other folks at a company to learn, explore, and implement useful recipes that can help them be more successful in their work.

A significant portion of my work as API Evangelist is dedicated to pushing forward the conversation around APIs, telling stories about the leading and bleeding edge of APIs, but I’m trying to not forget my roots, and my original mission to help non-developers understand the API potential. I feel that a wealth of API 101 materials, combined with examples of Zapier advocacy and storytelling, and pages dedicated to sharing Zapier recipes (Zaps) will help go a long ways to help encourage adoption amongst business users. My first API 101 lessons are rolling off the assembly line, and the next step is to create an example page where these lessons can be published, including other resources and recipes for using Zapier to exercise each lessons learned. If you would like to learn how to become a Zapier advocate at your company please drop me a line, I’m looking for a few beta users to help me push forward this work in a meaningful way.


Each Airtable Datastore Comes With Complete API and Developer Portal

I see a lot of tools come across my desk each week, and I have to be honest I don’t alway fully get what they are and what they do. There are many reasons why I overlook interesting applications, but the most common reason is because I’m too busy and do not have the time to fully play with a solution. One application I’ve been keeping an eye on as part of my work is Airtable, which I have to be honest, I didn’t get what they were doing, or really I just didn’t notice because I was too busy.

Airtable is part spreadsheet, part database, that operates as a simple, easy to use web application, which with a push of a button, you can publish an API from. You don’t just get an API by default with each Airtable, you get a pretty robust developer portal for your API complete with good looking API documentation. Allowing you to go from an Airtable (spreadsheet / database) to API and documentation–no coding necessary. Trust me. Try it out, anyone can create an Airtable and publish an API that any developer can visit and quickly understand what is going on.

As a developer, API deployment still feels like it can be a lot of work. Then, once I take off my programmers hat, and put on my business user hat, I see that there are some very easy to use solutions like Airtable available to me. Knowing how to code is almost slowing me down when it comes API deployment. Sure, the APIs that Airtable publishes aren’t the perfectly designed, artisanally crafted API I make with my bare hands, but they work just as well as mine. Most importantly, they get business done. No coding necessary. Something that anyone can do without the burden of programming.

Airtable provides me another solution that I can recommend that my readers and clients should consider using when managing their data, which will also allow them to easily deploy an API for developers to build applications against.I also notice that Airtable has a whole API integration part of their platform, which allows you to integrate your Airtables into other APIs–something I will have to write about separately in a future post. I just wanted to make sure and take the time to properly add Airtable to my research, and write a story about them so that they are in my brain, available for recall when people are asking me for easy to use solutions that will help them deploy an API.


When You Publish A Google Sheet To The Web It Also Becomes An API

When you take any Google Sheet and choose to publish it to the web, you immediately get an API. Well, you get the HTML representation of the spreadsheet (shared with the web), and if you know the right way to ask, you also can get the JSON representation of the spreadsheet–which gives you an interface you can program against in any application.

Articles I curate, the companies, institutions, organizations, government agencies, and everything else I track on lives in Google Sheets that are published to the web in this way. When you are viewing any Google Sheet in your browser you are viewing it using a URL like:

https://docs.google.com/spreadsheets/d/[sheet_id]/edit

Of course, [sheet_id] is replaced with the actual id for your sheet, but the URL demonstrates what you will see. Once you publish your Google sheet to the web you are given a slight variation on that url:

https://docs.google.com/spreadsheets/d/[sheet_id]/pubhtml

This is the URL you will share with the public, allowing them to view the data you have in your spreadsheet in their browsers. In order to get at a JSON representation of the data you just need to learn the right way to craft the URL using the same sheet id:

https://spreadsheets.google.com/feeds/list/[sheet_id]/default/public/values?alt=json

Ok, one thing I have to come clean on is that the JSON available for each Google sheet is not the most intuitive JSON you will come across, but once you learn what is going on you can easily consume the data within a spreadsheet using any programming languages. Personally, I use a JavaScript library called tabletop.js that quickly helps you make sense of a spreadsheet and get to work using the data in any (JavaScript) application.

The fastest, lowest cost way to deploy an API is to put some data in a Google Sheet, and hit publish to the web. Ok, its not a full blown API, it’s just JSON available at a public URL, but it does provide an interface you can program against when developing an application. I take all the data I have in spreadsheets and publish to Github as YAML, and then make static APIs available using that YAML in XML, CSV, JSON, Atom, or any other format that I need. Taking the load of Google, creating a cached version at any point in time that runs on Github, in a versioned repository that anyone can fork, or integrate into any workflow.


Either You Own The Conversation Around Your APIs Or Someone Else Will

I was looking at how many of the top mobile applications in the iTunes story actually had a public API presence, and was finding it very telling what came up in the Google search results for each company when I searched [company name] + API. It tells a lot about how a company sees the world, when they don’t have a public API presence, but they have a very public mobile application that uses APIs.

An example of this is with Tinder, where the top listings are all Github rogue API repositories, when you Google “Tinder API”. Tinder doesn’t own the conversation when it comes to their own APIs. While the Tinder APIs are public, and well documented, Tinder prefers acting like they are private–they aren’t. Pinterest uses SSL pinning, but there is even a good amount of information out there at how to get around that, making the mapping out and documenting of Tinder APIs a pretty doable thing.

Honestly, I don’t care about Tinder’s APIs. They are just an easy example to point a finger at and use as a poster child. I don’t even expect them to have fully public APIs that any developer could use without permission. Sure, lock that shit down, but provide a sandbox, and make sure every application gets approval before they can more access to live data. Make sure that you own the API conversation by having a developers portal, and provide information regarding what it takes to get access, and maybe some day actually become an approved partner.

I’m not saying that every company should have freely available public APIs. I’m saying every company should own the public conversation around their APIs, no matter what their strategy for developing applications around a platform’s APIs. Have a presence. Own the conversation. Have a door for application developers to walk, even if there is a waiting room. Not all applications will be competing with your own web, mobile, device, or network applications. Some will be about enabling data portability for you users, or maybe provide useful access aggregate data for use in visualizations–you never know what folks will be bringing to the table, why keep the door closed?

I understand. You may not be all team API like I am, but you are using APIs to drive your mobile experience. I just don’t get why you wouldn’t want to own the conversation around these APIs. You are leaving so much on the table. If your mobile app is finding success, people will want access to the goodness going on behind it–a rogue API is what kickstarted the Instagram API in the early days. It is pretty easy to reverse engineer any mobile application, and map out the surface area of the API behind, as well as the authentication in play. Either you own the conversation around your API, or someone will step up and do it for you in todays online world.


Locking Down Drones And IoT Devices By Manufacturers

I have been following stories about, as well as personally experiencing DJI restricting where their drones can fly, going beyond just warning you about restricted areas and actually locking down or restricting your drone capabilities. So it was interesting to also read a post in Motherboard about the company also locking down drones to prevent against hacking, modifying, and tweaking your DJI drones as you wish. Drones for me are a poster child for the entire Internet of Things (IoT), and I think DJI’s approach is a sign of what is to come for all Internet connected devices.

In coming years, there will be a lot that the IoT community can learn from the drone space. From the technical to regulatory, drones will be pushing forward conversations about our networks, cameras, security, privacy, surveillance, and corporate and government control over us, and our devices. Drones stimulate some interesting emotions within people associated with the industry, but more importantly people who know nothing about drones, and will be weighing in on regulation at the municipal, all the way up to the federal and international levels.

I thought it was interesting when DJI began enforcing the recommendations I get in the dashboard for my drones, and requiring that I update my drones, RC controller, and mobile applications to reduce their liability regarding what I an actually doing with my devices. However, locking down drones so people can’t modify, augment, or fix their own drones is a whole other layer to this discussion that isn’t just about stopping ISIS from strapping bombs to their drones, it is also about maintaining sovereignty over their creations, and limiting what we can do as owners when it comes to fixing our devices. We already see the right to fix conversation bubble up in the John Deere ecosystem, but it is something we will continue to see showing up in IoT ecosystems across many different business sectors.

The bold entry into our homes and lives that IoT device manufacturers are making amazes me, but what amazes me even more is how consumers allow this to happen with little resistance. This is another outcome from the drone sector I believe we’ll see more of, is drone operators standing up to defend their right to fix, as well as push back own data, content, an algorithmic ownership over what is produced using devices. Sadly consumers do not understand the value of their data, but hobbyists and commercial operators of drones, and hopefully other devices, do see the value of it, and will begin to shift the balance when it comes to who is profiting off the data our devices are generating.

There will be many technical, business, and political lessons to be learned from the drone space in coming years. I’m strangely thankful that my Drone Recovery project happened, because before that summer I really was not interested in drones, but now I’m not just interested, I own three drones, and have an active interest in understanding what manufacturers like DJI are doing. I’m feel that what DJI is doing with their platform will set a precedent (good and bad) for other IoT operators to follow–something I’ll be keeping a close eye on.


Github Serverless

I run the entire front-end of my online presence using Github. All my API Evangelist research lives as open repositories on Github, with the website running Jekyll, hosted on Github Pages. My front-end is all HTML, JavaScript, and CSS, that leverages YAML data, and displayed using Liquid. It provides me a nice way to offload the public side of my operations to Github.

I am increasingly doing this with all of my data, by publishing it as YAML, and rendering a dynamic (static) API representation in JSON–all done with the same approach I’m using to publish my website(s). You can get at all of the data I use across my API research in a single API Evangelist developer portal, which just aggregates all of the JSON APIs I’ve published across my network almost 100 Github repositories, and supporting sites.

Another thing I’m experimenting with is publishing simple JavaScript functions to individual pages within Github repositories. These scripts do a range of things from pulling items I’ve curated from the Feedly API, fresh data from Google Sheets that I am using as data stores, and a variety of other jobs across my network of research sites, data projects, and API tooling. Some of these scripts I’m running manually, while others I run on a variety of schedules using EasyCron.

The approach definitely has some significant limitations, but I find that I’m able to get quite a bit done with JavaScript by pulling data from external APIs and other feeds, and using each Github repo as storage, and the Github API as the read/write layer for this storage. I do not store any API keys, tokens, or other secrets in the Github repositories, I’m passing them all in via the URL, which isn’t the most secure, and could in theory be abducted in transit even though I’m using SSL–something I’d like to improve upon by passing a single token to unlock a private store. I have access to external systems via APIs, storage and compute via Github, and I can control everything through variety of functional JavaScripts I maintain using Github, and keep indexed using APIs.json.

It is becoming a kind of poor man’s serverless. I’m going to keep polishing my approach. Get better with my responses, and my approach to reading and writing data schema to the data storage folder in each of my repositories, which can then be read statically using JSON APIs I’ve pushed from this data, using Liquid. It is a pretty scrappy approach to serverless, but done in a way that takes the servers out of the equation for me, offloading the front-end and back-end work for my network of sites to Github. I am not sure where I’m going with this. Sometimes I get better results from a more straightforward API implementation on my Amazon infrastructure, but I am finding some interesting use cases, and seeing another side-effect I am enjoying–it is making my serverless infrastructure forkable and usable by others.


Having The Right Communications Pipeline For Your API Platform

My friend Matthew Reinbold, formerly of Vox Pop, and now the Lead for the Capital One API Center of Excellence, as well as the maintainer of web API events has shifted his blogging platform to use Github, using Jekyll. Ok, yawn, why is this news? Someone is shifting the underlying platform for their blog. Well, first Matt is one of the leading API practitioners in the space, who is also a storyteller. Second, his approach highlights a set of tools that other API providers should be considering for their API communications pipeline.

Matt is using a pretty potent formula for his communications platform in my opinion, with a handful of essential ingredients:

  • Github - Using a Github repository as the open source folder for your website.
  • Github Pages - Using Github Pages to publish the front-end for your website.
  • Jekyll - The content management system that sits in the folder for your website.
  • CloudFlare - The DNS and SSL front-end for your website, complete with analytics.
  • Hover - The registrar for the domain which you offload DNS management to CloudFlare.

Matt is taking advantage of the benefits of static website development, which some of the benefits are, as Matt describes:

  • SPEED - There’s no processing server side; posts have already been reduced to the essential atomic units of the web: HTML, Javascript, and CSS. There’s something poetic to me about that.
  • Security - While not so much an issue with my own coded CMS, I lived in constant fear of missing a zero-day Wordpress exploit patch and finding myself, along with clients, compromised. Reducing the number of moving parts significantly decreases the places where something might go wrong.
  • Hosting - Rather than having to find, research, and deploy to increasingly rare ColdFusion hosts (or port to another language), I can post my content to anywhere that supports HTTP/JS/CSS. hosting. This becomes very compelling given that Github Pages, one option, is free.

This is the cheapest and quickest way for your API to get a blog stood up, and get publishing stories about the value your API is bringing to the table. This approach isn’t just limited to your developer portal or engineering team blog, this could be for partners, or any API related project that you are running. I publish a static Jekyll blog for each area of my research, and I try to always have one for each of my data, or API tooling project–telling the story of each project that is independent from the API Evangelist blog, providing a static log of everything that has happened.

Github isn’t just a pipeline for code, it can be a pipeline for your communications and storytelling. It also can be a pipeline for your documentation, how-to-guides, and other resources. I’m happy to see Matt putting it to be use. Another thing I like about his post, other than him mentioning me ;-), is that he also mentioned the benefits of this approach over using Medium, which is something I’ve been advising API providers against for some time. In my opinion you are better off publishing your blog like Matt has, and then syndicating to Medium if you want. There is a lot more detail available on Matt’s story behind his new blog strategy, I recommend heading over and learning from what he’s done.


Being First With Any Technology Trend Is Hard

I first wrote about Iron.io back in 2012. The are an API-first company, and they were the first serverless platform. I’ve known the team since they first reached out back in 2011, and I consider them one of my poster children for why there is more to all of this than just the technology. Iron.io gets the technology side of API deployment, and they saw the need for enabling developers to go serverless, running small scalable scripts in the cloud, and offloading the backend worries to someone who knows what they are doing.

Iron.io is what I’d consider to be a pretty balanced startup, slowly growing, and taking sensible amounts of funding they needed to grow their business. The primary area I would say that Iron.io has fallen short is when it comes to storytelling about what they are up to, and generally playing the role of a shiny startup everyone should pay attention to. They are great storytellers, but unfortunately the frequency and amplification of their stories has fallen short, allowing other strong players to fill the void–opening the door for Amazon to take the lion share of the conversation when it comes to serverless. Demonstrating that you can rock the technology side of things, but if you don’t also rock the storytelling and more theatrical side of things, there is a good chance you can come in second.

Storytelling is key to all of this. I always love the folks who push back on me saying that nobody cares about these stories, the markets only care about successful strong companies–when it reality, IT IS ALL ABOUT STORYTELLING! Amazon’s platform machine is good at storytelling. Not just their serverless group, but the entire platform. They blog, tweet, publish press releases, whisper in reporter ears, buy entire newspapers, publish science fiction patents, conduct road shows, and flagship conferences. Each AWS platform team can tap into this, participate, and benefit from the momentum, helping them dominate the conversation around their particular technical niche.

Being first with any technology trend will always be hard, but it will be even harder if you do not consistently tell stories about what you are doing, and what those who are using your platform are doing with it. Iron.io has been rocking it for five years now, and are continuing to define what serverless is all about, they just need to turn up the volume a little bit, and keep doing what they are doing. I’ll own a portion of this story, as I probably didn’t do my share to tell more stories about what they are up to, which would have helped amplify their work over the years–something I’m working to correct with a little storytelling here on API Evangelist.


Opportunity To Develop A Threat Intelligence Aggregation API

I came across this valuable list of threat intelligence resources and think that the section on information sources should be aggregated and provided as a single threat intelligence API. When I come across valuable information repos like this my first impulse is to go through them, standardize and upload as JSON and YAML to Github, making all of this data forkable, and available via an API.

Of course if I responded to every impulse like this I would never get any of my normal work done, and actually pay my bills. A second option for me is to put things out there publicly in hopes that a) someone will pay me to do the work, or b) someone else who has more time, and the rent paid will tackle the work. With this in mind, this list of sources should be standardized, and publish to Github and as an API:

  • Alexa Top 1 Million sites - Probable Whitelist of the top 1 Million sites from Amazon(Alexa).
  • APT Groups and Operations - A spreadsheet containing information and intelligence about APT groups, operations and tactics.
  • AutoShun - A public service offering at most 2000 malicious IPs and some more resources.
  • BGP Ranking - Ranking of ASNs having the most malicious content.
  • Botnet Tracker - Tracks several active botnets.
  • BruteForceBlocker - BruteForceBlocker is a perl script that monitors a server’s sshd logs and identifies brute force attacks, which it then uses to automatically configure firewall blocking rules and submit those IPs back to the project site, http://danger.rulez.sk/projects/bruteforceblocker/blist.php.
  • C&C Tracker - A feed of known, active and non-sinkholed C&C IP addresses, from Bambenek Consulting.
  • CI Army List - A subset of the commercial CINS Score list, focused on poorly rated IPs that are not currently present on other threatlists.
  • Cisco Umbrella - Probable Whitelist of the top 1 million sites resolved by Cisco Umbrella (was OpenDNS).
  • Critical Stack Intel - The free threat intelligence parsed and aggregated by Critical Stack is ready for use in any Bro production system. You can specify which feeds you trust and want to ingest.
  • C1fApp - C1fApp is a threat feed aggregation application, providing a single feed, both Open Source and private. Provides statistics dashboard, open API for search and is been running for a few years now. Searches are on historical data.
  • Cymon - Cymon is an aggregator of indicators from multiple sources with history, so you have a single interface to multiple threat feeds. It also provides an API to search a database along with a pretty web interface.
  • Deepviz Threat Intel - Deepviz offers a sandbox for analyzing malware and has an API available with threat intelligence harvested from the sandbox.
  • Emerging Threats Firewall Rules - A collection of rules for several types of firewalls, including iptables, PF and PIX.
  • Emerging Threats IDS Rules - A collection of Snort and Suricata rules files that can be used for alerting or blocking.
  • ExoneraTor - The ExoneraTor service maintains a database of IP addresses that have been part of the Tor network. It answers the question whether there was a Tor relay running on a given IP address on a given date.
  • Exploitalert - Listing of latest exploits released.
  • ZeuS Tracker - The Feodo Tracker abuse.ch tracks the Feodo trojan.
  • FireHOL IP Lists - 400+ publicly available IP Feeds analysed to document their evolution, geo-map, age of IPs, retention policy, overlaps. The site focuses on cyber crime (attacks, abuse, malware).
  • FraudGuard - FraudGuard is a service designed to provide an easy way to validate usage by continuously collecting and analyzing real-time internet traffic.
  • Hail a TAXII - Hail a TAXII.com is a repository of Open Source Cyber Threat Intelligence feeds in STIX format. They offer several feeds, including some that are listed here already in a different format, like the Emerging Threats rules and PhishTank feeds.
  • I-Blocklist - I-Blocklist maintains several types of lists containing IP addresses belonging to various categories. Some of these main categories include countries, ISPs and organizations. Other lists include web attacks, TOR, spyware and proxies. Many are free to use, and available in various formats.
  • Majestic Million - Probable Whitelist of the top 1 million web sites, as ranked by Majestic. Sites are ordered by the number of referring subnets. More about the ranking can be found on their blog.
  • MalShare.com - The MalShare Project is a public malware repository that provides researchers free access to samples.
  • MalwareDomains.com - The DNS-BH project creates and maintains a listing of domains that are known to be used to propagate malware and spyware. These can be used for detection as well as prevention (sinkholing DNS requests).
  • Metadefender.com - Metadefender Cloud Threat Intelligence Feeds contains top new malware hash signatures, including MD5, SHA1, and SHA256. These new malicious hashes have been spotted by Metadefender Cloud within the last 24 hours. The feeds are updated daily with newly detected and reported malware to provide actionable and timely threat intelligence.
  • NormShield Services - NormShield Services provide thousands of domain information (including whois information) that potential phishing attacks may come from. Breach and blacklist services also available. There is free sign up for public services for continuous monitoring.
  • OpenBL.org - A feed of IP addresses found to be attempting brute-force logins on services such as SSH, FTP, IMAP and phpMyAdmin and other web applications.
  • OpenPhish Feeds - OpenPhish receives URLs from multiple streams and analyzes them using its proprietary phishing detection algorithms. There are free and commercial offerings available.
  • PhishTank - PhishTank delivers a list of suspected phishing URLs. Their data comes from human reports, but they also ingest external feeds where possible. It’s a free service, but registering for an API key is sometimes necessary.
  • Ransomware Tracker - The Ransomware Tracker by abuse.ch tracks and monitors the status of domain names, IP addresses and URLs that are associated with Ransomware, such as Botnet C∓C servers, distribution sites and payment sites.
  • SANS ICS Suspicious Domains - The Suspicious Domains Threat Lists by SANS ICS tracks suspicious domains. It offers 3 lists categorized as either high, medium or low sensitivity, where the high sensitivity list has fewer false positives, whereas the low sensitivty list with more false positives. There is also an approved whitelist of domains. Finally, there is a suggested IP blocklist from DShield.
  • signature-base - A database of signatures used in other tools by Neo23x0.
  • The Spamhaus project - The Spamhaus Project contains multiple threatlists associated with spam and malware activity.
  • SSL Blacklist - SSL Blacklist (SSLBL) is a project maintained by abuse.ch. The goal is to provide a list of “bad” SSL certificates identified by abuse.ch to be associated with malware or botnet activities. SSLBL relies on SHA1 fingerprints of malicious SSL certificates and offers various blacklists
  • Statvoo Top 1 Million Sites - Probable Whitelist of the top 1 million web sites, as ranked by Statvoo.
  • Strongarm, by Percipient Networks - Strongarm is a DNS blackhole that takes action on indicators of compromise by blocking malware command and control. Strongarm aggregates free indicator feeds, integrates with commercial feeds, utilizes Percipient’s IOC feeds, and operates DNS resolvers and APIs for you to use to protect your network and business. Strongarm is free for personal use.
  • Talos Aspis - Project Aspis is a closed collaboration between Talos and hosting providers to identify and deter major threat actors. Talos shares its expertise, resources, and capabilities including network and system forensics, reverse engineering, and threat intelligence at no cost to the provider.
  • Threatglass - An online tool for sharing, browsing and analyzing web-based malware. Threatglass allows users to graphically browse website infections by viewing screenshots of the stages of infection, as well as by analyzing network characteristics such as host relationships and packet captures.
  • ThreatMiner - ThreatMiner has been created to free analysts from data collection and to provide them a portal on which they can carry out their tasks, from reading reports to pivoting and data enrichment. The emphasis of ThreatMiner isn’t just about indicators of compromise (IoC) but also to provide analysts with contextual information related to the IoC they are looking at.
  • VirusShare - VirusShare.com is a repository of malware samples to provide security researchers, incident responders, forensic analysts, and the morbidly curious access to samples of malicious code. Access to the site is granted via invitation only.
  • Yara-Rules - An open source repository with different Yara signatures that are compiled, classified and kept as up to date as possible.
  • ZeuS Tracker - The ZeuS Tracker by abuse.ch tracks ZeuS Command & Control servers (hosts) around the world and provides you a domain- and a IP-blocklist.

Ideally, each source on this list would be publishing a forkable version of their data on Github and/or deploying a simple web API, but alas it isn’t the world we live in. Part of the process to standardardize and normalize the threat intelligence from all of these source would be to reach out to each provider, and take their temperature regarding working together to improve the data source by itself, as well as part of an aggregated set of data and API sources.

Similar to what I’m trying to do across many of the top business sectors being impacted by APIs, we need to to work aggregating all the existing sources of threat intelligence, and begin identifying a common schema that any new player could adopt. We need an open data schema, API definition, as well as suite of open source server and client tooling to emerge, if we are going to stay ahead of the cybersecurity storm that has engulfed us, and will continue to surround us until we work together to push it back.


When JSON Schema Is Seen As Power

In a 30 year career as a database professional I’ve seen some extraordinary ways in which owning and controlling data is associated with power. Those who have the data leverage it against those who do not have it. Losing control means losing power, so people do whatever they can to stay in control, protecting the spreadsheets and databases at all costs. After 30 years of seeing this play out over and over again, I thought I’d seen it all, but sadly in an API era I’m just seeing new incarnations of data being wielded by those in power.

I recently came across an example where a company was holding back a series of JSON schema for a variety of public datasets, and standards in use as part of some government systems. From what I can tell company had been brought in to handle the systems and open data work a few years back, and with each version of the software and schema they slowly began to maintain tighter control over the schema, while they were also being mandated to be more open with the data–shifting from being controlling over the data, to being controlling of the schema.

They see the ability to be able to validate data, API requests and responses as something only a handful of people should be able to do. If you have the ability to validate, and say, “yes that data or API is compliant”, you are now in a position of power. This groups was mandated to be open with the data, allowing it flow freely between open source and proprietary systems, keeping in sync with laws and regulations, but they had found another way to remain as gatekeeper–I think this is what some folks call innovation, and thinking out of the box.

In my world, it is just another example of how power will always find ways to keep data from flowing, no matter how it learns to be perceived as playing nicely in an open data and API world. Many companies are still playing by the old rules and just hoarding, locking, up and controlling data–refusing to play along in the API game. However it is fascinating to see how power can shape shift and find new ways to protect its interest in this new landscape. After 30 years of doing this I am not surprised, but I do have to call it out when I see it because, well it is not right. Be open and share your schema, and let everyone be able to validate that data is what it should be.


The Essential API Elements In My World

In 2017 there seems to be an API for just about everything. You can make products available via an API, messing, images, videos, and any of the digital bits that make up our lives. I still get excited by some new APIs, but APIs have to have real usage, and deliver real value before I’ll get too worked up about them. I’m regularly looking down the list of my digital bits thinking about which are the most important to me, which ones I’ll keep around, and the services I’ll adopt to help me define and manage these bits.

This process has got me thinking really deeply about what I’d consider to be the three most important types of APIs in my life:

  • Compute - In my world compute is all about AWS EC2 instances, but when I think about it, Github really handles the majority of the compute for my front-end, but EC2 is the scalable compute for the backend of my world that is driving my APIs.
  • Storage - Primarily storage is all about Amazon S3, but I also depend on Dropbox, Google Drive, and I also put Github into the storage bucket because I store quite a bit of JSON, YAML, and other data there.
  • DNS - apievangelist.com and kinlane.com are very important domains in my world–they are how I make my living, and share my stories. CloudFlare is how I manage this frontline of my world, making DNS an extremely important element in my world.

I leverage compute, storage, and DNS APIs regularly throughout each day–making them very important APIs in my existence. However, these are also the essential ingredients of my APIs as well. I consume these APIs, but I also deploy my APIs with these three elements. Each API has a compute and storage layer, with DNS as the naming, addressing, and discovery for these valuable resources in my world. This makes these three aspects of operating online, the three most essential elements in my world–even beyond images, messaging, video, and other elements that are ubiquitous across my digital presence.

It is interesting for me to think about the importance of these elements in my world, as storage and compute were the first two APIs that turned on the light bulb in my head when it came to the importance of web APIs. When Amazon launched Amazon S3 and Amazon EC2, that is when I knew APIs were going to be bigger than Flickr or Twitter. You could deploy global infrastructure with APIs–you could deploy APIs with APIs! I really enjoy thinking deeply about all my digital bits, and the role APIs are playing–regularly reassessing the value of API-driven resources in my world. It helps me think through what is important, and what isn’t–showing the 98% of all of this tech doesn’t matter, but there is a 2% that does make an actual difference in my digital existence.


OpenAPI Leading The Open Banking API Conversation

I’ve been looking through the ecosystems of banking API platforms trying to understand the technical, business, and political approach of banks when it comes to the API conversation. While Capital One is definitely leading the conversation in the U.S., I’ve also been looking to better understand what is happening around the PSD2 banking API conversation in the EU and UK.

I was pleased to find OpenAPI present in the OpenBankProject PSD2 API Explorer, as well as leading the specification standards conversation over at Open Banking in the UK. The existence of the OpenAPI allows analysts like me to quickly load up the OpenAPI in an API client like Postman or Restlet, and become more intimate with what paths, and definitions are available–developing my awareness of where banking API standards are headed.

OpenAPI is proving to be a great way to facilitate a conversation about an API at the team, as well as industry level. While the learning curve involved with OpenAPI adoption is real, I’m finding it to be an essential diplomatic tool when it comes to harmonizing the industry level conversation around my Human Services Data API work. OpenAPI provides a central reference that business stakeholders can reference at the 100K view, while also enabling developers and architects can discuss at the nitty gritty technical level.

I’m bookmarking all the OpenAPIs I find around PSD2, and I’m on the hunt for more OpenAPIs around FHIR. These are the two leading API standards conversation going on at the industry level, helping define a common API interface within two heavily regulated industries–banking and healthcare. While there is still a HUGE amount of work within these communities to truly achieve the adoption everyone is envisioning, I find the fact that their are OpenAPIs being used as a positive sign. It shows that we are moving towards more substance than just talk, and acknowledges the conversation stimulating powers of OpenAPI, in addition to the potential for delivering API deployment, management, testing, monitoring, SDKs, discovery, and other essential stops along the API lifecycle.


Does Your API Sandbox Have Malicious Users?

I have been going through my API virtualization research, expanding the number of companies I’m paying attention to, and taking a look at industry specific sandboxes, mock APIs, and other approaches to virtualizing APIs, and the data and content they serve up. I’m playing around with some banking API sandboxes, getting familiar with PSD2, and learning about how banks are approaches their API virtualization–providing me with an example within a heavily regulated industry.

AS I’m looking through Open Bank Project’s PSD2 Sandbox, and playing with services that are targeting the banking industry with sandbox solution, I find myself thinking about Netflix’s Chaos Monkey, which is “a resiliency tool that helps applications tolerate random instance failures.” Now I am wondering if there are any API sandboxes out there that have simulated threats built in, pushing developers to build more stable and secure applications with API resources.

There are threat detection solutions that have APIs, and some interesting sandboxes for analyzing malware that have APIs, but I don’t find any API sandboxes that just have general threats available in them. If you know of any sandboxes that provide simulations, or sample data, please let me know. Also if you know of any APIs that specifically provide API security threats in their sandbox environments so that developers can harden their apps–I’d love to hear more about it. I depend on my readers to let me know of the interesting details from API operations like this.

I’m on the hunt for APIs that have sandboxes that assist application developers think about the resiliancy and security of their applications built on top of an API. Eventually I’d also love to see a sandbox emerge to emerge that could help API providers think about the resiliancy and security of their APIs. I’m feeling like this aspect of API virtualization is going to become just as critical as guidance on API design best practices, but helping API operaters better understand the threats they face as API operators, and quantify them against their API in a virtualized and simulated environment that isn’t going to freak them out about the availability of their production environment.

I’ll keep an eye out for more examples of this in the wild–if you know of anything please let me know–thanks!


Standardizing and Templatizing API Design Editor Validation Tips

I’ve been playing with Apicurio, the open source API design editor I’ve been waiting for, and saw a potential opportunity for design time collaboration, instruction, and feedback loop. When you are designing an API in Apicurio it gives you alerts based upon JSON schema validation of the underlying OpenAPI, providing a nice visual feedback loop–forcing you to complete your API definition until it properly validates.

Visual alerts and feedback based upon JSON schema validation isn’t really new or that interesting–you see it in the Swagger Editor, and many other JSON tooling. Where I see an opportunity is specifically when it comes to an open source visual API design editor like Apicurio, and when the JSON schema engine for the validation responses is opened up as part of the architecture. Allowing users to import and export JSON schema that goes beyond the default OpenAPI schema, which gets us to a minimum viable OpenAPI definition–while this is good, we can do better.

I’d like to see a marketplace of JSON schema to emerge helping API designers and architects push the completeness and precision of their OpenAPI definitions beyond the speed at which the core OpenAPI spec can move, and go in directions, and cover niche definitions that the core OpenAPI schema will never cover. I want to be able to load a schema that will help me push forward my API responses beyond just a default 200. I want to be able to load custom JSON schema crafted by API design experts who have more skills than I do, and learn from them. I want my API design editor to help me take my APIs to the next level, while also be pushing my API design skills forward along the way.

Apicurio takes does a good job at giving plain english responses to validation errors–much better than some tools I’ve used. You can click on the detail of each alert, to get more information about what is going on. I could see this entire structure opened up as part of Apicurio’s architecture allowing custom JSON schema templates that can be be imported, and made more informative with plain english responses, more detail, and even links to learn more about how you can improve your API. Turning the basic validation responses into more of an API design knowledge-base, and even structured curriculum walking you through best practices when it comes to API design, and what is working in the industry.

Just some thoughts as I’m playing with Apicurio. I’m very happy to see this API design editor emerge, and I have been having fun thinking about what is possible when it comes to the road map. I feel a common visual API design interface is an important part of the next step in the evolution of the API sector. Which is why I have been advocating for an open source API design editor that any API provider can use to design their APIs, and any API service provider can bake into their services and tooling–standardizing how we all craft and communicate around our API designs. The validation feedback loop during this phase will be an important channel for pushing API designers to do what is right, make their API definitions more complete, while educating them about common API design practices, and building a more literate API workforce.


Enhancing Your API SEO

One question I’m regularly getting from my readers is regarding how you can increase the search engine optimization (SEO) for your APIs–yes, API SEO (acronyms rule)! While we should be investing in API discoverability by embracing hypermedia early on, I feel in its absence we should also be indexing our entire API operations with APIs.json, and making sure we describe individual APIs using OpenAPI, the world of web APIs is still very hitched to the web, making SEO very relevant when it comes to API discoverability.

While I was diving deeper into “The API Platform”, a VERY forward leaning API deployment and management solution, I was pleased to see another mention of API SEO using JSON-LD (scroll down on the page). While I wish every API would adopt JSON-LD for their overall design, I feel we are going to have to piece SEO and discoverability together for our sites, as The API platform demonstrates. They provide a nice example of how you can paste a JSON-LD script into the the page of your API documentation, helping amplify some of the meaning and intent behind your API using JSON-LD + Schema.org.

I have been thinking about Schema.org’s relationship to API discovery for some time now, which is something I’m hoping to get more time to invest in further during 2017. I’d like to see Schema.org get more baked into API design, deployment, and documentation, as well as JSON-LD as part of underlying schema. To help build a bridge from where we are at, to where we need to be going, I’m going to explore how I can leverage OpenAPI tags to help autogenerate JSON-LD Schema.org tags as part of API documentation. While I’d love for everyone to just get the benefits of JSON-LD, I’m afraid many folks won’t have the bandwidth, and could use an assist from the API documentation solutions they are already using–making APIs more SEO friendly by default.

If you are starting a new API I recommend playing with “The API Platform”, as you get the benefits of Schema.org, JSON-LD, and MANY other SIGNIFICANT API concepts by default. Out of all of the API frameworks I’ve evaluated as part of my API deployment research, “The API Platform” is by far the most advanced when it comes to leading by example, and enabling healthy API design practices by default–something that will continue to bring benefits across all stops along the life cycle if you put to work in your operations.


A Bot That Actually Does Useful Things For Me

I’m not a fan of the unfolding bot universe. I get it, you can do interesting things with them–the key word being interesting. Most of what I’ve seen done via Twitter, Facebook, and Slack Bots really isn’t that interesting. Maybe it’s that I’m old and boring, or maybe because people aren’t doing interesting things. When you hear me complain about bots, just remember it isn’t because I think the technology approach is dumb, it’s because I think the implementations are dumb.

After several dives into the world of bots, looking to understand how bots are using APIs, I’ve found some interesting Twitter bots, and an even smaller number of Slack bots I found to be useful–I have yet to find an interesting Facebook Bot. Honestly, I think it is the constaints of each platform that are incentivizing interesting things to be done, and also the not interesting, and even dangerous things to be done. So I find it interesting when the bot conversation moves to other platforms, bringing with it a new sets of constraints, like I just saw with a new bot out of Hashicorp.

Hashicorp’s Bot does mundane Github janitorial work for me! This is automation (aka bot) activity I can get behind. I feel like much of the Slack automation I’ve seen is doing things that wouldn’t actually benefit me, and would be creating more noise than any solution it would bring–this is due to how I use Slack, or rather how I don’t use Slack. I’m a HEAVY Github user, and there are MANY tasks that are left undone. Things like tagging repos, README files, licensing, and the other things we either forget about, or just don’t have the time for. You fire up a bot to help me with these things, my ears are going to perk up a bit when it comes to the bot conversation.

In the end, I just need to remember that it is not bots that are boring and dumb–people are. ;-) That includes me. I find the concept of a Github bot infinitely more valuable than a Facebook, Twitter, or Slack Bot. I’m curious to see where Hashicorp takes this, and now that the concept of a Github Bot is on my radar, I’m guessing I will see other examples of it in the wild. I’m hoping this is an area we’ll see more bot development and investment, but I also understand Facebook, Twitter, and Slack have relevance in other peoples world, and that I’m the oddball here who finds Github a more interesting platform.


An API Change Log And Road Map Visualization

I saw a blog post come across my feeds from the analysis and visualizaiton API provider Qlik, about their Qlik Sense API Insights. It is a pretty interesting approach to trying visualize the change log and road map for an API. I like it because it is an analysis and visualization API provider who has used their own platform to help visualize the evolution of their API.

I find the visualization for Qlik Sense API Insights to be a little busy, and not as interactive as I’d like to see it be, but I like where they are headed. It tries to capture a ton of data, showing the road map and changes across multiple versions of sixteen APIs, something that can’t be easy to wrap your head around, let alone capture in a single visualization. I really like the direction they are going with this, even though it doesn’t fully bring it home for me.

Qlik Sense API Insights is the first approach I’ve seen like this to attempt to try and quantify the API road map and change log–it makes sense that it is something being done by a visualization platform provider. With a little usage and user experience (UX) love I think the concept of analysis, visualizaitons, and hopefully insights around the road map, change log, and even open issues and status could be significantly improved upon. I could see something like this expand and begin to provide an interesting view into the forever changing world of APIs, and keep consumers better informed, and in sync with what is going on.

In a world where many API providers still do not even share a road map or change log I’m always looking for examples of providers going the extra mile to provide more details, especially if they are innovating thike Qlik is with visualizations. I see a lot of conversations about how to version an API, but very few conversations about how to communicate each version of your API. It is something I’d like to keep evangelizing, helping API providers understand they should at least be offering the essentials like a roadmap, issues, change log, and status page, but the possibility for innovation and pushing the conversation forward is within reach too!


Bringing The API Deployment Landscape Into Focus

I am finally getting the time to invest more into the rest of my API industry guides, which involves deep dives into core areas of my research like API definitions, design, and now deployment. The outline for my API deployment research has begun to come into focus and looks like it will rival my API management research in size.

With this release, I am looking to help onboard some of my less technical readers with API deployment. Not the technical details, but the big picture, so I wanted to start with some simple questions, to help prime the discussion around API development.

  • Where? - Where are APIs being deployed. On-premise, and in the clouds. Traditional website hosting, and even containerized and serverless API deployment.
  • How? - What technologies are being used to deploy APIs? From using spreadsheets, document and file stores, or the central database. Also thinking smaller with microservices, containes, and serverless.
  • Who? - Who will be doing the deployment? Of course, IT and developers groups will be leading the charge, but increasingly business users are leveraging new solutions to play a significant role in how APIs are deployed.

The Role Of API Definitions While not every deployment will be auto-generated using an API definition like OpenAPI, API definitions are increasingly playing a lead role as the contract that doesn’t just deploy an API, but sets the stage for API documentation, testing, monitoring, and a number of other stops along the API lifecycle. I want to make sure to point out in my API deployment research that API definitions aren’t just overlapping with deploying APIs, they are essential to connect API deployments with the rest of the API lifecycle.

Using Open Source Frameworks Early on in this research guide I am focusing on the most common way for developers to deploy an API, using an open source API framework. This is how I deploy my APIs, and there are an increasing number of open source API frameworks available out there, in a variety of programming languages. In this round I am taking the time to highlight at least six separate frameworks in the top programming languages where I am seeing sustained deployment of APIs using a framework. I don’t take a stance on any single API framework, but I do keep an eye on which ones are still active, and enjoying usag bey developers.

Deployment In The Cloud After frameworks, I am making sure to highlight some of the leading approaches to deploying APIs in the cloud, going beyond just a server and framework, and leveraging the next generation of API deployment service providers. I want to make sure that both developers and business users know that there are a growing number of service providers who are willing to assist with deployment, and with some of them, no coding is even necessary. While I still like hand-rolling my APIs using my peferred framework, when it comes to some simpler, more utility APIs, I prefer offloading the heavy lifting to a cloud service, and save me the time getting my hands dirty.

Essential Ingredients for Deployment Whether in the cloud, on-premise, or even on device and even the network, there are some essential ingredients to deploying APIs. In my API deployment guide I wanted to make sure and spend some time focusing on the essential ingredients every API provider will have to think about.

-Compute - The base ingredient for any API, providing the compute under the hood. Whether its baremetal, cloud instances, or serverless, you will need a consistent compute strategy to deploy APIs at any scale. -Storage - Next, I want to make sure my readers are thinking about a comprehensive storage strategy that spans all API operations, and hopefully multiple locations and providers. -DNS - Then I spend some time focusing on the frontline of API deployment–DNS. In todays online environment DNS is more than just addressing for APIs, it is also security. -Encryption - I also make sure encryption is baked in to all API deployment by default in both transit, and storage.

Some Of The Motivations Behind Deploying APIs In previous API deployment guides I usually just listed the services, tools, and other resources I had been aggregating as part of my monitoring of the API space. Slowly I have begun to organize these into a variety of buckets that help speak to many of the motivations I encounter when it comes to deploying APIs. While not a perfect way to look at API deployment, it helps me thinking about the many reasons people are deploying APIs, and craft a narrative, and provide a guide for others to follow, that is potentially aligned with their own motivations.

  • Geographic - Thinking about the increasing pressure to deploy APIs in specific geographic regions, leveraging the expansion of the leading cloud providers.
  • Virtualization - Considering the fact that not all APIs are meant for production and there is a lot to be learned when it comes to mocking and virtualizing APIs.
  • Data - Looking at the simplest of Create, Read, Update, and Delete (CRUD) APIs, and how data is being made more accessible by deploying APIs.
  • Database - Also looking at how APIs are beign deployed from relational, noSQL, and other data sources–providing the most common way for APIs to be deployed.
  • Spreadsheet - I wanted to make sure and not overlook the ability to deploy APIs directly from a spreadsheet making APIs are within reach of business users.
  • Search - Looking at how document and content stores are being indexed and made searchable, browsable, and accessible using APIs.
  • Scraping - Another often overlooked way of deploying an API, from the scraped content of other sites–an approach that is alive and well.
  • Proxy - Evolving beyond early gateways, using a proxy is still a valid way to deploy an API from existing services.
  • Rogue - I also wanted to think more about some of the rogue API deployments I’ve seen out there, where passionate developers reverse engineer mobile apps to deploy a rogue API.
  • Microservices - Microservices has provided an interesting motivation for deploying APIs–one that potentially can provide small, very useful and focused API deployments.
  • Containers - One of the evolutions in compute that has helped drive the microservices conversation is the containerization of everything, something that compliments the world of APis very well.
  • Serverless - Augmenting the microservices and container conversation, serverless is motivating many to think differently about how APIs are being deployed.
  • Real Time - Thinking briefly about real time approaches to APIs, something I will be expanding on in future releases, and thinking more about HTTP/2 and evented approaches to API deployment.
  • Devices - Considering how APis are beign deployed on device, when it comes to Internet of Things, industrial deployments, as well as even at the network level.
  • Marketplaces - Thinking about the role API marketplaces like Mashape (now RapidAPI) play in the decision to deploy APIs, and how other cloud providers like AWS, Google, and Azure will play in this discussion.
  • Webhooks - Thinking of API deployment as a two way street. Adding webhooks into the discussion and making sure we are thinking about how webhooks can alleviate the load on APIs, and push data and content to external locations.
  • Orchestration - Considering the impact of continous integration and deployment on API deploy specifically, and looking at it through the lens of the API lifecycle.

I feel like API deployment is still all over the place. The mandate for API management was much better articulated by API service providers like Mashery, 3Scale, and Apigee. Nobody has taken the lead when it came to API deployment. Service providers like DreamFactory and Restlet have kicked ass when it comes to not just API management, but making sure API deployment was also part of the puzzle. Newer API service providers like Tyk are also pusing the envelope, but I still don’t have the number of API deployment providers I’d like, when it comes to referring my readers. It isn’t a coincidence that DreamFactory, Restlet, and Tyk are API Evangelist partners, it is because they have the services I want to be able to recommend to my readers.

This is the first time I have felt like my API deployment research has been in any sort of focus. I carved this layer of my research of my API management research some years ago, but I really couldn’t articulate it very well beyond just open source frameworks, and the emerging cloud service providers. After I publish this edition of my API deployment guide I’m going to spend some time in the 17 areas of my research listed above. All these areas are heavily focused on API deployment, but I also think they are all worth looking at individually, so that I can better understand where they also intersect with other areas like management, testing, monitoring, security, and other stops along the API lifecycle.


The Growing Importance of Geographic Regions In API Operations

I have been revisiting my earlier work on an API rating system. One area that keeps coming up as I’m working is around the availability of APIs in a variety of regions, and the cloud platforms that are driving them. I have talked about regional availability of APIs for some time now, keeping an eye on how API providers are supporting multiple regions, as well as the expanding world of cloud computing that is powering these regional examples of providing and consuming APIs.

I have been watching Amazon rapidly expand their available regions, as well as Google and Microsoft racing to catch up. But I am starting to see API providers like Digital Ocean providing APIs for getting at geographic region information, and Amazon provides API methods for getting the available regions for Amazon EC2 compute–I will have to check if this is standard across all services. Twilio has regions for their API client, and Runscope has a region API for managing how you run API tests from a variety of regions. The role of geographic regions when it comes to providing APIs, as well as consuming APIs is increasingly part of the conversation when you visit the most mature API platforms, and something that keeps coming up on my radar.

We are still far from the average company being able to easily deploy, deprecate, and migrate APIs seamlessly across cloud providers and geographic regions, but as APIs become smaller and more modular, and cloud providers add more regions, and APIs to support automation around these regions, we will begin to see more decisions being made at deploy and run time regarding where you want to deploy or consume your API resources. To be able to do this we are going to need a lot more data and common schema regarding the what geographic regions are available for deployment, what services operate in which regions, and other key considerations about exactly where our resources should operate. This is why I’m revisiting this work, to see what I can do to get API service providers to share more data from either the API provider or consumer side of the equation.

I am considering adding an area of my research dedicated to API regions, aggregating examples of how geographic regions are playing a role in API operations. I’m thinking region availability will be playing just as significant role as performance, plans, security, reliability, and other areas of the API lifecycle when it comes to deciding where you deploy or consume your APIs. It feels like another one of the aspects of API operations that will overlap with many stops along the API lifecycle–not just deployment. One of the areas of the API lifecycle I’m increasingly thinking about that will affect geographic API decisions is regulations, and how governments are dictating what is acceptable when it comes to the storage, transmission, and access of digital resources. It feels like early notions of what the World Wide Web has been for the last 25 years is about to be blown out of the water, with the influences of digital nationalism, regulation, or even the Internet moving off planet, and increasingly driven by satellite infrastructure.


Electronic Submission of Injury and Illness Records to OSHA

I recently learned that Occupational Safety and Health Administration (OSHA) has issued guidance regarding the electronic submission of injury and illness records via an API, from an announcement that they has postponed the availability of electronic submissions for another six months. Regardless of the delay, it is good to see them migrating towards an API-focused approach to allowing businesses to be compliant with safety regulations and reporting guidelines.

Here are the details of the OSHA guidance, broken down into some interesting buckets of guidance:

  • Who: Establishments with 250 or more employees that are currently required to keep OSHA injury and illness records, and establishments with 20-249 employees that are classified in certain industries with historically high rates of occupational injuries and illnesses.
  • What: Covered establishments with 250 or more employees must electronically submit information from OSHA Forms 300 (Log of Work-Related Injuries and Illnesses), 300A (Summary of Work-Related Injuries and Illnesses), and 301 (Injury and Illness Incident Report). Covered establishments with 20-249 employees must electronically submit information from OSHA Form 300A.
  • When: The requirement becomes effective on January 1, 2017. The new reporting requirements will be phased in over two years. In 2017, all covered establishments must submit information from their completed 2016 Form 300A by July 1, 2017. In 2018, covered establishments with 250 or more employees must submit information from all completed 2017 forms (300A, 300, and 301) by July 1, 2018, and covered establishments with 20-249 employees must submit information from their completed 2017 Form 300A by July 1, 2018. Beginning in 2019 and every year thereafter, covered establishments must submit the information by March 2.
  • How: OSHA will provide a secure website that offers three options for data submission. First, users will be able to manually enter data into a web form. Second, users will be able to upload a CSV file to process single or multiple establishments at the same time. Last, users of automated record-keeping systems will have the ability to transmit data electronically via an API (application programming interface). We will provide status updates and related information here as it becomes available.

I think the three options available are interesting. Manual website, and a CSV file upload–which is kind of a gateway to API-land, but they will also be providing the real deal when it comes to submitting forms using an API. All government agencies should be migrating towards this approach to handling forms, and OSHA provides us with one more blueprint to point at when convincing government to be more machine readable when it comes to forms–if there is a web form, or PDF form, there should also be an API for submitting as well.

Now that the federal agency is on my radar I will be keeping an eye out for when their API is ready, and maybe even offer some help when it comes to the portal and presence for the API. I like this example because it is a good reference for APIs being used to deliver government forms, but also because it is a good example of API driven regulatory compliance, which I think we need more of. Not because regulations automatically equal good, but because we need regulations and business compliance to be as observable as we possibly can–APIs will be how we do this.


Making An Account Activity API The Default

I was reading an informative post about the Twitter Account Activity API, which seems like something that should be the default for ALL platforms. In today’s cyber insecure environment, we should have the option to subscribe to a handful of events regarding our account or be able to sign up for a service that can subscribe and help us make sense of our account activity.

An account activity API should be the default for ALL the platforms we depend on. There should be a wealth of certified aggregate activity services that can help us audit and understand what is going on with our platform account activity. We should be able to look at, understand, and react to the good and bad activity via our accounts. If there are applications doing things that don’t make sense, we should be able to suspend access, until more is understood.

The Twitter Account Activity API Callback request contains three level of details:

  • direct_message_events: An array of Direct Message Event objects.
  • users: An object containing hydrated user objects keyed by user ID.
  • apps: An object containing hydrated application objects keyed by app ID.

The Twitter Account Activity API provides a nice blueprint other API providers can follow when thinking about their own solution. While the schema returned will vary between providers, it seems like the API definition, and the webhook driven process can be standardized and shared across providers.

The Twitter Account Activity API is in beta, but I will keep an eye on it. Now that I have the concept in my head, I’ll also look for this type of API available on other platforms. It is one of those ideas I think will be sticky, and if I can kick up enough dust, maybe other API providers will consider. I would love to have this level of control over my accounts, and it is also good to see Twitter still rolling out new APIs like this.


Shared Publishing Of Data and API Projects, Portals, and Dashboards Using Github

Each one of the 80+ areas of my API Evangelist lifecycle research projects is a single Github repository that I publish JSON or YAML data stores containing the news, organizations, tools, APIs, and patents that I’ve aggregated as part of my research. The home page of each site is a set of UI elements that take the data store and renders it into something that makes the data consumable by a human. I reference each project my storytelling, and it acts as a workbench as I craft my guides, white papers, and API strategy work as a consultant.

The news I have curated is published as a news listing. Organizations and tools are published as a listing with icons, title, description, and links. Listings of my partners, banner logos, and other elements are driven from YAML and JSON files I update as part of my continuous integration with Google Spreadsheets, Feedly, Twitter, Facebook, LinkedIn, and other services I use to manage my operations. I have a variety of manual, and automated processes that are run each day, publishing, syndicated and moving the API Evangelist platform.

All of this is open to anyone else who wants to publish to it via a Github commit, or via the API when you possess valid Github API token like I do. I publish data using a variety of Github accounts depending on which project or organization I am working on. In its simplest form it is collaborative website development. When you combine with YAML and JSON data, and a Jekyll presentation layer it can become collaborate dashboard publishing, with shared ownership of a forkable data engine. Each player can publish the YAML or JSON they are in charge of, and the Github hosted, Jekyll and Github Pages presentation layer displays as a website, dashboard, or even machine readable, static data feeds in YAML, JSON, Atom, CSV, or another format.

I have a variety of Github templates for managing my network of API research sites, as well as forkable projects that anyone can launch to support open data projects like my Adopta.Agency blueprint, or a human services API developer portal. It’s not just me doing this, you can find [a forkable example of an API and developer over at the General Services Administration(GSA), providing a baseline API definition that other government agencies can follow when launching an API, and be publishing an API developer portal. Github works well for the continous integration and deployment of open data, and API portals, projects, and dashboards that can be managed in a collaborative fashion out in the open, or behind a curtain with a private repository.

I’m working with a handful of API providers to experiment publishing API industry data in this way. Combining the organizations, news, tools, patents, and other data I aggregate from across the API industry with additional monitoring, search, discovery, security, and other relevant data into single project sites, data and API portals, and industry dashboards. I’m looking to draw out more API service providers and encourage them to share data that could benefit the entire community. I’ve been doing this for a while with API definitions, but will be expanding into other areas of the lifecycle, and hopefully encouraging more sharing, and forcing y’all to come out of your siloes a bit, and learn to work together–whether you like it or not.


Shared Publishing To Github Org


Electronic Submission Of Injury And Illness Records To Osha Using Api


Algorithmic Observability In Predictive Policing

As I study the world of APIs I am always on the lookout for good examples of APIs in action so that I can tell stories about them, and help influence the way folks do APIs. This is what I do each day. As part of this work, I am investing as much time as I can into better understanding how APIs can be used to help with algorithmic transparency, and helping us see into the black boxes that often are algorithms.

Algorithms are increasingly driving vital aspects of our world from what we see in our Facebook timelines, to whether or not we would commit a crime in the eyes of the legal system. I am reading about algorithms being used in policing in the Washington Monthly, and I learned about an important example of algorithmic transparency that I would like to highlight and learn more about. A classic argument regarding why algorithms should remain closed is centered around intellectual property and protecting the work that gives you your competitive advantage–if you share your secret algorithm, your competitors will just steal it. While discussing the predictive policing algorithm, Rebecca Wexler explores the competitive landscape:

But Perlin’s more transparent competitors appear to be doing just fine. TrueAllele’s main rival, a program called STRmix, which claims a 54 percent U.S. market share, has an official policy of providing defendants access to its source code, subject to a protective order. Its developer, John Buckleton, said that the key to his business success is not the code, but rather the training and support services the company provides for customers. “I’m committed to meaningful defense access,” he told me. He acknowledged the risk of leaks. “But we’re not going to reverse that policy because of it,” he said. “We’re just going to live with the consequences.”

And remember PredPol, the secretive developer of predictive policing software? HunchLab, one of PredPol’s key competitors, uses only open-source algorithms and code, reveals all of its input variables, and has shared models and training data with independent researchers. Jeremy Heffner, a HunchLab product manager, and data scientist explained why this makes business sense: only a tiny amount of the company’s time goes into its predictive model. The real value, he said, lies in gathering data and creating a secure, user-friendly interface.

In my experience, the folks who want to keep their algorithms closed are simply wanting to hide incompetence and shady things going on behind the scenes. If you listen to individual companies like Predpol, it is critical that algorithms stay closed, but if you look at the wider landscape you quickly realize this is not a requirement to stay competitive. There is no reason that all your algorithms can’t be wrapped with APIs, providing access to the inputs and outputs of all the active parts. Then using modern API management approaches these APIs can be opened up to researchers, law enforcement, government, journalists, and even end-users who are being impacted by algorithmic results, in a secure way.

I will be continuing to profile the algorithms being applied as part of predictive policing, and the digital legal system that surrounds it. As with other sectors where algorithms are being applied, and APIs are being put to work, I will work to find positive examples of algorithmic transparency like we are seeing from STRmix and HunchLab. I’d like to learn more about their approach to ensuring observability around these algorithms, and help showcase the benefits of transparency and observability of these types of algorithms that are impacting our worlds–helping make sure everyone knows that black box algorithms are a thing of the past, and the preferred approach of snake oil salesman.


Continue To Explore Restaurant Menu as an Analogy for API Copyright and Patents

While working on my feedback to the EFF for the first response to the Oracle v Google API copyright case, one of the stories I published used the restaurant menu as an analogy for API copyright. This example was used in the most recent response by Google’s lawyers as they defended themselves in court, and as I’m working on my API patent research, I wanted to revisit this analogy, in the same way, helping focus attention on why API patents are such a bad idea.

Building on my previous analogy, as a restaurant, imagine your restaurant specialty is delivering meat-centric dishes. Your burgers and steaks are da bomb! You literally have several “secret sauces”, some unique preparation processes, as well as some very appealing ways of naming and describing your dishes. Not that different from many API providers, who have some “secret sauces”, some unique process, as well as some very appealing ways of naming and describing the very useful APIs they are offering.

In regards to copyright, why would you want to lock up the naming and ordering of what you are offering? Even if your competitor copies the exact wording on their menu (documentation), their burgers and steaks do not have your secret sauce or unique processes. Also, why would you want to burden food delivery services from aggregating your menu (documentation) alongside other restaurants using copyright? Don’t restrict how the local paper or food rag can reference your menu (documentation), and publish it on and offline–it is unnecessary and will do nothing to protect your business.

In regards to patents, why would you want to lock up the menu to your burgers and steaks alongside your secret sauce(s) and unique process? Could you imagine if McDonalds sued everyone for patent infringement because they had a burger section on their menu? Someone comes up with a unique burger, and now nobody can have a specific meat dish sections on their menu? The menu and the ingredients of your recipe shouldn’t be included in your patent. If your process is truly that unique, and remarkable, then patent that, you shouldn’t be locking up the ingredients, and the ways of naming, describing, and providing a menu (documentation) for your dish (APIs).

I am not anti-patent (well almost), but am 100% anti API patent. APIs are not your secret sauce or process. The URL, parameters, headers, body, response, and other elements of your API are no more patentable than hamburger, buns, mustard and ketchup are for your killer burger. The reason we have so many API patents is that we have very greedy, short-sighted companies who are just racing to get a piece of the action, and they have been taught that patents are how you make a grab for all the digital dishes on the table, or even might possibly be on the table in the future. They see things moving in a particular direction, and rather than doing those things well, they focus on locking up the doing of that thing.

If you are in the business of patenting your companies technology, please focus on patenting your secret sauce and truly unique processes, not the method for exchanging, selling, and baking your solution into other systems and applications. APIs should not be patented. APIs, no matter how unique they might be, are not the thing you should be defending. You should be making them accessible, and defending the unique and valuable thing you do behind them. Stop including API in your patent filings please, it goes against everything that makes API even works.


API Preparation At The Bureau For The 2020 Census

I was reading about what the Census is doing to prepare for the 2020 census over at GCN. I’ve been invested in what they are doing at Census for some time now, so it makes me happy to see where they are headed with their preparation for the 2020 census. From what I’ve read, and what I’ve seen with their existing API efforts, they have really taken API to heart and are working to bake APIs into everything they do.

According to GCN: Through the site’s application programming interface, users will be able to download and manipulate the data to serve their own purposes; ensuring that the API can drive all of data.census.gov’s core functions means outside users will have more power as well. “The more that we make this API capable, then we can serve our customers better by providing them with ways to extend the API in their own platforms for their customer base” – said Census Bureau Chief Data Officer Zach Whitman. Continuing to show that the folks at Census get APIs.

The Census Bureau is the most important API out there for helping us understand the people of the United States, and how the economy is working, or not working for us. When you look at the landing page there are working on in preparation of the 2020 Census you can tell they continue to work hard to find new ways of exploring and visualization the huge amount of data they have gathered through the censuses of the past. I’m glad the Census Bureau has been on their API journey for several years now, as what they have learned will go a long way towards making the 2020 census make a more meaningful impact.

APIs are not just about providing access to data. They are also about allowing many 3rd parties to add, update, as well as access and put to use data. Having the infrastructure and practice will contribute to a more collaborative and interactive census in 2020. Allowing local and regional players to actively play a role in the census as it plays out. What the Census teams have learned over the last couple of years operating their API platform will improve the results and the reach of the 2020 census in a way that will make it inclusive and relevant for the average citizen across the country.

I’ve met the Census data and API teams, I’m not worried about their preparation for the 2020 census. I do worry about whether or not they will have the resources they need to get the job done. With current discussions around funding the agency, as well as supporting the proper leadership at the agency. Like every other API effort I’ve come across it, the success of the 2020 census won’t just be a technical thing, it will also be about having the right business and political platform to ensure APIs are done right and are serving all stakeholders–which in the case of the 2020 Census includes every citizen in the United States.


I Have Two APIs I Am Interested In And I Am Not A Developer--What Do I Do?

My friend David Kernohan (@dkernohan) emailed me the other day asking me for some advice on where to get started working with some data APIs he had been introduced to. This is such a common question for me, and surprisingly seven years into API Evangelist they are questions I still do not have easy answers for. Partly because I spend the majority of my time writing about providing APIs, but also because API consumption is often times inconsistent, and just hard.

David provided me with two sources of data he wanted to work, which I think help articulate the differences between APIs, that can make things hard to work with when you are just getting started with any API. Let’s break down the two APIs he wants to work with:

  • UNISTATS
    • Description: Compare official course data from universities and colleges.
    • URL: http://dataportal.unistats.ac.uk/Pages/ApiDocumentation
    • Details: It is an API with 8 separate paths to get what you need.
    • Resources: Institution, Course, Stages, Accreditations, Locations, Statistics
    • Data Type: XML & JSON
    • Authentication: Basic Auth
  • Higher Education Funding Council for England (HEFCE) Register of Higher Education Providers
    • Description: The HEFCE Register is a searchable tool that shows how the Government regulates higher education providers in England.
    • URL: http://www.hefce.ac.uk/reg/register/data/
    • Details: Downloadable files with 6 urls available.
    • Resources: Providers, Courses
    • Data Type: XML, CSV
    • Authentication: Auth: NONE

Here you have two sources of data that overlap. One is actually an API, which you can change paths, parameters, and get different JSON or XML results. The other is just a download of an XML or CSV file. One has authentication using BasicAuth, which is a standard way of logging into websites, which often is reappropriated for accessing web APIs. You can start to see why API consumption can become pretty overwhelming, pretty quickly.

CSV Is Easier So where do we start? Well with the HEFCE downloads you get the results in CSV, something you can quickly upload into a spreadsheet and get to work. This is pretty straightforward data 101 stuff, making CSV and the spreadsheet still number one when it comes to working with data–a wide audience. However, the core dataset David wanted to work with from UNISTATS is an API, with JSON and XML returned. I know us API folk like to think of APIs as opening up access to data, but this chasm is one that many folks aren’t going to be able to step over.

XML is Harder Let’s begin with the pulling a list of institutions in XML. Before I can get at the data I need to sign up for an account. After signing up I am given a key which I can use to authenticate the first time I try to load the URL http://data.unistats.ac.uk/api/v3/KIS/Institutions.xml–there are more details about authentication as part of their documentation. As soon as I download this file, I double click on it to see which application will try to load it–Microsoft Word happily steps forward to assume the responsibility. However, it does me little good, and just loads a big XML blog in a documentation–what do I do with that? Next, I try with Microsoft Excel with the same results. Google Drive gives me the same response, uploads as XML, and loads as a blog with no recognition by Google Docs or Spreadsheet. So what now?

We have a set of CSV files, and potentially a set of XML files, after making our way through each available path of the UNISTATS API. We need to get the XML into Google Sheets, or Microsoft Excel. The CSV is easy, the XML is hard–we will need to convert from XML to CSV. We could accomplish this with the Google Sheets importXML function, but because the API requires authentication it would need some programming–I wanted to keep this code free if possible for now. We are able to authentication with the UNISTATS API via our browser and download the XML with this API, so writing code, even a Google Script seems over the top.

XML To CSV Conversion I recommend using a simple tool like XML To CSV Converter, and not overengineering this time. You can just download the XML returned from the UNISTATS API, save to desktop and upload to the XML to CSV Converter to get the CSV edition. Then the data from the UNISTATS API and the HEFCE downloads, all CSV format now can be uploaded to Google Sheets, or imported into Microsoft Excel for working with. This process can be repeated as necessary, whenever the data is updated on each of the sites–no coding necessary.

Reusing This Process For Future APIs This process worked with these two APIs. Next time you are working on a project the APIs could have different types of paths available, returning XML, JSON, CSV, or other configuration. They might have different types of authentication requiring API keys as a parameter, or maybe even OAuth–raising the bar even higher when it comes to connecting. Most importantly, sometimes the data returned might not be neat columns and rows, and not be compatible with working within a spreadsheet. Many APIs return “flat” data like we encountered this round, but an increasing number of APIs are returning much more structured forms of data that won’t simply import into a spreadsheet.

For this API exercise, we were able to take care of business using the browser. We were able to download the CSV, and the XML from the API using the browser as a client–no coding necessary. This is a really important element of understanding APIs. Websites return HTML for browsers to show to humans, and web APIs return XML, JSON, CSV, or another machine readable format for use in ANY application–in this case, it was a spreadsheet that will help us take things to the next level and figuring out what we want to do next with this data, to make sense of things.

Look At Leading API Clients – No Code Necessary For future API projects, I recommend taking a look at the Postman, or Restlet API client. Which will help act as a client for working with simple, and more complex APIs–helping you with authentication, headers, and other aspects of consuming a diverse range of APIs. These clients allow you to connect to APIs, and work with the XML, JSON, CSV, and other responses you will receive. Of course you will still have to download, convert, and upload resulting data into whatever application you intend to work with data within. These are simply clients, not applications that will help you transform, analyze, visualize, and make sense your data–it is up to you to do this.

Some Final Thoughts On API Consumption Despite almost 17 years of evolution web API consumption is still hard, and in the realm of programmers. Google Sheets and Microsoft Excel have tools to help you pull in data from APIs, but authentication and complex data structures will always be an obstacle in this environment. For more complex API integrations I recommend adopting an API client like Postman or Restlet, which will augment Google and Excel spreadsheets in your toolbox. Beyond that, I encourage using Github for publishing CSV, JSON, or YAML that is returned from APIs, and telling stories around the data using Github Pages. Github has been working hard to build in features for working with CSV, JSON, and YAML data, again making it possible to work with data returned form APIs with no, or minimal coding–Github employs Jekyll, which in turn uses Liquid + HTML, but is something totally still within the realm of non-programmers.

Learning how to consume APIs is a journey, not a destination. APIs come in many shapes and sizes, but if you grasp the basic of the web, and have some of the right tools in your toolbox, you can navigate them and put them to work. I’m going to work on some more Google Spreadsheet examples, as well as some Postman and Restlet examples, using some of the most common APIs out there like Twitter, Flickr, and others. I’ll check back with David in a couple weeks to see how he is doing when it comes to onboarding with the APIs he has targeted for this project, and see where I can help him in his journey.


The Open Service Broker API

Jerome Louvel from Restlet introduced me to the Open Service Broker API the other day, a “project allows developers, ISVs, and SaaS vendors a single, simple, and elegant way to deliver services to applications running within cloud-native platforms such as Cloud Foundry, OpenShift, and Kubernetes. The project includes individuals from Fujitsu, Google, IBM, Pivotal, RedHat and SAP.”

Honestly, I only have so much cognitive capacity to understand everything I come across, so I pasted the link into my super secret Slack group for API super heroes to get additional opinions. My friend James Higginbotham (@launchany) quickly responded with, “if I understand correctly, this is a standard that would be equiv to Heroku’s Add-On API? Or am I misunderstanding? The Open Service Broker API is a clean abstraction that allows ‘services’ to expose a catalog of capabilities, as well as the ability to create, use and delete those services. Sounds like add-on support to me, but I could be wrong[…]But seems very much like vendor-to-vendor. Will be interesting to track.”

At first glance, I thought it was more of an aggregation and/or discovery solution, but I think James is right. It is an API scaffolding that SaaS platforms can plug into their platforms to broker other 3rd party API services. It allows any platform to offer an environment for extending your platform like Heroku does, as James points out. It is something that adds an API discovery dimension to the concept of offering up plugins, or I guess what could be an embedded API marketplace within your platform. Opening up wholesale and private label opportunities for API providers to sell their warez directly on other people’s platforms.

The concept really isn’t anything new. I remember developing document print plugins for Box back when I worked with the Mimeo print API in 2011. The Open Service Broker API is just looking to standardize this approach so hat API provider could bake in a set of 3rd party partner APIs directly into their platform. I’ve recently added a plugin area to my API research. I will add the Open Service Broker API as an organization within this research. I’m probably also going to add it to my API discovery research, and I’m even considering expanding it into an API marketplace section of my research. I can see add-on, plugin, marketplace, and API brokering like this grow into its own discipline, with a growing number of definitions, services, and tools to support.