The API Evangelist Blog

This blog represents the thoughts I have while I'm research the world of APIs. I share what I'm working each week, and publish daily insights on a wide range of topics from design to depcration, and spanning the technology, business, and politics of APIs. All of this runs on Github, so if you see a mistake, you can either fix by submitting a pull request, or let me know by submitting a Github issue for the repository.


API Transit Basics: Training

This is a series of stories I’m doing as part of my API Transit work, trying to map out a simple journey that some of my clients can take to rethink some of the basics of their API strategy. I’m using a subway map visual, and experience to help map out the journey, which I’m calling API transit–leveraging the verb form of transit, to describe what every API should go through.

Think of support as a reactive area, while training will be the proactive area of the API life cycle. Ensuring there is a wealth of up to date material for API developers and consumers across all stops along the API life cycle. Investing in internal, and partner capacity when it comes to the fundamentals of APIs, as well as the finer details of each stop along the API life cycle, and CI/CD pipelines will pay off big time down the road.

Every API should be included in training materials, workshops, and potentially part of conference talks given my product owners. If your API delivers value within your organization, to partners, and 3r party developers you should be training folks on putting this value to use. Here are some of the training areas I’m seeing emerge within successful API operations:

  • Workshops - Conduct more of the workshops that I conducted with external consultants like me, as well as make sure they are conducted internally by each group. The first day of the our workshop was a great example of this in action.
  • Curriculum - Establish common approaches to designing, developing, and evolving curriculum for teaching about the API lifecycle, as well as using each individual API. Provide forkable templates that developers can easily put to work as part of their work, and make support materials a pipeline asset that gets deployed along with documentation, and other assets.
  • Conferences - Make sure you are sending team members to the latest conferences. In my conversations during the workshop, this didn’t seem like a problem, but something I think should be included anyways.

Like every other stop along this journey, API training can be a pipeline artifact and be developed, deployed, and evolved alongside all other code, documentation, and available solutions. Pushing everyone to not just attend trainings, but also work to develop and deliver curriculum as part of trainings helps make sure everyone is able to communicate what they do across the company. Everyone should be contributing to, executing, and participating in API training, no matter what their skill level, or ability to get up in front of people and speak.

API training will increase adoption, and save resources down the road. It will compliment platform communications, and strengthen support, while providing valuable feedback that can be included as part of the road map. My favorite part about doing API training is that it forces me as a developer to think through my ideas, consider how I can articulate and share them with others, which if you think about it, is an essential part of the API journey, and something we should bake into operations by default.


Transit Authorities Need too Understand that API Management Means More Control and Revenue

As I look through the developer, data, and API portals of transit authorities in cities around the United States, one thing is clear–they are all underfunded, understaffed, and not a priority. This is something that will have to change if transit authorities are expected to survive, let alone thrive in a digital world. I’m working on a series of white papers as part of my partnership with Streamdata.io, on transit data, and the monetization of public data. I’ll be writing more on this subject as part of this work, but I needed to start working through my ideas, and begin crafting my narrative around how transit authorities can generate much needed revenue and compete in this digital world.

I know that many transit authorities often see data as a byproduct, and something that is occasionally useful, and that their main objective is to keep the trains running. However, every tech company, and developer out there understands the value of the data being generated by the transit schedule, the trains, and most importantly the ridership. The Googles, Ubers, and data brokers are mining this data, enriching their big data warehouses, and actively working to generate revenue from transit authorities most valuable assets–their riders. Ticket sales, and people riding the trains seems like the direct value generation, but in the era of big data, where those people are going, what they are doing, reading, thinking, and who they are doing it with is equally or more valuable than the price of the ticket to ride.

Most of the transit APIs I’m using require you to sign up for a key, so there is some sort of API management in place. However, there are still huge volumes of feeds and downloads that are not being managed, and there is no evidence that there is any sort of analysis, reporting, or other intelligence being gathered from API consumption. Demonstrating that API management is more about rate limiting, and keeping servers up, than it is ever realizing what API management is truly about in the mainstream API world–value and revenue generation. Transit authorities need to understand that API management isn’t just about restricting access, it is about encouraging access, and developing awareness regarding the value of the data resources they have in their possession.

I’m sure there is so much more data available behind the scenes beyond the schedules, vehicles, or even the real time information some transit authorities are making available. There is no reason that fare purchases, user demographics, neighborhood, sensors, ticket swipes, and other operational data can’t be made available, of course taking care of privacy and security concerns. Technology platforms, and application developers have an insatiable appetite for this type of data, and are willing to pay for it. The greatest lie the devil has told is that public data should always be free, even for commercial purposes, allowing tech companies to freely mine, and generate revenues while transit authorities and municipalities struggle to make ends meet.

I’m not saying that transit data shouldn’t be freely available, and accessible by everyone, but there needs to be more balance in the system, and API management is how you do this. You measure who takes from the system, how much, and you pay or charge accordingly. Right now, there are applications mining transit data schedules and rider details, and selling leads to Uber, Lyft, and other ride share, further cannibalizing our public transit system. Transit authorities need to realize the value they generate on a daily basis, and understand that API management is key to quantifying this value, and generating the revenue they need to stay in operation, and even grow and thrive, better serving their constituents at a time when they need them the most.


The Value of Historical Transit Data When it Comes to Machine Learning

I’m working through the different ways that transit authorities can generate more revenue from their data using APIs as part of my work with Streamdata.io. Making data streaming and truly more real time is the obvious goal of this research, but Streamdata.io is invested in transit authorities take more control over their data resources, and use APIs to generate revenue at a time when they need all the revenue they can possible get their hands on.

One overlap in the projects I’m working on with Streamdata.io is where transit data intersects with machine learning, and artificial intelligence. I’m not sure what transit authorities are doing with their historical data, but I know that it isn’t available via their APIs, and developer portals. I’m guessing they see historical data about schedules, vehicles, riderships, and other data points as a burden, and once they’ve generated the reports they need, don’t do anything else with it. This historical data is a goldmine of information when it comes to training machine learning models, which could then in turn be better used to understand ridership, make predictions, understand maintenance, scheduling, and other aspects of transit operations–let alone commerce, real estate, and other demographic data.

There is a dizzying amount of investment going into machine learning and artificial intelligence right now, and is something that could be routed to transit authorities to help boost revenue. If all historical data on transit operations was digitized and available via APIs, then metered using modern API management approaches, it could be an entirely new revenue opportunity for transit authorities. Transit systems are the heartbeat of the cities they operate within, and historical data is the record of everything that occurs, which can be used to develop machine learning models for the transit industry, as well as real estate, commerce, and other sectors that transit systems feed into on a daily basis, and have for years.

I do not know what data transit authorities possess. I don’t know how much historical data they keep around, and what is required by government regulators, but I do know whatever there is, it has value. I’ve studied how API management is being used by tech companies for almost 8 years now, and it is how value is created, and revenue is generated, something that transit authorities and leadership needs to realize applies to them in a digital age. They are sitting on a wealth of historical data that would be of value to tech companies who are already mining their existing schedules, and real time vehicle data. Historical transit data, and machine learning just represents one of many opportunities on the table for transit authorities to tap when it comes to looking for new revenue opportunities in the future.


Moving Beyond A Single API Developer Portal

I am working with a number of different groups who are using developer portals in some very different ways once they have moved beyond the concept that API developer portals are just for use in the public domain. With the introduction of the static site shift in the CMS landscape, and the introduction of Jekyll and Github Pages as part of the Github project workflow, the concept of the API developer portal is beginning to mean a lot of different things, depending on what the project objective is.

An API portal is becoming something that can reflect a specific project, or group, and isn’t something that always has a public URL. Here are just a few of the ways in which I’m seeing portals being wielded as part of API operations.

  • Individual Portals - Considering how developers and business users can be leverage portals to push forward conversations around the APIs they own and are moving forward.
  • Team Portals - Thinking about how different groups and teams can have their own portals which aggregate APIs and other portals from across their project.
  • Partner Portals - Leveraging a single, or even partner specific portals that are public or private for engaging in API projects with trusted partners.
  • Public Portal - Begin the process of establishing a single Mutual of Omaha developer portal to provide a single point of entry for all public API efforts across the organization.
  • Pipeline Integration - How can BitBucket be leverage for deploying of individual, team, partner, and even the public portal, making portals another aspect of the continuous deployment pipeline.

One of the most interesting shifts that I am seeing is the deployment of portals as part of continuous deployment and integration pipelines. Since you can host a portal on Github, why not be deploying it, managing and evolving it as its own pipeline, or as part of individual projects, and partner integrations. This is something that static CMSs have have a profound effect on, as well as the integration of YAML, JSON, and CSV static data formats, which can be used to deliver data, content, and configurations that can be used throughout project pipelines.

Like every other stop along the API life cycle, API portals are quickly becoming more modular, and often times more ephemeral, shifting from the days where we just had one single API portal. We are beginning to move beyond just the concept of a single portal, and seeing a mix of API centric destinations that are public or private, and reflect the changing objectives of different teams, partners, and event outside industry influences.


API Transit Basics: Support

This is a series of stories I’m doing as part of my API Transit work, trying to map out a simple journey that some of my clients can take to rethink some of the basics of their API strategy. I’m using a subway map visual, and experience to help map out the journey, which I’m calling API transit–leveraging the verb form of transit, to describe what every API should go through.

Beyond communication, make sure there is adequate support across API teams, and the services, tooling, and processes involved with operations. If an API goes unsupported, it might as well not exist at all, making standardized, comprehensive support practices essential. Every API should have an owner, with a contact information published as part of all API definitions. Ideally, every API owner has a backup point of contact to go to if someone should leave a company, or is out sick.

Similar to the communications stop along this journey, there a handful of common support building blocks you see present within the leading API pioneers portals. These are the four I recommend baking in by default to all of your API portals, and anywhere API discovery and documentation exists.

  • Email - Make sure there is support available via email channels, with a responsive individual on the other end–with accountability.
  • Tickets - Consider a ticketing system for submitting and supporting requests around API operations.
  • Dedicated - Identify one, or many individuals who can act as internal, partner, and public support when it makes sense.
  • Office Hours - Consider one time a week where there is a human being available in person, or online to answer direct question.

Every single API should have a support element present as part of its operations. Each API definition allows for the inclusion of responsible point of contact via email, and other channels–make use of it. Support should be baked into each API’s definition, making it accessible across the API lifecycle, in a machine readable way. Ideally, every API developer provides support for their own work, bringing them closer to how their solutions are working, or not working for consumers.

API support isn’t rocket surgery, but will make or break your API operations if not done well. The APIs that perform the best, will have strong support behind them. The APIs that go dormant, or aren’t reliable will not have proper support. Developers do not think about support by default, which means it will often go overlooked unless a more senior developer, business user, or manager steps up. This is why API operations is a team sport, because there are different skill levels, and personalities at play, and while proper support won’t be everyone’s strength, it is something everyone should be responsible for.


Providing a Guest API Key

I’m spending time immersed in the world of transit data and APIs lately, and found a simple, yet useful approach to helping onboard developers over at the Washington Metropolitan Area Transit Authority (WMATA) API. When you you click on their products page (not sure why they use this name), you get a guest API key which allows you try out the API, and kick the tires. Of course, you can’t use the key in production applications, as it is rate limited and can change at any time, but the concept is simple, and provides an example which other API providers might want to consider.

In my days as the API Evangelist I’ve seen API providers do this in a variety of ways, by providing sample API URLs complete with an API key, and by embedding a key in the OpenAPI definition, so that the interactive documentation picks up the key and will allow developers to make live, interactive API calls–to name a few. No matter what your approach, providing a guest key for users to play around without signing up makes a lot of sense. Of course, you want to rotate this key regularly, or at least be monitoring it to see what IP addresses it is being called from, and maybe understand how its being used–you never know, it might reveal some interesting use cases.

Whatever your approach, get out of the way of your consumers. Don’t expect that everyone is going to want to sign up for an account so that they can learn about what your API does. Not everyone is interested in handing over their email address, and other information just so they can test drive. If you have API management in place (which you should), it really doesn’t take much to generate test users, applications, and keys, monitor them, and rotate them on a regular basis. I recommend using them like you would marketing tags, and strategically place them into different blog posts, documentation, widgets, and other resources–you never know what other insight you might learn about how people are putting your resources to use.


API Transit Basics: Communication

This is a series of stories I’m doing as part of my API Transit work, trying to map out a simple journey that some of my clients can take to rethink some of the basics of their API strategy. I’m using a subway map visual, and experience to help map out the journey, which I’m calling API transit–leveraging the verb form of transit, to describe what every API should go through.

Moving further towards the human side of this API transit journey, I’d like to focus on one of the areas that I see cause the failure and stagnation of many API operations–basic communications. A lack of communication, and one way communication are the most common contributors to APIs not reaching their intended audience, and establishing much needed feedback loops that contribute to the API road map. This portion of the journey is not rocket science, it just take stepping back from the tech for a moment and thinking about the humans involved.

When you look at Twitter, Twilio, Slack, Amazon, SalesForce, and the other leading API pioneers you see a handful of communication building blocks present across all of them. These are just a few of the communication elements that should be present in both internal, as well as external or publicly available API operations.

  • Blogs - Make blogs a default part of ALL portals, whether partner, public, or internal. They don’t have to be grand storytelling vehicles, but can be used as part of communicating around updates within teams, groups, and for projects.
  • Twitter - Not required for internally focused APIs, but definitely essential if you are running a publicly available API.
  • Github - Github enables all types of communication around repos, issues, wikis, and other aspects of managing code, definitions, and content on the social coding platform.
  • Slack - Leverage Slack for communicating around APIs throughout their life cycle, providing a history of what has occurred from start to finish.
  • API Path IDs - Establish common DNS + API path identifiers for creating threads around each API, allowing for discussions on BitBucket, Slack, and in emails when it comes to each API.

I recommend making communication a default requirement for all API owners, stewards, and evangelist who work internally, as well as externally. Ensuring that there is communication around the existence, and life cycle of an API, and helping make sure there is awareness across teams, as well as up the management chain. It’s not rocket science, but it is essential to doing business around programmatic interfaces. You don’t have to be a poet, or prolific blogger, but you do have to care about keeping your API consumers informed.

Communication around API operations is easily overlooked, and difficult to recreate down the road. Just put it in place from the beginning, and don’t worry about activity levels. Be genuine in what you publish and share, and be responsive and open with your readers. Whenever possible make things a two-way street, allowing readers to share their thoughts. Track everything, and route it back into your road map, leveraging all communications as part of the API feedback loop.


Can I Resell Your API?

Everyone wants their API to be used. We all suffer from “if we build it, they will come” syndrome in the world of APIs. If we craft a simple, useful API, developers will flock to it and integrate it into their applications. However, if you operate an API, you know that getting the attention of developers, and standing out amongst the growing number of APIs is easier said than done. Even if your API truly does bring the value you envision to the table, getting people to discover this value, and invest the time into integrating it into the platforms, products, and services takes a significant amount of work–requiring that you remove all possible obstacles and friction from any possible integration opportunity.

One way we can remove obstacles for possible integrations is by allowing for ALL types of applications–even other APIs. If you think about it, APIs are just another type of application, and one that many API providers I’ve talked with either haven’t thought about at all, or haven’t thought about very deeply and restrict this use case, as they see it as directly competing with their interests. Why would you want to prevent someone from reselling your API, if it brings you traffic, sales, and the other value your API brings to your company, organization, institution, or government agency? If a potential API consumer has an audience, and wants to private label your API, how does that hurt your business? If you have proper API management in place, and have a partner agreement in place with them, how is it different than any other application?

I’ve been profiling companies as part of my partnership with Streamdata.io, looking for opportunities to deliver real time streaming APIs on top of existing web APIs. Ideally, API providers become a Streamdata.io customer, but we are also looking to enable other businesses to step up and resell existing APIs as a streaming version. However, in some of the conversations I’m having, people are concerned about whether or not API provider’s terms of service will allow this. These developers are worried that revenue generation through the reselling of an existing API as something that would ruffle the feathers of their API provide, and result in getting their API keys turned off. Which is a completely valid concern, and something that is spelled out in some terms of service, but I’d say is often left more as an unknown, resulting in this type of apprehension from developers.

Reselling APIs is something I’m exploring more as part of my API partner research. Which APIs encourage reselling, white and private labeling, and OEM partnerships? Which APIs forbid the reselling of their API? As well as which APIs have not discussed it all. I’d love to hear your thoughts as an API provider, or someone who is selling their services to the API space. What are your thoughts on reselling your API, and have you had conversations with potential providers on this subject. I am going to explore this with Streamdat.io, APIMATIC, and other companies I already work with, as well as reach out to some API providers I’d like to resell as a streaming API using Streadmata.io and see what they say. It’s an interesting conversation, which I think we’ll see more discussion around in 2018.


API Transit Basics: Security

This is a series of stories I’m doing as part of my API Transit work, trying to map out a simple journey that some of my clients can take to rethink some of the basics of their API strategy. I’m using a subway map visual, and experience to help map out the journey, which I’m calling API transit–leveraging the verb form of transit, to describe what every API should go through.

Hopefully you already have your own security practices in place, with the ability to scan for vulnerabilities, and understand where security problems might exist. If you do, I’m guessing you probably already have procedures and protocols around reporting, and handling security problems across teams. Ideally, your API security practices are more about prevention than they are about responding to a crisis, but your overall strategy should have plans in place for addressing both ends of the spectrum.

Unfortunately in the wider API space, much of the conversation around API security has been slowed by many people feeling like their API management solutions were doing everything that is needed. Luckily, in 2017 we began to see this thaw a bit and some API security focused solutions began to appear on the market, as well as some existing players began tuning into to address the specific concerns of API security, beyond the desktop, web, and other common areas of concern.

  • Scanning APIs with OWASP Zap - OWASP is the top place for understanding security vulnerabilities of web applications, and they are expanding their focus to include APIs.
  • 42 Crunch - A new, OpenAPI driven API security solution for helping deliver policies across API operations.
  • OWASP REST Security Cheat Sheet - A checklist of considerations when it comes to API security out of OWASP.

After crafting this stop along the API lifecycle I wanted to make sure and include API discovery in the conversation. API definitions like OpenAPI, and a solid API discovery strategy helps provide the details of the surface area of API operations, allowing for easier scanning and securing of existing infrastructure. Another area that significantly introduces security benefits is making logging a first class citizen, allowing the DNS, gateway, code, server, and database layers to analyzed for vulnerabilities.

I prefer keeping this security stop short and sweet, as I know from experience that not all my readers have a strategy in place, and I want to give them a handful of options to consider as they look to get started. Many groups have been focusing on web and mobile security, but are just getting started thinking about API security. As APIs move out of the shadows behind mobile applications, and the number of threats increase, companies, institutions, and government agencies are getting more nervous, increasing the need for more API security storytelling here on my site.


An Organized Approach to OpenAPI Vendor Extensions Across API Teams

One of aspects of a project I’ve been assisting with recently involves helping define, implement, and organize the usage of OpenAPI vendor extensions across a distributed microservice development team. When I first began advising this group, I introduced them to the concept of extending the OpenAPI definition using x-extension format, expanding the teams approach to how they use the OpenAPI specification. They hadn’t heard that you can extend OpenAPI beyond what the specification brings to the table, allowing them to make it deliver exactly what they needed.

Within this project each microservice exists in its own Github repository, with an OpenAPI definition available in the root, defining the surface area of the API. At the organizational level, I have a script that will loop through all Github repos, spider each OpenAPI looking for any x-extensions, and then it aggregates them into a single master list of OpenAPI vendor extensions that are used across all APIs. I then take the list, and add descriptions to each verified extension, and when I come across ones I haven’t seen before I reach out to microservice owners to understand what the vendor extension is used for, and ensure it is in alignment with overall API governance for the project.

I am looking to encourage API designers, developers, and architects to extend OpenAPI. I am also looking to help them be responsible with this power, and make sure they are doing it for good reasons. I am also looking to organize, then educate across teams regarding how different groups are using OpenAPI vendor extensions, and incentivize the reuse and standardization of these extensions. I see vendor extensions as an area for potential innovation when it comes to defining what an API is capable of, as well as the relationship each microservice has with its supporting architecture. Acting for a relief valve for the often perceived constraints of the OpenAPI specification.

This project reminded me that I need to make sure and add a reminder to pick my head up each week and spend time aggregating vendor extensions from across the API space. I have a section dedicated to them in my OpenAPI toolbox, but I haven’t added anything to it recently. OpenAPI Vendor extensions are an important way to learn about how companies are extending the specification, and similar to what I’m doing for this individual project, it is important to aggregate and organize them so people in the community can learn from them, and reuse them whenever possible.


Orchestrating API Integration, Consumption, and Collaboration with the Postman API

You hear me say it all the time–if you are selling services and tooling to the API sector, you should have an API. In support of this way of thinking I like to highlight the API service providers I work with who follow this philosophy, and today’s example is from (Postman](https://getpostman.com). If you aren’t familiar with Postman, I recommend getting acquainted. It is an indispensable tool for integrating, consuming, and collaborating around the APIs you depend on, and are developing. Postman is essential to working with APIs in 2018, no matter whether you are developing them, or integrating with 3rd party APIs.

Further amplifying the usefulness of Postman as a client tool, the Postman API reflects the heart of what Postman does as not just a client, but a complete life cycle tool. The Postman API provides five separate APIs, allowing you orchestration your API integration, consumption, and collaboration environment.

  • Collections - The /collections endpoint returns a list of all collections that are accessible by you. The list includes your own collections and the collections that you have subscribed to.
  • Environments - The /environments endpoint returns a list of all environments that belong to you. The response contains an array of environments’ information containing the name, id, owner and uid of each environment.
  • Mocks - This endpoint fetches all the mocks that you have created.
  • Monitors - The /monitors endpoint returns a list of all monitors that are accessible by you. The response contains an array of monitors information containing the name, id, owner and uid of each monitor.
  • User - The /me endpoint allows you to fetch relevant information pertaining to the API Key being used.

The user, collections, and environments APIs reflect the heart of the Postman API client, where mocks and monitors reflects its move to be a full API life cycle solution. This stack of APIs, and the Postman as a client tool reflects how API development, as well as API operation should be conducted. You should be maintaining collections of APIs that exist within many environments, and you should always be mocking interfaces as you are defining, designing, and developing them. You should then also be monitoring all the APIs you depend–whether or not the APIs are yours. If you depend on APIs, you should be monitoring them.

I’ve long been advocating that someone development an API environment management solution for API developers, providing a single place we can define, store, and share the configuration, keys, and other aspects of integration with the APIs we depend on. The Postman collections and environment APIs is essentially this, plus you get all the added benefits of the services and tooling that already exist as part of the platform. Demonstrating why as an API service provider, you want to be following your own advice and having an API, because you never know when the core of your solution, or even one of the features could potentially become baked into other applications and services, and be the next killer feature developers can’t do without.


API Life Cycle Basics: Testing

Every API should be tested to ensure it delivers what is expected of it. All code being deployed should meet required unit and code tests, but increasingly API testing is adding another layer of assurance to existing build processes, even going so far as halting CI/CD workflows if tests fail. API testing is another area where API definitions are delivering, allowing tests to be built from existing artifacts, and allowing detailed assertions to be associated with tests to add to and evolve the existing definitions.

API testing has grown over the last couple of years to include a variety of open source solutions, as well as cloud service providers. Most of the quality solutions allow you to import your OpenAPI, and automate the testing via APIs. Here are a few of the solutions I recommend considering as you think about how API testing can be introduced into your API operations.

  • Runscope - An API testing service that uses OpenAPI for importing and exporting of API tests and assertions.
  • Hippie-Swagger - An open source solution for testing your OpenAPI defined APIs.
  • Spring Cloud Contract - Spring Cloud Contract is an umbrella project holding solutions that help users in successfully implementing the Consumer Driven Contracts approach.
  • Postman Testing - With Postman you can write and run tests for each request using the JavaScript language.
  • Frisby.js - Frisby is a REST API testing framework built on Node.js and Jasmine that makes testing API endpoints easy, fast, and fun.

There are numerous ways to augment API testing on top of your existing testing strategy. More of these providers are integrating with Jenkins and other CI/CD solutions, allowing API testing to deeply integrate with existing pipelines. My recommendation is that the artifacts from these tests and assertions also live alongside OpenAPI and other artifacts and are used as part of the overall definition strategy, widening the meaning of “contract” to apply across all stops along the lifecycle–not just testing.

While API testing may seem like common sense, I’d say that more than 50% of the organizations I’m talking with do not actively test all their APIs. Of the ones that do, I’d say less than half get very granular in their testing or follow test driven development philosophies. This is where service providers like Runscope deliver, helping bring the tools and expertise to the table, allowing you to get up and running in a cloud environment, building on a platform, rather than started from scratch when it comes to your API testing.


API Transit Basics: SDKs

Software Development Kits (SDKs), and code libraries in a variety of programming languages have always been a hallmark of API operations. Some API pundits feel that SDKs aren’t worth the effort to maintain, and keep in development alongside the rest of API operations, while others have done well delivering robust SDKs that span very valuable API stacks–consider the AWS JavaScript SDK as an example. Amidst this debate, SDKs continue to maintain their presence, and even have been evolving to support a more continuous integration (CI) and continuous deployment (CD) approach to delivering APIs and the applications that depend on them.

Supporting SDKs in a variety of programming languages can be difficult for some API providers. Luckily there is tooling available that help auto-generate SDKs from API definitions, helping make the SDK part of the conversation a little smoother. Of course, it depends on the scope and complexity of your APIs, but increasingly auto-generated SDKs and code as part of a CI/CD process is becoming the normal way of getting things done, whether you are just making them available to your API consumers, or you are actually doing the consuming yourself.

  • Swagger Codegen - The leading open source effort for generating SDKs from OpenAPI.
  • APIMATIC - The leading service for generating SDKs from OpenAPI, and including as part of existing CI/CD efforts.
  • RESTUnited - The easiest way to generate SDKs (REST API libraries): PHP, Python, Ruby, ActionScript (Flash), C#, Android, Objective-C, Scala, Java

Depending on your versioning and build processes, SDK generation can be done alongside all the other stops along this life cycle. When you iterate on an API, you simply auto-generate documentation, tests, SDKs, and other aspects of supporting your services. Not all providers I talk with are easily able to jump into the aspect of producing code, as their build processes aren’t as streamline, and some of their APIs are too large to expect auto-generated code to perform as expected. However, it is something they are working towards, along with other microservices, and decoupling efforts going on across their teams.

Once you realize an API definition driven approach to delivering APIs, the line between deployment and SDKs blurs–it is all about generating code from your definitions. Sometimes the code is providing resources, and other times it is consuming them. It just comes down to whether you are deploying code server or client side. Another significant shift I’m seeing in the landscape with SDKs, are things moving beyond just programming languages, and providing platform specific libraries for managing SalesForce, AWS, Docker, and other common components of our operations–further evolving the notion of what an SDK is and does in 2018.


AWS Has Head Start Helping Navigate Regulatory Compliance In The Cloud

I’m providing API guidance on a project being delivered to a government agency, as part of my Skylight partnership, and found myself spending more time looking around the AWS compliance department. You can find details on certifications, regulations, laws, and frameworks ranging from HIPPA and FERPA to FedRAMP, so that it can be used by federal government agencies in the United States, and other countries. You can find a list of services that are in scope, and track on their progress when it comes to compliance across this complex web of compliance rules. I’ve been primarily tracking on the progress of the AWS API Gateway which is currently in progress when it comes to FedRAMP compliance.

When it comes to regulatory compliance, AWS has a significant leg up on its competitors, Google and Microsoft. Both of these cloud platforms have existing regulatory efforts, but they aren’t as organized, or as far along as AWS’s approach to delivering in this area. Delivering cloud solutions that are compliant gives AWS a pretty significant advantage when it comes to first impressions with government agencies, and enterprise organizations operating within heavily regulated industries. Once this impression is made, and these groups have gotten a taste of AWS, and migrated systems, and data to their cloud, it will be hard to change their behavior.

As this whole Internet thing grows up, regulatory compliance is unavoidable. Many companies, organizations, institutions, and government agencies we are selling to are already needing to deliver when it comes to compliance, but even for the shiny new starts breaking new ground, at some point you will have to mature and deliver within regulatory constraints. Making AWS a pretty appealing place to be publishing databases, servers, and I’m hoping pretty soon, APIs using AWS API Gateway. If you are on the AWS API Gateway team, I’d love to get an update on the status, as I have a big government project I’d love to deploy using the API Gateway, instead of another industry provider gateway solution.


API Life Cycle Basics: Clients

I broke this area of my research into a separate stops a couple years back, as I saw several new types of service providers emerging to provide a new type of web-based API client. These new tools allowed you to consume, collaborate, and put APIs to use without writing any code. I knew that this shift was going to be significant, even though it hasn’t played out as I expected, with most of the providers disappearing, or being acquired, and leaving just a handful of solutions that we see today.

These new web API clients allow for authentication, and the ability to quickly copy and paste API urls, or the importing of API definitions to begin making requests, and seeing responses for targeted aPIs. These clients were born out of earlier API explorers and interactive API documentation, but have matured into standalone services that are doing interesting things to how we consume APIs. Here are the three web API clients I recommend you consider as part of your API life cycle.

  • Postman - A desktop and web client for working with and collaborating in a team environment around APIs.
  • PAW - Paw is a full-featured HTTP client that lets you test and describe the APIs you build or consume. It has a beautiful native macOS interface to compose requests, inspect server responses, generate client code and export API definitions. -RESTFddle - An easy-to-use platform to work with APIs. It simplifies everything from API exposure to API consumption. RESTFiddle is an Enterprise-grade API Management Platform for teams. It helps you to design, develop, test and release APIs.

Using web API clients allows for APIs to be easily defined, mocked, collaborated around, and leveraged as part of an API definition driven life cycle. The approach to integration saves significant cycles by allowing APIs to be designed, developed, and integrated with before any code gets written. Plus, the team and collaboration features that many of them posses can significantly benefit the process of not just consuming APIs, but also developing them. Making API clients an essential part of any development team, no matter what you are building.

Using API clients, bundled with an API definition-driven approach, and a healthy API mocking setup, can save you significant time and money when it comes to crafting the right API. What used to take years of development to iterate around, can take days or weeks, allowing you to define, design, mock, consume, collaborate, communicate, and iterate until exactly the right API is delivered. This approach to API development is changing how we deliver APIs, making operations much more flexible, agile, and fast moving, over the historically rigid, brittle, and slow moving approach to delivering API resources.


The Metropolitan Transportation Authority (MTA) Bus Time API Supports Service Interface for Real Time Information (SIRI)

The General Transit Feed Specification (GTFS) format for providing access to transit data has dominated the landscape for most of the time I have been researching transit data and APIs over the last couple of weeks. A dominance led by Google and their Google Maps, who is the primary entity behind GTFS. However the tech team at Streamdata.io brought it to my attention the other day that the Metropolitan Transportation Authority (MTA) Bus Time API Supports Service Interface for Real Time Information (SIRI), another standard out of Europe. I think MTA’s introduction to Siri, and the story behind their decision tells a significant tale about how standards are viewed.

According to the MTA, SIRI (Service Interface for Real Time Information) is a standard covering a wide range of types of real-time information for public transportation. This standard has been adopted by the European standards-setting body CEN, and is not owned by any one vendor, public transportation agency or operator. It has been implemented in a growing number of different projects by vendors and agencies around Europe.” I feel like their thoughts about SIRI not being owned by any one vendor is an important thing to take note of. While GTFS is an open standard, it is clearly a Google-led effort, and I’d say their decision to use Protocol Buffers reflects the technology, business, and politics of Google’s vision for the transit sector.

The MTA has evolved SIRI as part of their adoption, opting to deliver APIs as more of a RESTful interface as opposed to SOAP, and providing responses in JSON, which makes things much more accessible to a wider audience. While technologically sound decisions, I think using Protocol Buffers or even SOAP have political implications when you do not deeply consider your API consumers during the planning phases of your API. I feel like MTA has done this, and understands the need to lower the bar when it comes to the access of public transit data, ensuring that as of an audience as possible can put the real time transit data to use–web APIs, plus JSON, just equals an easier interface to work with for many developers.

I’m getting up to speed with GTFS and GTFS Realtime, and I am also getting intimate with SIRI, and learning how to translate from GTFS into SIRI. I’m looking to lower the bar when it comes to accessing real time transit data. Something simple web APIs excel at. I’ve been able to pull GTFS and GTFS Realtime data and convert into simpler JSON. Now that MTA has introduced me to SIRI, I’m going to get acquainted with this specification, and understand how I can translate GTFS into SIRI, and then stream using Server-Sent Events (SSE) and JSON Patch using Streamdata.io. Truly making these feeds available in real time, using common web standards.


API Life Cycle Basics: Documentation

API documentation is the number one pain point for developers trying to understand what is going on with an API, as they work to get up and running consuming the resources they possess. From many discussions I’ve had with API providers it is also a pretty big pain point for many API developers when it comes to trying to keep up to date, and delivering value to consumers. Thankfully API documentation has been being driven by API definitions like OpenAPI for a while, helping keep things up date and in sync with changes going on behind the scenes. The challenge for many groups who are only doing OpenAPI to produce documentation, is that if the OpenAPI isn’t used across the API life cycle, it will often become forgotten, recreating that timeless challenge with API documentation.

Thankfully in the last year or so I’m beginning to see more API documentation solutions emerge getting us beyond the Swagger UI age of docs. Don’t get me wrong, I’m thankful for what Swagger UI has done, but the I’m finding it to be very difficult to get people beyond the idea that OpenAPI (fka Swagger) isn’t the same thing as Swagger UI, and that the only reason you generate API definitions is to get documentation. There are a number of API documentation solutions to choose from in 2018, but Swagger UI still remains a viable choice for making sure your APIs are properly documented for your consumers.

  • Swagger UI - Do not abandon Swagger UI, keep using it, but decouple it from existing code generation practices.
  • Redoc - Another OpenAPI driven documentation solution.
  • Read the Docs - Read the Docs hosts documentation, making it fully searchable and easy to find. You can import your docs using any major version control system, including Mercurial, Git, Subversion, and Bazaar.
  • ReadMe.io - ReadMe is a developer hub for your startup or code. It’s a completely customizable and collaborative place for documentation, support, key generation and more.
  • OpenAPI Specification Visual Documentation - Thinking about how documentation can become visualized, not just text and data.

API documentation should not be static. It should always be driven from OpenAPI, JSON Schema, and other pipeline artifacts. Documentation should be part of the CI/CD build process, and published as part of an API portal life cycle as mentioned above. API documentation should exist for ALL APIs that are deployed within an organization, and used to drive conversations across development as well as business groups–making sure the details of API design are always in as plain language as possible.

I added the visual documentation as a link because I’m beginning to see hints of API documentation move beyond the static, and even dynamic realm, and becoming something more visual. It is an area I’m investing in with my subway map work, trying to develop a consistent and familiar way to document complex systems and infrastructure. Documentation doesn’t have to be a chore, and when done right it can make a developers day brighter, and help them go from learning to integration with minimal friction. Take the time to invest in this stop along your API life cycle, as it will help both you, and your consumers make sense of the resources you are producing.


Working With General Transit Feed Specification(GTFS) Realtime Data

I’ve been diving into the world of transit data, and learning more about GTFS and GTFS Realtime, two of the leading specifications for providing access to static and real time transit data. I’ve been able to take the static GTFS data and quickly render as APIs, using the zipped up CSV files provided. Next on my list I wanted to be able to work with GTFS Realtime data, as this is where the data is that changes much more often, and ultimately is more valuable in applications and to consumers.

Google has developed a nice suite of GTFS Realtime bindings in a variety of programming languages, including .NET, Java, JavaScript / Node.js, PHP, Python, Ruby, and Golang. I went with the PHP bindings, which interestingly enough is the only one in its own Github repository. I’m using it because I still feel that PHP has the best opportunity for adoption within municipal organizations–something that is beginning to change, but still holds true in my experience.

The GTFS-realtime data is encoded and decoded using Protocol Buffers, which provides a compact binary representation designed for fast and efficient processing of the data. Even with the usage of Protocol Buffers, which is also used by gRPC via HTTP/2, all of the GFTS Realtime data feeds I am consuming are being delivered via regular HTTP/1.1. I’m doing all this work to be able to make GTFS Realtime feeds more accessible for use by Streamdata.io, as the Protocol Buffers isn’t something the service currently supports. To make the data accessible for delivery via Server-Sent Events (SSE), and for partial updates to be delivered via JSON Patch, I need the Protocol Buffer format to be reduced to a simpler JSON format–which will be my next weeks worth of work on this project.

I was able to pretty quickly bind to the MTA subway GTFS Realtime feed here in NYC using the PHP bindings, and get at up to date “vehicle” and “alerts” via the transit authorities feeds. I’ve just dumped the data to the screen in no particular format, but was able to prove that I am able to connect to any GTFS feed, and easily convert to something I can translate into any format I desire. I’m opting to go with the Service Interface for Real Time Information (SIRI), which is more verbose than GTFS, but allows for availability in a JSON format. Now I just need to get more acquainted with the SIRI standard, and understands how it maps to the GTFS format.

I’m looking to have a solid approach to proxying an GTFS, and GTFS Realtime feed, and deploying as a SIRI compliant API that returns to JSON in coming weeks, so that I can quickly proxy using Streamata.io and deliver updates in true real time. Where transit vehicles are located at any particular moment, and details about alerts coming out of each transit authority are the most relevant, and real time aspect of transit operations. While the GTFS Realtime format is real time in name, it really isn’t in how its delivered. You still have to poll the feeds for changes, which is a burden on both the client and server, making Server-Sent Events, and JSON Patch a much more desirable, and cost effective way to get the job done.


OpenAPI Will Help You Get Your API House In Order

I wrote a piece previously about Consul not supporting Swagger documentation at this time, and the API provider and consumer impact of this decision. I’m going to continue picking on Consul with another API definition story, because they are forcing me to hand-craft my own OpenAPI. If I just had to learn about the API, and load the OpenAPI (fka Swagger) definition into my postman, publish to the repo for the project, and import into other tooling I’m using, I wouldn’t be so critical. However, since I’m having to take their static, and often incomplete documentation, and generate an OpenAPI for my project, I’m going to vent some about their API and schema design, and they short-sighted view of OpenAPI (cough, cough fka Swagger).

I just finished an OpenAPI for the Consul ACLs API path, and currently working on one for the Agent API path. I have already distilled down their static documentation into something that I can easily parse and translate into OpenAPI, I just need to finish doing the manual work. It is something I normally do with a scrape script, but the difficulties in consistently parsing their docs, combined with the scope of the docs, made me go with hand-crafting after distilling down the documentation for easier handling. I am familiar with the entire surface area of the Consul API, and now I’m getting to it’s intimate details, which also includes it’s intimate lack of details.

The first area I begin stumbling on is with the design of the Consul API. While it is a web or HTTP API, it isn’t following most of the basics of REST, which would significantly help things be a little more intuitive. Simple examples of this in action would be with the LAN Coordinates for a node path, which is a PUT with the following path /coordinate/update – making for verb redundancy in the HTTP Verb and path. There also isn’t a consistent approach to ids and how they are used in the paths, which is a side effect of no real focus on resources, or use of REST principles. Ultimately the API design isn’t as bad as many APIs that I consume, but it’s inconsistencies does it make it difficult to learn about.

The next area I find myself stumbling with is when it comes to the schema and response structure for the Consul API. In the documentation, some APIs provide no insight on what schema is returned, while others show a sample response, and others actually providing detail on the fields, types, and description. Some paths will utilize the same schema, but only reference the need for part of it, and return only what it needs, with no consistency in how it does this. Again, a lack of coherency around any resource model, and just requesting and responding with what is perceived in the moment. The challenge is not all of us new developers are “in the moment”, and the lack of consistency makes it difficult to understand just exactly what is going on.

As I mentioned in my previous piece, OpenAPI (fka Swagger) is much, much more than just API documentation. In this example, if present, it would act as a design scaffolding for Consul to think through the overall design patterns they use for each of their APIs and their paths, as well as consistently using the schema across the request and response structure of each API. The only reason I’m writing this story is become I needed a break from documenting the inconsistent details of the API. Piecing together a complete picture of the schema from one piece over here, and one piece over there, just so I can learn about the API and put it to use correctly in a project. I’ve read 100% of their API documentation, and hand-crafted an OpenAPI for about 30% of your API, and now I’m wishing the Consul had embarked on this journey, instead of me.

OpenAPI is much, much more than just documentation. It would give that strategic polish to a very tactically designed API. The dismissal of OpenAPI, because they’ve already done API documentation is just one symptom of a very tactically operated API. If Consul had used OpenAPI, it would have given them scaffolding to help them get their API house in order. It would have allowed them to think through the details of their API, which now as a consumer I’m having to do, and provide feedback on, which will hopefully would get included into the next version. With OpenAPI you have the opportunity to expedite this feedback loop, and see the missing details yourself, or potentially mock and provide the interface to a select group of users to provide feedback before you ever begin development.

My frustration with the Consul API isn’t entirely the design of it. It is mostly the incomplete design of it, and their unwillingness to pick up their head and look around, understanding what is good design, as well as what OpenAPI (fka Swagger) is, before responding that they don’t support it. I’m going to keep hand-crafting my Consul OpenAPI, because I see the value of the service, even if I don’t like their API, or the efforts that was put into the API. I’m hoping that they’ll see the light with OpenAPI, and maybe my hand-crafted edition will turn on this light. If nothing else, at least it will provide an OpenAPI that other developers can use, even if Hashicorp’s Consult team doesn’t see the value.


How Do We Keep Teams From Being Defensive With Resources When Doing APIs?

I was talking with the IRS about their internal API strategy before Christmas, reviewing the teams current proposal before they pitched in across teams. One of the topics that came up, which I thought was interesting, was about how to prevent some teams from taking up a defensive stances around their resources when you are trying to level the playing field across groups using APIs and microservices. They had expressed concern that some groups just didn’t see APIs as a benefit, and in some cases perceived them as a threat to their current position within the agency.

This is something I see at almost EVERY SINGLE organization I work with. Most technical groups who have established control over some valuable data, content, or other digital resource, have entrenched themselves, and become resistant to change. Often times these teams have a financial incentive to remain entrenched, and see API efforts as a threat to their budget and long term viability. This type of politics within large companies, organizations, institutions, and government agencies is the biggest threat to change than technology ever is.

So, what can you do about it. Well, the most obvious thing is you can get leadership on your team, and get them to mandate change. Often times this will involve personnel change, and can get pretty ugly in the end. Alternately, I recommend trying to build bridges, by understanding the team in question, and find ways you can do API things that might benefit them. Maybe more revenue and budget opportunities. Reuse of code through open source, or reusable code and applications that might benefit their operations. I recommend mapping out the groups structure and needs, and put together a robust plan regarding how you can make inroads, build relationships, and potentially change behavior, instead of taking an adversarial tone.

Another way forward is to ignore them. Focus on other teams. Find success. Demonstrate what APIs can do, and make the more entrenched team come to you. Of course, this depends on the type of resources they have. Depending on the situation, you may or may not be able to ignore them completely. Leading by example is the best way to take down entrenched groups. Get them to come out of their entrenched positions, and lower their walls a little bit, rather than trying to breach them. You are better off focusing doing APIs and investing in moving forward, rather than battling with groups who don’t see the benefits. I guarantee they can last longer than you probably think, and have developed some pretty crafty ways of staying in control over the years.

Anytime I encounter entrenched team stories within organizations I get sad for anyone who has to deal with these situations. I’ve had some pretty big battles over my career, which ended up in me leaving good jobs, so I don’t take them lightly. However, it makes me smile a little to hear one out of the IRS, especially internally. I know plenty of human beings who work at the IRS, but with their hard-ass reputation they have from the outside, you can’t help but smile just a bit thinking them facing the same challenges that the rest of us do. ;-) I think this is one of the most important lessons of microservices, and APIs, is that we can’t let teams ever get this big and entrenched again. Once we decoupled, let’s keep things in small enough teams, that this type of power can’t aggregate and be to big to evolve again.


Building An API Partner Program For Streamdata.io

I am working with Streamdata.io on a number of fronts when it comes to our partnership in 2018. One of the areas I’m helping them build, is the partner program around their API service. I’m taking what I’ve learned from studying the partner programs of other leading APIs, and I am pulling it together into coherent strategy that Streamdata.io can put to work over time. Like any other area of operations, we are going to start small, and move forward incrementally, making sure we are doing this in a sensible, and pragmatic way.

First up, are the basics. What are we trying to accomplish with the Streamdata.io partner program. I want to have a simple and concise answer to what their partner program does, and is designed to accomplish.

The Streamdata.io partner program is designed to encourage deeper engagement with companies, organizations, institutions, and government agencies that are putting Streamdata.io solutions to work, or are already operating within industries where Streamdata.io services will compliment what they are already doing. This partner program is meant to encourage continued participation by our customers, through offering them exposure, storytelling opportunities, referrals, and even new revenue opportunities. The Streamdata.io partner program is open to the public, just reach out and we’ll let you know if there is a fit between what our organizations are doing.

It is a first draft, and not the official description, but it takes a crack at describing why we are doing it, and some about what it is. After looking through our existing partner list, and talking about the objectives of the Streamdata.io partner program we have settled in on a handful of types of partner opportunities available with the program.

  • OEM - Our deeply integrated partners who offer a white label version Streamdata.io services to their customers.
  • Mutual Lead Referral - Partners who we are looking to generate business for each other, sending leads back and forth to help drive growth.
  • Co-Marketing - Partners who include Streamdata.io in their marketing, as well as participate in our partner marketing opportunities.
  • Reseller / Distribution - Streamdata.io customers who are authorized to resell and distribute Streamdata.io services alongside their work.
  • Marketplace - Making sure Streamdata.io is in leading application and API marketplaces, such as Amazon Marketplace.
  • System Integrators - Agencies who offer consulting services, and are up to speed on what Streamdata.io does, and can intelligently offer to their customers.

We are currently going through our existing partner list, and preparing the website page that articulates what the partner program is, and who our existing partners are. Then we are looking to craft a road map for what the future of the Streamdata.io partner program will look like.

  • Storytelling - Opportunities to include their products, services, and use cases in storytelling on the Streamdata.io blog, and via white papers, guides or other formats.
  • Newsletter - Including partners in the newsletter we are launching later this month, allowing for regular exposure via this channel.
  • Testimonials - Getting, and showcasing the testimonials of our partners, and also giving testimonials regarding their solutions.
  • Case Studies - Crafting of full case studies about how a partner has applied Streamdata.io solutions within their products, and services.

We have other ideas, but this represents what we are capable of in the first six months of the Streamata.io partner program. It gets the program off the ground, and has us reaching out to existing partners, letting them know of the opportunities on the table. It begins showcasing the program, and existing partners on the showcase page via the website, and invites other customers, and potential customers to participate. Once we get the storytelling drumbeat going for existing partners, and start driving traffic and links their way, we can look at what it will take to attract and land new partners as part of the effort.

Adding another dimension to this conversation, the telling of this story is directly related to my partnership with Streamdata.io. Which means I am a Streamdata.io partner, and they are one of my partners, and this storytelling is part of the benefits of being an API Evangelist partner. As part of the work on the Streamdata.io partner program we are looking at not just how the storytelling can be executed on Streamdata.io, but also here on API Evangelist. We are also looking into how the reseller, distribution, and integrators tiers of partnership can coexist with the consulting, speaking, and workshops I’m doing as part of my work as the API Evangelist. We will be revisiting this topic each week, and when I have relevant additions, or movements, I will tell the story here on the blog, as I do with most of my work.


API Life Cycle Basics: Portal

A coherent strategy to delivering and operating API portals is something that gets lost in a number of the API operations I am asked to review. It is also one of the more interesting aspects of the successful strategies I track on, something that when done right, can become a vibrant source of information, and when done wrong, can make an API a ghost town, and something people back away from when finding. As part of my research I think a lot about how API portals can be used as part of each APIs lifecycle, as well as at the aggregate levels across teams, within groups, between partners, and the public.

The most common form of the API portal is the classic public developer portal you find with Twitter, Twilio, Facebook, and other leading API pioneers. These portals provide a wealth of healthy patterns we can emulate, as well as some not so healthy ones. Beyond these public portals, I also se other patterns within the enterprise organizations I work with, that I think are worth sharing, showing how portals aren’t always just a single public destination, and can be much, much more.

  • Individual Portals - Considering how developers and business users can be leverage portals to push forward conversations around the APIs they own and are moving forward.
  • Team Portals - Thinking about how different groups and teams can have their own portals which aggregate APIs and other portals from across their project.
  • Partner Portals - Leveraging a single, or even partner specific portals that are public or private for engaging in API projects with trusted partners.
  • Public Portal - Begin the process of establishing a single Mutual of Omaha developer portal to provide a single point of entry for all public API efforts across the organization.
  • Pipeline Integration - How can BitBucket be leverage for deploying of individual, team, partner, and even the public portal, making portals another aspect of the continuous deployment pipeline.

Portals can be used as the storage for the central truth of OpenAPI, and their JSON schema. They can be where documentation, coding, tooling, and other stops along the life cycle live. They also provide for an opportunity for decentralization of API deployment, but done in a way that can be evolved alongside the existing CI/CD evolution occurring within many organization, as well as aggregated and made available as part of company wide public, partner, or private discovery portals. Portals, can be much more than just a landing page, and can act as a doorway to a vibrant ecosystem within an organization.

I admit, it can be tough to turn a landing page for a portal into an active source of information, but with the right investment over time, it can happen. I maintain almost 200 separate portals as part of my work as the API Evangelist. Not all of them are active and vibrant, but they all serve a purpose. Some are meant to be static and never changing, with others being more ephemeral and meant to eventually go away. While others, like the home page for each stop along my API life cycle research staying active for almost eight years now, providing a wealth of information on not just a single APIs, but an entire industry.


API Life Cycle Basics: DNS

DNS is one of those shadowy things that tends to be managed by a select few wizards, and the rest of an organization doesn’t have much knowledge, awareness, or access at this level. APIs has shifted this reality for me, and is something I’m also seeing at organizations who are adopting a microservices, and devops approach to getting things done. DNS should be a first class citizen in the API toolbox, allowing for well planned deployments supporting a variety of services, but also allow for logging, orchestration, and most importantly security, at the frontline of our API operations.

There are some basics I wanted to introduce to my readers when it comes to DNS for their API operations, but I also wanted to shine a light on where the DNS space is headed because of APIs. Some DNS and cloud providers are taking things to the next level, and APIs are central to that. Like most other stops along the API life cycle DNS is not just about doing DNS for your APIs, it is also about doing APIs for your DNS.

  • Dedicated API DNS - I’m not in the business of telling you how to name the domain, or subdomain for your API, but you should have a plan, and be also considering having multiple subdomains, separating concerns across operations.
  • API Control Over DNS - DNS is the frontline for your API infrastructure, even internally, and you should be able to programmatically configure, audit, orchestrate, and manage the DNS for your APIs using APIs.
  • Regional Consideration - Begin thinking about how you name and manage your DNS with multiple zones and regions in operations–even if you aren’t quite ready, you should be thinking in this way.
  • Amazon Route 53 Releases Auto Naming API for Service Name Management and Discovery - Thinking about how service addressing can be automated, as well as standardized as part of the life cycle.
  • CloudFlare - You may not use them as a provider, but tune in and study the way CloudFlare does their DNS, as well as provides APIs for managing DNS.

DNS should be a prominent part of API operations, even with internal APIs. It is the first line of defense when it comes to security, as well as discovery, and allowing developers and partners to put APIs to work. DNS shouldn’t be separate from the rest of the API life cycle, and should be reachable by all developers, with logging at this layer shipped to be included within API life cycle operations. DNS needs to come out of the shadows and be something your entire team is aware of, with transparency around configuration, as well as standard practices for usage across services.

I can’t emphasize enough regarding how DNS providers like CloudFlare have shifted my view of DNS. Even if you aren’t using them for your primary DNS, I recommend setting up a domain and playing around with what they have to offer. At least tune into their blog and Twitter account, as they are pushing the conversation forward when it comes to DNS, and API access to this layer. DNS in 2018 is much more than just addressing for your APIs, it is about logging, security, and much, much more. Bring it out of the background, and take another look at how it can make a bigger impact on what you are looking to achieve with APIs.


Some Of The Thinking Behind The Protocols Used By Kafka

I’ve been studying the overall Apache Stack a lot lately, with an emphasis on Kafka. I’m trying to understand what the future of APIs will hold, and where the leading edge of real time, event-driven architecture is at these days. I’m going through the protocol page for Kafka, learning about exactly how they move data around, and found their answers behind the decisions they’ve made along the way in deciding what protocols they chose to use were very interesting.

All the way at the bottom of the Kafka protocol page you can find the following “Some Common Philosophical Questions”, providing some interesting backstory on the decisions behind the very popular platform.

Some people have asked why we don’t use HTTP. There are a number of reasons, the best is that client implementors can make use of some of the more advanced TCP features–the ability to multiplex requests, the ability to simultaneously poll many connections, etc. We have also found HTTP libraries in many languages to be surprisingly shabby.

Others have asked if maybe we shouldn’t support many different protocols. Prior experience with this was that it makes it very hard to add and test new features if they have to be ported across many protocol implementations. Our feeling is that most users don’t really see multiple protocols as a feature, they just want a good reliable client in the language of their choice.

Another question is why we don’t adopt XMPP, STOMP, AMQP or an existing protocol. The answer to this varies by protocol, but in general the problem is that the protocol does determine large parts of the implementation and we couldn’t do what we are doing if we didn’t have control over the protocol. Our belief is that it is possible to do better than existing messaging systems have in providing a truly distributed messaging system, and to do this we need to build something that works differently.

A final question is why we don’t use a system like Protocol Buffers or Thrift to define our request messages. These packages excel at helping you to managing lots and lots of serialized messages. However we have only a few messages. Support across languages is somewhat spotty (depending on the package). Finally the mapping between binary log format and wire protocol is something we manage somewhat carefully and this would not be possible with these systems. Finally we prefer the style of versioning APIs explicitly and checking this to inferring new values as nulls as it allows more nuanced control of compatibility.

It paints an interesting story about the team, technology, and I think the other directions the API sector is taking, when it comes to which protocols they are using. I don’t know enough about how Kafka works to take any stance on their decisions. I’m just looking to just take a snapshot of their stance, so that I can come back to it at some point in the future, when I do.

I published my diverse toolbox diagram a couple weeks back, which includes Kafka. As I continue to develop my understanding of the Apache Stack, and Kafka, I will further dial-in the story that my API toolbox visual tells. The answers above further muddy the water for me about where it fits into the bigger picture, but I’m hoping it is something that will clear up with more awarness of what Kafka delivers.


We Are Not Supporting OpenAPI (fka Swagger) As We Already Published Our Docs

I was looking for an OpenAPI for the Consul API to use in a project I’m working on. I have a few tricks for finding OpenAPI out in the wild, which always starts with looking over at APIs.guru, then secondarily Githubbing it (are we to verb status yet?). From a search on Github I came across an issue on the Github repo for Hashicorp’s Consul, which asked for “improved API documentation”, a Hashicorp employee ultimately responded with “we just finished a revamp of the API docs and we don’t have plans to support Swagger at this time.”. Highlighting the continued misconception of what is “OpenAPI”, what it is used for, and how important it can be to not just providing an API, but also consuming it.

First things first. Swagger is now OpenAPI (has been for a while), an API specification format that is in the Open API Initiative (OAI), which is part of the Linux Foundation. Swagger is proprietary tooling for building with the OpenAPI specification. It’s an unfortunate and confusing situation that arose out of the move to the Open API Initiative, but it is one we need to move beyond, so you will find me correcting folks more often on this subject.

Next, let’s look at the consumer question, asking for “improved API documentation”. OpenAPI (fka Swagger) is much more than documentation. I understand this position as much of the value it delivers to the API consumer is often the things we associate with documentation delivering. It teaches us about the surface area of an API, detailing the authentication, request, and response structure. However, OpenAPI does this in a machine readable way that allows us to take the definition with us, load it up in other tooling like Postman, as well as use to autogenerate code, tests, monitors, and many other time saving elements when we are working to integrate with an API. Lesson for the API consumers here is that OpenAPI (fka Swagger) is much, much, more than just documentation.

Then, let’s look at it from the provider side. Looks like you just revamped your API documentation, without much review of the state of things when it comes to API documentation. Without being too snarky, after learning more about the design of your API, I’m guessing you didn’t look at the state of things when it comes to API design either. My objective is to not shame you for poor API design and documentation practices, just to point out you are not picking your head up and looking around much when you developed a public facing API, that many “other” people will be consuming. It is precisely the time you should be picking up your head and looking around. Lesson for the API provider her is that OpenAPI (fka Swagger) is much, much, more than just documentation.

OpenAPI (fka Swagger) is much, much, more than just documentation! Instead of me being able to fork an OpenAPI definition and share with my team members, allowing me to drive interactive documentation within our project portal, empower each team member to import the definition and getting up and running in Postman, I’m spending a couple of hours creating an OpenAPI definition for YOUR API. Once done I will have the benefits for my team that I’m seeking, but I shouldn’t have to do this. As an API provider, Consul should provide us consumers with a machine readable definition of the entire surface area of the API. Not just static documentation (that are incomplete). Please API providers, take the time to look up and study the space a little more when you are designing your APIs, and learn from others are doing when it come to delivering API resources. If you do, you’ll be much happier for it, and I’m guessing your API consumers will be as well!


API Life Cycle Basics: API Logging

Logging has always been in the background of other stops along the API lifecycle, most notably the API management layer. However increasingly I am recommending pulling logging out of API management, and making it a first-class citizen, ensuring that the logging of all systems across the API lifecycle are aggregated, and accessible, allowing them to be accessed alongside other resources. Almost every stop in this basics of an API life cycle series will have its own logging layer, providing an opportunity to better understand each stop, but also side by side as part of the bigger picture.

There are some clear leaders when it comes to logging, searching, and analyzing large volumes of data generated across API operations. This is one area you should not be reinventing the wheel in, and you need to be leveraging the experience of the open source tooling providers, as well as the cloud providers who have emerged across the landscape. Here is a snapshot of a few providers who will help you make logging a first class citizen in your API life cycle.

Elastic Stack - Formerly known as the Elk Stack, the evolved approach to logging, search, and analysis out of Elastic. I recommend incorporating it into all aspects of operations, and deploying APIs to make them first class citizens. Logmatic - Whatever the language or stack, staging or production, front or back, Logmatic.io centralizes all your logs and metrics right into your browser. Nagio - Nagios Log Server greatly simplifies the process of searching your log data. Set up alerts to notify you when potential threats arise, or simply query your log data to quickly audit any system. Google Stackdriver - Google Stackdriver provides powerful monitoring, logging, and diagnostics. AWS CloudWatch - Amazon CloudWatch is a monitoring service for AWS cloud resources and the applications you run on AWS.

I recommend cracking open logging from EVERY layer, and shipping them into a central system like Elastic for making them accessible. While each stop along the API lifecycle will come with its own logging and analysis solutions, depending on the services and tooling used, logs should also be shipped as part of a central system for analysis at the bigger picture level. Each stop along the API life cycle will have its own tooling and service, which will most likely come with its own logging and analysis services. Use these solutions. However, don’t stop there, and consider the benefits from looking at log data side by side, and what the big picture might hold.

Logging will significantly overlap with the security stop along the API life cycle. The more logging you are doing, and the more accessible these logs are, the more comprehensive your API security will become. You’ll find this becomes true at other stops along the API life cycle, and you will be able to better deliver o discovery, testing, define, and deliver in other ways, with a more comprehensive logging strategy. Remember, logging isn’t just about providing a logging layer, it is also about having APIs for your logging, providing a programmatic layer to understand how things are working, or not.


A Blueprint For An Augmented Transit API

I’m working through research on the world of transit APIs as part of my partnership with Streamdata.io. From what I’ve gathered so far, the world of transit data and APIs is quite a mess, and there is a pretty significant opportunity to improve upon what already exists. In the course of my research, I stumbled across MetroHero, which is an application and API provider that operates on top of the Washington Metropolitan Area Transit Authority data and API feeds.

I’m still working my way through their website, services, APIs, as well as talking with their team, but I’m fascinated with what they are doing, and wanted to think a little more about it before I talk with them this week. While their approach to improving upon WMATA applications is interesting, I think applying this way of thought to a government API is more interesting (surprise). The MetroHero API is what I’d consider to be an augmented API, operating on top of the WMATA API, and improving upon the data and services they make available about the Washington DC transit system.

The MetroHero API, taken directly from their developer portal, “are available for free. In return, we require the following”:

  • You must abide by WMATA’s Transit Data Terms of Use; by using our APIs, you agree to these terms of service.
  • Any data returned by or derived from data returned by our APIs must be freely available to all users of your application. Any paywalled application that utilizes our APIs must also provide a free tier with access to the same data returned by or derived from our APIs.
  • Any data returned by or derived from data returned by our APIs must be prominently credited back to MetroHero. For example, if this data is being displayed to users on a website or in an application, MetroHero must always be visually credited wherever and whenever the data appears or is used.

The MetroHero API is not sanctioned by WMATA. The MetroHero team doesn’t charge for their API, while also being very passionate about improving upon the WMATA APIs. I’ve been tuned into what MetroHero does since I first wrote about WMATA’s terms of service changes impacting them a while back, and I have been intrigued by an API that augments, and improve upon a government agency’s API. This is a topic I’ve been thinking about since my earlier frustrations with the federal government APIs I had been working on getting shutdown during the fall of 2013. An experience that has pushed me to think more about ways in which we can improve upon existing government services using APIs.

I’m looking to craft a blueprint that reflects what MetroHero is already doing. Something that is forkable, and executable. I would like to see MetroHero be the default in communities. While transit is first in line, I’m envisioning a model that goes well beyond just transit, and into other 511 information like automobile traffic and incidents, then move into 311, and 911. I will be adding this blueprint to my Adopta.Agency project, and focusing it on the local level, rather than at the federal level. The sub-domain for the project will Transit.Adopta.Agency (not setup), then I’ll add 511,911,311, and others later. I still have a lot of work to do on the transit portal blueprint, so first things first.


API Life Cycle Basics: API Management

The need to manage APIs is one of the older aspects of doing business with web APIs. Beginning around 2006, then maturing, and being baked into the cloud and markets by 2016. Whether it is through an management gateway that proxies existing APIs, natively as part of the gateway that is used to deploy the APIs themselves, or as a connective layer within the code, API management is all about authenticating, metering, logging, analyzing, reporting, and even billing against API consumption. This landscape has significantly shifted lately, with the bottom end of the market becoming more competitive, but luckily there are enough open source and cloud solutions available to get the job done.

Over the last decade API management providers have collectively defined some common approaches to getting business done using web APIs. While still very technical, API management is all about the business of APIs, and managing the value generated from providing access to data, content, algorithms, and other digital resources using the web. Here are the handful of common aspects of API management, which are being baked into the cloud, and made available across a number of open source solution providers catering to the API space:

  • Authentication - Requiring all developers to register, obtain keys, and provide unique identification with API request they make.
  • Service Composition - Allowing for the organizing and breaking down of APIs into meaningful lines of business, and allowing for different times of access to these products.
  • Rate Limiting - Limiting, and protecting the value of digital resources, only allowing access to those who have been approved.
  • Metering - Measuring each call that is made to APIs, and applying service composition, and pricing to all API traffic, quantifying the value of business being conducted.
  • Reporting - Providing analysis and reporting on all activity, enabling API providers to develop awareness, and drill down regarding how resources are being used.
  • Invoicing - Accounting for all API traffic and invoicing, charging, and crediting for API consumption, to generate revenue and incentivize the desire behavior among consumers.

APIs use the web, and API management allows companies, organizations, institutions, and government agency to provide secure access to valuable resources in this environment. API management is about developing an awareness of who has access to resources, understanding how they are using them, and charging or compensating for the value consumed or generated via API access. While much of the conversation in the tech sector is focused on revenue generation at this layer, in reality it is about understanding value generation and exchange around valuable resources–with revenue generation being one aspect of doing business using APIs on the web.

When it comes to selecting an API management solution my recommendation is always keep the relationship small, modular, and decoupled from other stops along the API lifecycle, keeping the business engagement limited as well, allowing you to grow and evolve without lengthy contracts. Every stop along the API life cycle should reflect the API philosophy, and kept small, decoupled, and doing one thing well. This goes for the technical, as well as the business of doing APIs. Increasingly my storytelling about API management is absent of vendors, and more focused on the nuts and bolts of managing APIs, reflecting the maturing and weaving in of API management into the fabric of the cloud.


The Data Behind The Washington Post Story On Police Shootings in 2017

I was getting ready to write my usual, “wish there was actual data behind this story about a database” story, while reading the Fatal Force story in the Washington Post, and then I saw the link! Fatal Force, 987 people have been shot and killed by police in 2017. Read about our methodology. Download the data. I am so very happy to see this. An actual prominent link to a machine readable version of the data, published on Github–this should be the default for ALL data journalism in this era.

I see story after story reference the data behind, without providing any links to the data. As a database professional this practice drives me insane. Every single story that provides data driven visualizations, statistics, analysis, tables, or any other derivative from data journalism, should provide a link to the Github repository which contains at least CSV representations of the data, if not JSON. This is the minimum for ALL data journalism going forward. If you do not meet this bar, your work should be in question. Other analysts, researchers, and journalists should be able to come in behind your work and audit, verify, validate, and even build upon and augment your work, for it to be considered relevant in this time period.

Github is free. Google Sheets is free. There is no excuse for you not to be publishing the data behind your work in a machine readable format. It makes me happy to see the Washington Post using Github like this, especially when they do not have an active API or developer program. I’m going to spend some time looking through the other repositories in their Github organization, and also begin tracking on which news agencies are actively using Github. Hopefully, in the near future, I can stop ranting about reputable news outlets not sharing their data behind stories in machine readable formats, because the rest of the industry will help police this, and only the real data-driven journalists will be left. #ShowYourWork


Generating Revenue From The Support Of Public Data Using APIs

I’m exploring the different ways that public data access via APIs can be invested in, even with the opportunity for generating revenue, but without charging access to the data itself. I want public data to remain accessible, but in reality it costs money to provide access to public data, to refine, and evolve it. This is meant to be an exploration of how public data is invested in, not how you lock up public data, so please read everything before commenting, as every time I write about this subject, there are folks who blindly declare that ALL public data has to be free, no matter what. I agree (mostly), but there has to be commercial monetization opportunities around public data, otherwise it will never evolve, improve, be enriched, and in some cases available at all.

I am looking for opportunities for public data stewards / owners to generated much needed revenue, as well as commercial interests to come in and augment, and build upon what is already available–going beyond what cash and resource strapped data stewards / owners might be able to do on their own. Here are a couple of areas I’m documenting a little more right now.

  • Wholesale APIs - Deploying, managing, and providing access to public data via APIs that are designed specifically for individual consumers. This could be deploying an API on AWS, Google, Azure, or other infrastructure provider, and providing private, or even I guess public access to the data. Delivering a personalized, customized, and performant public data API experience.
  • Real Time APIs - Providing a proxied stream of data from one or many public data sources, going beyond what the data steward / owner is capable of delivering. Charging for the technology, not access to the data itself.
  • Cached - Delivering a cached experience, so that when the primary source goes down, all API consumers are still able to get at historical, or other relevant data without interruption.
  • Transformation - Providing access to the data but in a different format than the source can provide. This is where the line begins to get blurry because technically you would be charging for access to the data here, albeit in a different format. Maybe it could be done as a wholesale API, where you charge for the service, not the data?
  • Enriched - Enriching public data, adding in additional data points, and other relevant data and content that makes it better. Another area things start to get blurry, because again, you are beginning to charge for access to the data. Maybe you would need to have a way to separate public, from the enrichment, I guess?

Those are just a couple of examples I’m looking thinking about. Most public data sources restrict the ability to charge for access to their data by any 3rd party–makes sense. In this particular exercise I’m looking at transit data, and how the overall data and API access experience can be improved upon. There are augmented APIs built upon public transit feeds, like we see from MetroHero, who is built upon WMATA. They clearly state that they are not sanctioned by WMATA, and could go away at any point, and make it known the data is freely available, and should remain that way downstream. I’m also looking at building upon MTA transit feeds in NYC, making the GTFS Realtime feeds available as SIRI feeds, but it is something that will cost me money, and I can’t afford to just provide for access free, subsidizing other applications without some return.

I’d like to improve upon MTA transit feeds. Make them more usable, and available in a streaming format using Streamdata.io. I can’t do this in their current format. I can easily proxy their feeds, and transform into a simpler SIRI JSON feed, but then I’m expected to just make them freely available. I’m looking to see how I can do this as a wholesale API, as well as make available in a paid streaming format. I’m not charging for the data, I’m charging for the serving of it in real time? IDK. I am just exploring these thoughts right now. I get where all the concerns come into the picture around making money off public data, and limiting who gets access, however I really want public data to remain available, and improve–this is something that takes money, investment, and the ability to generate revenue.


The Child Welfare Digital Services (CWDS) Certification, Approval, and Licensing Services (CALS) API

My partner Chris Cairns(@cscairns) over at Skylight sent me a link to the Child Welfare Digital Services (CWDS) Certification, Approval, and Licensing Services (CALS) API on Github the other day. The API isn’t your traditional public API, but shows what is possible when it comes to APIs at government agencies. The group behind the API has published their Digital Service Development Standards, and is actively using a Github Wiki to layout the API strategy for the organization.

To give some backround, the Child Welfare Digital Services (CWDS) is for “state and county workers who ensure that safe and quality licensed facilities and approved homes are available for the children and nonminor dependents who need them, the CALS Digital Service Team will facilitate activities related to ensuring that licensed facilities, approved homes and associated adults meet and maintain required standards.” It makes me happy to see that they are investing so heavily in API, in support of such a worthy cause.

Looking around their wiki I found a handful of APIs:

You can also find more about their development process, data model, and approach to security on the Github Wiki for the organization. After looking around more at their Github organization, I found a handful of other operational APIs:

There was also a Core API which they use as a base across all API projects, standardizing how they do things. Smart! I also found their testing strategy worthy of noting, just so I can add to my research.

  • Integration Testing - To run Integration tests set property cals.api.url to point to environment host. Use gradle integrationTest task. In this case token will be generated for default test user, so it’s possible to test environment with Perry running in dev mode.
  • Smoke Testing - Smoke test suite is part of integration tests. Set cals.api.url, use gradle smokeTestSuite task. Smoke test endpoint is not protected by Perry.

I like the style of the Child Welfare Digital Services (CWDS) team. They are investing heavily in APIs, and aren’t afraid of doing it out in the open. As it should be. It is important that ALL government agencies do this, so that other agencies can come along and build on their work, making government more efficient, and cost effective when getting business done. All of the APIs above can, and should be forked, and put to use across other child welfare organizations. I notice they are also using OpenAPI (fka Swagger), but haven’t published them as part of some of their projects. I will keep an eye on and update when they do–reuse of API definitions, is even more than reuse of code.

If you know of any government agency who is this progressive, and publishing their API strategy, processes, definitions, and code on Github–please let me know! This type of behavior needs showcasing, and I like to have a wealth of references on my blog which I can cite as I’m traveling around speaking, and consulting with government agencies. This type of API efforts should be default across all city, county, state, and federal government agencies.


Seeing Reflections From The Past Rippling In The API Pool When I Translated WSDL To OpenAPI

I was looking for the API definition and schema for the Service Interface for Real Time Information (SIRI) standard, but all they have were WSDL and XSDs. I am working with the Metropolitan Transit Authorities (MTA) SIRI feed, which returns JSON, and I wanted to have a JSON schema reflecting the responses I was working with. So I took the WSDLs and converted them to OpenAPI using the API Transformer, which kind of felt like that scene in the matrix where the world around him is beginning to turn to liquid as he pokes at it, right before he exits the matrix.

I regularly get the emails and tweets from folks telling me “we’ve done all this before”, when I talk about OpenAPI. I’m fully aware that we have, and have even written a few stories about it. However, translating my first WSDL into an OpenAPI was a new experience for me, and made me think deeply about where we are at in 2018 a little bit more. I was reminded once again of how much we’ve left behind, and how much of this we are slowly recreating as part of the OpenAPI specification, and the other definitions and tooling we are developing. I don’t think all of this is bad, but I do think we’ve never been able to have an honest conversation about all of this.

Over the last decade I feel like there has been two camps, those still committed to web services, and those that had invested in this new paradigm. Web service folks have mostly dug in their heals, and proclaimed, “we’ve already done all this”. While us web API folks moved forward with this new realm, somewhat in denial about where we’ve been. Most discussions have been pretty black or white, meaning it was all new, or all old, and we couldn’t really ever talk about things at a granular level, and exploring the grey layers in between. There was a lot of good in the web services model, and I don’t think we have ever processed what that good was–we just moved on.

When I work with WSDL and XSD it is clear that this stuff wasn’t meant for humans. Something I think web APIs, and using YAML for describing interfaces and schema has significantly improved upon. I think an API design first approach has also helped out significantly, making the interfaces, and schema much softer, and easier to sit down and understand WTF is going on. However I think a lot of the scaffolding and structure that existed in the web services world is being recreated, and we are doing a lot of the same work all over again. I feel like the same polarizing conversation we see around hypermedia, GraphQL, microservices, and other areas of evolution in the space reflect the same damaging effects of technological dogma that keeps us from being sensible about this stuff.

Anyhoo, I don’t think there is any solutions here. I’m just going to develop a robust toolbox for helping me map web services to a fresher web API implementation. I’m not a fan of translating SOAP to REST, but I do like having tools that help me make sense of what was, and create a scaffolding for what can be. Then I just dive in and clean up, polish, and move forward as I see fit. I just had to share my moment of anxiety as I translated that WSDL into OpenAPI. For a moment I felt like Neo in the Matrix, and wasn’t entirely sure whether this was real, or a simulation. I’m betting this is all real, we are just humans, and enjoy being our own worst enemies most of the time, and doign work over and over and over–rarely ever learning from the past. ;-(


API Life Cycle Basics: Gateway

API gateways have long played a role in providing access to backend resources via web services and APIs. This is how web services have historically been deployed, but it is also how modern web APIs are being managed. Providing a gateway that you can stand up in front of existing web APIs, and proxy them through a single gateway that authenticates, logs, and manages the traffic that comes in and out. There are many management characteristics of API gateways, but I want to provide a stop along the API lifecycle that allows us to think about the API deployment, as well as the API management aspects of delivering APIs.

I wanted to separate out the API gateway discussion from deploy and manage, focusing specifically on the opportunities to deploy one or many gateways, while also looking at it separately as a pattern in service of microservices. While code generation for API deployment is common, gateways are making a resurgence across the sector when it comes to working with a variety of backend systems, on-premise and in the cloud. There are many API gateway solutions available on the market, but I wanted to focus in on a handful that help span deployment and management, as well as allowing for new types of routing, and transformation patterns to emerge. Here are a couple of the gateway solutions I’m studying more these days:

  • AWS API Gateway - The Amazon API Gateway allows for the ingestion of OpenAPIs (Swagger) and the deployment of APIs that connect to a variety of backend services define as part of the AWS infrastructure.
  • Kong - Quickly build API-centric applications. Leverage the latest microservice and container design patterns. And tie it all together with the Kong microservice API gateway.
  • Zuul - I’m putting Zuul here, because it has some routing characteristics with makes it a deployment, as well as management solution. One you begin routing, you start to do some of the heavy lifting of design and deployment of resources.
  • API Gateway Pattern - A pattern for delivering microservices that use an API gateway, and support a variety of applications.
  • Building Microservices Using an API Gateway - Designing, building, and deploying microservices introduced the Microservices Architecture pattern.

You’ll notice there is a mix of several concepts here. API gateway as a pattern, in service of delivering microservices, as well as deploying and managing your APIs. I’m doing this on purpose, to try and show how the API gateway landscape is shifting, and evolving with the microservices evolution, as well as in service of devices, and the Internet of Things (IoT). I think Netflix’s approach to using Zuul reflect the shifting middleware roots of the gateway, while working hard to establish the right set of patterns to meet the demanding needs of a growing variety of clients who are consuming our APIs. You see this landscape with new providers like Kong, as well as other leading web server platforms like NGINX outlining how we navigate this new world.

The API Gateway was always a single, monolithic point of entry in my mind. However, as I use AWS API Gateway in a variety of geographic regions and client accounts, and do more deploying of Kong wherever I need it, I’m beginning to change my tune. I’m working with more enterprise groups who have multiple API gateway solutions in play–the result of many disparate teams, as well as acquisitions, and incongruous evolution along the way. Sometimes this is seen as a bad thing, but when it is embraced as part of a larger API life cycle strategy, and driven my an API definition approach to doing APIs, you begin to see a method to the madness. Something that you can even begin to govern, and orchestrate at scale across many different groups, and thousands of APIs.


Looking For THE Answer Instead Of Developing An Understanding Of Good API Design

I recently worked with a large team on a microservices and API governance training a couple months back, where I saw a repeating pattern that I’ve experienced with other large enterprise groups. They seemed to want the right answers to doing APIs and microservices, instead of developing an understanding of good (or bad) API design, and determining the right way for themselves. The biggest challenge with this perception amongst development groups, is that there is no right answer, or one way of doing APIs and microservices–you need to find the right way forward for your team, and for each project.

I get this a lot within large organizations who are just beginning their API journey–just show us the right way of doing it! To which I usually reply with, let’s roll up our sleeves and get to work on one of your services, and we will start showing you have to craft a simple, sensible, yet robust API that meets the needs of the project. I can’t show you the “right way”, until I understand what the particular need are, and zeroing in on what is the “right way”, takes work, refinement, and crafting f a robust definition for the service.

API design isn’t easy. It takes time to understand what schema is involved, and what role that schema will play in the request and response of an API. It takes work to understand whether the complexity should be spread horizontally across each API paths, or vertically within a single path, using parameters, the body, and other aspects of the surface area of the API. Until I understand the client needs I can’t fully articulate whether we should stick with a simple JSON response, or a more sophisticated hypermedia media type, or possibly even go with something like GraphQL.

Ultimately, I need you to go on this journey with me. I can help ask hard questions, and provide relevant answers about best practices I see across the API space, but I do not know the THE answer. Sometimes there might be multiple right answer, or we won’t know fully until we mock the interface and get to work playing with it. I know you want quick and direct solutions to how you should be doing this, but ultimately it is up to you to do the hard work of learning about what good API design is, and how we can apply it to this particular project. Each microservice within this project might have specific needs, and we’ll have to come up with a variety of solutions that we can apply consistently across the entire project.

I am spending more time helping train teams using OpenAPI as a scaffolding for walking through practical API design, in the context of each service. Going through the paths, verbs, parameters, body, responses, status codes, schema, and fields for each service, one step at a time. It is providing a great way to teach them good API design, in the context of each service they are looking to deliver. Helping them understand that there is no one right answer to all of this. That we just have to invest in their overall API design toolbox. Understand the pros and cons of each decision they make, and develop a healthy understanding of what API design is, both good, and bad–over time.


API Life Cycle Basics: Deployment

There are many ways to deploy an API, making this another confusing stop along the API life cycle for some of my readers. My goal in having this be a separate stop from design, or possibly management, is to help API providers think about where and how they deploy APIs. From my perspective, API deployment might be about which framework and language you choose to deploy in, spanning all the way to where you might deploy it, either on-premise, on-device, or in the cloud. The how and why of deploying your API will play a significant role in determining how stable and consistent you are able to deploy API resources, impacting almost every other stop along the API life cycle.

Many API providers still think of API deployment in the context of their internal operations, as opposed to thinking about how they will be put to use. The providers I’m seeing enjoy more flexibility and agility when it comes to API consumption are able to deploy APIs in a variety of languages, supporting a variety of existing platforms, and in any environment where they are needed. There are several concepts that are beginning to define API deployment in this new generation of compute in the cloud, here are just a handful of them.

  • Polyglot Deployment - The ability to deploy APIs in a variety of programming languages.
  • Multi-Platform - The ability to deploy APIs in a variety of platforms, and using existing system.
  • Multi-Cloud - The ability to deploy APIs within Amazon, Azure, Heroku, and Google environments.
  • Frameworks - Leverage a variety of open source API frameworks for deploying APIs.

I normally would put API gateways here as well, but because of renewed energy around gateway solutions actually deploying APIs instead of just managing and securing them, I’m breaking out gateway into its own stop along the API lifecycle. Gateway spans API deployment and management in my opinion, and while it should be considered alongside these elements, it is increasingly becoming its own stop. Ideally, teams are able to use a combination of gateway, as well as hand-rolled, and auto-generated approaches to deploying APIs, with the diversity I mentioned above.

Most groups I work with have one way to deploy APIs, using a single programming language. This results in many of them thinking about API consumption on the same terms. When you allow for, and support a variety of languages, platforms, and cloud environments, you open up a new world of possibilities when it comes to scaling, migrating, and hiring talent as part of your API operations. API deployment will be a new concept to many of my readers, and something not all will be ready for, but being able to think outside the API box you’ve been operating within until now is one of the basic aspects of the API life cycle you should be looking at evolving.


I Created An OpenAPI For The Hashicorp Consul API

I was needing an OpenAPI (fka Swagger) definition for the Hashicorp Consul API, so that I could use in a federal government project I’m advising on. We are using the solution for the microservices discovery layer, and I wanted to be able to automate using the Consul API, publish documentation within our project Github, import into Postman across the team, as well as several other aspects of API operations. I’m working to assemble at least a first draft OpenAPI for the entire technology stack we’ve opted to use for this project.

First thing I did was Google, “Consul API OpenAPI”, then “Consul API Swagger”, which didn’t yield any results. Then I Githubbed “Consul API Swagger”, and came across a Github Issue where a user had asked for “improved API documentation”. The resulting response from Hashicorp was, “we just finished a revamp of the API docs and we don’t have plans to support Swagger at this time.” Demonstrating they really don’t understand what OpenAPI (fka Swagger) is, something I’ll write about in future stories this week.

One of the users on the thread had created an API Blueprint for the Consul API, and published the resulting documentation to Apiary. Since I wanted an OpenAPI, instead of an API Blueprint, I headed over to APIMATIC API Transformer to see if I could get the job done. After trying to transform the API Blueprint to OpenAPI 2.0 I got some errors, which forced to me to spend some time this weekend trying to hand-craft / scrape the static API docs and publish my own OpenAPI. The process was so frustrating I ended up pausing the work, and writing two blog posts about my experiences, and then this morning I received an email from the APIMATIC team that they caught the errors, updated the API Blueprint, allowing me to continue transforming it into an OpenAPI definition. Benefits of being the API Evangelist? No, benefits of using APIMATIC!

Anyways, you can find the resulting OpenAPI on Github. I will be refining it as I use in my project. Ideally, Hashicorp would take ownership of their own OpenAPI, providing a machine readable API definition that consumers could use in tooling, and other services. However, they are stuck where many other API service providers, API providers, and API consumers are–thinking OpenAPI is still Swagger, which is just about API documentation. ;-( . I try not to let this frustrate me, and will write about it each time I come across, until things change. OpenAPI (fka Swagger) is so much more than just API documentation, and is such an enabler for me as an API consumer when I’m getting up and running with a project. If you are doing APIs, please take the time to understand what it is, it is something that could be the difference between me using our API, or moving on to find another solution. It is that much of a timesaver for me.


Understanding Events Across Your API Platform In Real Time

I spend a lot of time trying to understand and define what is API. With my new partnership with Streamdata.io I’m pushing that work into understanding APIs in real time, and as part of event-driven architecture. As the Streamdata.io team and I work to identify interesting APIs out there that would benefit from streaming using the service, a picture of the real time nature of API platforms begins to emerge. I’m beginning to see all of this as a maturity aspect of API platforms, and those who are further along in their journey, have a better understanding the meaningful events that are occurring via their operations.

As part of this research I’ve been studying the Stripe API, looking for aspects of the platform that you could make more real time, and streaming. Immediately I come across the Stripe Events API, which is a “way of letting you know when something interesting happens in your account”. Using the Stripe Events API, “you can retrieve an individual event or a list of events from the API. We also have a separate system for sending the event objects directly to an endpoint on your server, called webhooks.” This is the heartbeat of the Stripe platform, and represents the “events” that API providers and consumers want to know about, and understand across platform usage.

I think about the awareness API management has brought to the table in the form of metrics and analytics. Then I consider the blueprint the more mature platforms like Stripe have established when it comes to codifying this awareness in a way that can be accessed via API, and begin to make more real time, or at least asynchronous using webhooks. Then I think about what Streamdata.io provides with Server-Sent Events, and JSON Patch, providing a stream of these meaningful events in real time, as soon as these events happen–no polling necessary. This is what I find interesting about what they do, and why I’ve signed up to partner with them. Well, that combined with them supporting me financially. ;-)

Even before working with Streamdata.io, I have been studying the value of API driven events, and working to identify the API platforms who are mature enough to be mapping this value exchange out. This has led to me to desire a better understanding of event-driven architecture, not necessarily because it is the next thing with APIs, but because there is some substance in there. There is a reason why events matter. They represent the meaningful exchanges that are occurring via platforms. The valuable ones. The percentage of transactions we should be tuning into. I want to better understand this realm, and continue collecting a wealth of blueprints regarding how companies like Stripe are maximizing these exchanges.

Disclosure: In case it isn’t clear, Streamdata.io is my primary partner, paying me to do research in this area, and helping support me as the API Evangelist.


API Discovery Is Mostly About You Sharing Stories About The APIs You Use

I do a lot of thinking about API discovery, and how I can help people find the APIs they need. As part of this thinking I’m always curious why API discovery hasn’t evolved much in the last decade. You know, no Google for APIs. No magical AI, ML, AR, VR, or Blockchain for distributed API mining. As I’m thinking, I ask myself, “how is it that the API Evangelist finds most of his APIs?” Well, word of mouth. Storytelling. People talking about the APIs they are using to solve a real world business problem.

That is it! API storytelling is API discovery. If people aren’t talking about your API, it is unlikely it will be found. Sure people still need to be able to Google for solutions, but really that is just Googling, not API discovery. It is likely they are just looking for a company that does what they need, and the API is a given. We really aren’t going to discover new APIs. I don’t know many people who spend time looking for new APIs (except me, and I have a problem). People are going to discover new APIs by hearing about what other people are using, through storytelling on the web and in person.

In my experience as the API Evangelist I see three forms of this in action:

1) APIs talking about their API use cases on their blog 2) Companies telling stories about their infrastructure on their blog 3) Individuals telling stories about the APIs they use in job, side projects, and elsewhere.

This represent the majority of ways in which I discover new APIs. Sure, as the API Evangelist I will discover new APIs occasionally by scouring Github, Googling, and harvesting social media, but I am an analyst. These three ways will be how the average person discovers new APIs. Which means, if you want your API to be discovered, you need to be telling stories about it. If you want the APIs you depend on to be successful and find new users, you need to be telling stories about it.

Sometimes in all of this techno hustle, good old fashioned storytelling is the most important tool in our toolbox. I’m sure we’ll keep seeing waves of API directories, search engines, and brain wave neural networks emerge to help us find APIs over the next couple of years. However, I’m predicting that API discovery will continue to be defined by human beings talking to each other, telling stories on their blogs, via social media, and occasionally through brain interfaces.


API Transit Basics: Mocking

This is a series of stories I’m doing as part of my API Transit work, trying to map out a simple journey that some of my clients can take to rethink some of the basics of their API strategy. I’m using a subway map visual, and experience to help map out the journey, which I’m calling API transit–leveraging the verb form of transit, to describe what every API should go through.

One key deficiency I see in organizations that I work with on a regular basis, is the absence of the ability to quickly deploy a mock version of an API. Meaning, the ability to deliver a virtualized instance of the surface area of an API, that will accept requests, and return responses, without writing or generating any existing backend code. Mocking APIs require an API definition, and with many groups still producing these definitions from code, the ability to mock an API is lost in the shuffle. Leaving out the ability to play with an API before it ever gets built–which if you think about it, goes against much of why we design APIs in the first place.

Mocking of an API goes hand in hand with a design first approach. Being able to define, design, mock, and then receive feedback from potential consumers, then repeat until the desired API is delivered is significantly more efficient than writing code, deploying an API, and iterating on it over a longer time frame. Over the last couple of years, a growing number of services and tooling have emerged to help us mock our APIs, as well as the schema that are used as part of their requests and responses, giving birth to this entirely new stop along the API life cycle.

  • Mockable - A simple service for mocking web and SOAP APIs
  • Sandbox - A simple service for generating sandboxes using a variety of formats.
  • Stoplight Prism - An open source tool for mocking and transforming from OpenAPIs.
  • Wiremock - An open source tool for mocking APIs.
  • Postman Mock ServerYou can setup mock servers from within the Postman environment.

Mocking of API is something that organizations who have not adopted an API definition approach to delivering APIs cannot ever fully realize. When you have a robust, machine readable definition of the surface area of your API it allows you to quickly generate sandboxes, mocks, and virtualized instance of an API. These interfaces can then be consumed, and played with, and allow for API definitions to be adjusted, tweaked, and polished until it meets the needs of consumers. Shortening the feedback loop between each iteration, and version of an API–saving both time and money.

The API developers I’ve seen who have become proficient in defining and designing their APIs, and delivering mock APIs, have also begun to be more agile when it comes to mocking and virtualizing of data that gets returned as part of mock API responses. Further pushing mocking and virtualization into testing, security, and other critical aspects of API operations. Being able to mock API interfaces is a sign that API operations is maturing, allowing for costly mistakes to be eliminated, or identified long before anything goes into production, making sure APIs meet the needs of both providers and consumers long before anything gets set into stone.


Keeping Things In The Club By Drowning Everyone In API Complexity

After seven years of doing API Evangelist I have learned a lot about the realities of the technology sector, versus my own beliefs. One of the things that attracted me to web APIs in the first place was the ability to simplify the access to data, content, algorithms, and other resources using a web url. I could get a list of news articles, post a picture, launch a server in the cloud, and many other common business tasks using a simple URL. To me good API design is more about simplicity, than it was ever about REST, or any other dogmatic approach to doing APIs. However, after seven years of doing this, I’m pretty convinced that most folks have very little interest in truly making things simple for anyone.

As API space continues to move forward with efforts to address technical debt, and the cultural issues involved with the technology we are using within large enterprises as a part of the microservices movement–we are simultaneously see other fronts where leading edge practitioners are embracing technical complexity in service of scope, volume, and satisfying the requests of developers down in the weeds, and not taking time to consider the big picture. You see this with trends like Kafka, GraphQL, ad other areas, where we are moving forward with technology that isn’t entirely embracing the web, and introducing some pretty complex approaches to getting the job done.

I get it. The problems being solved are big. There is a lot of data. Complex delivery models. Robust, and highly functional applications. Simple web APIs can’t always deliver at the scope, scale, and satisfaction of the very technical folks involved. I’m not knocking things moving forward, but I am asking if everyone involved is thinking seriously about the big picture, and assessing the costs down the road–as well as those who get left behind. Not everyone will have the resources, knowledge, and ability to keep up, and I actually question if this pace and the complexity going on is actually required–then I get the feeling that maybe it is actually intentional. Survival of the fittest, meritocracy, unicorns, and all the competitiveness that the tech sector loves about itself.

My assessment is that not everyone is intentionally choosing complexity over simplicity. It is just that the current environment doesn’t afford taking a breathe to think about it, and considering whether or not operating at this scope makes sense, and all this data is actually needed, or if down the road this will all truly pencil out. For me, this is when we get to each point in time where we have to stop and ask, how did we accumulate all this technical debt? How did things get so bloated and complex? Oh yeah, we didn’t ever stop and ask ourselves the right questions along the way, and the folks who made the decision have long moved on, and are enjoying greener pastures. Nobody is ever really ensuring there is accountability for any of this, but I guess that is how it all works, right? Moving forward, at all costs. It will be someone else’s problem down the road.

My guess is that folks who are in the business of mastering or developing the latest technology, and always moving on to the next big thing will not see eye to eye with me. However, when you are someone like me who has been doing this 30+ years, and come into many large organizations to help clean up legacy messes, you might agree with me a little more. I’m guessing this is why some high tech companies who are selling the next thing to the enterprise prefer hiring young whipper snappers, who like to move fast and break things. There will always be endless waves of these techpreneurs to hire out of college, as well as companies to sell the latest technological solution to that will magically fix all our legacy technical debt challenges. Circle of life, and all that. Keeping things in the club, by drowning everyone in complexity, and getting rich along the way.


API Life Cycle Basics: Database

This is a series of stories I’m doing as part of my API Transit work, trying to map out a simple journey that some of my clients can take to rethink some of the basics of their API strategy. I’m using a subway map visual, and experience to help map out the journey, which I’m calling API transit–leveraging the verb form of transit, to describe what every API should go through.

Deploying an API from a database is the most common approach to delivering APIs today. Most of the data resources we are making available to partners and 3rd party developers via APIs lives in a database behind our firewall(s). While we have seen database platform providers begin to take notice of the need to make data available using the web, most APIs get deployed through custom frameworks, as well as gateways that expose backend systems as web APIs.

If you are deploying APIs from a centralized legacy database, there will be significantly more security, performance, and other operational concerns than if your database is dedicated to providing a backend to your API. There are a growing number of open source tools for helping broker the relationship between your API and the database, as well as evolving services, and entire database platforms that are API-centric. Here are just a handful of what I’m seeing out there to support the database stop along an API journey.

  • DataBeam - Generic RESTful Interface for databases.
  • Arrest-MySQL - A plug-n-play RESTful API for your MySQL database.
  • Postgrest - REST API for any Postgres database.
  • Restheart - RESTHeart, the automatic REST API Server for MongoDB.
  • NodeAPI - Simple RESTful API implementation on Node.js + MongoDB.
  • PHP CRUID API - Single file PHP script that adds a REST API to a SQL database
  • Google Cloud Spanner - loud Spanner is the first and only relational database service that is both strongly consistent and horizontally scalable.

There are many database to API tools and services available out there. There are also many cloud-native solutions available to help you generate APIs from your preferred cloud provider. Amazon, Azure, and Google all provide API deployment, and management solutions directly from their database solutions. The most difficult part about helping folks thinking about this stop along the API journey, is the many different scenarios for how data is stored, and the limitations on how that data can be made available via APIs.

Ideally you are starting from scratch with your API, and you can deploy a new database, with a brand new API layer exposing your data store within. If you are deploying from a legacy database which serves other systems and applications, I recommend thinking about replicating the database and creating read only instances for accessing via the API, or if if you need read / write capabilities, then take a look at many of the gateway solutions available today. Beyond that, if you have the skills to securely connect directly to your database, there are many more options on the table to help you get the job done in todays web-centric, data-driven world.


API Transit Basics: API Design

This is a series of stories I’m doing as part of my API Transit work, trying to map out a simple journey that some of my clients can take to rethink some of the basics of their API strategy. I’m using a subway map visual, and experience to help map out the journey, which I’m calling API transit–leveraging the verb form of transit, to describe what every API should go through.

API design is not just about REST. Sure, a great deal of the focus within this stop along the API journey will be focused on REST, but this is because it is the dominant methodology at this moment in time. API design is about establishing a framework for how you will consistently craft your APIs across teams, whether they are REST, GraphQL, Microservices, or even gRPC. Your API design strategy might be dominated by RESTful practices, especially early on in your journey, but API design should not be considered to be only REST methodologies.

In the last five years API design has matured into its own discipline, focusing on a define and design first approach to developing APIs, shifting away from a code then document approach we’ve seen dominate for the last decade, and is still common place at many organizations. There are a handful of tooling, and websites that have emerged to help API providers, architects, developers, and designers get a handle on this stop along the API journey–here are just a few.

  • Swagger Editor - Leverage the Swagger editor for manually working with OpenAPI definitions.
  • Apicurio - A robust, and beautiful open source tool for designing APIs.
  • API Design Guide - Continue to establish, evolve, and disseminate the organizational API design guide, providing guidance for all teams–make sure there is a feedback loop involved with its development.
  • API Stylebook - Learning and extracting from other companies API design guides.
  • Designing and Implementing Hypermedia APIs - A thoughtful post on how to design hypermedia APIs.
  • GraphQL Design - The best practices for designing a GraphQL API.
  • GRPC API Design - A guide to designing gRPC APIs that leverage HTTP/2 and Protocol Buffers.

API design is as much of a discipline as it is a methodology rooted in a specific standard, protocol, or history. It is about having the discipline to document current practices for designing APIs that are in production, then standardize, communicate, and evolve those practices in a formal way. While also studying and learning from other leading API providers and practitioners regarding how they are designing their APIs. Which is why I include the API Stylebook in this stop along the API journey–everyone should be learning from each other, which also includes sharing your API design guide when it is ready.

We need to move beyond API design meaning REST in the API community. This is something that has caused significant damage to the health of many API operations, and is a dogmatic approach that has replicated itself in hypermedia, and GraphQL–it needs to stop. API design is about defining a common framework for designing your APIs, no matter which approach you adopt internally. Ideally, your API design philosophy is multi-approach, allowing you to apply the right pattern where is needed, and not viewing API design as a one size fits all set of rules. When it comes to API design within your organization, start small, keep things loose, learn from others, and begin documenting your approach in a guide, that can eventually grow into a wider set of API governance practices that will allow your operations to grow in the way you envision.


The Transit Feed API Is A Nice Blueprint For Your Home Grown API Project

I look at a lot of APIs. When I land on the home page of an API portal, more often than not I am lost, confused, and unsure of what I need to do to get started. Us developers are very good at complexifying things, and making our APIs implementations as messy as our backends, and the API ideas in our heads. I suffer from this still, and I know what it takes to deliver a simple, useful API experience. It just takes time, resources, as well as knowledge to it properly, and simply. Oh, and caring. You have to care.

I am always on the hunt for good examples of simple API implementations that people can emulate, that aren’t the API rockstars like Twilio and Stripe who have crazy amounts of resources at their disposal. One good example of a simple, useful, well presented API can be found with the Transit Feeds API, which aggregates the feeds of many different transit providers around the world. When I land on the home page of Transit Feeds, I immediately know what is going on, and I go from home page to making my first API call in under 60 seconds–pretty impressive stuff, for a home grown API project.

While there are still some rough edges, Transit Feeds has all the hallmarks of a quality API implementation. Simple UI, with a clear message about what it does on the home, but most importantly an API that does one thing, and does it well–providing access to transit feeds. The site uses Github OAuth to allow me to instantly sign up and get my API key–which is how ALL APIs should work. You land on the portal, you immediately know what they do, and you have your keys in hand, making an API call, all without having to create yet another API developer account.

The Transit Feed API provides an OpenAPI for their API, and uses it to drive their Swagger UI API documentation. I wish the API documentation was embedded onto the docs page, but I’m just thankful they are using OpenAPI, and provide detailed interactive API documentations. Additionally, they have a great updates page, providing recent site, feed, and data updates across the project. To provide support they wisely use Github Issues to help provide a feedback loop with all their API consumers.

It isn’t rocket surgery. Transit Feed makes it look easy. They provide a pretty simple blueprint that the rest of us can follow. They have all the essential building blocks, in an easy to understand, easy to get up and running format. They leverage OpenAPI and Github, which should be the default for any public API. I’d love to see some POST and PUT methods for the API, encouraging for more engagement with users, but as I said earlier, I’m pretty happy with what is there, and just hope that the project owners keep investing in the Transit Feed API. It provides a great example for me to use when working with transit data, but also gives me a home grown example of an API project that any of my readers could emulate.


Alexa Voice Skills Are The Poster Child For Your Enterprise API Efforts

I was sitting in an IT architectural planning meeting for a large enterprise organization the other day, and one of the presentation from one of the executives contained a complex diagram of their IT infrastructure, with a column to the right showing a simple five step Alexa conversation, asking a specific question from customer. Each question posed as part of the Alexa conversation theoretically accessed a different system, weaving a pretty complex web of IT connections, to enable this simple conversation.

This presentation reflects why I feel that Alexa Skills development poses some interesting questions in the API world, and why the platform becomes interesting to so many business users. It reflects the end goal of why we are doing all of this (in theory), but then quickly illustrates how complicated we’ve actually made all of this, demonstrating how challenging delivering conversational interfaces will be in reality. There are many conversational challenges in enabling our system to be able to talk with humans, but I think many of the most daunting challenges companies will face in coming years will be to actually get at the right data to provide a relevant answer to questions poised in voice, bot, and other conversationally-enabled solutions.

Being able to quickly respond to information requests is why many companies, organizations, institutions, and government agencies are doing APIs. Being able to respond to them in real time conversations is definitely a question of doing APIs, but I’m finding in most organizations it is more about solving human and political questions, than it is just a technical one. Sure, you can envision the most beautiful stack of microservices reaching into every aspect of your organization(s), and even develop a robust conversational layer for answering questions posed across that stack, but delivering it all consistently, at scale, across multiple teams of human beings will never be easy, or quick.

I think that conversational interfaces provide an excellent exercise for companies, to help them map out the complexities of their backend systems, and try to understand how to deliver more real time solutions. Personally, I’m not a big fan of bot or voice-enablement, but I know others are. I’m more interested in them because of the technical challenges in delivering, and the business and cultural hurdles they put in front of development teams. It isn’t easy to deliver meaningful, relevant, and intelligent conversations via these new mediums, and I think the Alexa Skills framework provides a useful way for us to hang these conversations regarding our IT resources on.

While the majority of APIs are still about delivering data and content to the web and mobile applications, I think conversational interfaces are showing the future of where things are headed. I don’t think we’ll get there as fast as we would like, or as quickly as the vendors are promising us, but I do think we will make movements towards delivering more meaningful conversational interfaces in coming years. Mostly it will be due to the availability of API resources. If we can get at the data and content, we can usually answer questions regarding that data content. The problem will be that not everything is digitized, and easily accessible. Despite the promises of artificial intelligence, and voice-enabled platforms like Alexa, humans will prove to be the biggest obstacle to realizing the visions of business leaders to answer even the most basic questions we are looking to answer.


API Transit Basics: API Definitions

This is a series of stories I’m doing as part of my API Transit work, trying to map out a simple journey that some of my clients can take to rethink some of the basics of their API strategy. I’m using a subway map visual, and experience to help map out the journey, which I’m calling API transit–leveraging the verb form of transit, to describe what every API should go through.

Defining an API is the first stop along any API journey. When I say definitions, I’m not just talking about OpenAPI (fka Swagger), and specifically definitions for the surface area of your API. I’m talking about defining your idea, your goals, and the standard aspects of doing business with APIs. By API definitions, I mean having a robust toolbox of definitions for everything that is going into your API operations, from standardized dates and currencies, to common data schema, and yes to making sure there is an active OpenAPI definition for every single one of your APIs.

I’d say that 75% of the companies, organizations, institutions, and government agencies I’m talking with about APIs begin API development by coding. A very costly, and rigid approach to defining a solution to a problem. Many of the groups I know who are using OpenAPI in their operations still rely on it being generated from systems and code, and do not actually hand-define, or hand-craft the definitions for their APIs, which should be being applied across API operations, not just for delivering documentation. When it comes to establishing a robust API definition strategy for operations, I recommend starting with a handful of tolls and concepts.

  • OpenAPI - Ensuring there are OpenAPI definitions for ALL APIs / microservices.
  • JSON Schema - Ensuring there are robust JSON schema for all data in use.
  • Postman Collections - Postman’s proprietary format for defining APIs, which can be translated to and from OpenAPI.
  • API Transformer - Opening up the ability to transform APIs across formats.
  • Multi-Format - Being able to speak XML, JSON, and YAML fluently and seamlessly across groups.

There are many other tools to assist you in crafting, generating, managing, and evolving the definitions as part of your API operations. API definitions isn’t just one stop along this API journey, and I will be exploring ideas for how API definitions can be applied to each stop along this API journey, in a separate line of thought that runs parallel to what I consider to be my API Transit basics. These five areas represent what I think are the basics of API definitions for ANY API operations, and should be where any API provider begins their journey–by defining the moving parts of each API, and what it will do from define to deprecation.

The biggest threat to properly defining APIs is too much automation, and thinking that they only apply to one stop of the API Transit process. OpenAPI is not just about generating documentation. JSON Schema is not just about completing your OpenAPI definition. Not all APIs are purely JSON, and teams should be multi-lingual when it comes to the definitions they use across API operations. API definitions are essential to not just delivering your APIs, but also communicating and supporting them, and evolving them as part of your road map. API definitions are essential to establishing healthy API operations, and without them, things will easily break down for a single API, and be near impossible to deliver APIs consistently at scale across any organization.


My Evolving Definition Of A Robust And Diverse API Toolbox

It is always telling when folks assume I mean REST when I say API. While the web dominates my definition of API, and REST is definitely a leading architectural style, these assumptions always define the people who bring them to the table, more than they ever do me. I’m in the business of studying how people are applying programmatic interfaces using the web. To reflect my research I’ve been evolving a diagram of my toolbox that I’ve been publishing as part of workshops, presentations, and some talks I’m preparing for 2018. It reflects what I’m seeing as the evolving API toolbox that I’m seeing companies working with, and a diversity in which I’m encouraging others to think about more, as we choose to ignore the polarizing forces in the API sector.

To set the tone for any API conversation I am participating in, I prefer to introduce the concept of the API toolbox including more tools than just REST, acknowledging that their are a growing number of tools in our API infrastructure toolbox which can be applied to different APIs, to solve a variety of problems and challenges we face. Also we need to be more honest about the fact that there are many legacy solutions still in use across large organizations, even as we consider adopting the latest in leading edge approaches to API deployment in newer projects.

  • HTTP - Leverage the web, and the HTTP standard across ALL API efforts.
  • SOAP - Acknowledging there are still a number of SOAP services in use.
  • RPC - Understand how and why RPC APIs still might be viable in production.
  • REST - Making REST, and a resource-centered approach the focus of the operations.
  • Microservices - Emphasis on independently deployable and module API services.
  • Verbs - Knowing, and putting to use HTTP verbs across API implementations.
  • Content-Type - Understanding the negotiation between XML, JSON, and other types.
  • Hypermedia - Considering how hypermedia design, and content types play a role.
  • GraphQL - Thinking about GraphQL when it comes to data intensive API projects.
  • HTTP/2 - Understanding and embracing the evolution of the HTTP standard.
  • gRPC - Considering two-speed APIs, and using gRPC for higher volume API implementations.
  • Webhooks - Seeing APIs as a two-way street, and pushing data to APIs as well as receiving.
  • Server-Sent Events (SSE) - Leveraging HTTP push technology to make things real time.
  • Websockets - Opening up two streams that allow for bi-directional API interactions.
  • PubSubHubbub - Considering a distributed publish-subscribe approach to API interactions.
  • Apache - Being aware of the Apache stack which includes Spark, Kafka, and other real time data solutions.

While HTTP and REST are definitely the focal point of many API conversations I am in, SOAP and RPC are legacy realities we must accept are still getting the job done in many environments, and we shouldn’t be shaming the folks who own this infrastructure. At the same time I’m helping folks unwind this legacy infrastructure, I also find myself participating in discussions around event-driven architecture, streaming, and HTTP/2 which represent where API architecture is headed. I’m needing a toolbox that reflects this spectrum of API tooling, as well as where we’ve been, and find ourselves still supporting in 2018.

I’m still evaluating the Apache stack, as well as GraphQL and gRPC, to better understand how they fit in my definition. This work, as well as part new partnership with Streamdata.io is pushing me to re-evaluate exactly what is real time and streaming using webhooks and server-sent events, alongside a more event-driven approach I am seeing emerge within many leading organizers. People love to say that APIs are done. I wish I could show how silly this way of thinking makes y’all look. The idea that using the web to exchange data, content, and algorithms in a machine readable formats is going anywhere is laughable. My objective is to keep tracking on the tools people are using to get this job done, and help folks ensure their toolbox is as robust and diverse as possible, not traffic in silly dogmatic fantasies about API trends and religions.


Treating All APIs Like They Are Public

I was talking with the Internal Revenue Service (IRS) about their internal API strategy the week before Christmas, sharing my thoughts on the strategy that they were pitching internally when it comes to the next phase of their API journey. One topic that kept coming up is the firm line of separate between public and private APIs, which you kind of get at an organization like the IRS. It isn’t really the type of organization you want to be vague about this line, making sure everyone understands where an API should be consumed, and where it should not be consumed.

Even with that reality, I still made the suggestion that they should be treating ALL APIs like they are public. I clarified by saying you shouldn’t be getting rid of the hard line dictating whether or not an API is internal or external, but if you treat them all like they are public, and act like they are all under threat, you will be better off for it. This peaked their interest, was something they did not expect to hear from me, and was something they would be adding to their recommendations for the next version of their API strategy.

The first benefit of treating your internal APIs like they are public is when it come to security, logging, and overall API management. You have the tools in place to catch any threats, and develop awareness regarding how an API is being used, both good and bad. While the threats might be minimized internally, developing the same awareness, and having the tools to identify who is using what, and respond accordingly will benefit operations. API security isn’t just about firewalls, it is about an awareness of who is using what.

The next benefit is about the future of your APIs. If you treat APIs like they are public, and you ever want to make it public, you will be in much better shape. You will have proper authentication, management, logging, and security controls already in place. You can cross the line between internal and external with much less friction. When you are ready to work with partners on a project, the time to make resources available can be significantly reduced, making things more efficient and agile when it comes to working with partners.

I get the hard line between external and external. However I don’t get having two separate API strategies. Have one strategy. Treat everything like they are public, but then be very strict, and explicit about who has access to an API, and monitor, audit, analyze, and report on who has access to API resources in real time. These are web APIs. Let’s treat them all the same, and expect that there will be threats and misuse of varying degrees. Let’s treat all APIs equal, and reduce the chance people will become complacent with API management and security just because it is an “internal” API.


The National Transit Database (NTD) Needs To Be An API

I’ve been looking for sources of transit data as part of some research I’m doing with Streamdata.io. Like most industries I study as part of my API research, it is a mess. There is no single source of truth, lack of robust open source solutions, government PDFs acting as databases, and tech companies extracting as much value as they can, and giving as little in return as they possibly can. Todays frustration centers around the unfortunately common federal government PDF database, or more specifically, the National Transit Database (NTD).

In 2017, when you publish something to the web as a “database”, it should be machine readable. There is some valuable data in the agency profile reports for the 800+ transit agencies available in the database, but this information is locked up in PDFs. You can find machine readable, historic versions of this data up to 2015 in data.gov, but for 2016, and 2017, the data is only available in individual PDFs for each agency profile. To make things more difficult, the listing of transit agencies uses some Ajax voodoo for its pagination and detail pages, making it even harder to scrape, on top of rendering each agencies detail useless by storing it as a PDF.

I understand why government is stuck in this mode. The systems they use only provide them with PDF as a their primary output. Staff hasn’t been trained on the importance of making data available in machine readable formats. People just don’t understand the negative impact they are making on the life of their data, and how it restricts people putting it to work. In some cases, people are fully aware of this, and want to limit how the data gets used, interpreted, keeping them as the definitive source of truth. I’m not saying this is what the Federal Transit Agency (FTA) is up to, but I’m saying it is the effect of their actions, which is having a chilling effect on folks like me using the valuable data to help communities served by these transit agencies.

I emailed the FTA asking if they have a machine readable copy of the database. This information should be published by default as CSV to the agencies Github account. I’m sure the data is available in a spreadsheet somewhere, before it becomes a PDF. It wouldn’t be very hard to save this data as CSV and publish to Github, which could then be easily converted into JSON, or other machine readable formats. I’m happy with CSV. I’m just not happy with PDF being called a database. Database implies that I can put the data to work, and in its current format the National Transit Database (NTD) is’t usable as data–hence not a database. It is just too much work to get out of the PDFs and make usable again, forcing me to step away from my project to understand how communities are investing in transit–I am hoping I can find the data some other place.


API Transit - The Basics

I have been evolving my approach to mapping out all the stops along my API research, using a subway map approach lately. It has been something I’ve been working on since 2014, and had developed as a keynote talk in 2015. My goal is to be able to lay out simple, as well as increasingly complex aspects of consistently operating an API. Something I’ve historically called the API life cycle, but will work to call API transit in the future.

Right now, I have two main approaches to delivering the API Transit maps. 1) API Life Cycle, and 2) API Documentation. The first is about applying consistent practices to API operations, and the second is about understanding API operations as they happen. In my mind, both these types of API Transit maps will eventually work in sync, but I have to work my way up to that. Right now, I’m focusing on the API Life Cycle version, which is becoming more about API governance, but I’m going to try and rebrand as API Transit. I’m using transit as a verb, “pass across or through” a standard, and consistent way of doing APIs. What some might consider API design, or governance, but I’m considering more holistically.

To support a couple of my consulting projects I am working on at the moment, I have published a simple API Transit project to help navigate some API teams through what I’d consider to be the basics they should be considering as they look to standardize how they deliver APIs across teams. It’s a basic single line, 19 stop API Transit map. It is something I will keep adding stops to, and expand many into their own lines, serving up much more detail, but for this first project I wanted to keep simple, and speaking to a specific enterprise audience. I don’t want to overwhelm them with information as they are just getting started on their API journey. They still have so much work to do in these 19 areas, I don’t them to get distracted with other areas, or feel like they are drowning in information.

My API Transit maps all run on Github, using Jekyll as the client. Each transit line, and stop is stored as Siren hypermedia stored in a Jekyll Collection. The resulting transit map, and details of each stop is just a simple HTML client which uses Liquid to render the data. This allows me to add stops, and lines as I need, expanding the API journey for each API Transit implementation. I still have routing challenges for the lines on the map. I have an editor for helping me plot where each line should go, but there are no easy answers when it comes to transit map layout, and is something that is proving to be more art than science, so I’m refraining from automating too much at the moment. I’m working on a routing algorithm, but just don’t have the time to perfect it at the moment.

Next, I’m working on more complex iterations of existing APIs, so more about documentation than governance, life cycle, or transit. I’m doing this with PSD2 as an exercise. Once I’ve done some more complex transit and specific API maps, I will work on combining the two, and applying the governance that exists in the transit map to a specific API, or set of APIs. Not sure where all of this is going, it is just a work in progress right now. It has been for almost three years, and I expect it will continue for many more years. If you are interested in having an API Transit map created for an existing API, or for a specific API governance process, feel free to reach out. I’m looking for more paid work to help push this work forward. Otherwise, it will just move along at whatever pace I can on my own steam!


API Discovery Will Be About Finding Companies Who Do What You Need And API Is Assumed

While I’m still investing in defining the API discovery space, and I’m seeing some improvements from other API service and tooling providers when it comes to finding, sharing, indexing, and publishing API definitions, I honestly don’t think in the end API discovery will ever be a top-level concern. While API design, deployment, management, and even testing and monitoring have floated to the top as primary discussion areas for API providers, and consumers, the area of API discovery never has quite become a priority. There is always lots of talk about API discovery, mostly about what is broken, rarely about what is needed to fix, with regular waves of directories, marketplaces, and search solutions emerging to attempting to fix the problem, but always falling short.

As I watch more mainstream businesses on-board with the world of APIs, and banks, healthcare, insurance, automobile, and other staple industries work to find their way forward, I’m thinking that the mainstreamification of APIs will surpass API discovery. Meaning that people will be looking for companies who do the thing that they want, and that API is just assumed. Every business will need to have an API, just like every business is assumed to have an website. Sure there will be search engines, directories, and marketplaces to help us find what we are looking for, but when we just won’t always be looking for APIs, we will be looking for solutions. The presence of an API be will be assumed, and if it doesn’t exist we will move on looking for other companies, organizations, institutions, and agencies who do what we need.

I feel like this is one of the reasons API discovery really became a thing. It doesn’t need to be. If you are selling products and services online you need a website, and as the web has matured, you need the same data, content, media, and algorithms available in a machine readable format so they can be distributed to other websites, used within a variety of mobile applications, and available in voice, bot, device, and other applications. This is just how things will work. Developers won’t be searching for APIs, they’ll be searching for the solution to their problem, and the API is just one of the features that have to be present for them to actually become a customer. I’ll keep working to evolve my APIs.json discovery format, and incentivize the development of client, IDE, CI/CD, and other tooling, but I think these things will always be enablers, and not ever a primary concern in the API lifecycle.


<< Prev Next >>