The API Evangelist Blog

This blog represents the thoughts I have while I'm research the world of APIs. I share what I'm working each week, and publish daily insights on a wide range of topics from design to depcration, and spanning the technology, business, and politics of APIs. All of this runs on Github, so if you see a mistake, you can either fix by submitting a pull request, or let me know by submitting a Github issue for the repository.


Round Two Of The Department of Veterans Affairs Lighthouse Platform RFI

I’m spending some more time thinking about APIs at the Department of Veterans Affairs (VA), in response to round two of their request for information (RFI). A couple months back I had responded to an earlier RFI, providing as much information as I could think of, for consideration as part of their API journey. As a former VA employee, and son of two Vietnam Vets (yes two), you can say I’m always willing to invest some in APIs over at the VA.

To provide a response, I have taken the main questions they asked, broken them out here, and provided answers to the best of my ability. In my style, the answers are just free form rants, based upon my knowledge of the VA, and the wider API space. It is up to the VA, to decide what is relevant to them, and should be included in their agency API strategy.

2. Current Scope
While the acquisition strategy for Lighthouse has not yet been formalized, VA envisions that the program will consist of multiple contracts. For example, a contract for recommending policy and standards to form governance would likely be separate from an API build team. The key high level activities below are anticipated to be included within these contracts, and VA is requesting feedback from industry on how these activities should be aligned between multiple contracts. The list below is not inclusive of all tasks required to support this program. Additionally, VA intends to provide the IAM solution and the provisioning of necessary cloud resources to host the proposed technology stack. VA’s current enterprise cloud providers are Microsoft Azure and Amazon Web Services.

Microservice Focused Operational & Implementation
Lighthouse should embrace a microservices way of doing things, so that the platform can avoid legacy trappings when it comes to delivering software at the VA, which have resulted in large, monolithic systems, possessing enormous budgets, and entrenched teams, that are able to develop a resistance to change and evolution. This microservices way of doing things should be adopted internally, as well as externally, then applied to the technology, business, and politics of delivering ALL Lighthouse infrastructure.

All contracts should be defined and executed in a modular way, with the only distinction between projects being operational, or for specific project implementations. Everything should be delivered as microservices, no matter whether it is in support of operating the Lighthouse platform, or delivering services to Lighthouse-driven applications. The technology and business of each service should be self-contained, modular, and focusing on doing one thing, and doing it well. Ensuring all services executed as part of Lighthouse operations are decoupled, working independently, allowing for easily defining, delivering, managing, evolving, and deprecating of every operational and implementational service that makes Lighthouse work.

Operational services will be the first projects delivered via the platform, and will be used to establish and mature the Lighthouse project deliver workflow, but then going forward, every additional operational, as well as specific implementation focused service will utilize the same workflow and life cycle.

  • Definitions - Everything begins as a set of definitions. Leveraging OpenAPI, JSON Schema, Dockerfiles, and other common definitions to provide a human, and machine readable definition of every project, which is ultimately delivered as a microservice.
  • Github - Each microservice begins as either a public or private Github repository, with a README index of the definition of what a service will deliver. Providing a self-contained, continuously deployed and integration blueprint of what a service does.
  • Architecture - Always providing a comprehensive outline all backend architecture used to support a specific service, including the technical, as well as the business, and security policy elements of what it takes to deliver the required service.
  • Tooling - Always providing a comprehensive outline of any tools used as part of delivering a service, to provide what is needed from a front-end delivery and execution vantage point.
  • Lifecycle - Establish a lifecycle, that each service will need to pass through, ensuring consistent delivery, and management of services that adhere to governance standards.
    • define - What definitions are required for services?
    • design - What is the API design guidance involved?
    • mock - How are APIs and data virtualized as part of development?
    • portal - Which portals are service published to, or will possess?
    • document - What documentation is required and delivered?
    • test - Where are the code, as well as interface level tests?
    • clients - What client environment are in use for design, development, and testing?
    • *** - Pause there, and repeat until the desired service is realized…
    • deploy - How are services delivered as part of a containerized, continuous deployment pipeline?
    • dns - What DNS is needed to address and route traffic to services?
    • manage - What API management level services are in place to secure, log, limit, and report of API and service consumption?
    • logging - What is the logging stack, how is it shipped, analyzed, and reported upon?
    • monitor - What monitors are required and in place for each service?
    • performance - How is performance measured and reported upon?
    • sdk - What client libraries, SDKs, and samples in place for service integration?
    • depenencies - What internal service, and external API dependencies are in play?
    • licensing - What is the data, code, interface, and other licensing that apply?
    • privacy - Are privacy policies in place, and considered for the platform, partners, developers, and end-users.
    • terms - Are terms of service in place, and independently considered for each service?
    • monetization - What are the operating costs, and other monetization considerations?
    • plans - What API consumption plans, rate limits, and policies in place to govern service usage?
    • support - What support mechanisms are in place, with relevant point of contacts?
    • communication - What communication channels are in place, such as blogs, social, and messaging channels?
    • observability - What is the observability of each service, from open source to monitoring, and CI/CD workflows, ensuring it can be audited?
    • discovery - What is required to register, route, and discover an API as part of overall operations?
    • evangelism - What is the plan for making sure a service is known, used, and evangelized amongst target audience?
  • Governance - How is each step along the life cycle measured, reported upon, and audited as part of governance, to understand how a service is meeting platform requirements, and evolving along a maturity path–allowing for innovation to occur, and newer ideas to flourish, but also allow more hardened, secure, and mature services to rise to the top.

The OpenAPI, JSON Schema, and other definitions for each microservice will ultimately be the contract for each project. Of course, to deliver the first set of operational platform services (compute, storage, DNS, pipeline, logging, etc.) these independent contracts might need to be grouped into a single, initial contract. Something that will also occur around different groups of services being delivered at any point in the future, but each individual service should be self-contained, with its own contract definition existing in it’s Github repository core.

Question: API Roadmap Development (Backlog, Future)
Each service being delivered via Lighthouse will possess its own self-contained road map as part of its definition. Providing a standardized, yet scalable way to address what is being planned, is being delivered, operated, and when anything will ultimately be deprecated.

  • Github Issues - Each Github repository has it’s own issues for managing all conversations around the service road map. Tags and milestones can be used to designate the past, present, future, and other relevant segmentation of the road map.
  • Micro / Macro - Each services posses micro level detail about the road map, which is available via Github APIs, in a machine readable way for inclusion at the macro level, serving governance, reporting, and higher level road map considerations.
  • Communication - Each service owner is responsible for road map related communication, support, and management providing their piece of the overall road map puzzle.

The Lighthouse platform road map should work like an orchestra, with each participant bringing their contribution, but platform operators and conductors defining the overall direction the platforms is headed. At scale, Lighthouse will be thousands of smaller units, organized by hundreds of service owners and stewards, serving millions of end-users, with feedback loops in service through the stack.

Question: Outreach (Internal & External Parties)
Outreach is essential to the viability of any platform, and represents the business and political challenges that lie ahead for the VA, or any government agency looking to work seamlessly with public and private sector partners, as well as the public at large. There will be many services involved with Lighthouse operations that will need to be private, but the default should always be public, allowing for as much transparency and observability as possible, which will feed platform outreach in a positive way.

  • Github Project Pages - Each Github repository can have a public facing Github Pages static site portal and landing page. Allowing for individual service, or group portals to exist, providing a destination for all stakeholders to get involved.
  • Github Social Framework - Github provides a wealth of outreach and communication solutions from organization and repository search, to issues and wikis, and tagging services with individual topics. All of which can be used as part of outreach and engagement in a private or public setting.
  • Twitter - Microblogging provides a great way to publish regular updates, and provide communication around platform operations.
  • Linkedin - Enterprise development groups, especially those in service of the government tend to use Github for establishing their profile, and maintaining their presence, which can be incorporated into all outreach efforts.
  • Blogs - The platform should possess its own public and / or private blogs, as well as potentially more topically, service, or project based blogs that expand outreach to the long tail of platform operations.

This type of outreach around platform operations is something that scares the hell out of government folks, and the majority of government APIs operation are critically deficient in the area of outreach. This has to change. If there is no feedback loop in place, and outreach doesn’t occur regularly and consistently, the platform will not succeed. This is how the API world operates.

Question: Management of API Request Process (Internal (VA)/External (Non-VA))
New services will always be needed. Operational and implementation related requests should all be treated the same. Obviously there will be different prioritization mechanisms in place, but API requests should just be the birth of any new service, allowing it to begin its journey, and transit through the API lifecycle described above. Not all requests will reach deployment, and not all deployments will reach maturity, but all API requests should be treated equally.

  • Definitions - Each API request begins with a definition. A simple description of what a service will do.
  • Github - Each API request begins its journey as a Github repository, with a README containing its basic definition, and conversation around its viability within Github issues.
  • JSON Schema - As part of each request, all data that will be leverage as part of service operations should be defined as JSON Schema, and included in the Github repository.
  • OpenAPI - Additionally, the access to the service, and its underlying data and resource should be defined using a machine readable OpenaPI definition, outlining the contract of the service.
  • Certification - Some stakeholders will have submitted API requests before, and better understand the process, and be certified owners of existing services, working as part of trusted organizations, expediting and prioritizing the request process.
  • Template(s) - The most common service patterns to emerge should be defined as template, providing seeds and starter projects to help expedite and streamline the API request process, ensuring all the moving parts are there to make a decision, in a forkable, replicable package.

New API requests should be encourage. Anyone should be able to submit a new service, replicate, or augment an existing service, or respond to a platform API RFP. The life cycle described above should be open to everyone looking to submit an API request. Allowing them to define, design, mock, and iterate their submission. Even providing a nearly usable representation of a service, even before the idea or service is accepted. Forcing everyone to flesh out their service, deliver a viable proof of concept, that will streamline the API acceptance process.

Question: Propose, Implement and Manage the PaaS (technology stack)
As mentioned before, this aspect of Lighthouse should be delivered as microservices, alongside every other service being delivered via the platform. It just so happens that this portion of the stack will be the first to be delivered, and be iterated upon, evolved, and deprecated just like any other service. To put this in perspective, I will outline the AWS, and Azure infrastructure need to support management of the platform later on in this post, while considering the fact that AWS and Azure have been on the same journey that the VA is on with Lighthouse, something that has been playing out for the last decade.

The VA wants to be the Amazon of serving veterans. They want internal groups, vendors, contractors, veteran health and service organizations, and independent developers to come build their solutions on the Lighthouse platform. The VA should uses its own services for internal service delivery, as well as supporting external projects. The operational side of Lighthouse platform should be all microservice projects, with the underlying infrastructure being Azure or AWS solutions, providing a common platform as a service stack that can be leveraged, no matter where the actual service is deployed.

Question: DevOps Continuous Integration and Continuous Delivery (CI/CD) of APIs
Every service in support of operations or implementations via the Lighthouse platform will exist as a self-contained Github repository, with all the artifacts needed to be included in any application pipeline. The basic DNA blueprint for each service should be crafted to support any single CI/CD service, or ideally even multiple types of CI/CD and orchestration solutions like AWS and Azure both support.

  • Microservices - Lighthouse CI/CD will be all about microservice orchestration, and using a variety of pipelines to deliver and evolve initially hundreds, and eventually thousands of services in concert.
  • Github - Github will the cellular component driving the Lighthouse CI/CD workflow, providing individual service “legos” that can be composed, assembled, disassembled, and delivered in any way.
  • Definitions - Each microservice will contain all the artifacts needed for supporting the entire life cycle listed above, driven by a variety of CI/CD pipelines. Leveraging dockerfiles, build packages, OpenAPI definitions, schema, and other definitions to continuously deliver and integrate across platform operations.

Both AWS and Azure provide CI/CD workflows, which can be used to satisfy the portion of the RFI. I will list out all the AWS and Azure services I think should be considered below. Additionally, Jenkins, CircleCI, or other 3rd party CI/CD could easily be brought in to deliver on this aspect of platform delivery. The microservices core can be used as part of any pipeline delivery model.

Question: Environment Operations & Maintenance (O&M)
Again, everything operates as microservices, and gets delivered independently as services that can be configured and maintained as part of overall platform operations and maintenance, or in service of individual services, and groups of services supporting specific implementations.

  • Microservices - Everything is available as microservices, allowing the underlying environment operations and maintainenace to be orchestration, and optimized in real time.

Each of the AWS and Azure services listed below are APIs. They allow for the configuration and management of each service via API or CLI, allowing the architecture to be seamlessly managed as part of the overall API stack, as well as the CI/CD pipeline. Making environment operations and maintenance, just part of the continuous delivery cycle(s).

Question: Release Management
Release occurs at the granular service level. With Github and CI/CD as the vehicle for moving release forward daily, versioning, defining, and communicating all the way. With the proper code and API level testing in place, release management can happen safely at scale.

  • Github - Github version control, branches, and release management should be used as part of the overall release management strategy.
  • Versioning - Establishment of a service versioning strategy for minor and major code, and interface releases, allowing independent release management that can occur at the higher orchestration level
  • CI/CD Pipelines - Everything should be a pipeline, broken down by logical operational, organization, and project boundaries, operating on a continuous release cycle.
  • Microservices - Everything is operated independently, and released independently via containers, with approach dependency management as part of each release.
  • Definitions - OpenAPI and JSON Schema are versioned and use to act as the contract for each release.
  • Communications - Along with each release, comes a standard approach to notification, communication, and support.

Release management will horizontally take a significant amount of time to wrap your head around. Moving forward hundreds, and thousands of services in concert won’t be easy. However it will be more resilient, and forgiving than moving forward a single monolith.

Question: API Analytics
Awareness should be baked in by default to the Lighthouse platform, measuring everything, and reporting on it consistently, providing observability across all aspects of operations in alignment with security policies. Analysis should be its own set of operational services, that span the entire length of the Lighthouse platform.

  • Log Shipping - The database, container, web server, management, and DNS logs for ALL services should be shipped, and centralized, for complete access and analysis.
  • APIs - Centralized logs should be its own service, with programmatic access to logs for all platform services.
  • Modular - Analytics should be modular, bit-size API-driven elements that can be mixed, composed, published, and visualized in reusable ways.
  • Embeddables - Modular, embeddable UI elements should be developed as applications on top of platform analytics APIs, allowing for portable dashboard that can be remixed, reused, and evolved.
  • Search - The logging and reporting layer of the platform should have a core search element, allowing all logs to searched, as well as the logs for how API consumers are analyzing logs (mind blown).
  • Continouous - As with all other services, analytics, reporting, and visualizations should be continuous, and ever evolving and deployed on a day to day, week to week basis.

A standard logging strategy across all services is how we achieve a higher level of API analytics, going beyond just database or web server statics, and even API management analytics, providing end to end, comprehensive platform service measurement, analysis, reporting, and visualization. Allowing platform operators, consumers, and auditors to access and understand how all service are being used, or not being used.

Question: Approval to Operate (ATO) Support for Environments
Every service introduced as part of the Lighthouse platform should have all the information required to support ATO, with it baked into the governance and maturity life cycle for any service. It actually lends itself well to the maturity elements of the lifecycle above, ensuring there is ATO before anything is deployed.

  • Definitions - All definitions are present for satisfying ATO.
  • Github - Everything is self-contained within a single place for submission.
  • Governance - ATO is part of the governance process, while still allowing for innovation.
  • Micro / Macro - ATO for each individual service can be considered, as well as at the project, group levels, understanding where services fit in at macro level.

ATO can be built into the templated API request and submission process discussed earlier, allowing for already approved architecture, tooling, and patterns to be used over and over, streamlining the ATO cycle. Helping service developers enjoy more certainty around the ATO process, while still allowing for innovation to occur, pushing the ATO definition and process when it makes sense.

Question: Build APIs including system level APIs that connect into backend VA systems
Everything is a microservice, and there are plenty of approaches to ensure that legacy backend systems can enjoy continued use and evolution through evolved APIs. The API life cycle allows for the evolution of existing backend systems that operate in the cloud and on-premise in small, bit-size service implementations.

  • Gateway - AWS API Gateway and Azure API management makes it easy to publish newer APIs on top of legacy backend systems.
  • Facades - Establishing facade patterns for modernizing, and evolving legacy systems, allowing them to take on a new interface, while still maintaining existing system.
  • OpenAPI - Map out newer APIs using OpenAPI, then importing into gateways and wiring up with backend systems.
  • Schema- Mapping out the schema transformations from backend systems to front-end API requests and responses using JSON Path, and JSON Schema.
  • Microservices - Delivering newer APIs on top of legacy systems in smaller, more evolvable services.

From the frontend, you shouldn’t be able to tell whether a legacy VA system is in use, or newer cloud infrastructure. All applications should be using APIs, and all APIs should be delivered as individual or groups of microservices, that do one thing and does it well. As APIs evolve, the backend systems should be decoupled and evolved as well, but until that becomes possible, all consumption of data, content, and other resources will be routed through the Lighthouse API stack.

Question: API key management or managing third party access (authorization, throttling, etc.)
Both theAWS API Gateway, and Azure API Management allow for the delivery of modern API management infrastructure that can be used to govern internal, partner, and 3rd party access to resources. All applications should be using APIs, and ALL APIs should be using a standardized API management approach, no matter whether the consumption is internal or external. Ensuring consistent authorization, throttling, logging, and other aspects of doing business with APIs.

  • IAM - Leverage API keys, JWT, and OAuth in conjunction with IAM policies governing which backend resources are available to API consumers.
  • Gateway - All API traffic is routed through the AWS API Gateway and Azure API management layers, allowing for consistent and comprehensive management across all API consumption.
  • Management - Apply consistent logging, rate limiting, transformations, error and security at the API management level, ensuring all services behave in the same way.
  • Plans - Establishing of a variety of API plans that dictate API levels of access, which services are accessible to different API key levels, that are in sync with backend IAM policies.
  • Logging - Every API call is logged, and contains user and application keys, allow ALL API consumption to be audited and reported upon, and responded to in real time.
  • Security - Providing a single point of entry, and the ability to shut down access, striking the balance between access and security which is the hallmark of doing APIs.

API management is baked into the cloud. It is a discipline that has been evolving for over a decade, and is available on both the AWS and Azure platforms. The tools are there, Lighthouse just needs to establish a coherent strategy for authentication, service composition, logging, reporting, and responding to API consumption at scale in real time. Staying out of the way of consumers, while also ensuring that they only have access to the data, content, and other resources they are allowed to, in alignment with overall governance.

Management of API lifecycle in cloud, hybrid, and/or on premise environments
All operational aspects of the Lighthouse platform should be developed as independent microservices, with a common API–no matter what the underlying architecture is. The DNS service API should be the same, regardless of whether it is managing AWS or Azure DNS, or possibly any other on-premise or 3rd party service–allowing for platform orchestration using a common API stack.

  • Microservices - Each operational service is a microservice, with possibly multiple versions, depending on the backend architecture in use.
  • Containers - Every operational service is operated as a container, allowing it to run in any cloud environment.
  • Github - All services live as a Github repository, allowing it to be checked out and forked via any cloud platform.

The modular, containerized, microservice approach to delivering the Lighthouse platform will allow for the deployment, scaling, and redundant implementation of services in any cloud environment, as well as on-premise, or hybrid scenarios. All services operate using the same microservice footprint, using containers, and a consistent API surface area, allowing for the entire platform stack to be orchestrated against no matter where the actual service resides.

_**Question: 3. Use Case
To better provide insight into aligning activities to contracts, VA has provided the use case below. Please walk through this use case discussing each activity and the contract it would be executed under.

Veteran Verification Sample Use Case: VA has a need for a Veteran Verification API to verify a Veteran status from a number of VA backend systems to be shared internally and externally as an authoritative data source. These backend systems potentially have conflicting data, various system owners, and varying degrees of system uptime._

This is a common problem within large organizations, institutions, and government agencies. This is why we work to decouple, modularize, and scale not just the technology of building applications on backend systems, but also the business, and politics of it all. Introducing a competitive element when it comes to data management access, and building in redundancy, resilience, and a healthier incentive model into how we provide access to critical data, content, and other resources.

I have personal experience with this particular use case. One of the things I did while working at the VA, was conduct public data inventory, and move forward the conversation around a set of veteran benefit web services, which included asking the question–who had the authoritative record for a veteran? Many groups felt they were the authority, but in my experience, nobody actually did entirely. The incentives in this environment weren’t about actually delivering a meaningful record on a veteran, it was all about getting a significant portion of the budget. I recommend decoupling the technology, business, and politics of providing access to veterans data using a microservices approach.

  • Microservices - Break the veterans record into separate, meaningful services.
  • Definitions - Ensure the definitions for the schema and API are open and accessible.
  • Discovery - Make sure that the Veteran Verification API is full discoverable.
  • Testing - Make sure the Verification API is fully tested on a regular basis.
  • Monitoring - Ensure that there are regular monitors for the Verification API.
  • Redundancy - Encourage multiple implementations of the same API to be delivered and owned by separate groups in separate regions, with circuit breaker behavior in APIs and applications.
  • Balancing - Load balance between services and regions, allowing for auto-scaled APIs.
  • Aggregation - Encourage the development of aggregate APIs that bridge multiple source, providing aggregate versions of the veteran’s record, encouraging new service owners to improve on existing services.
  • Reliability - Incentivize and reward reliability with Verification API owners, through revenue and priority access.

There should be no single owner of any critical VA service. Each service should have redundant versions of the service, available in different regions, and managed by separate owners. Competition should be encouraged, with facade and aggregate introduced, putting pressure on core service providers to deliver quality, or their service(s) will be de-prioritized, and newer services will be given traffic and revenue priority. The same backend database can be exposed via many different APIs, with a variety of owners and incentives in place to encourage the quality of service.

APIs, coupled with the proper terms of service in place can eliminate an environment where defensive data positions are established. If other API owners can get access to the same data, and offer a better quality API, then evangelize and gain traction with application owners, entrenched API providers will no longer flourish. Aggregate and facade APIs allow for the evolution of existing APIs, even if the API owners are unwilling to move and evolve. Shifting the definition of what is authoritative, making it much more liquid, allowing it to shift and evolve, rather than just be diluted and meaningless, as it is often seen in the current environment.

_Question: 4. Response
In addition to providing the requested content above, VA asks for vendors to respond to the following questions:

Describe how you would align the aforementioned activities between contracts, and the recommended price structure for contracts?_

Each microservice would have its own technical, business, and political contract, outline how the service will be delivered, managed, supported, communicated, and versioned. These contracts can be realized individually, or grouped together as a larger, aggregate contract that can be submitted, while still allowing each individual service within that contract to operate independently.

As mentioned before, the microservices approach isn’t just about the technical components. It is about making the business of delivering vital VA services more modular, portable, and scalable. Something that will also decouple and shift the politics of delivering critical services to veterans. Breaking things down into much more manageable chunks that can move forward independently at the contract level.

  • Micro Procurement - One of the benefits of breaking down services into small chunks, is that the money needed to deliver the service can become much smaller, potentially allowing for a much smaller, more liquid and flowing procurement cycle. Each service has a micro definition of the monetization involved with the service, which can be aggregated by groups of services and projects.
  • Micro Payments - Payments for service deliver can be baked into the operations and life cycle of the service. API management excels at measuring how much a service is accessed, and testing, monitoring, logging, security, and other stops along the API life cycle can all be measured, and payments can be delivered depend on quality of service, as well as volume of service.

Amazon Web Services already has the model for defining, measuring, and billing for API consumption in this way. This is the bread and butter of the Amazon Web Services platform, and the cornerstone of what we know as the cloud. This approach to delivering, scaling, and ultimately billing or payment for the operation and consumption of resources, just needs to be realized by the VA, and the rest of the federal government. We have seen a shift in how government views the delivery and operation of technical resources using the cloud over the last five years, we just need to see the same shift for the business of APIs over the next five years.

_Question: The Government envisions a managed service (ie: vendor responsible for all aspects including licenses, scaling, provisioning users, etc.) model for the entire technology stack. How could this be priced to allow for scaling as more APIs are used? For example, would it be priced by users, API calls, etc.?_

API management is where you start this conversation. It has been used for a decade to measure, limit, and quantify the value being exchanged at the API level. Now that API management has been baked into the cloud, we are starting to see the approach being scaled to deliver at a marketplace level. With over ten years of experience with delivering, quantifying, metering and billing at the API level, Amazon is the best example of this monetization approach in action, with two distinct ways of quantifying the business of APIs.

  • AWS Marketplace Metering Service - SaaS style billing model which provides a consumption monetization model in which customers are charged only for the number of resources they use–the best known cloud model.
  • AWS Contract Service - Billing customers in advance for the use of software, providing an entitlement monetization model in which customers pay in advance for a certain amount of usage, which could be used to deliver certain amount of storage per month for a year, or a certain amount of end-user licenses for some amount of time.

This provides a framework for thinking about how the business of microservices can be delivered. Within these buckets, AWS provides a handful of common dimensions for thinking through the nuts and bolts of these approaches, quantifying how APIs can be monetized, in nine distinct areas:

  • Users – One AWS customer can represent an organization with many internal users. Your SaaS application can meter for the number of users signed in or provisioned at a given hour. This category is appropriate for software in which a customer’s users connect to the software directly (for example, with customer-relationship management or business intelligence reporting).
  • Hosts – Any server, node, instance, endpoint, or other part of a computing system. This category is appropriate for software that monitors or scans many customer-owned instances (for example, with performance or security monitoring). Your application can meter for the number of hosts scanned or provisioned in a given hour.
  • Data – Storage or information, measured in MB, GB, or TB. This category is appropriate for software that manages stored data or processes data in batches. Your application can meter for the amount of data processed in a given hour or how much data is stored in a given hour.
  • Bandwidth – Your application can bill customers for an allocation of bandwidth that your application provides, measured in Mbps or Gbps. This category is appropriate for content distribution or network interfaces. Your application can meter for the amount of bandwidth provisioned for a given hour or the highest amount of bandwidth consumed in a given hour.
  • Request – Your application can bill customers for the number of requests they make. This category is appropriate for query-based or API-based solutions. Your application can meter for the number of requests made in a given hour.
  • Tiers – Your application can bill customers for a bundle of features or for providing a suite of dimensions below a certain threshold. This is sometimes referred to as a feature pack. For example, you can bundle multiple features into a single tier of service, such as up to 30 days of data retention, 100 GB of storage, and 50 users. Any usage below this threshold is assigned a lower price as the standard tier. Any usage above this threshold is charged a higher price as the professional tier. Tier is always represented as an amount of time within the tier. This category is appropriate for products with multiple dimensions or support components. Your application should meter for the current quantity of usage in the given tier. This could be a single metering record (1) for the currently selected tier or feature pack.
  • Units – Whereas each of the above is designed to be specific, the dimension of Unit is intended to be generic to permit greater flexibility in how you price your software. For example, an IoT product which integrates with device sensors can interpret dimension “Units” as “sensors”. Your application can also use units to make multiple dimensions available in a single product. For example, you could price by data and by hosts using Units as your dimension. With dimensions, any software product priced through the use of the Metering Service must specify either a single dimension or define up to eight dimensions, each with their own price.

These dimensions reflect the majority of API services being sold out there today, we don’t find ourselves in a rut with measuring value, like just paying per API call. Allowing Lighthouse API plans to possess one or more dimensions, beyond any single use case.

  • Single Dimension - This is the simplest pricing option. Customers pay a single price per resource unit per hour, regardless of size or volume (for example, $0.014 per user per hour, or $0.070 per host per hour).
  • Multiple Dimensions – Use this pricing option for resources that vary by size or capacity. For example, for host monitoring, a different price could be set depending on the size of the host. Or, for user-based pricing, a different price could be set based on the type of user (admin, power user, and read-only user). Your service can be priced on up to eight dimensions. If you are using tier-based pricing, you should use one dimension for each tier.

This provides a framework that Lighthouse can provide to 3rd party developers, allowing them to operate their services within a variety of business models. Derived from many of the hard costs they face, and providing additional volume based revenue, based upon how may API calls of any particular service receives.

Beyond this basic monetization framework, I’d add in an incentive framework that would dovetail with the business models proposed, but then provide different pricing levels depending on how well the services perform, and deliver on the agreed upon API contract. There are a handful of bullets I’d consider here.

  • Design - How well does a service meet API design guidelines set forth in governance guidance.
  • Monitoring - Has a service consistently met its monitoring goals, delivering against an agreed upon service level agreement (SLA).
  • Testing - Beyond monitoring, are APIs meeting granular interface testing, along a regular testing & monitoring schedule.
  • Communication - Are service owners meeting expectations around communication around a service operations.
  • Support - Does a service meet required support metrics, making sure it is responsive and helpful.
  • Ratings - Provide a basic set of metrics, with accompanying ratings for each service.
  • Certification - Allowing service providers to get certified, receiving better access, revenue, and priority.

All of the incentive framework is defined and enforced via the API governance strategy for the platform. Making sure all microservices, and their owners meet a base set of expectations. When you take the results and apply weekly, monthly, and quarterly against the business framework, you can quickly begin to see some different pricing levels, and revenue opportunities around all microservices emerge. You deliver consistent, reliable, highly ranked microservices, you get paid higher percentages, enjoy greater access to resources, and prioritization in different ways via the platform–if you don’t, you get paid less, and operate fewer services.

This model is already visible on the AWS platform. All the pieces are there to make it happen for any platform, operating on top of the AWS platform. The marketplace, billing, and AWS API Gateway connection to API plans exists. When you combine the authentication and service composition available at the AWS API Gateway layer, with the IAM policy solutions available via AWS, an enterprise grade solution for delivering this model securely at scale, comes into focus.

Question: Is there a method of paying or incentivizing the contractor based on API usage?
I think I hit on this with the above answer(s). Keep payments small, and well defined. Measured, reported upon, and priced using the cloud model, connecting to a clear set of API governance guidance and expectations. The following areas can support paying and incentivizing contractors based upon not just usage, but also meeting the API contract.

  • Management - API management puts all microservices into plans, then log, meter, and track on value exchanged at this level.
  • Marketplace - Turning the platform into a marketplace that can be occupied by a variety of internal, pattern, vendor, 3rd party, and public actors.
  • Monetization - Granular understanding of all the resources it takes to deliver each individual service, and understand the costs associated with operating at scale.
  • Plans - A wealth of API plans in place at the API gateway level, something that is tied to IAM policies, and in alignment with API governance expectations.
  • Governance - Providing a map, and supporting guidance around the Lighthouse platform API governance. Understanding, measuring, and enforcing consistency across the API lifecycle–platform wide.
  • Value Exchange - Using the cloud model, which is essentially the original API management, marketplace, and economy model. Not just measuring consumption, but used to maximize and generate revenue from the value exchanged across the platform.

When you operate APIs on AWS and Azure, the platform as a service layer can utilize and benefit from the underlying infrastructure as a service monetization framework. Meaning, you can use AWS’s business model for managing the measuring, paying, and incentivizing of microservice owners. All the gears are there, they just need to be set in motion to support the management of a government API marketplace platform.

Based on the information provided, please discuss your possible technology stack and detail your experience supporting these technologies.
Both Amazon Web Services and Azure provide the building blocks of what you need to execute the above. Each cloud platform has its own approach to delivering infrastructure at scale. Providing an interesting mix of API driven resources you can jumpstart any project.

AWS First, let’s take a look at what is relevant to this vision from the Amazon Web Services side of things. These are all the core AWS solutions on the table, with dashboard, API, and command line access to get the job done.

Compute

Storage Amazon S3 - Scalable Storage in the Cloud Amazon EBS - Block Storage for EC2 Amazon Elastic File System - Managed File Storage for EC2 Amazon Glacier - Low-cost Archive Storage in the Cloud AWS Storage Gateway - Hybrid Storage Integration

Database

Authentication

  • AWS Identity & Access Management - Manage User Access and Encryption Keys
  • Amazon Cognito - Identity Management for your Apps
  • AWS Single Sign-On - Cloud Single Sign-On (SSO) Service
  • a href=”https://aws.amazon.com/cloudhsm/?hp=tile&so-exp=below”>AWS CloudHSM</a> - Hardware-based Key Storage for Regulatory Compliance

Management

Logging

Network

Discovery

Migration

Orchestration

Monitoring

Security

Analytics

Integration

I’m a big fan of the AWS approach. Their marketplace, and AWS API gateway provide unprecedented access to backend cloud, and on-premise resources, which can be secured using AWS IAM. Amazon Web Services products a robust infrastructure as a services, adequate enough to deliver any platform as a services solutions.

Azure

Next, let’s look at the Azure stack to see what they bring to the table. There is definitely some overlap with the AWS list of resources, but Microsoft has a different view of the landscape than Amazon does. However, similar to Amazon, most of the building blocks are here to deliver on the proposal above.

Compute

  • Virtual Machines - Provision Windows and Linux virtual machines in seconds
  • App Service - Quickly create powerful cloud apps for web and mobile
  • Functions - Process events with serverless code
  • Batch - Cloud-scale job scheduling and compute management
  • Container Instances - Easily run containers with a single command
  • Service Fabric - Develop microservices and orchestrate containers on Windows or Linux
  • a href=”https://azure.microsoft.com/en-us/services/virtual-machine-scale-sets/”>Virtual Machine Scale Sets</a> - Manage and scale up to thousands of Linux and Windows virtual machines
  • Azure Container Service (AKS) - Simplify the deployment, management, and operations of Kubernetes
  • Cloud Services - Create highly-available, infinitely-scalable cloud applications and APIs
  • Linux Virtual Machines - Provision virtual machines for Ubuntu, Red Hat, and more
  • Windows Virtual Machines - Provision virtual machines for SQL Server, SharePoint, and more

Storage

  • Storage - Durable, highly available, and massively scalable cloud storage
  • Backup - Simple and reliable server backup to the cloud
  • StorSimple - Lower costs with an enterprise hybrid cloud storage solution
  • Site Recovery - Orchestrate protection and recovery of private clouds
  • Data Lake Store - Hyperscale repository for big data analytics workloads
  • Blob Storage - REST-based object storage for unstructured data
  • Disk Storage - Persistent, secured disk options supporting virtual machines
  • Managed Disks - Persistent, secured disk storage for Azure virtual machines
  • Queue Storage - Effectively scale apps according to traffic
  • File Storage - File shares that use the standard SMB 3.0 protocol

Deployment

  • API Apps - Easily build and consume Cloud APIs

Containers

Databases

Authentication Azure Active Directory - Synchronize on-premises directories and enable single sign-on Multi-Factor Authentication - Add security for your data and apps without adding hassles for users Key Vault - Safeguard and maintain control of keys and other secrets Azure Active Directory B2C - Consumer identity and access management in the cloud

Management

  • API Management - Publish APIs to developers, partners, and employees securely and at scale

Logging

  • Log Analytics - Collect, search, and visualize machine data from on-premises and cloud
  • Traffic Manager - Route incoming traffic for high performance and availability

Monitoring

  • Azure Monitor - Highly granular and real-time monitoring data for any Azure resource
  • Microsoft Azure portal - Build, manage, and monitor all Azure products in a single, unified console

Analytics

Network

  • Content Delivery Network - Ensure secure, reliable content delivery with broad global reach
  • Azure DNS - Host your DNS domain in Azure
  • Virtual Network - Provision private networks, optionally connect to on-premises datacenters
  • Traffic Manager - Route incoming traffic for high performance and availability
  • Load Balancer - Deliver high availability and network performance to your applications
  • Network Watcher - Network performance monitoring and diagnostics solution

Orchestration

  • Scheduler - Run your jobs on simple or complex recurring schedules
  • Automation - Simplify cloud management with process automation
  • Automation & Control - Centrally manage all automation and configuration assets

Integration

  • Data Factory - Orchestrate and manage data transformation and movement
  • Logic Apps - Automate the access and use of data across clouds without writing code
  • Event Grid - Get reliable event delivery at massive scale

Search

Discovery

Security

  • Security Center - Unify security management and enable advanced threat protection across hybrid cloud workloads
  • Security & Compliance - Enable threat detection and prevention through advanced cloud security
  • Azure DDoS Protection - Protect your applications from Distributed Denial of Service (DDoS) attacks

Governance

  • Azure Policy - Implement corporate governance and standards at scale for Azure resources

Monetization

  • Cost Management - Optimize what you spend on the cloud, while maximizing cloud potential

Experience
I have been studying Amazon full time for almost eight years. I’ve been watching Azure play catch up for the last three years. I run my infrastructure, and a handful of clients on AWS. I understand the API landscape of both providers, and how they can be woven into vision proposed so far.

I see the AWS API stack, and the Azure API stack, as a starter set of services that can be built upon to deliver the base Lighthouse implementation. All the components are there. It just need the first set of Lighthouse services to be defined, delivering the essential building blocks any platform needs, things like compute, storage, dns, messaging, etc. I recommend that the VA Lighthouse team take the AWS API stack, and mold it into v1 of the Lighthouse API stack. Take the momentum from AWS’s own API journey, build upon it, and set into motion the VA Lighthouse API journey.

Enable VA services to be delivered as individual, self-contained units, that can be used as part of a larger VA orchestration of veteran services. Open up the VA and let some sunlight in. Think about what Amazon has been able to achieve by delivering its own internal operations as services, and remaking not just retail, but also touching almost every other industry with Amazon Web Services. The Amazon Web Services myth story provides a powerful and compelling narrative for any company, organizations, institution, or government agency like the VA to emulate.

This proposal is not meant to be a utopian vision for the VA. However it is meant to, as the name of the project reflects, shine a light on existing ways of delivering services via the cloud. Helping guide each service in its own individual journey, while also serving the overall mission of the platform–to help the veteran be successful in their own personal journey.


AWS IAM-Like Policies For AWS API Gateway And Marketplace Billing

The primary reason I’ve been adopting more AWS solutions as part of my API stack, and using tools I have historically felt lock me into the AWS ecosystem, is the available of AWS identity and access management (IAM). I just cannot deliver secure at this level as a small business owner, and their robust solution lets me dial in exactly what I need when it comes to defining who has access to what across my API infrastructure. I can define different policies, and apply them at the API management layer using both AWS Lambda and AWS API Gateway. Keeping everything separated, yet with a single API stack as the point of entry, for all consumers and applications.

I want all of this security goodness, but for the business of my APIs. Similar to the engine that drives the relationship between me as an AWS Marketplace user and AWS, I want a framework for applying business policies at the plan level within AWS API Gateway. I want to determine who has access to which resources, as well as what they can use, but I want to be able to meter this usage, and charge different rates. Compute, storage, and bandwidth for my partners is different than for retail API consumers, with a mix of resource and API call based metrics.

The AWS monetization policies would reflect the AWS Marketplace framework, giving me a mix of metering and contract based billing, reflecting single or multi-dimensional usage across the eight areas of consumption they support currently. I want to be able to establish common monetization policies across all my microservices, and allow product managers to implement them consistently at scale using AWS API Gateway. Like security, these API product managers shouldn’t be experts in the economics of the services being offered, they should just be able to apply from a common pool of business policies, and provide feedback on how to evolve, when appropriate.

This concept is very much in the realm of traditional API management service composition, but would possess a machine readable policy format just like IAM policies. API monetization policies could be reported upon, providing breakdown of consumption of resources at the backend system, or front-end API path level, helping translate the monetization side of our API strategy, into actual API plans that can be executed at run-time. Providing a standardized, scalable, quantifiable way to measure the value exchange that occurs at the API level. Done in a way that could be applied internally, or external with partners, and 3rd party developers. Making the business of my APIs more consistent, modular, and reusable–just like the rest of my API infrastructure.

I think AWS has a significant advantage in this area. They have the advanced resource management infrastructure, as well as the business side of all of this from managing their own APIs, but also from slowly rolling it out as part of the AWS Marketplace. AWS API Gateway has the plan, and marketplace key, providing the beginning of the implementation. All we need is the standardized policies based upon their existing pricing framework, and the ability to measure and report upon at the AWS API Gateway plan level. The working parts are there, it just needs to be brought together. It might also be something someone could piece together from logging, and other existing outputs on the AWS platform, and create an external reporting and billing solution. IDK. Just brainstorming, what I’d like to see, and getting it here on the blog before the thought passes.


Provisioning A Default App And Keys For Your API Consumers On Signup

I sign up for a lot of APIs. I love anything that reduces friction when on-boarding, and allows me to begin making an API call in 1-3 clicks. I’m a big fan of API providers that allow me to signup using my Github OAuth, preventing me from having to sign up for yet another account. I’m also a big fan of providers who automatically provision an application for me as part of the signup, and have my API keys waiting for me as soon as I’ve registered.

While signing up for the Sabre travel API I saw that they provisioned my application as part of the API sign up process in a way that was worth showcasing. Saving me the time and hassle of having to add a new application after I’ve signed up. Stuff like this might seem like a pretty small detail when developing an API on-boarding process, but when you are signing up for many different APIs, and trying to manage your time–these little details add up to be a significant time saver.

Ideally, API providers would auto-provision a default application along with the signup, but I like the idea of also giving me the option to name my application while registering. When crafting your API registration flow, make sure you spend time signing up multiple times, and try to put yourself in your API consumers shoes. I even recommend signing up for an account each week, repeatedly experiencing what your consumers will be exposed to. I also recommend spending time signing up for other APIs on a regular basis, to experience what they offer–you will always surprised by what I find.


An Open Banking in the UK OpenAPI Template

After learning more about what Open Banking is doing for APIs in the UK, I realized that I needed an OpenAPI template for the industry specification. There are six distinct schema available as part of the project, and I wanted a complete OpenAPI to describe which paths were available, as well as the underlying response schema. I got work crafting one from the responses that were available within the Open Banking documentation.

Open Banking had schema available for their API definitions, but OpenAPI is the leading API and data specification out there today, so it makes sense that there should be an OpenAPI available, helping all participating banking API providers take advantage of all the tooling available within the OpenAPI community. To help support, I have published my Open Banking OpenAPI definition as a Github Gist:

I’ve applied this OpenAPI definition to the 17 banks they have listed, and will be including them in the next publishing of my API Stack project. Open Banking provides a common definition that can be used across many banks, and an OpenAPI template allows me to quickly apply the common template to each individual bank. Generating bank specific documentation, SDK and code samples, monitoring, tests, and other client tooling. Helping me put the valuable data being made available via each API to work.

I’d like to see more organizations like Open Banking emerge. I’d also like to help ensure they all make OpenAPI templates available for any API and schema specifications they establish. The API lifecycle is increasingly OpenAPI defined, and when you make your guidance available in the OpenAPI format, you are enabling actors within any industry to quickly get up and running with designing, deploying, managing, testing, monitoring, and almost every other stop along a modern API lifecycle. Increasing the chances of adoption for any API standards you are putting out there.


An Opportunity Around Providing A Common OpenAPI Enum Catalog

I’m down in the details of the OpenAPI specification lately, working my way through hundreds of OpenAPI definitions, trying to once again make sense of the API landscape at scale. I’m working to prepare as many API path definitions as I possibly can to be runnable within one or two clicks. OpenAPI definitions, and Postman Collections are essential to making this happen, both of which require complete details on the request surface area for an API. I need to know everything about the path, as well as any headers, path, or query parameters that need to included. A significant aspect of this definition being complete includes default, and enum values being present.

If I can’t quickly choose from a list of values, or run with a default value, when executing an API, the time to seeing a live response grows significantly. If I have to travel back to the HTML documentation, or worse, do some Googling before I can make an API call, I just went from seconds to potentially minutes or hours before I can see a real world API response. Additionally, if there are many potential values available for each API parameter, enums become critical building blocks to helping me understand all the dimensions of an API’s surface area. Something that should have been considered as part of the API’s design, but often just gets left as part of API documentation.

When playing with a Bitcoin API with the following path /blocks/{pool_name}, I need to the list of pools I can choose from. When looking to get a stock market quote from an API with the following path, /stock/{symbol}/quote, I need a list of all the ticker symbols. Having, or not having these enum values at documentation, and execute time, are essential. Many of these lists of values are so common, developers take them for granted. Assuming that API consumers just have them laying around, and really aren’t worth including in documentation. You’d think we all have lists of states, countries, stock tickers, Bitcoin pools, and other data just laying around, but even as the API Evangelist, I often find myself coming up short.

All of this demonstrates a pretty significant opportunity for someone to create a Github hosted, searchable, forkable list of common OpenAPI enum lists. Providing an easy place for API providers, and API consumers to discover simple, or complex lists of values that should be present in API documentation, and included as part of all OpenAPIs. I recommend just publishing each enum JSON or YAML list as a Github Gist, and then publishing as a catalog via a simple Github Pages website. If I don’t see something pop up in the next couple of months, I’ll probably begin publishing something myself. However, I need another API related project like I need a hole in the head, so I’m holding off in hopes another hero or champion steps up and owns the enum portion of the growing OpenAPI conversation.


What Is Open Banking In The UK?

I am profiling banks in the UK as part of an effort move forward my API Stack work, and populate the Streamdata.io API Gallery. One significant advantage that banks in the UK have over other countries in the EU, and even in the US, is the help of Open Banking. To help profile the organization, I’ll just borrow from their website to define who they are and what they do.

The Open Banking Implementation Entity was created by the UK’s Competition and Markets Authority to create software standards and industry guidelines that drive competition and innovation in UK retail banking.

In 2016, The Competition and Markets Authority (CMA) published a report on the UK’s retail banking market which stated that older, larger banks do not have to compete hard enough for customers’ business, and smaller and newer banks were finding it difficult to grow and access the UK banking market. To solve this problem, they proposed a number of remedies including Open Banking, which defines API standards that are intended to help level that playing field.

The role of Open Banking is to:

  • Design the specifications for the Application Programming Interfaces (APIs) that banks and building societies use to securely provide Open Banking
  • Support regulated third party providers and banks and building societies to use the Open Banking standards
  • Create security and messaging standards
  • Manage the Open Banking Directory which allows regulated participants like banks, building societies and third party providers to enroll in Open Banking
  • Produce guidelines for participants in the Open Banking ecosystem
  • Set out the process for managing disputes and complaints

This approach to standardizing API definitions is the type of leadership that is needed to move API conversation forward in ALL industries. I know in the US, many enjoy viewing regulations as always bad, but this type of organizational designation can go a long way towards moving an industry forward in a concerted fashion. Doing the hard work to establish a common API definition, and play a central role in helping ensure each actor within an industry is implementing the definition as expected.

I’d like to see more organizations emerge that reflect Open Banking’s mission, in a variety of industries. Many companies do not have the time, expertise, or desire to do the homework and understand what needs to occur on the API front. Speaking from experience, there is’t a lot of vendor-free funding to do this kind of work, and it is something that will require public sector investment. In my opinion, this doesn’t always have to be government led, but there should be industry neutral funding available to move forward the conversation in a way that benefits everyone involved, without a focus on any single product or service.


Relationship Between OpenAPI Path, Summary, Tags and AysncAPI Topics

I’m working my way through several hundred OpenAPI definitions that I have forked from APIs.guru, Any API, and have automagically generated from API documentation scrape scripts I have developed over time. Anytime I evolve a new OpenAPI definition, I first make sure the summary, description, and tags are as meaningful as they possibly can. Sadly this work is also constrained by how much time I have to spend with each API, as well as how well designed their API is in the first place. I have a number of APIs that help me enrich this automatically, by mining the API path, applying regular expressions, but often times it takes a manual review to add tags, polish summaries, and make the OpenAPI details as meaningful as I possibly can, in regards to what an API does.

As I’m taking a break from this work, I’m studying up on AsyncAPI, trying to get my head around how I can be crafting API definitions for the message-based, event-driven, streaming APIs I’m profiling alongside my regular API research. One of the areas the AsyncAPI team is pushing forward is around the concept of a topic–_“to create a definition that suites most use cases and establish the foundation for community tooling and better interoperability between products when using AsyncAPI.”_ or to elaborate further, “a topic is a string representing where an AsyncAPI can publish or subscribe. For the sake of comparison they are like URLs in a REST API.” Now I’m thinking about the relationships between the API design elements I’m wrestling with in my API definitions, and how the path, summary, and tags reflect what Async is trying to articulate with their topics discussion.

{organization}.{group}.{version}.{type}.{resources}.{event}

  • organization - the name of the organization or company.
  • group - the service, team or department in charge of managing the message..
  • version - the version of the message for the given service. This version number should remain the same unless changes in the messages are NOT backward compatible.
  • type - the type of the message, e.g., is it a command or an event?. This value should always be event unless you’re trying to explicitly execute a command in another service, i.e., when using RPC.
  • resources - resources and sub-resources, in a word (or words) describing the resource the message refers to. For instance, if you’re sending a message to notify a user has just signed up, the resource should be user. But, if you want to send a message to notify a user has just changed her full name, you could name it as user.full_name.
  • event - an event or command name, in case message type is event, this should be a verb in past tense describing what happened to the resource, and in case message type is command, this should be a verb in infinitive form describing what operation you want to perform.

Example(s):

  • hitch.accounts.1.event.user.signedup
  • hitch.email.1.command.user.welcome.send

As I’m crafting OpenAPI definitions, and publishing them to Github, I’m using Jekyll to give me access to the large numbers of OpenAPI definitions I’ve published, and indexed using APIs.json, as Liquid objects. For each site. I can references APIs path using a dotted notation, such as site.twilio.send-sms-get. I haven’t polished my naming conventions, and simply taking the path, stripping out everything but the alpha, numeric characters for the file names, but it got me thinking about how I might want to get more structured in how I name the individual units of compute I’m publishing using OpenAPI, and often times as Postman Collections.

As I publish these API definitions to Github, as part of my API profiling for inclusion in the Streamdata.io API, I’m looking to establish a map of the surface area, that I can potentially turn into webhooks, streamings, and other approaches to real time message delivery. This is why I’m looking to understand AsyncAPI, to help quantify the result of this work. After I map out the surface area of the APIs, and quantify the topics at play, and obtain an API key, I need a way to then map out the real time streams of messages that will get passed around. To do this, I will need a way to turn each potential API response and its resulting request into a topic definition into a well defined, measurable input or output–AsyncAPI is going to help me do this.


People Who Provide Enum For Their OpenAPI Definitions Are Good People

I’m processing a significant amount of OpenAPI definitions currently, as well as crafting a number of them from scraped API documentation. After you work with a lot of OpenAPI definitions, aiming to achieve a specific objective, you really get to know which aspects of the OpenAPI are the most meaningful, and helpful when they are complete. I talked about the importance of summary, description, and tags last week, and this week I’d like to highlight how helpful it is when the stewards of OpenAPI definitions include enum values for their parameters, and I think they are just good people. ;-)

Enums are simply just a list of potential values for each of the parameters you outline as part of your API definition. So if you have state as a parameter for use in the request of your API, you have a list of the 50 US states as the enum. If you the parameter is color, you have just the color black, because we all know it is the only color(all the colors). ;-) If you provide a parameter that will accept a standard set of inputs, you should consider providing an enum list to help your consumers understand the potential for that parameter. Outlining the dimensions of the parameter in a simple JSON or YAML array of every single possible value.

I can’t articulate how many times I have to go looking for a list of values. Sometimes it is present within the description for the OpenAPI, but often times I have to go back to the portal for the API, and follow a link to a page that lists out the values. That is, if an API provider decides to provide this information at all. The thoughtful ones do, the even more thoughtful ones put it in their OpenAPI definitions as enum values. Anytime I come across a list of enums that I can quickly build an array, select, and other common aspects of doing business with APIs, I’m a happy camper.

Which is why you find me writing up enums. Boring. Boring. Boring. However, it is something that makes me happy, potentially multiple times in a single day, and imagine that multiplied by the number of developers you have, or maybe “had”, depending on how frustrating it is to find the values that can be used in your API’s parameters. In my opinion, enums add rich dimensions to what your API does, and can be as important as the overall design of your API. Depending on how you’ve designed your API, you may have invested heavily in design, or may be leaning on your API parameters to do the heavy lifting of helping you–making them even more important when it comes to documenting them as part of your API operations.


Insecurity Around Providing Algorithmic Transparency And Observability Using APIs

I’m working on a ranking API for my partner Streamdata.io to help quantify the efficiencies they bring to the table when you proxy an existing JSON web API using their service. I’m evolving an algorithm they have been using for a while, wrapping it in a new API, and applying it across the APIs I’m profiling as part of my API Stack, and the Streamdata.io API Gallery work. I can pass the ranking API any OpenAPI definition, and it will poll and stream the API for 24 hours, and return a set of scores regarding how real time the API is, and what the efficiency gains are when you use Streamdata.io as a proxy for the API.

As I do this work, I find myself thinking more deeply about the role that APIs can play in helping make algorithms more transparent, observable, and accountable. My API ranking algorithm is pretty crude, but honestly it isn’t much different than many other algorithms I’ve seen companies defend as intellectual property and their secret sauce. Streamdata.io is invested in the ranking algorithm and API being as transparent as possible, so that isn’t a problem here, but each step of the process allows me to think through how I can continue to evangelize other algorithm owners to use APIs, to make their algorithms more observable and accountable.

In my experience, most of the concerns around keeping algorithms secret stem from individual insecurities, and nothing actually technical, mathematical, or proprietary. The reasons for the insecurities are usually that the algorithm isn’t that mathematically sophisticated (I know mine isn’t), or maybe it is pretty flawed (I know mine is currently), and people just aren’t equipped to admit this (I know I am). I’ve worked for companies who venomously defend their algorithms and refuse to open them up, because in the end they know they aren’t defensible on many levels. The only value the algorithm possesses in these scenarios is secrecy, and the perception that there is magic going on behind the scenes. When in reality, it is a flawed, crude, simple algorithm that could actually be improved upon if it was opened up.

I’m not insecure about my lack of mathematical skills, or the limitations of my algorithm. I want people to point out its flaws, and improve upon my math. I want the limitations of the algorithm to be point out. I want API providers and consumers to use the algorithm via the API (when I publish) to validate, or challenge the algorithmic assumptions being put forth. I’m not in the business of selling smoke and mirrors, or voodoo algorithmics. I’m in the business of helping people understand how inefficient their API responses are, and how they can possibly improve upon them. I’m looking to develop my own understanding of how can make APIs more event-driven, real time, and responsive. I’m not insecure about providing transparency and observability around the algorithms I develop, using APIs–all algorithm developers should be as open and confident in their own work.


Using Jekyll And OpenAPI To Evolve My API Documentation And Storytelling

I’m reworking my API Stack work as independent sets of Jekyll collections. Historically I just dumped all APIs.json, and OpenAPIs into the central data folder, and grouped them into folders by company name. Now I am breaking them out into tag based collections, using a similar structure. Further evolving how I document and tell stories using each API. I have been published a single OpenAPI for each platform, but now I’m publishing a separate OpenAPI for each API path–we will see where this goes, it might ultimately end up biting me in the ass. I’m doing this because I want to be able to talk about a single API path, and provide a definition that can be viewed, interpreted, and executed against, independent of the other paths–Jekyll+OpenAPI is helping me accomplish this.

With each API provider possessing its own APIs.json index, and each API path having its own OpenAPI definition, I’m able to mix up how I document and tell stories around these APIs. I can list them by API provider, or by individual API path. I can filter based upon tags, and provide execute-time links that reference each individual unit of API. I have separate JavaScript functions that can be referenced if the API path is GET, POST, or PUT. I can even inherit other relevant links like API sign up or terms of service as part of its documentation. I can reference all of this as part of larger documentation, or within blog posts, and other pages throughout the website–which will be refreshed whenever I update the OpenAPI definition.

If you aren’t familiar with how Jekyll works. It is a static content solution, that allows you do develop collections. You can put CSV, JSON, or YAML into these collections (folders), and they become objects you can reference using Liquid syntax. So if I put Twitter’s APIs.json, and OpenAPI into a folder within my social collection, I can reference as site.social.twitter which is the APIs.json for Twitter’s entire API operations, and I can reference individual APIs as site.social.twitter.search for the individual OpenAPI defining the Twitter search API path. This decouples API documentation for me, and allows me to not just document APIs, but tell stories with API definitions, making my API portals much more interactive, and hopefully engaging.

I just got my API stack approach refreshed using this new format. Now I just need to go through all my APIs and rebuild the underlying Github repository. I have thousands of APIs that I track on, and I’m curious how this approach holds up at scale. While API Stack is a single repository, I can essentially publish any collection of APIs I desire to any of the hundreds of repositories that make up the API Evangelist network. Allowing me to seamless tell stories using the technical details of API operations, and the individual API resources they serve up. Further evolving how I tell stories around the APIs I’m tracking on. While my API documentation has always been interactive, I think this newer, more modular approach, reflects the value each unit of value an API brings to the table, rather than just looking to document all the APIs a provider possesses.


The Importance of the API Path Summary, Description, and Tags in an OpenAPI Definition

I am creating a lot of OpenAPI definitions right now. Streamdata.io is investing in me pushing forward my API Stack work, where I profile API using OpenAPI, and index their operations using APIs.json. From the resulting indexes, we are building out the Streamdata.io API Gallery, which shows the possibilities of providing streaming APIs on top of existing web APIs available across the landscape. The OpenAPI definitions I’m creating aren’t 100% complete, but they are “good enough” for what we are needing to do with them, and are allowing me to catalog a variety of interesting APIs, and automate the proxying of them using Streamdata.io.

I’m finding the most important part of doing this work is making sure there is a rich summary, description, and set of tags for each API. While the actual path, parameters, and security definitions are crucial to programmatically executing the API, the summary, description, and tags are essential so that I can understand what the API does, and make it discoverable. As I list out different areas of my API Stack research, like the financial market data APIs, it is critical that I have a title, and description for each provider, but the summary, description, and tags are what provides the heart of the index for what is possible with each API.

When designing an API, as a developer, I tend to just fly through writing summary, descriptions, and tags for my APIs. I’m focused on the technical details, not this “fluff”. However, this represents one of the biggest disconnects in the API lifecycle, where the developer is so absorbed with the technical details, we forget, neglect, or just don’t are to articulate what we are doing to other humans. The summary, description, and tags are the outlines in the API contract we are providing. These details are much more than just the fluff for the API documentation. They actually describe the value being delivered, and allow this value to be communicated, and discovered throughout the life of an API–they are extremely important.

As I’m doing this work, I realize just how important these descriptions and tags are to the future of these APIs. Whenever it makes sense I’m translating these APIs into streaming APIs, and I’m taking the tags I’ve created and using them to define the events, topics, and messages that are being transacted via the API I’m profiling. I’m quantifying how real time these APIs are, and mapping out the meaningful events that are occurring. This represents the event-driven shift we are seeing emerge across the API landscape in 2018. However, I’m doing this on top of API providers who may not be aware of this shift in how the business of APIs is getting done, and are just working hard on their current request / response API strategy. These summaries, descriptions, and tags, represent how we are going to begin mapping out the future that is happening around them, and begin to craft a road map that they can use to understand how they can keep evolving, and remain competitive.


A Really Nice API Application Showcase Over At The Intrinio Market Data API

I am profiling financial market data APIs currently, and as I’m doing my work profiling APIs, I’m always on the hunt for interesting elements of their API operations that I can showcase for my readers. While looking at the financial market data API from Intrinio, I found that I really, really like their application showcase, which providers a pretty attractive blueprint for how we can showcase what is being develop on top of our APIs.

The Intrinio application showcase is just clean looking, and has the bells and whistles you’d expect like categories, search, detail or list view, and detail pages providing you all the information you need about the application, and where you can find tutorials, code, and other relevant resources.

Another thing I really like is it isn’t just about web and mobile applications. They have spreadsheet integrations, and help walk you through how to “apply” each type of integration. This is what the application in API means to me. It isn’t always just about finished web, mobile, and device applications. It is about applying the resources available via the programmatic interfaces to some problem you have in your world.

Anyways, the Intrinio application showcase is totally worth profiling as part of my research. It is a great blueprint for other API providers to follow when crafting their own application showcases. This post give me a single URL that I can share with folks, and reference throughout my stories, white papers, guides, and talks. I’d love to see this become the standard for how API providers showcase their applications, keeping things simple, clean, and bringing value to their consumers.


How Big Or Small Is An API?

I am working to build out the API Gallery for Streamdata.io, profiling a wide variety of APIs for inclusion in the directory, adding to the wealth of APIs that could be streamed using the service. As I work to build the index, I’m faced with the timeless question regarding, what is an API? Not technically what an API does, but what is an API in the context of helping people discover the API they are looking for. Is Twitter an API, or is the Twitter search/tweets path an API? My answer to this question always distills down to a specific API path, or as some call it an API endpoint. Targeting a specific implementation, use case, or value generated by a single API provider.

Like most things in the API sector, words are used interchangeably, and depending on how much experience you have in the business, you will have much finer grained definitions about what something is, or isn’t. When I’m talking to the average business user, the Twitter API is the largest possible scope–the entire thing. In the context of API discovery, and helping someone find an API to stream or to solve a specific problem in their world, I’m going to resort to a very precise definition–in this case, it is the specific Twitter API path that will be needed. Depending on my audience, I will zoom out, or zoom in on what constitutes a unit of API. The only consistency I’m looking to deliver is regarding helping people understand, and find what they looking for–I’m not worried about always using the same scope in my definition of what an API is.

You can see an example of this in action with the Alpha Vantage market data API I’m currently profiling, and adding to the gallery. Is Alpha Vantage is a single API, or 24 separate APIs? In the context of the Streamdata.io API Gallery, it will be 24 separate APIs. In the context of telling the story on the blog, there is a single Alpha Vantage API, with many paths available. I don’t want someone searching specifically for a currency API to have to wade through all 24 Alpha Vantage paths, I want them to find specifically the path for their currency API. When it comes to API storytelling, I am fine with widening the scope of my definition, but when it comes to API discovery I prefer to narrow the scope down to a more granular unit of value.

For me, it all comes down the definition of what an API is. It is all about applying a programmatic interface. If I’m applying in a story that targets a business user, I can speak in general terms. If I’m applying to solve a specific business problem, I’m going to need to get more precise. This precision can spin out of control if you are dealing with developers who tend to get dogmatic about programming languages, frameworks, platforms, and the other things that make their worlds go round. I’m not in the business of being “right”. I’m in the business of helping people understand, and solve the problems they have. Which gives me a wider license when it comes to defining how big or small an API can be. It is a good place to be.


Some Common Features Of An API Application Review Process

I received a tweet from my friend Kelly Taylor with USDS, asking for any information regarding establishing an “approve access to production data” for developers. He is working on an OAuth + FHIR implementation for the Centers for Medicare and Medicaid Services (CMS) Blue Button API. Establishing a standard approach for on-boarding developers into a production environment always makes sense, as you don’t want to give access to sensitive information without making sure the company, developer, and application has been thoroughly vetted.

As I do with my work, I wanted to think through some of the approaches I’ve come across in my research, and share some tips and best practices. The Blue Button API team has a section published regarding how to get your application approved, but I wanted to see if I can expand on, while also helping share this information with other readers. This is a relevant use case that I see come up regularly in healthcare, financial, education, and other mainstream industries.

Virtualization & Sandbox The application approval conversation usually begins with ALL new developers being required to work with a sandboxed set of APIs, only providing production API access to approved developers. This requires having a complete set of virtualized APIs, mimicking exactly what would be used in production, but in a much safer, protected environment. One of the most important aspects of this virtualized environment is that there also needs to be robust sets of virtualized data, providing as much parity regarding what developers will experience when they enter the production environment. The sandbox environment needs to be as robust and reliable as the production, which is a mistake I see made over and over from providers, where the sandbox isn’t reliable, or as functional, and developers never are able to reach production status in a consistent and reliable way.

Doing a Background Check Next, as reflected in the Blue Button teams approach, you should be profiling the company and organization, as well as the individual behind each application. You see company’s like Best Buy refusing any API signup that doesn’t have an official company domain that can be verified. In addition to requiring developers provide a thorough amount of information about who they are, and who they work for, many API providers are using background and profiling services like Clearbit to obtain more details about a user based upon their email, IP address, and company domain. Enabling different types of access to API resources depending on the level of scrutiny a developer is put under. I’ve seen this level of scrutiny go all the way up to requiring the scanning of drivers license, and providing corporate documents before production access is approved.

Purpose of Application One of the most common filtering approaches I’ve seen centers around asking developer about the purpose of their application. The more detail the better. As we’ve seen from companies like Twitter, the API provider holds a lot of power when it comes to deciding what types of applications will get built, and it is up to the developer to pitch the platform, and convince them that their application will serve the mission of the organization, as well as any stakeholders, and end-users who will be leveraging the application. This process can really be a great filter for making sure developers think through what they are building, requiring them to put forth a coherent proposal, otherwise they will not be able to get full access to resources. This part of the process should be conducted early on in the application submission process, reducing frustrations for developers if their application is denied.

Syncing The Legal Department Also reflected in the Blue Button team’s approach is the syncing of the legal aspects of operating an API platform, and it’s applications. Making sure the application’s terms of service, privacy, security, cookie, branding, and other policies are in alignment with the platform. One good way of doing this is offering a white label edition of the platforms legal documents for use by the each application. Doing the heavy legal work for the application developers, while also making sure they are in sync when it comes to the legal details. Providing legal develop kits (LDK) will grow in prominence in the future, just like providing software development kits (SDK), helping streamline the legalities of operating a safe and secure API platform, with a wealth of applications in its service.

Live or Virtual Presentation Beyond the initial pitch selling an API provider on the concept of an application, I’ve seen many providers require an in-person, or virtual demo of the working application before it can be added to a production environment, and included in the application gallery. It can be tough for platform providers to test drive each application, so making the application owners do the hard work of demonstrating what an application does, and walking through all of its features is pretty common. I’ve participated on several judging panels that operate quarterly application reviews, as well as part of specific events, hackathons, and application challenges. Making demos a regular part of the application lifecycle is easier to do when you have dedicated resources in place, with a process to govern how it will all work in recurring batches, or on a set schedule.

Getting Into The Code As part of the application review process many API providers require that you actually submit your code for review via Github. Providing details on ALL dependencies, and performing code, dependency, and security scans before an application can be approved. I’ve also see this go as far as requiring the use of specific SDKs, frameworks, or include proxies within the client layer, and requiring all HTTP calls be logged as part of production applications. This process can be extended to include all cloud and SaaS solutions involved, limiting where compute, storage, and other resources can be operated. Requiring all 3rd party APIs in use be approved, or already on a white list of API providers before they can be put to use. This is obviously the most costly part of the application review process, but depending on how high the bar is being set, it is one that many providers will decide to invest in, ensuring the quality of all applications that run in a production environment.

Regular Review & Reporting One important thing about the application review process is that it isn’t a one time process. Even once an application is accepted an added into the production environment, this process will need to be repeated for each version release of the application, along with the changes to the API. Of course the renewal process might be shorter than the initial approval workflow, but auditing and regular check-in should be common, and not forgotten. This touches on the client level SDK, and API management logging needs of the platform, and that regular reporting upon application usage and activity should be available in real time, as well as part of each application renewal. API operations is always about taking advantage the real time awareness introduced at the API consumption layer, and staying in tune with the healthy, and not so healthy patterns that emerge from logging everything an application is doing.

Business Model It is common to ask application developers about their business model. The absence of a business model almost always reflects the underlying exploitation and sale of data being access or generated as part of application’s operation. Asking developers how they will make money and sustain their operations, along with regular checkins to make sure it is truly in effect, is an easy to ensure that applications are protecting the interests of the platform, its partners, and the applications end-users.

There are many other approaches I’ve seen API providers require before accepting an application into production. However, I think we should also be working hard to keep the process simple, and meaningful. Of course, we want a high bar for quality, but as with everything in the API world, there will always be compromises in how we deliver on the ground. Depending on the industry you are operating the bar will be made higher, or possibly lowered a little to allow for more innovation. I’ve included a list of some of the application review process I found across my research–showing a wide range of approaches across API providers we are all familiar with. Hopefully that helps you think through the application review process a little more. It is something I’ll write about again in the future as I push forward my research, and distill down more of the common building blocks I’m seeing across the API landscape.

Some Leading Application Review Processes


Code Generation Of OpenAPI (fka Swagger) Still The Prevailing Approach

Over 50% of the projects I consult on still generate OpenAPI (fka Swagger) from code, rather then the other way around. When I first begin working with any API development group as an advisor, strategist, or governance architect I always ask, “are you using OpenAPI?” Luckily the answer is almost always yes. The challenge is that most of the time they don’t understand the full scope of how to use OpenAPI, and are still opting for the more costly approach–writing code, then generating OpenAPI from annotations. It has been over five years since Jakub Nesetril(@jakubnesetril) of Apiary first decoupled this way of doing API design first, but clearly we still have a significant amount of work when it comes to API definition and design literacy amongst development groups.

When you study where API services and tooling are headed it is clear that API deployment, and the actual writing of code is getting pushed further down in the life cycle. Services like Stoplight.io, and Postman are focusing on enabling a design, mock, document, test, and iterate approach, with API definitions (OpenAPI, Postman, etc) at the core. The actual deployment of API, either using open source frameworks, API gateways, or other method, is coming into the picture more downstream. Progressive API teams are hammering out exactly the API they need without ever writing any code, making sure the API design is dialed in before the more expensive, and often permanent code gets written and sent to production.

You will see me hammering on this line of API design first messaging on API Evangelist over the next year. Many developers still see OpenAPI (fka Swagger) about generating API documentation, not as the central contract that is used across every stop along the API lifecycle. Most do not understand that you can mock instead of deploying, and even provide mock data, errors, and other scenarios, allowing you to prototype applications on top of API designs. It will take a lot of education, and awareness building to get API developers up to speed that this is all possible, and begin the long process of changing behavior on the ground. Teams just are used to this way of thinking, but once they understand what is possible, they’ll realize what they have been missing.

I need to come up with some good analogies for generating API definitions from code. It really is an inefficient, and a very costly way to get the job done. Another problem is that this approach tends to be programming language focused, which always leaves its mark on the API design. I’m going to be working with both Stoplight.io and Postman to help amplify this aspect of delivering APIs, and how their services and tooling helps streamline how we develop our APIs. I’m going to be working with banks, insurance, health care, and other companies to improve how they deliver APIs, shifting things towards a design-first way of doing business. You’ll hear the continued drumbeat around all of this on API Evangelist in coming months, as I try to get the attention of folks down in the trenches, and slowly shift the behavior towards a better way of getting things done.


The Growing Importance of Github Topics For Your API SEO

When you are operating an API, you are always looking for new ways to be discovered. I study this aspect of operating APIs from the flip-side–how do I find new APIs, and stay in tune with what APIs are to? Historically we find APIs using ProgrammableWeb, Google, and Twitter, but increasingly Github is where I find the newest, coolest APIs. I do a lot of searching via Github for API related topics, but increasingly Github topics themselves are becoming more valuable within search engine indexes, making them an easy way to uncover interesting APIs.

I was profiling the market data API Alpha Vantage today, and one of the things I always do when I am profiling an API, is I conduct a Google, and then secondarily, a Github search for the APIs name. Interestingly, I found a list of Github Topics while Googling for Alpha Vantage API, uncovering some interesting SDKs, CLI, and other open source solutions that have been built on top of the financial data API. Showing the importance of operating your API on Github, but also working to define a set of standard Github Topic tags across all your projects, and helping encourage your API community to use the same set of tags, so that their projects will surface as well.

I consider Github to be the most important tool in an API providers toolbox these days. I know as an API analyst, it is where I learn the most about what is really going on. It is where I find the most meaningful signals that allow me to cut through the noise that exists on Google, Twitter, and other channels. Github isn’t just for code. As I mention regularly, 100% of my work as API Evangelist lives within hundreds of separate Github repositories. Sadly, I don’t spend as much time as I should tagging, and organizing projects into meaningful topic areas, but it is something I’m going to be investing in more. Conveniently, I’m doing a lot of profiling of APIs for my partner Streamdata.io, which involves establishing meaningful tags for use in defining real time data stream topics that consumers can subscribe to–making me think a little more about the role Github topics can play.

One of these days I will do a fresh roundup of the many ways in which Github can be used as part of API operations. I’m trying to curate and write stories about everything I come across while doing my work. The problem is there isn’t a single place I can send my readers to when it comes to applying this wealth of knowledge to their operations. The first step is probably to publish Github as its own research area on Github (mind blown), as I do with my other projects. It has definitely risen up in importance, and can stand on its own feet alongside the other areas of my work. Github plays a central role in almost every stop along the API life cycle, and deserves its own landing page when it comes to my API research, and priority when it comes to helping API providers understanding what they should be doing on the platform to help make their API operations more successful.


A Summary Of AWS API Gateway As An API Deployment and Management Solution

I was providing an overview of Kong, AWS API Gateway, and other solutions for a team I’m advising a couple weeks back. I was just looking to distill down some of the key features, and provide an overview to a large, distributed team. This work lends itself well to publishing here on the blog, so I published an overview of Kong yesterday, and today I wanted to publish the summary of the AWS API Gateway. The API gateway solution from AWS has some overlap with what Kong delivers, but I consider it to be more of an API deployment, as well as an API management gateway.

The AWS API Gateway brings API deployment front and center, allowing you to define and deploy APIs that are wired up to your backend (AWS) infrastructure:

  • API Endpoint - a host name of the API. the API endpoint can be edge-optimized or regional, depending on where the majority of your API traffic originates from. You choose a specific endpoint type when creating an API.
  • Backend Endpoint - A backend endpoint is also referred to as an integration endpoint and can be a Lambda function, an HTTP webpage, or an AWS service action, or a mock interface.
  • Swagger / OpenAPI - Using Swagger to import and export API configuration and definitions.

Then the gateway brings a wealth of API management features, providing a look at how it has been baked into the cloud now:

  • Accounts - Creation and management of Accounts.
  • Keys - Creation and management of API Keys
  • Certificates - Adding and management of certifications
  • Documentation - Publishing of ApI documentation
  • Domains - Mapping of domains
  • Response - Custom Gateway responses.
  • Models - Management of schema models.
  • Validation - Validation of API requests
  • SDK Generation - Generating of client SDKs
  • Staging - Establishing of stages
  • Tags - Tagging of resources
  • Templates - Mapping template used to transform a payload.
  • Plans - Establishing of different plans for API usage.
  • VPC - Usage of VP under the caller’s account in a region.
  • Regions - Deployment of gateways in different AWS regions.
  • Serverless - Usage of Lambda for serves integration.
  • Logging - Logging using Cloudwatch.
  • IAM - You can use AWS administration and security tools, such as AWS Identity and Access Management (IAM) and Amazon Cognito, to authorize access to your APIs.

Then of course, everything with AWS has two separate programmatic interfaces for you to work with everything:

  • API - Programmatic access through hypermedia API.
  • Command Line - The AWS CLI is an open source tool built on top of the AWS SDK for Python that provides commands for interacting with AWS services.

AWS API Gateway doesn’t have some of the bells and whistles associated with other leading API management solutions, however it makes up for this with its API deployment capabilities–answering the age old question, which of the API management solutions will help me deploy my APIs. If you are operating your infrastructure within AWS, then AWS API Gateway makes a lot of sense. The connectivity it brings to the table is hard to ignore. What really sold it for me, is the IAM part of the equation. Before using AWS, I never had fine grained policies for what backend systems my APIs can or cannot access.

I avoided AWS API Gateway for a while. I was waiting for it to mature, and looking for enough benefit to get me beyond my vendor lock-in fears with API infrastructure. The IAM and Serverless aspects of delivering APIs are the features that pushed me to the point where I’m now using it for about 50% of my API infrastructure. It isn’t as portable, and versatile as solutions like Kong or Tyk are, but it does provide a solid set of API deployment and management features for me to put to work on projects that are already running in the AWS cloud.


Your Microservices Effort Will Fail Because You Will Never Decouple Your Business

I’m regularly surprised by companies who are doing microservices which are failing to see the need the change organizational culture, and that microservices will be some magic voodoo to fix all their legacy technical debt. That simply decoupling and breaking down the technology, without any re-evaluation of the business and politics behind, will fix everything, and set the company, organizations, institution, or government agency on a more positive trajectory. In coming years, we will continue to hear stories about why microservices do not work, from endless waves or groups who were unable to do the hard work to decouple, and reorganize the operations behind the services they provide.

The monolith legacy systems I’m seeing targeted are widely seen as purely technology, which is why it is often labeled as technical debt. What is missing from this targeting and labeling is any acknowledgement of the people and decisions behind the monolith. The years of business, political, and cultural investment into the monolith. How will we every unwind, or properly address the monolith, if we do not see the organizational, human, and business aspects of why it exists in the first place? Are we talking about the business decisions that went into creating and perpetuating the monolith? It is highly likely we will be making some of the same decisions with microservices, which could end up being worse than when we made them with a single system. Distributed mess, is often more painful than consolidated mess.

I’m seeing endless waves of large organizations mandating that their teams invest in microservices, with no mandating for microteams, microbudgets, microdecisionmaking, or any of the other decoupling needed to make microservices truly work independently. I attach micro as a joke. I really don’t feel micro is the constant that needs applying when it comes to services, or the business and organizational mechanism behind them. However, it is the word du jour, and one that gets at some of the illnesses our organizations are facing in 2018. In reality, it is more about decoupling and decomposing the technology, business, and politics of our operations, into meaningful units that can be deployed, operate, and deprecated independent of each other.

My point is that your microservices effort will fail if you aren’t addressing the business side of the equation. If your microservices team(s) still exist within your legacy organizational structure, you really haven’t decoupled or decomposed anything. The old way of making decisions, dealing with budget impacts, will still reflect what happened with the previous monolith. Your technology will be independently operating, but still beholden to the same ways of deciding and funding what actually happens on the ground. The result will resemble having a entirely new motor, where you are running without lubricant, or possibly old, thick, expired lubricant that prevents your new motor from ever delivering at full capacity, and eventually breaking down in ways you have never imagined while operating your existing monolith.


A Summary Of Kong As An API Management Solution

I was breaking down what the API management solution Kong delivers for a customer of mine, and I figured I’d take what I shared via the team portal, and publish here on the blog. It is an easy way for me to create content, and make my consulting work more transparent here on the blog. I am using Kong as part of several healthcare and financial projects currently, and I am actively employing it to ensure customers are properly managing their APIs. I wasn’t the decision maker on any of these projects when it came to choosing the API management layer, I am just the person who is helping standardize how they are using API services and tooling across the API life cycle for these projects.

First, Kong is an open source API management solution with an easy to install community edition, and enterprise level support when needed. They provide an admin interface, and developer portal for the API management proxy, but there is also a growing number of community editions like KongDash, and Konga emerging to make it a much more richer ecosystem. And of course, Kong has an API for managing the API management layer, as every API service and tooling provider should have.

Now, let’s talk about what Kong does for helping in the deploying of your APIs:

  • API Routing - The API object describes an API that’s being exposed by Kong. Kong needs to know how to retrieve the API when a consumer is calling it from the Proxy port. Each API object must specify some combination of hosts, uris, and methods
  • Consumers - The Consumer object represents a consumer - or a user - of an API. You can either rely on Kong as the primary datastore, or you can map the consumer list with your database to keep consistency between Kong and your existing primary datastore.
  • Certificates - A certificate object represents a public certificate/private key pair for an SSL certificate.
  • Server Name Indication (SNI) - An SNI object represents a many-to-one mapping of hostnames to a certificate.

Then it focuses on the core aspects of what is needed to help manage your APIs:

  • Authentication - Protect your services with an authentication layer.
  • Traffic Control - Manage, throttle, and restrict inbound and outbound API traffic.
  • Analytics - Visualize, inspect, and monitor APIs and microservice traffic.
  • Transformations - Transform requests and responses on the fly.
  • Logging - Stream request and response data to logging solutions.

After that, it has a bunch of added features to help make it a scalable, evolvable solution:

  • DNS-based loadbalancing - When using DNS based load balancing the registration of the backend services is done outside of Kong, and Kong only receives updates from the DNS server.
  • Ring-balancer - When using the ring-balancer, the adding and removing of backend services will be handled by Kong, and no DNS updates will be necessary.
  • Clustering - A Kong cluster allows you to scale the system horizontally by adding more machines to handle more incoming requests. They will all share the same configuration since they point to the same database. Kong nodes pointing to the same datastore will be part of the same Kong cluster.
  • Plugins - lua-nginx-module enables Lua scripting capabilities in Nginx. Instead of compiling Nginx with this module, Kong is distributed along with OpenResty, which already includes lua-nginx-module. OpenResty is not a fork of Nginx, but a bundle of modules extending its capabilities.
  • API - Administrative API access for programmatic control.
  • CLI Reference - The provided CLI (Command Line Interface) allows you to start, stop, and manage your Kong instances. The CLI manages your local node (as in, on the current machine).
  • Serverless - Invoke serverless functions via APIs.

There are a number of API management solutions available out there today. I will profile each one I am actively using as part of my work on the ground. I’m agnostic towards which provider my clients should use, but I like having the details about what features they bring to the table readily available via a single URL, so that I can share when these conversations come up. I have many API management solutions profiled as part of my API management research, but in 2018 there are just a handful of clear leaders in the game. I’ll be focusing on the ones who are still actively investing in the API community, and the ones I have an existing relationship with in a partnership capacity. Streamdata.io is a reseller of Kong in France, making it something I’m actively working with in the financial space, and also something I’m using within the federal government, also bringing it front and center for me in the United States.

If you have more questions about Kong, or any other API management solution, feel free to reach out, and I’ll do my best to answer any questions. We are also working to provide more API life cycle, strategy, and governance services along with my government API partners at Skylight, and through my mainstream API partners at Streamdata.io. If you need help understanding the landscape and where API management solutions like Kong fits in, me and my partners are happy to help out–just let us know.


Aggregating Multiple Github Account RSS Feeds Into Single JSON API Feed

Github is the number one signal in my API world. The activity that occurs via Github is more important than anything I find across Twitter, Facebook, LinkedIn, and other social channels. Commits to repositories and the other social activity that occurs around coding projects is infinitely more valuable, and telling regarding what a company is up to, than the deliberate social media signals blasted out via other channels is. I’m always working to dial in my monitoring of Github using the Github API, but also via the RSS feeds that are present on the public side of the platform.

I feel RSS is often overlooked as an API data source, but I find that RSS is not only alive and well in 2018, it is something that is actively used on many platforms. The problem with RSS for me, is the XML isn’t always conducive to working with in many of my JavaScript enabled applications, and I also tend to want to aggregate, filter, and translate RSS feeds into more meaningful JSON. To help me accomplish this for Github, I crafted a simple PHP RSS aggregator and converter script which I can run in a variety of situations. I published the basic script to Github as a Gist, for easy reference.

The simple PHP script just takes an array of Github users, loops through them, pulls their RSS feeds, and then aggregates them into a single array, sorts by date, and then outputs as JSON. It is a pretty crude JSON API, but it provides me with what I need to be able to use these RSS feeds in a variety of other applications. I’m going to be mining the feeds for a variety of signals, including repo and user information, which I can then use within other applications. The best part is this type of data mining doesn’t require a Github API key, and is publicly available, allowing me to scale up much further than I could with the Github API alone.

Next, I have a couple of implementations in mind. I’m going to be creating a Github user leaderboard, where I stream the updates using Streamdata.io to a dashboard. Before I do that, I will have to aggregate users and repos, incrementing each commit made, and publishing as a separate JSON feed. I want to be able to see the raw updates, but also just the most changed repositories, and most active users across different segments of the API space. Streamdata.io allows me to take these JSON feeds and stream them to the dashboard using Server-Sent Events(SSE), and then applying each update using JSON Patch. Making for a pretty efficient way to put Github to work as part of my monitoring of activity across the API space.


Having A Developer.[YourDomain] Is Clear Differentiator In The API Game

I am profiling US, UK, French, and German banks as part of some research I am doing for Streamdata.io. I am profiling how far along in the API journey these banks are, and one clear differentiator for me is whether a bank has a developer.[bankdomain] subdomain setup for their APIs or not. The banks that have a dedicated subdomain for their API operations have a clear lead over those who do not. The domain doesn’t do much all by itself, but it is clear that when a bank can get this decision made, many of the other decisions that need to be made are also happening in tandem.

This isn’t unique just to banking. This is something I’ve written about several times over the years, and remains constant after looking at thousands of APIs over the last eight years. When a company’s API presence exists within the help section of their website, the API is almost always secondary to the core business. When a company relies on a 3rd party service for their API and developer presence, it almost always goes dormant after a couple months, showing that APIs are just not a priority within the company. Having a dedicated subdomain, landing page, and set of resources dedicated to doing APIs goes a long way towards ensuring an API program gains the momentum it needs to be successful within an organization, and industry.

I know that having a dedicated subdomain for API operations seems like a small thing to many folks. However, it is one of the top symptoms of a successful API in my experience. Making data, content, and algorithms available in a machine readable way for use in other applications by 3rd party via the web is something every company, organization, institution, and government agency should be doing in 2018. It is the next iteration of the web, and is not something that should be a side project. Having a dedicated subdomain demonstrates that you understand this, and an API won’t just be the latest trend at your organization. Even if your APIs are entirely private in the beginning, having a public portal for your employees, partners, and other stakeholders will go along way towards helping you get the traction you are looking for in the API game.


Streaming And Event-Driven Architecture Represents Maturity In The API Journey

Working with Streamdata.io has forced a shift in how I see the API landscape. When I started working with their proxy I simply saw it about doing API in real time. I was hesitant because not every API had real time needs, so I viewed what they do as just a single tool in my API toolbox. While Server-Sent Events, and proxying JSON APIs is just one tool in my toolbox, like the rest of the tools in my toolbox it forces me to think through what an API does, and understand where it exists in the landscape, and where the API provider exists in their API journey. Something I’m hoping the API providers are also doing, but I enjoy doing from the outside-in as well.

Taking any data, content, media, or algorithm and exposing as an API, is a journey. It is about understanding what that resource is, what it does, and what it means to the provider and the consumer. What this looks like day one, will be different from what it looks like day 365 (hopefully). If done right, you are engaging with consumers, and evolving your definition of the resource, and what is possible when you apply it programmatically through the interfaces you provide. API providers who do this right, are leveraging feedback loops in place with consumers, iterating on their APIs, as well as the resources they provide access to, and improving upon them.

Just doing simple web APIs puts you on this journey. As you evolve along this road you will begin to also apply other tools. You might have the need for webhooks to start responding to meaningful events that are beginning to emerge across the API landscape, and start doing the work of defining your event-driven architecture, developing lists of most meaningful topics, and events that are occurring across your evolving API platform. Webhooks provide direct value by pushing data and content to your API consumers, but they have indirect value in helping you define the event structure across your very request and response driven resource landscape. Look at Github webhook events, or Slack webhook events to understand what I mean.

API platforms that have had webhooks in operation for some time have matured significantly towards and event-driven architecture. Streaming APIs isn’t simply a boolean thing. That you have data that needs to be streamed, or you don’t. That is the easy, lazy way of thinking about things. Server-Sent Events (SSE) isn’t just something you need, or you don’t. It is something that you are ready for, or you aren’t. Like webhooks, I’m seeing Server-Sent Events (SSE) as having the direct benefits of delivering data and content as it is updated, to the browser or for other server uses. However, I’m beginning to see the other indirect benefits of SSE, and how it helps define the real time nature of a platform–what is real time? It also helps you think through the size, scope, and efficiency surrounding the use of APIs for making data, content, and algorithms available via the web. Helping us think through how and when we are delivering the bits and bytes we need to get business done.

I’m learning a lot by applying Streamdata.io to simple JSON APIs. It is adding another dimension to the API design, deployment, and management process for me. There has always been an evolutionary aspect of doing APIs for me. This is why you hear me call it the API journey on a regular basis. However, now that I’m studying event-driven architecture, and thinking about how tools like webhooks and SSE assist us in this journey, I’m seeing an entirely new maturity layer for this API journey emerge. It goes beyond just getting to know our resources as part of the API design, and deployment process. It builds upon API management and monitoring and helps us think through how our APIs are being consumed, and what the most meaningful and valuable events are. Helping us think through how we deliver data and content over the web in a more precise manner. It is something that not every API provider will understand right away, and only those a little further along in their journey will be able to take advantage of. The question is, how do we help others see the benefits, and want to do the hard work to get further along in their own API journey.


More Outputs Are Better When It Comes To Establishing An API Observability Ranking

I’ve been evolving an observability ranking for the APIs I track on for a couple years now. I’ve bene using the phrase to describe my API profiling and measurement approach since I first learned about the concept from Stripe. There are many perspectives floating around the space about what observability means in the context of technology, however mine is focused completely on APIs, and is more about communicating with external stakeholders, more than it is just about monitoring of systems. To recap, the Wikipedia definition for observability is:

Formally, a system is said to be observable if, for any possible sequence of state and control vectors, the current state can be determined in finite time using only the outputs (this definition is slanted towards the state space representation). Less formally, this means that from the system’s outputs it is possible to determine the behavior of the entire system. If a system is not observable, this means the current values of some of its states cannot be determined through output sensors.

Most of the conversations occurring in the tech sector are focused on monitoring operations, and while this is a component of my definition, I lean more heavily on the observing occurring beyond just internal groups, and observability being about helping keep partners, consumers, regulators, and other stakeholders be more aware regarding how complex systems work, or do not work. I feel that observability is critical to the future of algorithms, and making sense of how technology is impacting our world, and APIs will play a critical role in ensuring that the platforms have the external outputs required for delivering meaningful observability.

When it comes to quantifying the observability of platforms and algorithms, the more outputs available the better. Everything should have APIs for determining the inputs and outputs of any algorithm, or other system, but there should also be operational level APIs that give insight into the underlying compute, storage, logging, DNS, and other layers of delivering technological solutions. There should also be higher level business layer APIs surrounding communication via blog RSS, Twitter feeds, and support channels like email, ticketing, and other systems. The more outputs around platform operations there are, the more we can measure, quantify, and determine how observable a platform is using the outputs that already exist for ALL the systems in use across operations.

To truly measure the observability of a platform I need to be able to measure the technology, business, and politics surrounding its operation. If communication and support exist in the shadows, a platform is not observable, even if there are direct platform APIs. If you can’t get at the operational layer like logging, or possibly Github repositories used as part of continuous integration or deployment pipelines, observability is diminished. Of course, not all of these outputs should all be publicly available by default, but in many cases, there really is no reason they can’t. At a minimum there should be API access, with some sort of API management layer in place, allowing for 3rd party auditors, and analysts like me to get at some, or all of the existing outputs, allowing us to weigh in on an overall platform observability workflow.

As I continue to develop my API observability ranking algorithm, the first set of values I calculate are the number of existing outputs an API has. Taking into consideration the scope of core and operational APIs, but also whether I can get at operations via Twitter, Github, LinkedIn, and other 3rd party APIs. I find many of these channels more valuable for understanding platform operations, than the direct APIs themselves. Chatter by employees, and commits via Github can provide more telling signals about what is happening, than the intentional signals emitted directly by the platform itself. Overall, the more outputs available the better. API observability is all about leveraging existing outputs, and when companies, organizations, institutions, and government agencies are using solutions that have existing APIs, they are more observable by default, which can have a pretty significant impact in helping us understand the impact a technological solution is having on everyone involved.


Labeling Your High Usage APIs and Externalizing API Metrics Within Your API Documentation

I am profiling a number of market data APIs as part of my research with Streamdata.io. As I work my way through the process of profiling APIs I am always looking for other interesting ideas for stories on API Evangelist. One of the things I noticed while profiling Alpha Vantage, was that they highlighted their high usage APIs with prominent, very colorful labels. One of the things I’m working to determine in this round of profiling is how “real time” APIs are, or aren’t, and the high usage label adds another interesting dimension to this work.

While reviewing API documentation it is nice to have labels that distinguish APIs from each other. Alpha Vantage has a fairly large number of APIs so it is nice to be able to focus on the ones that are used the most, and are more popular. For example, as part of my profiling I focused on the high usage technical indicator APIs, rather than profiling all of them. I need to be able to prioritize my work, and these labels helped me do that. Providing one example of the benefit that these types of labels can bring to the table. I’m guessing that there are many other time saving aspects of labeling popular APIs, beyond just saving me time.

This type of labeling is an interesting way of externalizing API analytics in my opinion. Which is another interesting concept to think about across API operations. How can you take the most meaningful data points across your API management processes, and distill them down, externalize and share them so that your API consumers can benefit from valuable API metrics? In this context, I could see a whole range of labels that could be established, applied to interactive documentation using OpenAPI tags, and made available across API documentation, helping make APIs even more dynamic, and in sync with how they are actually being used, measured, and making an impact on operations.

I’m a big fan of making API documentation even more interactive, alive, and meaningful to API consumers. I’m thinking that tagging and labeling is how we are going to do this in the future. Generating a very visual, but also semantic layer of meaning that we can overlay in our API documentation, making them even more accessible by API consumers. I know that Alpha Advantages’s high usage labels have saved me significant amounts work, and I’m sure there are other approaches that could continue delivering in this way. It is something I’m keeping a close eye in this increasingly event-driven, API landscape, where API integration is becoming more dynamic and real time.


Be Clear About Your API Pricing

I’m profiling a large number of APIs right now, and I am ranking APIs based upon how easy or difficult they are to access. Whether or not an API provide has a business model is part of the ranking, and how clearly articulated the access and pricing is around that model is a critical part of my profiling algorithm. The APIs that end up included in the API gallery I’m developing for Streamdata.io, and available as part of my wider API Stack research will all have to possess easy to articulate access levels. Not all of them will be free, but the ones that cost money will have straightforward pricing that can be articulate in a single sentence–something that seems to be elusive with many of the API providers I am profiling.

I am regularly confused regarding the myriad of ways in which API providers obfuscate the pricing for their APIs. I’ve long been weary of API providers who don’t have a clear business model, but when they have a pricing page, but bu do not consistently apply it to APIs, I’m just left confounded. I can’t always tell if it is done maliciously, or they just haven’t approached their API through an external lens. If I find a pricing page, and the plans seem reasonable, and I’ve plugged my credit card in, but then I still don’t have access to some APIs, and there is no clear labeling of which APIs I have access to as part of my plan, I just can’t spend the afternoon testing and seeing which APIs return a 403 to understand the landscape. The API service composition, and pricing tiers needs to be coherent and front and center, otherwise I just have to move on. If I can’t communicate what is going on to others, it won’t be included in my work.

I do not have a problem with different tiers of access, as long as they are communicated, and information about them is accessible. I won’t complain when some APIs are out of reach to me, and placed in premium tiers–I’ll just pass that information on to my readers. However, if I have to do some sort of secret handshake, or call some special sales hotline to understand what is going on, in my experience there are usually other illnesses occurring behind the scene, and I’m pretty well conditioned to just move on. If you have a publicly available API, be clear about your pricing. Even if I need approval for higher levels of usage, or it costs me to gain access to high level tiers. Don’t play games, there are too many APIs out there to mess around with hidden API pricing plans, unless of course you really aren’t interested in folks covering your API, and putting them to use, which I feel like some of these companies are actually hoping occurs.


A Dedicated Guest Blogger Program For Your API

I get endless waves of people wanting to “guest post” on API Evangelist. It isn’t something I’m interested in because of the nature of API Evangelist, and that it really is just my own stream of consciousness, and not about selling any particular product or service. However, if you are an API provider, looking for quality content for your blog, having a formal approach to managing guest bloggers might make sense. Sure, you don’t want to accept all the spammy requests that you will get, but with the right process, you could increase the drumbeat around your API, and build relationships with your partners and API consumers.

There is an example of this in action at the financial data marketplace Intrinio, with their official blogger program. The blogging program for the platform has a set of established benchmarks defined by the Intrinio team, to establish quality for any post that is accepted as part of the program. What I find really interesting, is that they also offer three months of free access to data feeds for API consumers who publish a post via the platform. “Exceptional” participants in the program may have their free access extended, and ALL participants will receive discounts on paid data access subscriptions via the platforms APIs.

This is the type of value exchange I like to see via API platforms. Too many APIs are simple one way streets, paying for GET access to data, content, media, and algorithms. API management shouldn’t be just about about metering one way access and charging for it. Sensible API management should measure value exchange around ALL platform resources, including blog and forum posts, and other activities API providers should be incentivizing via their platforms. This is one of the negative side effects of REST I feel–too much focus on resources, and not about the events that occur around these resources. Something we are beginning to move beyond in an event-driven API landscape.

Next, I will be profiling the concept of having dedicated data partner programs for your API platform. Showcasing how your API consumers can submit their own data APIs for resell alongside your own resources. In my opinion, every API platform should be opening up every resource for GET, POST, PUT, and DELETE, as well as allow for the augmenting, aggregation, enrichment, and introduction of other data, content, media, and algorithms, to add more value to what is already going on. Opening up a dedicated guest blogger program modeled after Intrinio’s is a good place to start. Learning about how to set up guidelines and benchmarks for submission, and evolving your API management to allow for incentivizing of participation. Once you get your feet wet with the blog, you may want to expand to other resources available via the platform, making your API operations a much more community thing.


You Have to Know Where All Your APIs Are Before You Can Deliver On API Governance

I wrote an earlier article that basic API design guidelines are your first step towards API governance, but I wanted to introduce another first step you should be taking even before basic API design guides–cataloging all of your APIs. I’m regularly surprised by the number of companies I’m talking with who don’t even know where all of their APIs are. Sometimes, but not always, there is some sort of API directory or catalog in place, but often times it is out of date, and people just aren’t registering their APIs, or following any common approach to delivering APIs within an organization–hence the need for API governance.

My recommendation is that even before you start thinking about what your governance will look like, or even mention the word to anyone, you take inventory of what is already happening. Develop an org chart, and begin having conversations. Identify EVERYONE who is developing APIs, and start tracking on how they are doing what they do. Sure, you want to get an inventory of all the APIs each individual or team is developing or operating, but you should also be documenting all the tooling, services, and processes they employ as part of their workflow. Ideally, there is some sort of continuous deployment workflow in place, but this isn’t a reality in many of the organization I work with, so mapping out how things get done is often the first order of business.

One of the biggest failures of API governance I see is that the strategy has no plan for how we get from where we are to where we ant to be, it simply focuses on where we want to be. This type of approach contributes significantly to pissing people off right out of the gate, making API governance a lot more difficult. Stop focusing on where you want to be for a moment, and focus on where you are. Build a map of where people are, tools, services, skills, best and worst practices. Develop a comprehensive map of where organization is today, and then sit down with all stakeholders to evaluate what can be improved upon, and streamlined. Beginning the hard work of building a bridge between your existing teams and what might end up being a future API governance strategy.

API design is definitely the first logical step of your API governance strategy, standardizing how you design your APIs, but this shouldn’t be developed from the outside-in. It should be developed from what already exists within your organization, and then begin mapping to healthy API design practices from across the industry. Make sure you are involving everyone you’ve reached out to as part of inventory of APIs, tools, services, and people. Make sure they have a voice in crafting that first draft of API design guidelines you bring to the table. Without buy-in from everyone involved, you are going to have a much harder time ever reaching the point where you can call what you are doing governance, let alone seeing the results you desire across your API operations.


Riot Games Regional API Endpoints

I’m slowly categorizing all the APIs I find who are offering up some sort regional availability as part of their operations. With the easy of deployment using leading cloud services, it is something I am beginning to see more frequently. However, there is still a wide variety of reasons why an API provider will invest in this aspect of their operations, and I’m looking to understand more about what these motivations are. Sometimes it is because they are serving a global audience, and latency kills the experience, but other times I’m seeing it is more about the maturity of the API provider, and they’ve have such a large user base that they are getting more requests to deliver resources closer to home.

The most recent API provider I have come across who is offering regional API endpoints is from Riot Games, the makers of League of Legends, who offers twelve separate regions for you to chose from, broken down using a variety of regional subdomains. The Riot Games API provides a wealth of meta data around their games, and while they don’t state their reasons for providing regional APIs, I’m guessing it is to make sure the meta data is localized to whichever country their customers are playing in. Reducing an latency across networks, making the overall gaming and supporting application experience as smooth and seamless as possible. Pretty standard reasons for doing regional APIs, and providing a simple example of how you do this at the DNS level.

RIot Games also provides a regional breakdown of the availability of their regional endpoints on their API status page, adding another dimension to the regional API delivery conversation. If you are providing regional APIs, you should be monitoring them, and communicating this to your consumers. This is all pretty standard stuff, but I’m working to document every example of regional APIs I come across as part of my research. I’m considering adding a separate research area to track on the different approaches so I can publish a guide, and supporting white papers when I have enough information organized. All part of my work to understand how the API business operates, and is expanding. Showcasing how the leaders are delivering resources via APIs in a scalable way.


Consistency in Branding Across API Portals

I recently watched a BBC documentary about the history of the branding used as part of the London Underground. I’m pretty absorbed lately with using public transit as an analogy for complex API implementations, and moving beyond just using subway maps, I thought the branding strategy for the London Underground provided other important lessons for API providers. The BBC documentary went into great detail regarding how much work was put into standardizing the font, branding, and presentation of information for each London Underground, to help reduce confusion, and help riders get where they needed, and making the city operate more efficiently.

As I continue to study the world of API documentation, I think we have so much work ahead of us when it comes to standardizing how we present our API portals. Right now every API portal is different, even often times with multiple portals from the same company–see Amazon Web Services for example. I think we underestimate the damage this has to the overall API experience for consumers, and why we see API documentation like Swagger UI, Slate, and Read the Docs have such an impact. However this is just documentation, and we need this to occur as part of the wider API portal user experience. I’ve seen some standardized open source API portal solutions, and there are a handful of API portal services out there, but there really is no standard for how we deliver, brand, and operate the wider API experience.

I have my minimum viable API portal definition, and have been tracking on the common building blocks of API operations for eight years now, but there are no “plug and play” solutions that users can implement, following any single approach. I have the data, and I even have a simple Twitter Bootstrap version of my definition (something I’m upgrading ASAP), but in my experience people get very, very, very hung up on the visual aspects of this conversation, want different visual elements, and quickly get lost on the functional details. I’m working with my partners APIMATIC to help standardize their portal offering, but honestly it is something that needs to be wider than just me, and any single provider. It is something that needs to emerge as a common API portal standard. If we can bring this into focus, I think we will see API adoption significantly increase, reducing much of the confusion we all face getting up and running with any new API.


Keeping API Schema Simple For Wider Adoption

One aspect of my talk at APIDays Paris this last week, included a slide about considering to allow API consumers to negotiate CSV responses from our API. Something that would probably NEVER occur to most API providers, and probably would make many even laugh at me. I’m used to it, and don’t care. While not something that every API provider should be considering, depending on the data you are serving up, and who your target API consumer ares, it is something that might make sense. Allowing for the negotiation of CSV responses represents lowering the bar for API consumption, and widening the audience who can put our APIs to work.

I was doing more work around public data recently, and was introduced to an interesting look at some lessons from developing open data standards. I’m doing a deep dive into municipal data lately as part of my partnership with Streamdata.io, and I found the lessons they published interesting, and something that reflects my stance on API content negotiation.

From the development and maintenance of the API, it quickly became clear that adjusting scripts after every election (and by-election) and website modification, was quickly becoming unsustainable. To address this issue, a simple CSV schema was developed to encourage standardisation of this data from the outset. The schema was designed to be as simple and easy to understand and implement as possible. Comprised of just 21 fields, 7 of which are recommended fields, the schema does not have hierarchical relationships between terms and can be implemented in a single CSV file. By making the standard this simple, we were able to get a number of adopters onboard and outputting their lists of elected representatives on their own open data portals.

When it comes to APIs, simplicity rules. The simpler you can make your API, the more impact you will make. Allowing for the negotiation of CSV responses from your API when possible allows API consumers to go from API to a spreadsheet in just one or two clicks. This is huge when it comes to on boarding business users with the concepts of APIs, and what they do, and allows them to easily put valuable data resources to work in their native environment–the spreadsheet. This is something many API consumers won’t understand, but when it comes to seeking meaningful API adoption, it is something that expand the reach of any API beyond the developer class, putting it within reach of business users.

I am a big fan of pushing our APIs to allow for the negotiation of CSV. XML, and JSON by default, whenever possible. I’m also a fan of delivering richer experiences by allowing for the negotiation of hypermedia media types. While delivering hypermedia takes a significant amount of thought and investment, allowing for the negotiation of CSV, XML, and JSON doesn’t take a lot of work. When delivering your APIs, I highly recommend thinking about who your API consumers are, and whether offering CSV responses might shift the landscape even a little bit, making your valuable data resources a little more usable by business users who won’t necessarily be delivering web or mobile applications.


API Quota API, Webhooks, and Server-Sent Events (SSE)

I am profiling market data APIs as part of my partnership with Streamdata.io. It is a process I enjoy, because it provides me with a number of interesting stories I can tell here on API Evangelist. Many of the APIs I profile just frustrate me, but there are always the gems who are doing interesting things with their APIs, and understand providing APIs, as well as consuming APIs. One API that I’ve been profiling, and I am able to put to use in my work to build a gallery of real time data APIs, was 1Forge.

1Forge provides dead simple APIs for accessing market data, and surprise!! – you can sign up for a key, and begin making API calls within minutes. It might not sound like that big of a deal, but after going through 25+ APIs, I only have about 5 API keys. I’m working on an OpenAPI definition for 1Forge, so I can begin to poll, and stream the data they make available, including it in the Streamdata.io API gallery I’m building. However, as I was getting up and running with the API, I noticed their quota endpoint, which allows me to check my usage quote with the 1Forge API–something that I thought was story worthy.

The idea of an endpoint to check my applications usage quota for an API seems like a pretty fundamental concept, but sadly it is something I do not see very often. It is something that should be default for ALL APIs, but additionally I’d like to see a webhook for, letting me know when my API consumption reaches different levels. Since I’m talking about Streamdata.io, it would also make sense to offer a Server-Sent Event (SSE) for the API quote endpoint, allowing me to bake the usage quota for all the APIs I depend on into a single API dashboard–streaming real time usage information across the APIs depend on, and maybe displaying things in RED when I reach certain levels.

An API quota API is useful for when you depend on a single API, but is something that becomes almost critical when it comes to depending on many APIs. These are one of those APIs that API providers are going to need to realize has to be present by default for their API platforms. It is something that can keep us humans in tune with our consumption, but more importantly can help us programmatically manage our API consumption, and adjust our polling frequency automatically as reach the limits of our API access tier, or even upgrade as we realize our rate limit constraints are too tight for a specific application. I’m going to add an API quota API to my list of default administrative APIs that API providers should be offering. Updating the default set of resources we should have available for ALL APIs we are operating.


The More We Know About You The More API Access You Get

I’ve been trash talking APIs that identify me as part of some sort of sales funnel, and automate the decision around whether or not I get access to their API. My beef isn’t with API providers profiling me and making decisions about how much access I get, it is about them limiting profiles making it so I do not get access to their APIs at all. Their narrow definitions of the type of API consumers they are seeking does not include me, even though I have thousands of regular readers of my blog who do fit their profile. In the end, it is their loss, not mine, that they do not let me in, but the topic is still something I feel should be discussed out in the open, hopefully expanding the profile definitions for some API providers who may not have considered the bigger picture.

I’ve highlighted the limiting profiling of API consumers that prevent access to APIs, but now I want to talk about how profiling can be sensibly used to limit access to API resources. Healthy API management always has an entry level tier, but what tiers are available after that often depend on a variety of other data points. One thing I see API providers regularly doing is requiring API consumers to provide more detail about who they are and what they are doing with an API. I don’t have any problem with API providers doing this, making educated and informed decisions regarding who an API consumer is or isn’t. As the API Evangelist I am happy to share more data points about me to get more access. I don’t necessarily want to do this to sign up for your entry level access tier, just so I can kick the tires, but if I’m needing deeper access, I am happy to fill our a fuller profile of myself, and what I am working on.

Stay out of my way when it comes to getting started and test driving your APIs. However, it is perfectly acceptable to require me to disclose more information, require me to reach out an connect with your team, and other things that you feel are necessary before giving me wider access to your APIs, and provide me with looser rate limits. I encourage API providers to push on API consumers before you give away the keys to the farm. Developing tiered levels of access is how you do this. Make me round off the CRM entry for my personal profile, as well as my company. Push me to validate who I am, and that my intentions are truly honest. I encourage you to reach out to each one of your API consumers with an honest “hello” email after I sign up. Don’t require me to jump on the phone, or get pushy with sales. However, making sure I provide you with more information about myself, my project and company in exchange for higher levels of API access is a perfectly acceptable way of doing business with APIs.


Learning About The Headers Used for gRPC over HTTP/2

I am learning more about gRPC and HTTP/2, as part of the recent expansion of my API toolbox. I’m not a huge fan of Protocol Buffers, however I do get the performance gain they introduce, but I am very interested in learning more about how HTTP/2 is being used as a transport. While I’ve been studying how websockets, Kafka, MQTT, and other protocols have left the boundaries of HTTP and are embracing the performance gains available in the pure TCP realm, I’m more intrigued by the next generation of HTTP as a transport.

Part of my learning process is all about understanding the headers available to us in the HTTP/2 realm. I’ve been learning more about the next generation HTTP headers from the gRPC Github repository which provides details on the request and response headers in play.

HTTP/2 API Request Headers

  • Request-Headers → Call-Definition *Custom-Metadata
  • Call-Definition → Method Scheme Path TE [Authority] [Timeout] Content-Type [Message-Type] [Message-Encoding] [Message-Accept-Encoding] [User-Agent]
  • Method → “:method POST”
  • Scheme → “:scheme “ (“http” / “https”)
  • Path → “:path” “/” Service-Name “/” {method name} # But see note below.
  • Service-Name → {IDL-specific service name}
  • Authority → “:authority” {virtual host name of authority}
  • TE → “te” “trailers” # Used to detect incompatible proxies
  • Timeout → “grpc-timeout” TimeoutValue TimeoutUnit
  • TimeoutValue → {positive integer as ASCII string of at most 8 digits}
  • TimeoutUnit → Hour / Minute / Second / Millisecond / Microsecond / Nanosecond
  • Hour → “H”
  • Minute → “M”
  • Second → “S”
  • Millisecond → “m”
  • Microsecond → “u”
  • Nanosecond → “n”
  • Content-Type → “content-type” “application/grpc” [(“+proto” / “+json” / {custom})]
  • Content-Coding → “identity” / “gzip” / “deflate” / “snappy” / {custom}
  • Message-Encoding → “grpc-encoding” Content-Coding
  • Message-Accept-Encoding → “grpc-accept-encoding” Content-Coding *(“,” Content-Coding)
  • User-Agent → “user-agent” {structured user-agent string}
  • Message-Type → “grpc-message-type” {type name for message schema}
  • Custom-Metadata → Binary-Header / ASCII-Header
  • Binary-Header → {Header-Name “-bin” } {base64 encoded value}
  • ASCII-Header → Header-Name ASCII-Value
  • Header-Name → 1*( %x30-39 / %x61-7A / “_” / “-“ / “.”) ; 0-9 a-z _ - .
  • ASCII-Value → 1*( %x20-%x7E ) ; space and printable ASCII

HTTP/2 API Response Headers

  • Response → (Response-Headers *Length-Prefixed-Message Trailers) / Trailers-Only Response-Headers → HTTP-Status [Message-Encoding] [Message-Accept-Encoding] Content-Type *Custom-Metadata
  • Trailers-Only → HTTP-Status Content-Type Trailers
  • Trailers → Status [Status-Message] *Custom-Metadata
  • HTTP-Status → “:status 200”
  • Status → “grpc-status” 1*DIGIT ; 0-9
  • Status-Message → “grpc-message” Percent-Encoded
  • Percent-Encoded → 1*(Percent-Byte-Unencoded / Percent-Byte-Encoded)
  • Percent-Byte-Unencoded → 1*( %x20-%x24 / %x26-%x7E ) ; space and VCHAR, except %
  • Percent-Byte-Encoded → “%” 2HEXDIGIT ; 0-9 A-F

I’m enjoying getting down to the nitty gritty details of how HTTP/2 works. I’m intrigued by the multi-directionality of it. Being able to use just like HTTP/1.1 with simple requests and responses, but also being able to introduce bi-directional API calls, where you can make many different API calls as you want. I don’t think I will get any time to play with in the near future. I have way too much work. However, I do like learning about how it is being used, and I think Google is the most forward thinking when it comes to HTTP/2 adoption in the API sector–providing multi-speed APIs in JSON using HTTP/1.1, or Protocol Buffers using HTTP/2.


I Appreciate The Request To Jump On Phone But I Have Other APIs To Test Drive

Streamdata.io is investing in my API Stack work as we build out their API Gallery of valuable data streaming APIs. I’m powering through hundreds of APIs and using my approach to profiling APIs that I have been developing over the last eight years of operating API Evangelist. I have a large number of APIs to get through, so I don’t have a lot of time to spend on each API. I am quickly profiling and ranking them to quickly identify which one’s are worth my time. While there are many elements that get in the way of me actually being able to obtain an API key and begin using an API, one of the more frustrating elements when API providers require me to jump on the phone with them before I can test drive any APIs.

I’ve encountered numerous APIs that require me talk to a sales person before I can do anything. I know that y’all think this is savvy. This is how business is done these days, but it just isn’t the way you start conversations with API consumers. Sure, there should be support channels available when I need them, but it SHOULD NOT be the way you begin a conversation with us API consumers. I’ve heard all the reasons possible for why companies feel like they need to do this, and I guarantee that all of them are based upon out of date perspectives around what APIs are all about. Often times they are a bi-product of not having a modern API management solution in place, and a team that lacks a wider awareness of the API sector and how API operations works.

In 2018, I shouldn’t have to talk to you on the phone to understand what your API does, and how it fits into what I’m working on. Most of the time I do not even know what I’m working on. I’m just kicking the tires, seeing what is possible, and considering how it fits into my bigger picture. What good does it do for me to jump on the phone if I don’t even know what I’m working on? I can’t tell you much. You can’t share API responses with me. You will able to do less than if you just give me access to APIs, and allow me to make API calls. You don’t have to allow me to make too many calls, just a handful to get going. You don’t even have to give me access to ALL the APIs, just enough of them to wet my appetite and help me understand what it is that you do. This is done using modern API management solutions, and service composition. Giving you the control over exactly how mcuh of your resources I will have access to, until I prove myself worthy.

The APIs I come across that require me to jump on sales call will have to wait until later. I just won’t have the time to evaluate their value, and understand where they fit into my work. Which means they probably won’t ever make it into my project, or any of my storytelling around the work. Which means many of these APIs will not get the free exposure to my readers, helping them understand what is possible. It is just one of many self-inflicted wounds API providers make along the way when they leave their enterprise blinders on, and are too restrictive around their API resources. Sales still has a place in the API game, but the overall API strategy has significantly evolved in the last five years, and is something that is pretty easy to see if you spend time playing with other leading APIs on the market. Demonstrating that these providers probably haven’t done much due diligence about what is out there, which often is just yet another symptom of a poorly run API program, making passing on it probably a good idea.


API Is Not Just REST

This is one of my talks from APIDays Paris 2018. Here is the abstract: The modern API toolbox includes a variety of standards and methodologies, which centers around REST, but also includes Hypermedia, GraphQL, real time streaming, event-driven architecture , and gRPC. API design has pushed beyond just basic access to data, and also can be about querying complex data structures, providing experience rich APIs, real-time data streams with Kafka and other standards, as well as also leveraging the latest algorithms and providing access to machine learning models. The biggest mistake any company, organization, or government agency can do is limit their API toolbox to be just about REST. Learn about a robust and modern API toolbox from the API Evangelist, Kin Lane.

Diverse Toolbox
After eight years of evangelizing APIs, when I participate in many API conversations, some people still assume I’m exclusively talking about REST as the API Evangelist–when in reality I am simply talking about APIs that leverage the web. Sure, REST is a dominant design pattern I shine a light on, and has enjoyed a significant amount of the spotlight over the last decade, but in reality on the ground at companies, organizations, institutions, and government agencies of all shapes and sizes, I find a much more robust API toolbox is required to get the job done. REST is just one tool in my robust and diverse toolbox, and I wanted to share with you what I am using in 2018.

The toolbox I’m referring tool isn’t just about what is needed to equip an API architect to build out the perfect vision of the future. This is a toolbox that is equipped to get us from present day into the future, acknowledging all of the technical debt that exists within most organizations which many are looking to evolve as part of their larger digital transformation. My toolbox is increasingly pushing the boundaries of what I’ve historically defined as an API, and I’m hoping that my experiences will also push the boundaries of what you define as an API, making you ready for what you will encounter on the ground within organizations you are delivering APis within.

Application Programming Interface
API is an acronym standing for application programming interface. I do not limit the scope of application in the context to just be about web or mobile application. I don’t even limit it to the growing number of device-based applications I’m seeing emerge. For me, application is about applying the digital resources made available via an programmatic interface. I’m looking to take the data, content, media, and algorithms being made available via APIs and apply them anywhere they are needed on the web, within mobile and device applications, or on the desktop, via spreadsheets, digital signage, or anywhere else that is relevant, and sensible in 2018.

API does not mean REST. I’m really unsure how it got this dogmatic association, nor do I care. It is an unproductive legacy of the API sector, and one I’m looking to move beyond. Application programming interfaces aren’t the solution to every digital problem we face. They are about understanding a variety of protocols, messaging formats, and trying to understand the best path forward depending on your application of the digital resources you are making accessible. My API toolbox reflects this view of the API landscape, and is something that has significantly evolved over the last decade of my career, and is something that will continue to evolve, and be defined by what I am seeing on the ground within the companies, organizations, institutions, and government agencies I am working with.

SOAP
I have been working with databases since 1987, so I fully experienced the web services evolution of our industry. During the early years of the web, there was a significant amount of investment into thinking about how we exchanged data across many industries, as well as within individual companies when it came to building out the infrastructure to deliver upon this vision. The web was new, but we did the hard work to understand how we could make data interoperability in a machine readable way, with an emphasis on the messages we were exchanging. Looking back I wish we had spent more time thinking about how we were using the web as a transport, as well as the influence of industry and investment interests, but maybe it wasn’t possible as the web was still so new.

While web services provided a good foundation for delivering application programming interfaces, it may have underinvested in its usage of the web as a transport, and became a victim of the commercial success of the web. The need to deliver web applications more efficiently, and a desire to hastily use the low cost web as a transport quickly bastardized and cannibalized web services, into a variety of experiments and approaches that would get the job done with a lot less overhead and friction. Introducing efficiencies along the way, but also fragmenting our enterprise toolbox in a way which we are still putting back together.

XML & JSON RPC
One of the more fractious aspects of the web API evolution has been the pushback when API providers call their XML or JSON remote procedure call (RPC) APIs, RESTful, RESTish, or other mixing of philosophy and ideology, which has proven to be a dogma stimulating event. RESTafarians prefer that API providers properly define their approach, while many RPC providers could care less about labels, and are looking to just get the job done. Making XML and JSON RPC a very viable approach to doing APIs, something that still persists almost 20 years later.

Amazon Web Services, Flickr, Slack, and other RPC APIs are doing just fine when it comes to getting the job done, despite the frustration, ranting, and shaming by the RESTafarians. It isn’t an ideal approach to delivering programmatic interfaces using the web, but it reflects its web roots, and gets the job done with low cost web infrastructure. RPC leaves a lot of room for improvement, but is a tool that has to remain in the toolbox. Not because I am designing new RPC APIs, but there is no doubt that at some point I will have to be integrating with an RPC API to do what you need to get done in my regular work.

REST at Center
Roy Fielding’s dissertation on representational state transfer, often referred to as simple REST, is an amazing piece of work. It makes a lot of sense, and I feel is one of the most thorough looks at how to use the web for making data, content, media, and algorithms accessible in a machine readable way. I get why so many folks feel it is the RIGHT WAY to do things, and one of the reasons it is the default approach for many API designers and architects–myself included. However, REST is a philosophy, and much like microservices, provides us with a framework to think about how we put our API toolbox to work, but isn’t something that should blind us from the other tools we have within our reach.

REST is where I begin most conversations about APIs, but it doesn’t entirely encompass what I mean when every time I use the phrase API. I feel REST has given me an excellent base for thinking about how I deliver APIs, but will slow my effectiveness if I leave my REST blinders on, and let dogma control the scope of my toolbox. REST has shown me the importance of the web when talking about APIs, and will continue to drive how I deliver APIs for many years. It has shown me how to structure, standardize, and simplify how I do APIs, and help my applications reach as wide as possible audience, using commonly understood infrastructure.

Negotiating CSV
As the API Evangelist, I work with a lot of government, and business users. One thing I’ve learned working with this group is the power of using comma separated values (CSV) as a media type. I know that us developers and database folks enjoy a lot more structure in our lives, but I have found that allowing for the negotiation of CSV responses from APIs, can move mountains when it comes to helping onboard business users, and decision makers to the potential of APIs–even if the data format doesn’t represent the full potential of an API. CSV responses is the low bar I set for my APIs, making them accessible to a very wide business audience.

CSV as a data format represents an anchor for the lowest common denominator for API access. As a developer, it won’t be the data format I personally will negotiate, but as a business user, it very well could mean the difference between using an API or not. Allowing me to take API responses and work with them in my native environment, the Excel spreadsheet, or Google Sheets environment. As I am designing my APIs, I’m always thinking about how I can make my resources available to the masses, and enabling the negotiation of CSV responses whenever possible, helps me achieve my wier objectives.

Negotiating XML
I remember making the transition from XML to JSON in 2009. At first I was uncomfortable with the data format, and resisted using it over my more proven XML. However, I quickly saw the potential for the scrappy format while developing JavaScript applications, and when developing mobile applications. While JSON is my preferred, and default format for API design, I am still using XML on a regular basis while working with legacy APIs, as well as allowing for XML to be negotiated by the APIs I’m developing for wider consumption beyond the startup community. Some developers are just more comfortable using XML over JSON, and who knows, maybe by extending an XML olive branch, I might help developers begin to evolve in how they consume APIs.

Similar to CSV, XML represents support for a wider audience. JSON has definitely shifted the landscape, but there are still many developers out there who haven’t made the shift. Whether we are consumers of their APIs, or providing APIs that target these developers, XML needs to be on the radar. Our toolbox needs to still allow for us to provide, consume, validate, and transform XML. If you aren’t working with XML at all in your job, consider yourself privileged, but also know that you exist within a siloed world of development, and you don’t receive much exposure to many systems that are the backbone of government and business.

Negotiating JSON
I think about my career evolution, and the different data formats I’ve used in 30 years. It helps me see JSON as the default reality, not the default solution. It is what is working now, and reflects not just the technology, but also the business and politics of doing APIs in a mobile era, where JavaScript is widely used for delivering responsive solutions via multiple digital channels. JSON speaks to a wide number of developers, but we can’t forget that it is mostly comprised of developers who have entered the sector in the last decade.

JSON is the default media type I use for any API I’m developing today. No matter what my backend data source is. However, it is just one of several data formats I will potentially open up for negotiation. I feel like plain JSON is lazy, and whenever possible I should be thinking about a wider audience by providing CSV and XML representations, but I should also be getting more structured and standardized in how I handle the requests and responses for my API. While I want my APIs to reach as wide as possible audience, I also want them to deliver rich results that best represents the data, content, media, algorithms, and other digital resources I’m serving up.

Hypermedia Media Types
Taking the affordances present when humans engage with the web via browsers for granted is one of the most common mistakes I make as an API design, developer, and architect. This is a shortcoming I am regularly trying to make up for by getting more sophisticated in my usage of existing media types, and allowing for consumers to negotiate exactly the content they are looking for, and achieve a heightened experience consuming any API that I deliver. Hypermedia media types provide a wealth of ways to deliver consistent experiences, that help be deliver many of the affordances we expect as we make use of data, content, media, and algorithms via the web.

Using media types like Hal, Siren, JSON API, Collection+JSON, and JSON-LD are allowing me to deliver a much more robust API experience, to a variety of API clients. Hypermedia reflects where I want to be when it comes to API design and architecture that leverages the web, but it is a reflection I have to often think deeply about as I still work to reach out to a wide audience, forcing me to make it one of several types of experience my consumers can negotiate. While I wish everyone saw the benefits, sometimes I need to make sure CSV, XML, and simpler JSON are also on the menu, ensuring I don’t leave anyone behind as I work to bridge where we are with where I’d like to go.

API Query Layers
Knowing my API consumers is an important aspect of how I use my API toolbox. Depending on who I’m targeting with my APIs, I will make different decisions regarding the design pattern(s) I put to work. While I prefer investing resources into the design of my APIs, and crafting the URLs, requests, and responses my consumers will receive, in some situations my consumers might also need much more control over crafting the responses they are getting back. This is when I look to existing API query languages like Falcor or GraphQL to give my API consumers more of a voice in what their API responses will look like.

API query layers are never a replacement for a more RESTful, or hypermedia approaches to delivering web APIs, but they can provide a very robust way to hand over control to consumers. API design is important for providers to understand, and define the resources they are making available, but a query language can be very powerful when it comes to making very complex data and content resources available via a single API URL. Of course, as with each tool present in this API toolbox, there are trade offs with deciding to use an API query language, but in some situations it can make the development of clients much more efficient and agile, depending on who your audience is, and the resources you are looking to make available.

Webhooks
In my world APIs are rarely a one way street. My APIs don’t just allow API consumers to poll for data, content, and updates. I’m looking to define and respond to events, allowing data, and content to be pushed to consumers. I’m increasingly using Webhooks as a way to help my clients make their APIs a two-way street, and limit the amount of resources it takes to make digital assets available via APIs. Working with them to define the meaningful events that occur across the platform, and allow API consumers to subscribe to these events via Webhooks. Opening the door for API providers to deliver a more event-driven approach to doing APIs.

Webhooks are the 101 level of event-driven API architecture for API providers. It is where you get started trying to understand the meaningful events that are occurring via any platform. Webhooks are how I am helping API providers understand what is possible, but also how I’m training API consumers in a variety of API communities about how they can deliver better experiences with their applications. I see webhooks alongside API design and management, as a way to help API providers and consumers better understand how API resources are being used, developing a wider awareness around which resources actually matter, and which ones do not.

Websub
In 2018, I am investing more time in putting Websub, formerly known as the word which none of us could actually pronounce, PubSubHubbub. This approach to making content available by subscription as things change has finally matured into a standard, and reflects the evolution of how we deliver APIs in my opinion. I am using Websub to help me understand not just the event-driven nature of the APIs I’m delivering, but also that intersection of how we make API infrastructure more efficient and precise in doing what it does. Helping us develop meaningful subscriptions to data and content, that adds another dimension to the API design and even query conversation.

Websub represents the many ways we can orchestrate our API implementations using a variety of content types, push and pull mechanisms, all leveraging web as the transport. I’m intrigued by the distributed aspect of API implementations using Websub, and the discovery that is built into the approach. The remaining pieces are pretty standard API stuff using GETs, POSTs, and content negotiation to get the job done. While not an approach I will be using by default, for specific use cases, delivering data and content to known consumers, I am beginning to put Websub to work alongside API query languages, and other event-driven architectural approaches. Now that Websub has matured as a standard, I’m even more interested in leveraging it as part of my diverse API toolbox.th

Server Sent Events (SSE)
I consider webhooks to be the gateway drug for API event-driven architecture. Making API integrations a two street, while also making them more efficient, and potentially real time. After webhooks, the next tool in my toolbox for making API consumption more efficient and real time are server-sent events (SSE). Server-sent events (SSE) is a technology where a browser receives automatic updates from a server via a sustained HTTP connection, which has been standardized as part of HTML5 by the W3C. The approach is primarily used to established a sustained connection between a server, and the browser, but can just as easily be used server to server.

Server-sent events (SSE) delivers one-way streaming APIs which can be used to send regular, and sustained updates, which can be more efficient than regular polling of an API. SSE is an efficient way to begin going beyond the basics of client-server request and response model and pushing the boundaries of what APIs can do. I am using SSE to make APIs much more real time, while also getting more precise with the delivery of data and content, leverage other standards like JSON Patch to only provide what has changed, rather than sending the same data out over the pipes again, making API communication much more efficient.

Websockets
Shifting things further into real time, websockets is what I’m using to deliver two-way API streams that require data be both sent and received, providing full-duplex communication channels over a single TCP connection. WebSocket is a different TCP protocol from HTTP, but is designed to work over HTTP ports 80 and 443 as well as to support HTTP proxies and intermediaries, making it compatible with the HTTP protocol. To further achieve compatibility, the WebSocket handshake uses the HTTP Upgrade header to change from the HTTP protocol to the WebSocket protocol, pushing the boundaries of APIs beyond HTTP in a very seamless way.

SSE is all about the one-way efficiency, and websockets is about two-way efficiency. I prefer keeping things within the realm of HTTP with SSE, unless I absolutely need the two-way, full-duplex communication channel. As you’ll see, I’m fine with pushing the definition of API out of the HTTP realm, but I’d prefer to keep things within bounds, as I feel it is best to embrace HTTP when doing business on the web. I can accomplish a number of objectives for data, content, media, and algorithmic access using the HTTP tools in my toolbox, leaving me to be pretty selective when I push things out of this context.

gRPC Using HTTP/2
While I am forced to use Websockets for some existing integrations such as with Twitter, and other legacy implementations, it isn’t my choice for next generation projects. I’m opting to keep things within the HTTP realm, and embracing the next evolution of the protocol, and follow Google’s lead with gRPC. As with other RPC approaches, gRPC is based around the idea of defining a service, specifying the methods that can be called remotely with their parameters and return types. gRPC embraces HTTP/2 as its next generation transport protocol, and while also employing Protocol Buffers, Google’s open source mechanism for the serialization of structured data.

At Google, I am seeing Protocol Buffers used in parallel with OpenAPI for defining JSON APIs, providing two speed APIs using HTTP/1.1 and HTTP/2. I am also seeing Protocol Buffers used with HTTP/1.1 as a transport, making it something I have had to integrated with alongside SOAP, and other web APIs. While I am integrating with APIs that use Protocol Buffers, I am most interested in the usage of HTTP/2 as a transport for APIs, and I am investing more time learning about the next generation headers in use, and the variety of approaches in which HTTP/2 is used as a transport for traditional APIs, as well as multi-directional, streaming APIs.

Apache Kafka
Another shift I could not ignore across the API landscape in 2017 was the growth in adoption of Kafka as a distributed streaming API platform. Kafka focuses on enabling providers to read and write streams of data like a messaging system, and develop applications that react to events in real-time, and store data safely in a distributed, replicated, fault-tolerant cluster. Kafka was originally developed at LinkedIn, but is now an Apache open source product that is in use across a number of very interesting companies, many of which have been sharing their stories of how efficient it is for developing internal data pipelines. I’ve been studying Kafka throughout 2017, and I have added it to my toolbox, despite it pushing the boundaries of my definition of what is an API beyond the HTTP realm.

Kafka has moved out of the realm of HTTP, using a binary protocol over TCP, defining all APIs as request response message pairs, using its own messaging format. Each client initiates a socket connection and then writes a sequence of request messages and reads back the corresponding response message–no handshake is required on connection or disconnection. TCP is much more efficient over HTTP because it allows you to maintain persistent connections used for many requests. Taking streaming APIs to new levels, providing a super fast set of open source tools you can use internally to deliver the big data pipeline you need to get the job done. My mission is to understand how these pipelines are changing the landscape and which tools in my toolbox can help augment Kafka and deliver the last mile of connectivity to partners, and public applications.

Message Queuing Telemetry Transport (MQTT)
Continuing to round off my API toolbox in a way that pushes the definition of APIs beyond HTTP, and helping me understand how APIs are being used to drive Internet-connected devices, I’ve added Message Queuing Telemetry Transport (MQTT), an ISO standard for implementing publish-subscribe-based messaging protocol to my toolbox. The protocol works on top of the TCP/IP protocol, and is designed for connections with remote locations where a light footprint” is required because compute, storage, or network capacity is limited. Making MQTT optimal for considering when you are connecting devices to the Internet, and unsure of the reliability of your connection.

Both Kafka, and MQTT have shown me in the last couple of years, the limitations of HTTP when it comes to the high and low volume aspects of moving data around using networks. I don’t see this as a threat to APIs that leverage HTTP as a transport, I just see them as additional tools in my toolbox, for projects that meet these requirements. This isn’t a failure of HTTP, this is simply a limitation, and when I’m working on API projects involving internet connected devices I’m going to weight the pros and cons of using simple HTTP APIs, alongside using MQTT, and being a little more considerate about the messages I’m sending back and forth between devices and the cloud over the network I have in place. MQTT reflects my robust and diverse API toolbox, as one that gives me a wide variety of tools I’m familiar with and can use in different environments.

Mastering My Usage Of Headers
One thing I’ve learned over the years while building my API toolbox is the importance of headers, and they are something that have regularly been not just about HTTP headers, but the more general usage of network networks. I have to admit that I understood the role of headers in the API conversation, but had not fully understood the scope of their importance when it comes to taking control over how your APIs operate within a distributed environment. Knowing which headers are required to consume APIs is essential to delivering stable integrations, and providing clear guidance on headers from a provider standpoint is essential to APIs operating as expected on the open web.

Content negotiation was the header doorway I walked through that demonstrated the importance of HTTP headers when it comes to deliver the meaningful API experiences. Being able to negotiate CSV, XML, and JSON message formats, as well as being able to engage with my digital resources in a deeper way using hypermedia media types. My headers mastery is allowing me to better orchestrate an event-driven experience via webhooks, and long running HTTP connections via Server-Sent Events. They are also taking me into the next generation of connectivity using HTTP/2, making them a critical aspect of my API toolbox. Historically, headers have often been hidden in the background of my API work, but increasingly they are front and center, and essential to me getting the results I’m looking for.

Standardizing My Messaging
I have to admit I had taken the strength of message formats present in my web service days for granted. While I still think they are bloated and too complex, I feel like we threw out a lot of benefits when we made the switch to more RESTful APIs. Overall I think the benefits of the evolution were positive, and media types provide us with some strong ways to standardize the messages we pass back and forth. I’m fine operating in a chaotic world of message formats and schema that are developed in the moment, but I’m a big fan of all roads leading to standardization and reuse of meaningful formats, so that we can try to speak with each other via APIs in more common formats.

I do not feel that there is one message format to rule them all, or that even one for each industry. I think innovation at the message layer is important, but I also feel like we should be leveraging JSON Schema to help tame things whenever possible, and standardize as media types. Whenever possible, reuse existing standards from day one is preferred, but I get that this isn’t always the reality, and in many cases we are handed the equivalent of a filing cabinet filled with handwritten notes. In my world, there will always be a mixed of known and unknown message formats, something that I will always work to tame, as well as be increasingly apply machine learning models to help me identify, evolve, and make sense of–standardizing things in any way I possibly can.

Knowing (Potential) Clients
I am developing APIs for a wide variety of clients. Some are designed for web applications, others are mobile applications, and some are devices. They could be spreadsheets, widgets, documents, and machine learning models. The tables could be flipped, and the APIs exist on device, and the cloud becomes the client. Sometimes the clients are known, other times they are unknown, and I am looking to attract new types of clients I never envisioned. I am always working to understand what types of clients I am looking to serve with my APIs, but the most important aspect of this process is understanding when there will be unknown clients.

When I have a tightly controlled group of target clients, my world is much easier. When I do not know who will be developing against an API, and I am looking to encourage wider participation, this is when my toolbox comes into action. This is when I keep the bar as low as possible regarding the design of my APIs, the protocols I use, and the types of data formats and messages I use. When I do not know my API consumers and the clients they will developing, I invest more in API design, and keep my default requests and responses as simple as possible. Then I also allow for the negotiation of more complex, higher speed, more control aspects of my APIs by consumers who are in the know, targeting more specific client scenarios.

Using The Right Tools For The Job
API is not REST. It is one tool in my toolbox. API deployment and integration is about having the right tool for the job. It is a waste of my time to demand that everyone understand one way of doing APIs, or my way of doing APIs. Sure, I wish people would study and learn about common API patterns, but in reality, on the ground in companies, organizations, institutions, and government agencies, this is not the state of things. Of course, I’ll spend time educating and training folks wherever I can, but my role is always more about delivering APIs, and integrating with existing APIs, and my API toolbox reflects this reality. I do not shame API providers for their lack of knowledge and available resources, I roll up my sleeves, put my API toolbox on the table, and get to work improving any situation that I can.

My API toolbox is crafted for the world we have, as well as the world I’d like to see. I rarely get what I want on the ground deploying and integrating with APIs. I don’t let this stop me. I just keep refining my awareness and knowledge by watching, studying, and learning from what others are doing. I often find that when someone is in the business of shutting down a particular approach, or being dogmatic about a single approach, it is usually because they aren’t on the ground working with average businesses, organizations, and government agencies–they enjoy a pretty isolated, privileged existence. My toolbox is almost always open, constantly evolving, and perpetually being refined based upon the reality I experience on the ground, learning from people doing the hard work to keep critical services up and running, not simply dreaming about what should be.


A Regulatory Subway Map For PSD2

This is one of my talks from APIDays Paris 2018. Here is the abstract: Understanding the PSD2 regulations unfolding for the banking industry is a daunting challenge for banks, aggregators, and regulators, let alone when you are an average user, journalist, or analyst trying to make sense of things. As the API Evangelist I have used a common approach to mapping out transit systems using the universally recognized subway map introduced by Harry Beck in London in the early 20th century. I’m taking this mapping technique and applying it to the PSD2 specification, helping provide a visual, and interactive way to navigating this new regulatory world unfolding in Europe. Join me for an exploration of API regulations in the banking industry, and how we can use a subway and transit map to help us quickly understand the complexities of PSD2, and learn what takes to get up and running with this new definition for the API economy.

Using Transit As API Analogy
I’ve been fascinated with transit systems and subway maps for most of my adult life. Recently I’ve begin thinking about the parallels between complex municipal transit systems, and complex API infrastructure. Beginning in 2015, I started working to see if I could apply the concept of a subway map to the world of APIs, not just as a visualization of complex API systems, but as a working interface to discover, learn about, and even govern hundreds, or even thousands of APIs working in concert. While very much a physical world thing, transit systems still share many common elements with API infrastructure, spanning multiple disparate systems, operated by disparate authorities, possessing legacy as well as modern infrastructure, and are used by many people for work and in their personal lives.

While the transit system isn’t a perfect analogy for modern API infrastructure, there is enough there to keep me working to find a way to apply the concept, and specifically the transit map to helping us make sense of our API-driven systems. I’ve called my work API Transit, leaning on both the noun definition of transit, “the carrying of people, goods, or materials from one place to another”, as well as the verb, “pass across or through (an area)”. The noun portion reflects the moving of our digital bits around the web using APIs, while the verb dimension helps us understand what is often called the API lifecycle, but more importantly the governance that each API should pass through as it matures, and continues to move our important digital bits around.

The History of Subway Maps
The modern approach to mapping transit systems can be traced back to Henry Beck who created the now iconic map of the London Underground in 1933. The methodology was decoupled from earlier ways of mapping and communicating around transit resources, in a way that was focused on how the resources would be experienced by end-users, evolving beyond a focus on the transit resources themselves, and their location in our physical world. At the beginning of the 20th century, subway maps were still being plotted along roads and rivers, leaving them coupled to legacy elements, and ways of thought. By 1915, public transit engineers like Henry Beck were beginning to rethink how they mapped the transit infrastructure, in a way that helped them better communicate their increasingly complex infrastructure internally, but most importantly, in a way that helped them communicate externally to the public.

This is what I’m looking to do with applying 20th century transit infrastructure mapping while applying it to 21st century digital API transit infrastructure. We are needing to get people(s) bits, and digital goods and materials around the web, while also understanding how to develop, discover, operate and manage this infrastructure in a consistent, and intuitive way. However, API infrastructure is rapidly growing, and with the introduction of microservices becoming increasingly complex, and difficult to logically map out the increasingly evolving, shifting, and moving API landscape. I’m hoping to suspend reality a little bit as Henry Beck did, and take utilize the same visual cues to begin to visualize API infrastructure in a way that is familiar to not just developers, but hopefully regulators, and average consumers. Much like the transit system of any major city, initially an API transit system will seem overwhelming, and confusing, but with time and experience it will come into focus, and become a natural part of our daily lives.

API Transit Applied To Governance
Over the last two years I have been actively working to apply the API transit model to the concept of API governance. Beginning with API design, I have been looking for a way to help us understand how to consistently craft APIs that meet a certain level of quality set by a team, company, or even an entire industry. I’ve pushed this definition to include many of the almost 100 stops along the API lifecycle I track on as part of my work as the API Evangelist. Allowing API governance transit maps to be crafted, which can be used to help understand API governance efforts, but also help be applied to actually executing against this governance framework. Delivering the verb dimension of API transit, to “pass across or through (an area)”, allowing each API or microservice to regularly pass through each area, line, or stop of each API Transit governance map.

API Transit began as a way to define and visualize the API lifecycle for me, acknowledging that the lifecycle is rarely ever a straight line. Often times it begins as that, but then over time it can become a mashup of different areas, lines, and stops that will need to be realized on a regular basis, on a one time basis, or any other erratic combination our operations can conceive of. With API Transit applied as an API governance model, I wanted to push the analogy even further, and see if I can push it to accommodate an individual API definition, or even a complex set of microservices working in concert. I began playing with a variety of existing API implementations, looking for one that could be used to tell a bigger story of how the transit model could be use beyond mapping how buses, trains, and other transit vehicles operate.

API Transit For PSD2 Landscape
To push the API transit concept further, I took the Payment Services Directive 2 (PSD2) governing banking APIs in Europe, to see what might be possible when it comes to mapping out API infrastructure in a impactful way. I was able to easily map out lines for some of the most common aspects of PSD2, visualizing accounts, transactions, customers, and then eventually every other stop I wanted to include. The trick now, is how to I articulate shared stops, transfer stations, and other common patterns that exist as part of the PSD2 specification. I didn’t want to just visualize the banking regulations, I want to create an interactive visualization that can be experienced by developers, regulators, and even end-users.

The HTML Canvas transit map solution I’m using allows me to plot each line, and stop, and give each stop a label as well as a link. This allows me to provide much more detail about each stop, allowing someone exploring the specification to obtain much more information about each aspect of how the PSD2 specification is put to use. The resulting API transit map for the PSD2 landscape is pretty crude, and not very attractive. It will take much more work to bring each line in alignment, and allowing for overlap and redundancy to exist across lines. The reasons behind each line and stop will be different than a physical transit map, but there are still existing constraints I need to consider as a craft each transit map. The process is very rewarding, and helps me better understand each intimate detail of the PSD2 specification, something I’m hoping will eventually be passed on visually, and experimentally to each map user.

API Transit Is OpenAPI Defined
As part of my rendering of the API Transit map for the PSD2 landscape, I was able to generate the HTML5 Canvas using the OpenAPI for the PSD2 API. Each line was created by the tags applied across APIs, and each stop was created to reflect each individual API method available. Providing a machine readable set of instructions to quantify what the structure of the landscape should be. I feel that the tagging of each API method can be improved, and used as the key driver of the how each API transit map is rendered. Something that will be used to make sense of complex renderings, as well as deliver on much more simplified, distilled versions to help make the landscape more accessible to non-developers.

The details present as part of each PSD2 API path, method, parameters, responses, and schema can be rendered as part of the information provided with each stop. Providing a clear, but detailed representation of each stop within the API Transit system, from a machine readable core which will be used to make the API Transit map more interactive, as well as part of a larger continuous deployment and integration experience. While each stop along each PSD2 line will not always be linear in nature, this approach allows for stops to be organized in a meaningful order, driven by a set of common machine readable rules that can be versioned, evolved over time, and used throughout the API Transit system, and tooling that is deployed to keep the API Transit system operational.

API Transit Has Hypermedia Engine
With the ability to map out an overall API governance system, as well as a set of microservices that meet a specific industry objective, I needed a way to put in all into motion, so it could work together as a complete API transit system. I chose hypermedia to be the engine for the interaction layer, and OpenAPI to be the definition for each individual API. Hypermedia is used as a scaffolding for each area, line, and stop along the transit system, allowing each API to be navigated, but in a way that each API could also navigate any overall API governance model. Introducing the concept that an API could be experienced by any human, and any API governance model could be experienced by an API–again, allowing it to “pass across or through (an area)”, thus API Transit.

Each API has a machine readable OpenAPI definition allowing requests and responses to be made against the surface area of the API as it is explored via the API Transit experience, much like interactive documentation provides already. However, the same OpenAPI definition allows the API to also be validated against overall API governance practices, ensuring certain design standards are met. I’m also playing with the use a machine readable APIs.json index for each API, in addition to the OpenAPI definition, allowing governance to be easily expanded to other API governance lines like management, documentation, pricing, support, terms of service, and other aspects of API operations not defined in an OpenAPI, but could be indexed within an APIs.json document.

API Transit Runs As Github Repository
To continue evolving on the API Transit experience, I’ve begun deploying the API Transit map, with OpenAPI core, and hypermedia engine to Github for continuous deployment, and hosting of the API Transit map using Github Pages. Github, combined with Github Pages, and the static CMS solution Jekyll introduces an interesting opportunity for bringing each API Transit map to life. Jekyll transforms the hypermedia engine, and each API definition into a collection of objects which can be referenced with the API Transit map using the Liquid syntax. This allows each area, line, and stop to be easily traveled, moving from stop to stop, transferring between lines, and consuming rich information all along the way.

Running each API Transit map makes exploring an interactive experience, but it also makes the experience easily part of any continuous integration or deployment pipeline, using Git or the Github API to engage with, or evolve the API Transit definition. This makes each API definition seamlessly integrated with actual deployment workflows, while also making API governance also part of these workflows, helping realize API testing, monitoring, validation, and other elements of the build process. With seamless integration as part of existing continuous deployment and integration workflows, I can easily envision a future where API Transit maps aren’t just interactive, but they are continually changing, streaming, and visually updated in real time.

Each Stop Filled With Information
Each stop along an API Transit line is defined by its OpenAPI path, and navigated to by the hypermedia engine, but that is just the beginning. The hypermedia object for each stop contains the basics like the name and description of the stop, as well as the reference to the path, method, parameters, responses, and schema in the OpenAPI, but it also can possess other fields, images, video, audio, and links that deliver rich information about each individual stop. The hypermedia engine allows us to define and evolve these over time without breaking the client, which is the API Transit map. Providing an huge opportunity to educate as well as implement governance at each stop along the API Transit line(s).

Since a stop can be along an API governance line, or along an individual API line, the opportunity for education is enormous. We can educate developers about healthy API design practices, or we can educate a developer about a specific API method, where the healthy API design practices can be fully realized. With this approach to using Github Pages, Jekyll, and a hypermedia engine as an interactive CMS, there is a pretty significant opportunity to make the API Transit experience educational and informative for anyone exploring. Each developer, aggregator, regulator, and even end-users can take control over which aspects of the PSD2 specification they wish to engage with and learn about, allowing for the PSDS API Transit to serve as wide of an audience possible.

Map Progress Of Individual Banks
While there is an OpenAPI, and now API Transit map for the overall PSD2 specification. I will be using the approach to track on the evolution of each individual bank in Europe being impacted by the regulation. Ideally each bank will actively maintain and publish their own OpenAPI definition, something I will be actively advocating for, but ultimately I know the way things end up working in the real world, and I’ll have to maintain an OpenAPI definition for many of the banks, scraping them from the documentation each bank provides, and SDKs available on Github. I will use this as a literal map of the progress for each individual bank, and how compliant they are with EU regulation.

Each individual bank PSD2 API Transit map will allow me to navigate the APIs for each bank, including URLs, and other information that is unique to each provider. I will be using APIs.json to track index API operations, which I will work to bring out as visual elements through the API Transit map experience. In theory, each banks APIs will be an exact copy of the master PSD2 specification, but I’m guessing more often than not, when you overlay the master PSD2 API Transit with each individual bank’s API Transit map, you are going to see some differences. Some differences will be positive, but most will demonstrate deficiencies and deviation from the overall guidance provided by EU regulatory agencies.

Compare Over PSD2 To Individual Banks
I am using the hypermedia engine for the API Transit map to crawl each individual banks implementation to deliver the overlap map I discussed. I’m working on a way to show the differences between the master PSD2 specification, and one or many individual PSD2 API implementations. I’m not just looking to create an overlay map for visual reference, I will actually crawl the OpenAPI for each individual bank and output a checklist of which paths, parameters, and schema they do not support, ensuring their definition matches the master PSD2 specification. The first approach to doing this will be about comparing OpenAPI definitions, with a second dimension actually running assertions against the API for those that I have access to.

There will be too much information to show on a single API Transit map, especially if we are comparing multiple banks at once. There will need to be a report format, accompanying a PSD2 API Transit map overlay showing the differences at the overall line and stop level. The programmatic comparison between the master PSD2 API and each of the banks will be easy to accomplish using the hypermedia engine, as well as the detailed OpenAPI definition. What will prove to be more challenging, is to create a meaningful representation of that map that allows the differences to be visualized, as well as the detail that is missing or present to be navigated via an aggregate Github repository.

Apply Governance To Individual Banks
In addition to comparing the OpenAPI definitions of individual banks against the master PSD2 OpenAPI definition, I’m using the API Transit hypermedia engine to compare each individual API against the overall API governance. At first the API governance will just be about the API design elements present in the OpenAPI definition for each bank’s API. I will be looking for overall design considerations that may or may not be present in the master PSD2 specification, as well as beginning to add in additional API governance areas like testing, monitoring, performance, documentation, and other areas I mentioned earlier, which will be tracked via an APIs.json index.

There are a number of details I’d like to see banks provide that isn’t covered as part of the PSD2 guidance. Having an API is only the beginning of usable banking solution for 3rd party applications. There needs to be documentation, support, and other elements present, or a well designed, and PSD2 compliant doesn’t mean much. I’m using API Transit to take the governance of PSD2 implementations beyond what the EU has put into motion. Taking what I know from the wider API sector and getting to work to map out, validate, and rank banking APIs regarding how easy to use, responsive, and usable their PSD2 APIs are. Making sure 3rd party developers, aggregators, and regulators can get at the APIs that are supposed to be defining their compliance.

Use to Map Out Countries & Regions
Using API Transit, I will be aso zooming out and mapping out the PSD2 landscape for individual countries, and regions. Creating separate transit maps that allow each country to be navigated, using the API Transit hypermedia engine to crawl each banks API Transit system map, and eventually provide aggregate comparison tools to see how different banks compare. With a machine readable index of each country, as well as individual indexes of each bank, the entire PSD2 regulatory landscape will become explorable via a series of API transit maps, as well as programmatically using the hypermedia engine present for each individual bank API implementation.

API Transit is meant to help map out the complexity of individual API systems, as well as the overall API governance in place. It will also enable the mapping of the complexity of many implementations across cities, regions, and countries. Allowing for regulatory tooling to eventually be developed on top of the thousands of Github repositories that will be created, housing each individual banks API Transit maps. Leveraging the Github API, Git, and the hypermedia and OpenAPI core of each individual API Transit implementation. Allowing for searching, reporting, auditing, and other essential aspects of making sure PSD2 is working.

Making Sense Of Complex API Landscapes
The goal of API Transit is to make sense of complex API landscape. Tracking the details of thousands of API operations across the EU countries who are required to be compliant in 2018. Providing machine readable blueprints of the regulation, as well as each individual bank that is required to be compliant, and each individual detail of what that compliance means. Then tying it all together with a hypermedia engine that allows it to be explored by any system or tooling, with a familiar transit map on top of it all–providing a way for humans to explore and navigate this very complex banking API landscape.

As I am evolving the process of creating API Transit maps, the API Transit hypermedia engine allows for the development of a gradual awareness of a very complex and technical landscape. It all feels the same as when I land in any new city and begin understand the local transit system. At first, I am overwhelmed, and confused, but the more time I spend exploring the transit maps, riding the subway and buses, developing a familiarity with each system, the easier it all becomes. There is a reason transit maps are used to provide access to complex municipal transit systems, as they help provide a comprehensive view of the landscape, that allows you to find your way around, and use as a reference as you develop an awareness–this is what I’m looking to build with API Transit for the banking sector.

Tours For Individual Roles
One of the benefits of an API Transit map client that is built on a hypermedia engine is that you can take advantage of the ability to change the map, and experience depending on who you are targeting. The concept of the transit maps allows us to suspend realities of the physical world, and the hypermedia client for the API Transit map can suspend realities of the virtual world, allowing for the creation of unique tours and experiences depending on who the end user is. The complete API Transit map for each bank’s implementation will still exist, but the ability to suspend some details to distill down complexity into more simpler experiences will be possible.

The objective of individual API Transit tours and experiences will allow for onboarding new developers, business users, or possibly regulator oversight looking at just a specific dimension of the PSD2 guidance. Allowing for the ability to zoom in just at the customer experience layer, or possible just the transactional capabilities across many banking API providers or even a city, region, or country. Think of how a city might have separate transit maps for the subway, buses, and other systems, but still have the overall system map. We can do the same with API Transit, further reducing complexity, and helping users make sense of the complex banking API systems.

Interactive Industry API Governance
For API governance to be effective it needs to be understood by everyone being impacted, and are required to participate. It has to be able to measured, quantified, tested, audited, and visualized in real time, to understand how it is working, and not working. Modern API operations excel at doing this at the company, organization, institutional, and government level, and we should be using API solutions and tooling to help us understand how governance is being applied. There should be machine readable definitions, and common media types, as well as API driven tooling for educating, implementing, and measuring API governance–completing the API loop.

This is another reason why the transit analogy is working so well for me when it comes to making API governance an interactive experience. I’m able to develop not just linear experiences where participants can click next, next, next and walk through everything. They can choose their own paths, experience the lines they feel need attention, while allowing managers, auditors, and regulators to understand how well each implementation is responding, evolving, and interacting with regulations. Making API governance at the industry level interactive not just for the banks and the EU regulatory body, but also for every other participant in between trying to make sense of what is happening with PSD2 in 2018.

Continuously Deployed And Integrated
As I mentioned before, API Transit runs 100% on Github, within a Github repository. The API Transit map client is unaware of each individual banks API implementation. The uniqueness for each implementation resides in a series of Siren hypermedia, OpenAPI, and APIs.json files, that provide a machine readable snapshot of the implementation. The combination of Github, and the machine readable YAML core of each API Transit instance makes it all able to be integrated into continuous integration and deployment pipelines. Evolving each API Transit instance, as each actual banks API implementation is built and released–hopefully ensuring they are kept in sync with the reality on the ground at each bank.

At this point, API Transit has become another application being developed upon the banking APIs that PSD2 are designed to expose. It ultimately will be an aggregated application that uses not just one banking API, but all the banking APIs. The difference from other aggregators is API Transit is not interested in the data each bank possesses. It is interested in continuously understanding how well each banks API is when it comes to complying with PSD2 regulations, while also continuously helping developers, aggregators, regulators, and anyone else make sense of the complexities of banking APIs and the PSD2 regulations.

Mapping Technology, Business, & Politics
While API Transit seems very technical at first look, the solution is meant to help make sense of the business and politics around the PSD2 rollout. The first part of the conversation might seem like it is about ensuring each bank has a compliant API, but it is more about ensuring they have common operational level components of API operations like a developer portal, documentation, support, and other business aspects of doing business with APIs. Next, it will be about ensuring all the technical aspects of PSD2 compliant APIs in place, validating each API path, as well as it’s responses, schema, and other components.

While we need to make sure 100% of the PSD2 OpenAPI definition is represented with each bank, we also need to make sure that API is accessible, secure, and usable, otherwise none of this matters. If a bank plays politics by making their API unusable to aggregators and 3rd party developers, yet appear on the surface to have a functioning API that is compliant with the PSD2 specification, we need to be able to identify that this is the case, make sure we are testing for these scenarios on a recurring basis, and be able to visualize and report upon it to regulators. Properly addressing the technical, as well as the business, and politics of API operations.

Making API Regulation More Familiar
The main reason I’m using the transit map approach is because it is familiar. Secondarily, because the concept continues to work after two years of pushing forward the concept. Everyone I have shown it to immediately responds with, “well the trick will be to not make things too overwhelming and complicated”. Which I response with, “do you remember the first time you used the transit system in your city? How did you feel?” Overwhelmed, and thinking this was complicated–because it is. However, because there is a familiar, universal map that helps you navigate this complexity, eventually you begin to learn more about the complexity of the city you live in, or are visiting. Helping bridge the complexity, making it all a little more familiar over time.

The transit map concept is universal and familiar. It is in use across hundreds of transit systems around the globe. The transit map is familiar, but it also has the ability to help make a complex system more familiar with a little exploration, and over the course of time. This is why I’m extending the transit approach to the world of APIs, and using it to make complex API systems like the PSD2 banking API regulation more familiar. Banking APIs aren’t easy. Banking regulations aren’t easy. We need a map to help us navigate this world, and be able to make sense of the complexity, even as it continues to evolve over time–API Transit is that.

Helping API Regulation Be More Consistent
API Transit is machine readable, driven by existing specifications, including Siren a hypermedia media type, OpenAPI, and APIs.json. The first machine readable definition I started with was the OpenAPI definition for the PSD2 regulation. This was the seed, then I generated the hypermedia engine from that, and took the API governance engine I had been working on for the last couple of years and built it into the existing API Transit core. The objective of all of this work is to use the framework to introduce more consistency into how I am mapping out the rollout of PSD2 across the European banking landscape. This is something that is too big to do manually, and something that requires a machine readable, API driven approach to get the job done properly.

The PSD2 OpenAPI definition provides a way to consistently measure, and report on whether each banks APIs are in compliance with PSD2 regulations. The APIs.json is meant to make sure each bank’s APIs are in compliance with API Evangelists guidelines for API operations. The Siren hypermedia engine is meant to enable the consistent crawling and auditing of everything as it evolves over time by other systems and applications, and when combined with the HTML5 Canvas API Transit map, it allows humans to explore, navigate, and visual banking APIs, and the PSD2 regulations in a more consistent, and organized fashion.

Seeing PSD2 In Motion In A Visual Way
Ultimately, API Transit is about being able to see API operations across an industry in a visual way, where you can see everything in motion. Applying it to the PSD2 landscape is meant to help visualize and understand everything happening. It has taken me many hours of work to get intimate with the core OpenAPI definition for PSD2. Learning about all the paths and schema that make up the regulation. As I work to evaluate the compliance of hundreds of banks across France and UK, I’m quickly realizing that it will be beyond my capacity to be able to see everything that is going on. API Transit helps me see the PSD2 landscape in a way that I can be regularly updated, revisited, and experienced as the landscape evolves.

The PSD2 API Transit application is just getting started. The methodology has been proven, and I’ve begun profiling banks, and collecting OpenAPI definitions for any APIs that I find, and profiling the wider presence of their API operations using APIs.json. Then I will begin improving on the mapping capabilities of the API Transit map client, making for a more visually pleasing experience. Along the way I am working with partners to better monitor the availability and performance of each banks APIs. Within a couple of months I expect a clearer picture to begin to come into focus regarding how the PSD2 landscape is unfolding in 2018, and by the end of the year, it will become clear which banks have emerged as competitive leaders within the API economy.


Shifting Gears Between the Technology and Politics of APIs

I’ve been working on two talks for API Days in Paris next week. These talks are at two opposite ends of the API spectrum for me. The first one, API Is Not Just REST, is rooted in the technology of APIs, but then touches lightly on the business and politics of APIs. The second one, a regulatory subway map for PSD2, is all about the politics of APIs, but then touches lightly on the business and technology of APIs. I have the outlines for both talks done, and I’m working on the narrative and slides for each, along the way I’m really caught by how different each end of the spectrum are, and require me to use a different part of my brain–something I think really defines the yin and yang of APIs.

When I’m down at the protocol level of APIs, thinking about the details of HTTP, TCP, and the nuance of how headers are used, and the messages we pass back and forth, the politics of APIs do not matter to me. My developer and architect brain is absorbed with the technical details, and the human or political consequences of my decisions really do not matter all that much. As long as things are technically correct, and my responses and requests are doing what is expected, I am good. It is easy for me to be railroaded within this technical silo, and I wouldn’t ever need to be concerned for the business and politics of it all, if my work as API Evangelist didn’t force me out of my comfort zone.

Inversely, when I’m thinking about the politics of how this all works, and the intention and impact of regulatory guidance like PSD2 in Europe, the technical details of HTTP, headers, and what messages I’m using feel less important. Sure, they still matter to what I’m trying to do, but the strictness in which I define my protocols, headers, data formats, schema, and other gears of API operations takes a looser form. When you look at banks who have NO PUBLIC API, just getting them up an running seems much more important than ensuring each HTTP response status code is present, and each schema is perfectly represented. Eventually, I will need to make sure all my technical i’s are dotted, and t’s are crossed, but I have bigger battles to wage at the moment.

As I’m pulled back and forth between the technical and the politics of APIs, I find myself becoming very, very aware of the business of APIs, and how the complexity of both technology and the politics are wielded for business gain. Seeing how technology is used as a competitive advantage, as well as leveraging politics to get ahead in the game. You see this in the rhetoric around PSD2, where invoking the bogey man can be very telling of a company’s position, just as much as the company’s who are proactively jumping on the PSD2 bandwagon, and getting ahead of the game, or even being a leader when it comes to defining implementations. Competitive edges can be sharpened by embracing political shifts, as well as through the adoption of leading edge technologies present across the API space (Kafka, OpenAPI, etc.)–it really depends on your organizations approach to the API game, and how up to speed you are on how it is played.

While I don’t feel it should be everyone’s role to be exposed to the all the extremes of the API sector, I do feel like we do a poor job of exposing developers and architects to the business and politics of it all in a meaningful way. I also feel like we spend too much time either hiding from or protecting business users from the technical details. I find that I learn a lot being pulled back and forth to either end of the spectrum, something I’m hoping to share as part of my talks next week in France, as well as the conversations I have in the hallways at the conference, and meeting rooms before and after the event. I look forward to seeing you all in Paris, and Grenoble next week.


Where Am I In The Sales Funnel For Your API?

I’m signing up for a large number of new APIs lately as part of a project I am working on. It is pretty normal for me to sign up for a couple new APIs a week, or 10-20 within a month, but right now I’m signing up for hundreds, setting up an application and getting keys for each service. I’ll share more about what I’m working on in future stories, but I wanted to talk more about the on-boarding practices of some of these APIs. It is pretty clear from the on-boarding processes that I don’t rank very high in some of these API provider’s sales funnels, making me not deserving of self-service access, or even a sales call–which is a separate topic I will talk about in a future post.

I’ve registered with a number of high value API providers who have more of an enterprise focus, but also have a seemingly self-service, public API available. After signing up for access, it becomes very clear that APIs are anything but self-service, and there is a sales funnel in play, and I’ve been ranked, tagged, and identified where I am in this sales funnel, and what value I bring as a potential small business–which is not much by usual measurements. I’m pretty well versed in how company’s set up their sales strategy, and have seen many companies think it is a good idea to translate these practies to their API operations. Only targeting the high value customers, and not really giving a shit about the rest of them. It just isn’t worth the resources to go after them, they don’t have the spending capacity of customers you want in your funnel.

I get it. You are right. I don’t have a lot of money to buy your services, and will not become a high value customers. However, I have many readers who are high value customers, and trust my opinion about which services are worthy paying attention to. And guess what? I’m not going to write about your service. I’m not going to include you in my prime time storytelling, and when I do reference your API as part of my research, it will be in the club of shame, and APIs that really aren’t worth your time playing with. My enterprise readers, growing startups, university IT leadership, and government project owners with big budgets won’t ever know about your API, all because you didn’t see me as being worth your time, and your API on-boarding practices are out dated.

In a self-service API world you don’t need to be high touch with the long tail of your API consumers. This is why we have API management in place, with sensibly priced access tiers, requiring monthly levels of access, all requiring credit cards on file. While it frustrates me when companies don’t have a free tier of access for me to kick the tires, I get it. What makes a situation untenable is when I put in a credit card, and I am willing to spend a couple hundred bucks to write a story, build a prototype, or publish a landscape guide or white paper, and I still can’t get access to your resources and understand what is going on. It’s ok. I’m guessing they probably weren’t worth sharing with my readers anyways, and you belong right where you should belong in my API research that gets read by business and IT leadership around the globe.


Helping Define Stream(Line)Data.io As More Than Just Real Time Streaming

One aspect of my partnership with Streamdata.io is about helping define what it is that Streamdata.io does–internally, and externally. When I use any API technology I always immerse myself in what it does, and understand every detail regarding the value it delivers, and I work to tell stories about this. This process helps me refine not just how I talk about the products and services, but also helps influence the road map for what the products and services deliver. As I get intimate with what Streamdata.io delivers, I’m beginning to push forward how I talk about the company.

The first thoughts you have when you hear the name Streamdata.io, and learn about how you can proxy any existing JSON API, and begin delivering responses via Server-Sent Events (SSE) and JSON Patch, are all about streaming and real time. While streaming of data from existing APIs is the dominant feature of the service, I’m increasingly finding that the conversations I’m having with clients, and would be clients are more about efficiencies, caching, and streamlining how companies are delivering data. Many API providers I talk to tell me they don’t need real time streaming, but at the same time they have rate limits in place to keep their consumers from polling their APIs too much, increasing friction in API consumption, and not at all about streamlining it.

These experiences are forcing me to shift how I open up conversations with API providers. Making real time and streaming secondary to streamlining how API providers are delivering data to their consumers. Real time streaming using Server-Sent Events (SSE) isn’t always about delivering financial and other data in real time. It is about delivering data using APIs efficiently, making sure only what what has been updated and needed is delivered when it is needed. The right time. This is why you’ll see me increasingly adding (line) to the Stream(line)data.io name, helping focus on the fact that we are helping streamline how companies, organizations, institutions, and government agencies are putting data to work–not just streaming data in real time.

I really enjoy this aspect of getting to know what a specific type of API technology delivers, combined with the storytelling that I engage in. I was feeling intimidated about talking about streaming APIs with providers who clearly didn’t need it. I’m not that kind of technologist. I just can’t do that. I have to be genuine in what I do, or I just can’t do it. So I was pleasantly surprised to find that conversations were quickly becoming about making things more efficient, over actually ever getting to the streaming real time portion of things. It makes what I do much easier, and something I can continue on a day to day basis, across many different industries.


I Want to Be Able to POST and PUT and Receive Credits on My API Bill

I’ve been thinking about the potential for measuring value exchange at the API management level a lot more lately, and while I’m working on a project to profile banks that are needing to comply with the PSD2 regulations in Europe, I’m thinking about the missed opportunities for the API providers I’m using to fuel my research to leverage the value I’m generating. I’m using a variety of data enrichment APIs for helping add to the contact data, corporate profiles, images, documents, patents, and other valuable data to what I’m doing as part of my wider research into the API space. While these APIs have valuable services that I am paying for, all of the APIs are just using one HTTP verb–GET.

On a regular basis I come across incorrect, or incomplete data, and as part of my work I dive in and correct the data, and continue to connect the dots. I often find better copies of logos, add in relevant business profile data, and I always provide very detailed information regarding a company’s API–which I find to be some of the most telling aspects of what a company does, or doesn’t do. I’m thankful for the services that the 3rd party APIs I utilize, but I think they are missing out on a pretty big opportunity for trusted partners like me to be able to POST or PUT the data I am gathering back to their systems.

I would love to be able to POST and PUT back information to the APIs I am GETting my data from, and receive credits to my API bill for these contributions. It would benefit API providers by helping ensure the data they are providing is complete and acurate, and it would benefit me by helping me keep my API bills as low as possible. I understand that it would some work on the provider side to ensure I’m a trusted partner, and being able to verify the POST and PUT API calls I am making actually add value, but with a proper queue, and a little bit of human power, it wouldn’t take that much. At first it may seem like the investment would be more than the value, but all you’d have to do is find a handful of partners like me who would significantly contribute to the data you provide via your APIs.

When it comes to the value exchanged via APIs, and it always seems to be the heavily dominated user generated content platforms like Facebook and Twitter, or the GET only data providers, without a lot in between. Most providers I talk with are nervous about the quality of data, and the overhead with managing contributions. Which is definitely true if the flood gates are wide open, but with the proper approach to API management, and sensible access tiers for partners, you could easily identify who the most valuable API consumers were. Do not miss out on the opportunity to allow trusted partners like me to POST and PUT, and receive credits on my bill, making API management, and the value exchange that occurs via your platform a two-way street.


The Role of European Banking Authority (EBA) When It Comes To PSD2

As part of my continued effort to break down the Payment Services Directive 2 (PSD2) in Europe, and develop my awareness of how the regulations are intended, as well as the reality on the ground within the industry, I am working to map out all of the players involved. This post is about understanding the role of the European Banking Authority (EBA), and clearly understanding when and where they come into the conversation.

First, what is the European Banking Authority (EBA)? They are the regulatory agency for the European Union, who is in charge of conducting stress tests on European banks and increasing transparency in the European financial system and identifying weaknesses in banks’ capital structures. When it comes to PSD2, their role is to:

  • develop a publicly accessible central register of authorised payment institutions, which shall be kept up to date by the national authorities
  • assist in resolving disputes between national authorities
  • develop regulatory technical standards on strong customer authentication and secure communication channels with which all payment service providers must comply
  • develop cooperation and information exchange between supervisory authorities

The catalyst for this post was because I was looking for the “central register of authorized payment institutions”, and could not find it. Something that will be critical to this effort working, and evolving. I’m also on the hunt for more details regarding how they will be addressing authentication for all 3rd party API access, which will also be something that makes or breaks this effort. And, of course, as the API Evangelist I’m looking to help anyone in the position of helping “develop cooperation and information exchange”–it is what I do.

When it comes to PSD2, I have gotten to know the API definition (OpenAPI), and I am making my way through the actual set of laws, but I’m still working to understand who all the players are. I’ll keep profiling every type participant in the PSD2 theater that is unfolding across Europe in 2018, until each of the actors makes sense in my head, and I can speak to all of them intelligently. Then I’m hoping to compare notes with my research regarding banking in the United States, and see how it all looks. I’ll be spending next week in France talking with bankers about PSD2, and giving a talk on API governance at this level. So for now, I’m going to be all about EU banking, but I’m engaged in several conversations here in the states with major banks as well, which will all make for some great financial API storytelling over the next couple of months.


Key Points From The Payment Services Directive 2 (PSD2)

I’m immersed in studying the Payment Services Directive 2 (PSD2) in Europe, which includes an API definition to help enable the interoperability they are looking to achieve as part of the regulation. I’m working to break down the directive into bit such chunks to help be digest, and understand exactly what it does. The PSD2 laws seeks to improve the existing EU rules for electronic payments (hence the 2), and takes into account emerging approaches to payment services, such as Internet and mobile payments, with APIs at the hear.

The directive sets out rules concerning:

  • strict security requirements for electronic payments and the protection of consumers’ financial data, guaranteeing safe authentication and reducing the risk of fraud
  • the transparency of conditions and information requirements for payment services
  • the rights and obligations of users and providers of payment services

Additionally, “the directive is complemented by Regulation (EU) 2015/751 which puts a cap on interchange fees charged between banks for card-based transactions. This is expected to drive down the costs for merchants in accepting consumer debit and credit cards.” Which can be one of the most frustrating aspects of banking today, where you have no expectations regarding the fees you can be charged around every turn, as you are just trying to make ends meet.

You will be seeing a lot more posts about PSD2 as I work to absorb the regulations, and the technical guidance set forth regarding banking APIs. I’m playing around with the OpenAPI definition for PSD2, and crafting a version of my API Transit subway map to represent the technical guidance present. I’m also working to understand the business, and political aspects of PSD2, which involves me breaking down the directive into this small, digestible stories, here on API Evangelist.


AWS API Gateway OpenAPI Vendor Extensions

I was doing some work on the AWS API Gateway, and as I was going through their API documentation I found some of the OpenAPI vendor extensions they use as part of operations. These vendor extensions show up in the OpenAPI you export for any API, and reflect how AWS has extended the OpenAPI specification, making sure it does what they need it to do as part of AWS API Gateway operations.

AWS has 20 separate OpenAPI vendor extensions as part of the OpenAPI specification for any API you manage using their gateway solution:

  • x-amazon-apigateway-any-method - Specifies the Swagger Operation Object for the API Gateway catch-all ANY method in a Swagger Path Item Object. This object can exist alongside other Operation objects and will catch any HTTP method that was not explicitly declared.
  • x-amazon-apigateway-api-key-source - Specify the source to receive an API key to throttle API methods that require a key. This API-level property is a String type.
  • x-amazon-apigateway-authorizer - Defines a custom authorizer to be applied for authorization of method invocations in API Gateway. This object is an extended property of the Swagger Security Definitions object.
  • x-amazon-apigateway-authtype - Specify an optional customer-defined information describing a custom authorizer. It is used for API Gateway API import and export without functional impact.
  • x-amazon-apigateway-binary-media-types - Specifies the list of binary media types to be supported by API Gateway, such as application/octet-stream, image/jpeg, etc. This extension is a JSON Array.
  • x-amazon-apigateway-documentation - Defines the documentation parts to be imported into API Gateway. This object is a JSON object containing an array of the DocumentationPart instances.
  • x-amazon-apigateway-gateway-responses - Defines the gateway responses for an API as a string-to-GatewayResponse map of key-value pairs.
  • x-amazon-apigateway-gateway-responses.gatewayResponse - Defines a gateway response of a given response type, including the status code, any applicable response parameters, or response templates.
  • x-amazon-apigateway-gateway-responses.responseParameters - Defines a string-to-string map of key-value pairs to generate gateway response parameters from the incoming request parameters or using literal strings.
  • x-amazon-apigateway-gateway-responses.responseTemplates - Defines GatewayResponse mapping templates, as a string-to-string map of key-value pairs, for a given gateway response. For each key-value pair, the key is the content type; for example, “application/json”, and the value is a stringified mapping template for simple variable substitutions. A GatewayResponse mapping template is not processed by the Velocity Template Language (VTL) engine.
  • x-amazon-apigateway-integration - Specifies details of the backend integration used for this method. This extension is an extended property of the Swagger Operation object. The result is an API Gateway integration object.
  • x-amazon-apigateway-integration.requestTemplates - Specifies mapping templates for a request payload of the specified MIME types.
  • x-amazon-apigateway-integration.requestParameters - Specifies mappings from named method request parameters to integration request parameters. The method request parameters must be defined before being referenced.
  • x-amazon-apigateway-integration.responses - Defines the method’s responses and specifies parameter mappings or payload mappings from integration responses to method responses.
  • x-amazon-apigateway-integration.response - Defines a response and specifies parameter mappings or payload mappings from the integration response to the method response.
  • x-amazon-apigateway-integration.responseTemplates - Specifies mapping templates for a response payload of the specified MIME types.
  • x-amazon-apigateway-integration.responseParameters - Specifies mappings from integration method response parameters to method response parameters. Only the header and body types of the integration response parameters can be mapped to the header type of the method response.
  • x-amazon-apigateway-request-validator - Specifies a request validator, by referencing a request_validator_name of the x-amazon-apigateway-request-validators Object map, to enable request validation on the containing API or a method. The value of this extension is a JSON string.
  • x-amazon-apigateway-request-validators - Defines the supported request validators for the containing API as a map between a validator name and the associated request validation rules. This extension applies to an API.
  • x-amazon-apigateway-request-validators.requestValidator - Specifies the validation rules of a request validator as part of the x-amazon-apigateway-request-validators Object map definition.

I keep track of these vendor extensions as part of my OpenAPI toolbox, but I also like to aggregate them, and learn from them. They tell an important story of what AWS is looking to do with the AWS API Gateway. They point to some interesting use cases for the OpenAPI specification including validation, and transforming or mapping API requests and responses, to name a few. There is always a lot to learn from API providers who are extending the OpenAPI specification.

I encounter a number of API designers and architects who don’t know they can extend the specification. It is important that teams realize they can not just extend the specification to fit their needs, but also that they should be learning from how other API proviers are doing this. A few signs of an API provider who is further along in their API journey are 1) actively maintaining and sharing and OpenAPI definition for their APIs, and 2) actively extending and sharing the vendor extensions they use to make OpenAPI do exactly what they need.


A Health Check Response Format for HTTP APIs

My friend Irakli Nadareishvili, has published a new health check response format for HTTP APIs that I wanted to make sure was documented as part of my research. The way I do this is write a blog post, forever sealing this work in time, and adding it to the public record that is my API Evangelist brain. Since I use my blog as a reference when writing white papers, guides, blueprints, policies, and other aspects of my work, I need as many references to usable standards like this.

I am going to just share the introduction from Irakli’s draft, as it says it all:

The vast majority of modern APIs driving data to web and mobile applications use HTTP [RFC7230] as a transport protocol. The health and uptime of these APIs determine availability of the applications themselves. In distributed systems built with a number of APIs, understanding the health status of the APIs and making corresponding decisions, for failover or circuit-breaking, are essential for providing highly available solutions. There exists a wide variety of operational software that relies on the ability to read health check response of APIs. There is currently no standard for the health check output response, however, so most applications either rely on the basic level of information included in HTTP status codes [RFC7231] or use task-specific formats. Usage of task-specific or application-specific formats creates significant challenges, disallowing any meaningful interoperability across different implementations and between different tooling. Standardizing a format for health checks can provide any of a number of benefits, including:

  • Flexible deployment - since operational tooling and API clients can rely on rich, uniform format, they can be safely combined and substituted as needed.
  • Evolvability - new APIs, conforming to the standard, can safely be introduced in any environment and ecosystem that also conforms to the same standard, without costly coordination and testing requirements. This document defines a “health check” format using the JSON format [RFC7159] for APIs to use as a standard point for the health information they offer. Having a well-defined format for this purpose promotes good practice and tooling.

Here is an example JSON response, showing the standard in action:

I have seen a number of different approaches to providing health checks in APIs, from a single ping path, to proxying of the Docker Engine API for the microservices Docker container. It makes sense to have a standard for this, and I’ll reference Irakli’s important work from here on out as I’m advising on projects, or implementing my own.


Developing a Microservice to Orchestrate Long Running Background Server-Sent Events

I am working to understand the value that Streamdata.io brings to the table, and one of the tools I am developing is a set of APIs to help me measure the difference in data received for normal API calls versus when they are proxied with Streamdata.io using Server-Sent Events (SSE) and JSON Patch. Creating an API to poll any 3rd party API I plug in is pretty easy and straightforward, but setting up a server setup to operate long running Server-Sent Events (SSE), managing for failure and keeping an eye on the results takes a little more consideration. Doing it browser side is easy, but server side removes the human aspect of the equation, which starts and stops the process.

This post is just meant to just outline what I’m looking to build, and act as a set of project requirements for what I’m going to develop–it isn’t a guide to building it. This is just my way of working through my projects, while also getting content published on the blog ;-). I just need to work out the details of what I will need to run many different Server-Sent Events (SSE) jobs for long periods of time, or even continuously, and make sure nothing breaks, or at least minimize the breakages. Half of my project will be polling hundreds of APIs, while the other half of it will be proxy those same APIs, and making sure I’m receiving those updates continuously.

I will need some basic APIs to operate each event stream I want to operate:

  • Register - Register a new API URL I wish to run ongoing stream on.
  • Start - Kick off a new stream for any single API I’m tracking on.
  • Stop - Stop a stream from running for any single API I have streaming.

Any API I deem worthy, and have successfully proxied with Streamdata.io will be registered, and operating as a long running background scripts via AWS EC2 instances I have deployed. This is the straightforward part of things. Next, I will need some APIs to monitor these long running scripts, to make sure they are doing what they should be doing.

  • Status - Check the status of a long running script to make sure it is still running and doing what it is supposed to do.
  • Logs - View the logs of an event that has been running to see each time it has executed, and what the request and response were.
  • Notify - Adding a notification API to send a ping to either myself, or someone else response for a long running script to investigate further.

I’m think that set of APIs should give me what I need to run these long running jobs. Each API will be executing command scripts that run in the background on Linux instances. Then I’m going to need a similar set of services to asses the payload, cache, and real time status of each API, keeping in line with my efforts to break down the value of real time APIs.

  • Size - A service that processes each partial API response in the bucket and calculates the size of the response. If nothing changed, there was no JSON Patch response.
  • Change - A service that determines if a partial API response has changed from the previous response from 60 seconds before, identifying the frequency of change. If nothing changed, there was no JSON Patch response.

I have three goals with long running script microservice. 1) Monitor the real time dimensions of a variety of APIs over time. 2) Understand the efficiencies gained with caching and streaming over polling APIs, and 3) Potentially store the results on Amazon S3, which I will write about in a separate post. I will build an application for each of these purposes on top of these APIs, keeping the microservice doing one thing–processing long run scripts that receive Server-Sent Events (SSE) deliver via Streamdata.io proxies I’ve sent for APIs I’ve targeted.

Next, I am going to get to work programming this service. I have a proof of concept in place that will run the long running scripts. I just need to shape it into a set of APIs that allow me to program against the scripts, and deliver these different use case applications I’m envisioning. Once I have done, I will run for a few months in beta, but then probably open it up as a Server-Sent (SSE) events as a service, that allows anyone to execute long running scripts on the server side. Others may not be interested in measuring the performance gains, but I am guessing they will be interested in storing the streams of response.


Docker Engine API Has OpenAPI Download At Top Of Their API Docs

I am a big fan API providers taking ownership of their OpenAPI definition, which enables API consumers to download a complete OpenAPI then import into any client tooling like Postman, using it to generate client SDKs, and getting up to speed regarding the surface area of an API. This is why I like to showcase API providers I come across who do this well, and occasionally shame API providers who don’t do it, and demonstrate to their consumers that they don’t really understand what OpenAPI definitions are all about.

This week I am showcasing an API provider who does it well. I was on the hunt for an OpenAPI of the Docker Engine API, for use in a project I am consulting on, and was please to find that they have a button to download the OpenAPI for each version of the Docker Engine API right at the top of the page. Making it dead simple for me, as an API consumer, to get up and running with the Docker API in my tooling. OpenAPI is about much more than just the API documentation, and something that should be a first class companion to ALL API documentation for EVERY API provider out there–whether or not you are a devout OpenAPI (fka Swgger) believer.

The Docker API team just saved me a significant amount of time in tracking down another OpenAPI, which most likely would be incomplete. Let alone the amount of work that would be required to hand-craft one for my project. I was able to take the existing OpenAPI and publish to the team Github Wiki for a project I’m advising on. The team will be able to use the OpenAPI to import into their Postman Client and begin to learn about the Docker API, which will be used to orchestrate the containers they are using to operate their own microservices. A subset of this team will also be crafting some APIs that proxy the Docker API, and allow for localized management of each microservice’s underlying engine.

I had to create the Consul OpenAPI for the team last week, which took me a couple hours. I was pleased to see Docker taking ownership of their OpenAPI. This is a drum I will keep beating here on the blog, until EVERY API provider takes ownership of their OpenAPI definition, providing their consumers with a machine readable definition of their API. OpenAPI is much more than just API documentation, and is essential to making sense of what an API does, and then take that knowledge and quickly translate it into actual integration, in as short of time as possible. Don’t make integrating with your API difficult, reduce as much friction as possible, and publish an OpenAPI alongside your API documentation like Docker does.


Five APIs to Guide You on Your Way to the Data Dark Side

I was integrating with the Clearbit API, doing some enrichment of the API providers I track on, and I found their API stack pretty interesting. I’m just using the enrichment API, which allows me to pass it a URL, and it gives me back a bunch of intelligence on the organization behind. I’ve added a bookmarklet to my browser, which allows me to push it, and the enriched data goes directly into my CRM system. Delivering what it the title says it does–enrichment.

Next up, I’m going to be using the Clearbit Discovery API to find some potentially new companies who are doing APIs in specific industries. As I head over the to the docs for the API, I notice the other three APIs, and I feel like they reflect the five stages of transition to the data intelligence dark side.

  • Enrichment API - The Enrichment API lets you look up person and company data based on an email or domain. For example, you could retrieve a person’s name, location and social handles from an email. Or you could lookup a company’s location, headcount or logo based on their domain name.
  • Discovery API - The Discovery API lets you search for companies via specific criteria. For example, you could search for all companies with a specific funding, that use a certain technology, or that are similar to your existing customers.
  • Prospector API - The Prospector API lets you fetch contacts and emails associated with a company, employment role, seniority, and job title.
  • Risk API - The Risk API takes an email and IP and calculates an associated risk score. This is especially useful for figuring out whether incoming signups to your service are spam or legitimate, or whether a payment has a high chargeback risk.
  • Reveal API - Reveal API takes an IP address, and returns the company associated with that IP. This is especially useful for de-anonymizing traffic on your website, analytics, and customizing landing pages for specific company verticals.

Your journey to the dark side begins innocently enough. You just want to know more about a handful of companies, and the data provided is a real time saver! Then you begin discovering new things, finding some amazing new companies, products, services, and insights. You are addicted. You begin prospecting full time, and actively working to find your latest fix. Then you begin to get paranoid, worried you can’t trust anyone. I mean, if everyone is behaving like you, then you have to be on your guard. That visitor to your website might be your competitor, or worse! Who is it? I need to know everyone who comes to my site. Then in the darkest depths of your binges you are using the reveal API and surveilling all your users. You’ve crossed to the dark side. Your journey is complete.

Remember kids, this is all a very slippery slope. With great power comes great responsibility. One day you are a scrappy little startup, and the next your the fucking NSA. In all seriousness. I think their data intelligence stack is interesting. I do use the enrichment API, and will be using the discovery API. However, we do have to ask ourselves, do we want to be surveilling all our users and visitors. Do we want to be surveilled on every site we visit, and on every application we use? At some point we have to make sure and check how far towards the dark side we’ve gone, and ask ourselves, is this all really worth it?

P.S. This story reminds me I totally flaked on delivering a white paper to Clearbit on the topic of risk. Last year was difficult for me, and I got swamped….sorry guys. Maybe I’ll pick up the topic and send something your way. It is an interesting one, and I hope to have time at some point.


API Management and the Measurement of Value Exchanged

The concept of API management has been around for a decade now, and is something that is now part of the fabric of the cloud with services like AWS API Gateway. API management is about requiring all consumers of any API resource to sign up for any API access, obtain a set of keys that identify who they are, and pass these keys in with each API call they make from any application. Since every API call is logged, and every API call possesses these keys, it opens up the ability to understand exactly how your APIs are being used through reporting and analytics packages which come with all modern API management available today. This is the fundamentals of API management, allowing us to understand who is accessing our digital resources, and how they are putting them to use–in real time.

The security of API management comes in with this balance of opening up access, being aware of who is accessing what, and being able to throttle or shut down the access of bad actors–more than it is ever about authentication, and requiring keys for all API calls. If you want access to any digital resources from a company, organization, institution, or government agency in a machine readable format, for use in any other web, mobile, or device application–you use the API. This allows ALL digital assets to be made available internally, to trusted partners, and even to the public, while still maintaining control over who has access to what, and what types of applications they are able to use them in. APIs introduce more control over our digital resources, not less–which is a persistent myth when it comes to web APIs.

API management allows us to limit who has access to which APIs by putting each API into one or many “plans”. The concept of software as a service (SaaS) has dominated this discussion around plan access, establishing tiers of API access such as free, pro, and enterprise, or maybe bronze, silver, and platinum. Giving users different levels of access to APIs, often times depending on how much they are paying, or possibly depending upon the level of trust (ie. partners get access to more APIs, as well as higher rate limits). While this approach still persists, much of what we see lacks imagination, due to the rules of the road dictated by many API startup VC investors pulling the strings behind the scenes. When crafting API access plans you want to incentivize consumer behavior, but honestly startup culture, and VC dominance has limited API provider’s vision, and stagnated many of the conversations around what is possible with API management after a decade.

When you are trying to scale a startup fast you need a narrow offering of API products. If you are operating a “real” business, or organization, institution, or government agency, your API plans will look much different, and you shouldn’t be emulating the startup world. We should be more concerned with value exchange between internal groups, with our partners, and potentially with 3rd party public developers. We should create API plans that incentivize, but not limit access, encouraging developers to innovate, while still generating sensible revenue around our digital resources we are making available. The reason many existing companies are struggling with their API programs is they are emulating startups, and not thinking bigger than the table Silicon Valley has set for us. API management is about measuring and rewarding value exchange, not rapidly growing your company so you can inflate your numbers, and sell your business to the highest bidder. That is so 2012!

When you see API management being used to measure value exchange to its fullest you see all the HTTP verbs being used, and measured. Providers aren’t just providing GETs, measuring and charging for access. They are allowing for POST, PUT, and DELETE, and measuring that as well. If they are really progressive, they reward internal groups, partners, and 3rd party developers for the POSTs, PUTs, and DELETEs made. Incentivizing and then rewarding for desired behavior around valuable resources. POSTed a blog post? We’ll pay you $50.00. PUT 100 records, cleaning up addresses, we’ll pay you 50 cents per update. All you do is GET, GET, GET, well we’ll charge you accordingly, but if you also POST, and PUT, we’ll charge you a lower rate for your GETs, as well as apply credits to your account. API management shouldn’t be about three plans, and generating revenue, it should be about measuring the value exchange around ALL the data, content, media, and algorithmic exchanges that occurs on a daily basis.

API management has done amazing things for allowing companies, organizations, institutions, and government agencies to develop an awareness around who is accessing their resources. Look at what the Census has done with their API, what Capital One is doing with their API program, and Oxford Dictionaries are doing with their API program. Noticed I didn’t mention any startups, or tech rockstars? API management isn’t just about revenue generation. Don’t let the limited imagination of the startup space dictate otherwise. Now that API management is part of the fabric of the cloud, let’s begin to realize its full potential. Let’s not restrict ourselves to just a handful of plans. Let’s use it to broadly measure the value exchange around all the digital resources we are publishing to the web, making them available internally, to our partners, and the public in a machine readable way, so that they can be used in any web, mobile, or device application.


API Transit Basics: Deprecation

This is a series of stories I’m doing as part of my API Transit work, trying to map out a simple journey that some of my clients can take to rethink some of the basics of their API strategy. I’m using a subway map visual, and experience to help map out the journey, which I’m calling API transit–leveraging the verb form of transit, to describe what every API should go through.

This is a simple one. All APIs will eventually need to be deprecated. This is how you avoid legacy systems that have been up for over decades. Make sure the life span of each service is discussed as part of its conception, and put some details out about the expected timeline for its existence. Even if this becomes an unknown, at least you thought about it, and hopefully discussed it with others.

Here are just a few of the common building blocks I’m seeing with API operations that respect their users enough to plan for API deprecation:

  • Releases - Have a set release schedule, and think about what will be deprecated along with each release, allowing for future planning with push.
  • Schedule - Have a deprecation schedule set for each API. You can always extend, or keep versions of your API beyond the date, but at least set a minimum schedule.
  • Communication - Make sure you have a communication strategy around deprecations. Post to the blog, Tweet out notices, and send emails.
  • The Sunset HTTP Header - This specification defines the Sunset HTTP response header field, which indicates that a URI is likely to become unresponsive at a specified point in the future.

Another valuable concept this process will introduce is the possibility that APIs can be ephemeral and maybe only exist for days, weeks, or months. With CI/CD cycles allowing for daily, weekly, and monthly code pushes, there is no reason that APIs can evolve rapidly, and deprecate just as fast. Make sure deprecation is always discussed, and thought about in context of other legacy systems, and technical debt that exists at the organization.

API deprecation is inevitable. We might as well start planning for it from day one. Every API definition upon inception should have an API deprecation target date, 12 months, 18 months, or whatever your time frame is. You may have future versions of the API in place, and in some cases extend the life of an API, but having a deprecation strategy shows you are thinking about the future, considering change, as well as considering the impact on your consumers.


Breaking Down The Value Of Real Time APIs

I am working to evolve an algorithm for Streamdata.io that helps measure the benefits of their streaming service. There are a couple layers to what they offer as a company, but as I dive into this algorithm, there are also multiple dimensions to what we all perceive as real time, and adding more complexity to the discussion, it is something that can significantly shift from industry to industry. The Streamdata.io team was working to productize this algorithm for quantifying the value their service delivers, but I wanted to take some time to break it down, lay it out on the workbench and think about it before they moved to far down this path.

Ok. To help me get my brain going, I wanted to work my way through the dictionary sites, exploring what is real time? Real time often seems to describe a human rather than a machine sense of time. It is about communicating, showing, or presenting something at the time it actually happens, where there is no notable delay between the action and its effect or consequence. All of this is relative to the human receiving the real time event, as well as defining exactly when something truly happens / happened. Real time in banking is different than real time in stock trading, and will be different than media. All requiring their own perception of what is real time, and what the effects, consequences, and perceptions are.

When it comes to the delivery or streaming of real time events, it isn’t just about the delivering of the event, message, or transaction. It is about doing it efficiently. The value of real time gets ruined pretty quickly when you have to wade through too much information, or you are given too many updates of events, messages, and transactions that are not relevant. Adding an efficient element to the concept of what is real time. Real time, streaming updates of EVERYTHING are not as meaningful as streaming updates of only what just happened, staying truer to the concept of real time, in my opinion. Making the caching, and JSON Patch aspect of what Streamdat.io relevant to delivering a true real time experience–you only get what has changed in real time, not everything else.

To help me break down the algorithm for measuring the value delivered by Streamdata.io, I’ve started with creating three simple APIs.

  • Poll API - A service for polling any API I give it. I can adjust the settings, but the default is that it polls it every 60 seconds, until I tell it to stop. Storing every response on a private Amazon S3 bucket.
  • API Size - A service that processes each API response in the bucket and calculates the size of the response.
  • API Change - A service that determines if an API response has changed from the previous response from 60 seconds before, identifying the frequency of change.

This gives me a baseline of information I need to set the stage for what is real time. I am trying to understand what changes, and potentially what the value is of precise updates, rather than sending everything over the wire with each API response. After I set this process into motion for each API, I have another set of APIs for turning on the Streamdata.io portion, which reflects the other side of the coin.

  • Stream API - This service proxies an API with Streamdata.io and begins to send updates every 60 seconds. Similar to the previous set of services, I am storing the initial request, as well as every incremental update on Amazon S3.
  • API Size - A service that processes each partial API response in the bucket and calculates the size of the response. If nothing changed, there was no JSON Patch response.
  • API Change - A service that determines if a partial API response has changed from the previous response from 60 seconds before, identifying the frequency of change. If nothing changed, there was no JSON Patch response.

This gives me all the raw data I need to calculate the value which Streamdata.io delivers for any single API. However, it also gives me the raw data I need to begin calculating what is real time, and the value of it. We are tagging APIs that we catalog, allowing us to break down by common areas like finance, banking, media, transit, etc. This will allow us to start looking at how often things change within different sectors, and begin to look at how we can measure the value brought to the table when events, messages, and transactions are efficiently delivered in real time.

I am going to build me a dashboard to help me work with this data. I need to look at it for a couple of months, and run a number of different APIs through until I will know what dimensions I want to add next. I’m guessing I’m going to want some sort of freshness score on this, to see if something really truly is a new event, message, or transaction, or possibly being recirculated, duplicated, or some other anti-pattern. IDK. I’m guessing there are a number of new questions I will have about this data before I will truly be able to feel comfortable that the algorithm defines a meaningful vision of real time. Right now the algorithm sets to compute a couple meaningful efficiency gains.

  • Client Bandwidth (BW) Savings - What efficiencies are realized when working with data in the client.
  • Server Bandwidth (BW) Savings - What efficiencies are realized in bandwidth, as data is transmitted.
  • Server CPU Savings - What efficiencies are realized on the service in CPU savings.

You can see this calculated for the Washington Metropolitan Area Transit Authority (WMATA) Data APIs in a story I wrote last year. I want to be able to calculate these efficiency gains, but I want to be able to do it over time, and begin to try and understand the real time dimension of savings, not just what is introduced through caching. These three calculations speak to the caching aspect of what Streamdata.io delivers, not the real time benefits of the service. Something that won’t be as straightforward to quantify, but I want to give it a try regardless.

I want the algorithm to measure these efficiency gains, but I want to be able to capture the real time value of an API, both in quantifying the real time value delivered by the APIs, as well as the real time value delivered by Streamdata.io–establishing a combined real time ranking. This moves the algorithm into territory where it isn’t just describing the value delivered by Streamdata.io to their clients, but also quantifying the value delivered by the combination of the clients API, and Streamdata.io working together. This is where I think things will start to get interesting, especially as we begin to move Streamdata.io services, and our algorithm into new industries, and adding new dimensions and perceptions to the discussion.

This is when things will start to get interesting I feel. By the time we get to this point, the tagging structure I will have applied to different APIs will have evolved, and become more precise as well. Allowing me to further refine the algorithm to apply a real time value ranking to specific streams within a single provider, or even in aggregate across providers. Allowing consumers to subscribe to the most precise real time streams of events, messages, and transactions, and cutting out the noise and redundancy. This is when we can move things beyond just large volumes of data in real time, but precise volumes of data in real time. Consuming only what is needed, training our machine learning models on exactly what is required, and keeping them updating in real time, allowing us to deliver real time artificial intelligence streams that are updated by the second, or minute, producing the most relevant models possible.


<< Prev Next >>