The API Evangelist Blog
This blog represents the thoughts I have while I'm research the world of APIs. I share what I'm working each week, and publish daily insights on a wide range of topics from design to depcration, and spanning the technology, business, and politics of APIs. All of this runs on Github, so if you see a mistake, you can either fix by submitting a pull request, or let me know by submitting a Github issue for the repository.
It Isn't Just That You Have A PDF For Your API Docs, It Is Because It Demonstrates That You Do Not Use Other APIs13 Aug 2018
I look at a lot of APIs. I can tell a lot about a company, and the people behind an API from looking at their developer portal, documentation, and other building blocks of their presence. One of the more egregious sins I feel an API provider can make when operating their API is publishing their API documentation as a PDF. This is something that was acceptable up until about 2006, but over a decade after it shows that the organization behind an API hasn’t done their homework.
The crime really isn’t the fact that an API provider is using a PDF for their documentation. I’m fine with API providers publishing a PDF version of their API, to provide a portable version of it. Where a PDF version of the documentation becomes a problem is when it is the primary version of the documentation, which demonstrates that the creators don’t get out much and haven’t used many other APIs. If an API team has done their homework, actually put other 3rd party APIs to work, they would know that PDF documentation for APIs is not the norm out in the real world.
One of the strongest characteristics an API provider can possess, is an awareness of what other API providers are doing. The leading API providers demonstrate that they’ve used other APIs, and are aware of what API consumers in the mainstream are used to. Most mainstream API consumers will simply close the tab when they encounter an API that has a PDF document for their API. Unless you have some sort of mandate to use that particular API, you are going to look elsewhere. If an API provider isn’t up to speed on what the norms are for API documentation, and are outwardly facing, the chance they’ll actively support their API is always diminished.
PDF API documentation may not seem like too big of a mistake to many enterprise, institutional, and government API providers, but it demonstrates much more than just a static representation of what an API can do. It represents an isolated, self-contained, non-interactive view of what an API can do. It reflects an API platform that is self-centered, and not really concerned with the outside world. Which often means it is an API platform that won’t always care about you, the API consumer. APIs in the age of the web are all about having an externalized view of the world, and understanding how to play nicely with large groups of developers outside of your firewall–when you publish a PDF version of your API docs, you demonstrate that you don’t get out much, and aren’t concerned with the outside world.
I spend time reviewing each wave of data API marketplaces as they emerge on the landscape every couple of years. There are a number of reasons why these data marketplaces exist, ranging from supporting government agencies, NGOs, or for commercial purposes. One of the most common elements of API-driven data marketplaces that frustrates me is when they don’t do the hard work to expose the meta data around the databases, datasets, spreadsheets, and the raw data they are providing access to–making it very difficult to actually discover anything of interest.
You can see a couple examples of this with mLab, World Health Organization, Data.World, and others. While these platforms provide (sometimes) impressive ability to manage data stores, but they don’t always do a good job exposing the meta data of their catalogs as part of the available APIs. Dynamically generating API endpoints, documentation, and other resources based upon the data that is being published to their platforms. Leaving developers to do the digging, and making the investment to understand what is available on a platform.
Some of the platforms I encounter obfuscate their data metadata on purpose, requiring developers to qualified before they get access to valuable resources. Most I think, just do not put themselves in the position of an API consumer who lands on their developer page, and doesn’t know anything about an API. They understand the database, and the API, so it all makes sense to them, and they don’t have any empathy for anyone else who isn’t in the know. Which is a common trait of database centered people who speak in acronyms, and schema that they assume other people know, and do not spend much time thinking outside of that bubble.
I could make a career out of deploying APIs on top of other data marketplace APIs. Autogenerating a more accessible, indexable, intuitive layer on top of what they’ve already deployed. I regularly find a wealth of data that is accessible through an API interface, but will most likely never be found by anyone. Before most developers will ever make the investment to onboard with an API, they need to understand what valuable resources are available. I can imagine many developers stumble across these data marketplaces, spend about 15 minutes looking around, maybe sign up for a key, but then give up because of the overhead involved with actually understanding what data is actually available.
I’m deploying three new APIs right now, using a new experimental serverless approach I’m evolving. One is a location API, another providing API access to companies, and the third involves working with patents. I will be evolving these three simple web APIs to meet the specific needs of some applications I’m building, but then I will also be selling retail and wholesale access to each API once they’ve matured enough. With all three APIs of these APIs, I began with a simple JSON schema from the data source, which I used to generate three rough OpenAPI definitions that will acts the contract seed for my three services.
Once I had three separate OpenAPI contracts for the services I was delivering, I wanted to spend some time hand designing each of the APIs before I imported into AWS API Gateway, generating Lambda functions, loading in Postman, and used to support other stops along the API lifecycle. I still use a localized version of Swagger Editor for my OpenAPI design space, but I’m working to migrate to OpenAPI-GUI as soon as I can. I still very much enjoy the side by side design experience in Swagger Editor, but I want to push forward the GUI side of the conversation, while still retaining quick access to the RAW OpenAPI for editing.
One of the reasons why I still use Swagger Editor is because of the schema validation it does behind the scenes. Which is one of the reasons I need to learn more about Speccy, as it is going to help me decouple validation from my editor, and all me to use it as part of my wider governance strategy, not just at design time. However, for now I am highly dependent on my OpenAPI editor helping me standardize and stabilize my OpenAPI definitions, before I use them along other stops along the API lifecycle. These three APIs I’m developing are going straight to deployment, because they are simple datasets, where I’m the only consumer (for now), but I still need to make sure my API contract is solid before I move to other stops along the API lifecycle.
Right now, loading up an OpenAPI in Swagger Editor is the best sanity check I have. Not just making sure everything validates, but also making sure it is all coherent, and renders into something that will make sense to anyone reviewing the contract. Once I’ve spend some time polishing the rough corners of an OpenAPI, adding summary, descriptions, tags, and other detail, I feel like I can begin using to generate mocks, deploy in a gateway, and begin managing the access to each API, as well as the documentation, testing, monitoring, and other stops using the OpenAPI contract. Making this manual stop in the evolution of my APIs a pretty critical one for helping me stabilize each API’s definition before I move on. Eventually, I’d like to automate the validation and governance of my APIs at scale, but for now I’m happy just getting a handle on it as part of this API design stop along my life cycle.
We are kicking it into overdrive now that the schedule is up for APIStrat in Nashville, TN this September 24th through 26th. From now until the event at the end of September you are going to hear me talk about all the amazing speakers we have, the companies they work for, and the interesting things they are all doing with APIs. One of the perks of being a speaker or a sponsor at APIStrat–you get coverage on API Evangelist, a become part of the buzz around the 9th edition of the API Strategy & Practice Conference (APIStrat), now operated by the OpenAPI Initiative (OAI) and the Linux Foundation.
Today’s post is about my friends over at the digital solutions and API management agency Vanick Digital. With APIStrat coming to their backyard, and their ability to capture the attention of the APIStrat program committee, Vanick Digital has two separate talks this year:
Securing the Full API Stack by Patrick Chipman - APIs open up new channels for sharing and consuming data, but whenever you open a new channel, new security risks emerge. Additionally, APIs often involve a variety of new components, such as API gateways, in-memory databases, edge caches, facade layers, and microservice-aligned data stores that can complicate the security landscape. How and where do you apply the right controls to ensure your API and your data are secure? In this session, we’ll answer that question by identifying the different components commonly used in the delivery of API products. For each layer, we’ll discuss the security risks that can and should be mitigated there, along with best practice approaches (including ABAC, OAuth2, and more) to implement those mitigations.
What Do You Mean By “API as a Product”? by Lou Powell - You may have heard the term “API Product.” But what does it mean? In this talk I will introduce the concept and explain the benefits and challenges of transforming your organization to view your APIs as measurable products that expose your companies capabilities, creating agility, autonomy, and acceleration. Traditional product manufacturers create new product and launch them into the marketplace and then measure value; we will teach you to view your APIs in the same way. Concepts covered in this presentation will be designing APIs with Design Thinking, funding your product, building teams, marketing your API, managing your marketplace, and measuring success.
Showcasing their skills as an API focused agency, by bringing it to the stage at APIStrat–smart! I am currently working with their team to understand how API Evangelist and Vanick Digital can work more closely together on projects. Helping me support the customers I’m reaching with my storytelling and workshops, delivering, scaling, and managing the day to day details I don’t have the time to provide for my customers. So it makes me happy to see them at APIStrat, sharing their wisdom, and demonstrating what they are capable of. If you are under resourced like many API providers are, I recommend coming to APIStrat and meeting with the team, or if it can’t wait until September, feel free to reach out directly–just let them know where you found them.
APIStrat is seven weeks away, so make sure you get registered. The workshop, session, and keynote lineup is locked up, but we still have a handful of sponsorship opportunities available. You can find the sponsorship prospectus on the web site, or feel free to contact me directly and I’ll get you plugged in with the events team. Make sure you don’t miss out on an opportunity to be part of this ongoing API conversation that we’ve kept going since 2013–where API developers, architects, designers, and API business leaders, evangelists, advocates, and the API curious gather to discuss where the API space is headed. Now that APIStrat is operated by the OpenAPI Initiative it makes it the place to be if you want to contribute to the road map for the OpenAPI specification, and influence the direction of the API specification. No matter how you choose to get involved, we look forward to seeing you all in Nashville next month!
I Am Speaking In Washington D.C. At The Blue Button 2.0 Developer Conference On The API Life Cycle This Monday07 Aug 2018
I’m heading to Washington D.C. this Monday to speak on the API life cycle as part of the Blue Button 2.0 Developer Conference. We’ll be coming together in the Eisenhower Executive Office Building, within the west wing complex of the White House, to better understand how we can, “bring together developers to learn and share insights on how we can leverage claims data to serve the Medicare population.”
The gathering will hear from CMS Administrator Seema Verma and other Administrator Leadership about Blue Button 2.0 and the MyHealthEData initiative, while also hosting a series of break sessions, which I’m part of:
- Blue Button 2.0 and FHIR (where it’s all heading) with Mark Scrimshire and Cat Greim
- MyHealthEData and Interoperability with Alex Mugge and Joy Day
- Overview of Medicare Claims Data with Karl Davis
- Medicare Beneficiary User Research with Allyssa Allen
- Sync for Science with Josh Mandel and Andrew Bjonnes
- API Design with Kin Lane
Registration for the gathering is now closed, but if you are a federal govy, I’m sure you can find someone to get you in. I’m looking forward to seeing the CMS, HHS, and USDS folks again, as they are doing some amazing stuff with the Blue Button API, as well as hang out with some of the VA people I know will be there. The Blue Button API is one of the more important API blueprints we have out there in the healthcare space, as well as the federal government. I’ve been a champion on Blue Button since I contributed to the project back when I worked in DC back in 2013, and will continue to invest in its success in coming years.
In my session I will be covering my API lifecycle and governance research as the API Evangelist, but I’m eager to talk with more folks involved with the Blue Button API about what is next, and better understand where HL7 FHIR is headed, while also developing my awareness of who is actively participating in the Blue Button API community. I’ll be in DC late Sunday night, through Monday, and I’m back to west coast on Tuesday AM. If you are around I’d love to connect, and if you want to tune in, I believe there will be a live stream of the event on the Blue Button API portal.
I am publishing a new API for locations. I am tired of needing some of the same location based resources across projects, and not having a simple, standardized API I can depend on. So I got to work finding the most accurate and complete data set I could find of cities, regions, and countries. I settled on using the complete, and easy to use countries-regions-cities project by David Graham–providing a straightforward SQL script I can use as the seed for my locations API database.
After crafting an API for this database using AWS API Gateway and Lambda, and working my way down my API checklist, it occurred to me that I wanted to include David Graham’s work as one of the project dependencies. Giving him attribution, while honestly acknowledging my project’s dependency on the data he provided. I’m working hard to include all dependencies within each of the microservices that I’m publishing, being mindful of every data, code, and human dependency that exists behind each service I deliver. Even if I don’t rely on regular updates from them, I still want to acknowledge their contribution, and consider attribution as one layer of my API dependency discussion.
Having a dependency section of my API checklist has helped me evolve how I think about defining the dependencies my services have. I initially began tracking all other services that my microservices were dependent on, but then I quickly began adding details about the other software, data, and people the service depends on as well. I’m also pulling together a machine readable definition for tracking on my microservice dependencies. It will be something I include in the API discovery (APIs.json) document for each service, alongside the OpenAPI, and other specifications. Allowing me to track on the dependencies (and attribution) for all of my APIs, and API related artifacts that I am producing on a regular basis. Providing data provenance for each of my services, documenting the origins of all the data I’m using across my services, and making accessible via an API.
For me, having the data provenance behind each service provides me with a nice clean inventory of all my suppliers. Understanding the data, services, open source code, and people I depend on to deliver a service is important to helping me make sense of my operations. For the people behind the data, services, and open source code I depend on it helps provide attribution, and showcase their valuable contribution to the services I offer. For partner and 3rd party consumers of my services, being observable about the dependencies that exist behind a service they are depending on, helps them make much more educated decisions around which services they put to work, and bake into their applications and systems. In the end, everyone is better off if I invest in data provenance as part of my wider API dependency efforts.
I’ve been having regular meetings with the SAPI API team lately, talking through their presence in the API space, and throwing out ideas for what the future might hold. This isn’t a paid engagement, it is just something I’m interested in investing in on my own, but like many other API service providers in the space, we are exploring what partnership opportunities might there might be. Last week, I wrote about their API Business Hub, which I’ll keep exploring, but first I wanted to address perceptions in the industry that SAP is a little late to the game, when it comes to going all in on APIs.
For me, the SAP API journey begins in 2010, when I left my role as VP of Technology at WebEvents Global, who runs all of SAP Events. By 2010, I had found success in my role by scaling up infrastructure using the AWS cloud, which was orchestrated using APIs, and I was beginning to get a taste of where things were going when it came to delivering digital resources to mobile applications using APIs. I helped lead the technology around SAPPHIRE, SAP’s flagship conference, as well as many other lesser events, and meetings. I had also gotten in trouble by SAP IT leadership for using the cloud, which they labeled a “hobby toy”, and told me I shouldn’t be using. However, I was delivering applications more quickly and cost effectively, so my leadership really couldn’t dismiss my usage of the cloud, and more specifically web APIs. Convincing me that web APIs were going to be more successful than the current web services strategy I was working under.
After seeing the success I was having using APIs to deliver SAP events, and better understanding the potential for delivering global infrastructure in the cloud, and via the increasingly ubiquitous mobile devices we had in our pockets, I left my position and started API Evangelist–eight years later, I’m still doing it. The biggest difference between now and then, is back in 2010 I had to spend a lot of time explaining what is an API, and why someone should be doing it. In 2018, I don’t have to do that, people get it, and most conversations center around how to do APIs right. Everything was a lot more work earlier on, and while there is still so much work to be done, at least now I can get down to business, focusing on the challenges that large enterprise organizations face when doing APIs successfully at scale, and less about having to convince that APIs are a thing in the first place.
When you talk to some people in the API space, they consider SAP a little late to the game as one of the leaders in the software industry. SalesForce, Amazon, Google, Microsoft all saw the signs early on, and IBM, Oracle, CA, HP, and others jumped on the bandwagon along the way, with SAP only joining the conversation in the last couple of years. There are plenty of SAP APIs, and a growing number of API focused services, but SAP just isn’t a player you hear about on a regular basis across the API sector. IBM, CA, and even Oracle have invested significant amounts into their presence in the API space, acquisitions of API talent and solutions, and overall mindshare when you think about APIs. While SAP has a significant amount of work ahead of them to turn up the volume on what they are doing with APIs, they couldn’t have picked a better time to step it up–with so many enterprise organizations finally realizing they need to go API first, SAP will see more return on every dollar they invest into their API operations.
Now is the perfect time to be entering the API game if you are enterprise API service provider. The last decade of growth in the API sector has been established on the backs of many failed and successful API startups, and things are finally beginning to mature, with so many other enterprise organizations looking for direction when it comes to their API strategy. Financial, healthcare, government, and other mainstream sectors are trying to make sense of their digital assets, and APIs are key to this evolution. We are all trying to figure this out, at an individual, professional, and business entity level. In my opinion, any company looking to do APIs, or also sell services to the API sector, should be open and transparent about where they are in their API journey. Do not be embarrassed about being late to the party, not having everything figured out and in place, because this is the exact same position everyone else is at. Sure, there are some organizations that are further along in this journey, but 95% of the mainstream business world is just getting started in this is area.
I’ve been doing API Evangelist for 8 years straight, and I’m still figuring things out. APIs are not a destination, they are a journey. Do not sweat being late for anything, just get to work investing in your journey. Get to work designing, developing, and operating the best in class APIs as possible, putting them to work internally, across your partner landscape, and make them publicly available when it makes sense. If you are selling your services to other API providers, get to work showing why your solutions matter, and are something they should be adopting. In both scenarios, get to work telling the story of your journey in real time. Be honest about how you got here, and where you are going. Tell your story regularly on and offline, and help provide a platform for your API consumers to tell their story. Do this for many year, and eventually, your brand, products, services, and personality will become more common in the ever expanding and evolving world of APIs.
As we prepare for APIStrat in Nashville, TN this September 24th through 26th, I asked my partner in crime Audrey Watters (@audreywatters) to write a post on the significance of Virginia Eubanks, the author of Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor keynoting the conference–she shared this story, of why her work is so significant, and why it is important for the API community to tune in.
Repeatable tasks can and should be automated – that’s an assertion that you’ll hear all the time in computing.
Sometimes the rationale is efficiency – it’s cheaper, faster, “labor-saving.” Automation will free up time; it will make our lives easier. Or so we’re told.
Sometimes automation is encouraged in order to eliminate human error or bias.
Increasingly, automation is eliminating human decision-making altogether. And in doing so, let’s be clear, neither bias nor error are removed; rather they are often re-inscribed. Automation – algorithmic decision-making – can obscure error; it can obscure bias.
This push for more automated decision-making works hand-in-hand with the push for more data collection, itself a process that is already shaped by precedent and by politics. And all this, of course, is facilitated by APIs.
APIs are commonly referred to as a “glue” of sorts – the implication, more often than not, is that APIs are simply a neutral technology holding larger technical systems together. But none of this is neutral – not the APIs and not the algorithms and not the databases.
These technologies are never neutral in their design, development, or implementation. The systems that technologies exist in – organizationally, economically, politically, culturally – are never neutral either.
It seems imperative that those building digital technologies begin to think much more critically about the implications of their work, recognizing that the existing inequalities in the analog systems are readily being ported to the digital sphere.
This makes the work of one of the keynote speakers at this fall’s API Strategy and Practice conference so particularly timely: Virginia Eubanks is a political science professor at the University of Albany, SUNY and the author of Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. The book is a powerful work of ethnography, chronicling the ways in which data mining, predictive modeling, and algorithmic decision-making reproduce and even exacerbate inequalities in housing, health care, and social welfare services. “The digital poorhouse” Eubanks calls it.
“When we talk about the technologies that mediate our interactions with public agencies today,” she writes, “we tend to focus on their innovative qualities, the ways they break with convention. Their biggest fans call them ‘disruptors,’ arguing that they shake up old relations of power, producing government that is more transparent, responsive, efficient, even inherently more democratic.” This argument overlooks the ways in which new technologies are necessarily entangled in old systems of power. Moreover, those building these technologies benefit from a privilege that both shields them from and blinds them to the ramifications of their work on those most marginalized politically and economically.
Without a purposeful effort to address systemic inequalities, technologies will only make things worse. APIs will only make things worse. Instead, we must be part of the work of rethinking these old systems, listening to those on the margins, and reorienting our technological practices towards equity and justice.
We need your help moving the AsyncAPI specification forward. Ok, first, what is the AsyncAPI specification? “The AsyncAPI Specification is a project used to describe and document Asynchronous APIs. The AsyncAPI Specification defines a set of files required to describe such an API. These files can then be used to create utilities, such as documentation, integration and/or testing tools.” AsyncAPI is a sister specification to OpenAPI, but instead of describing the request and response HTTP API landscape, AsyncAPI is describing the message, topic, event, and streaming API landscape across the HTTP and TCP landscape. It is how we are going to continue to ensure there is machine readable descriptions of this portion of the API landscape, for use in tooling and services.
My friend Fran Mendez (@fmvilas) is the creator and maintainer of the specification, and he is doing way too much of the work on this important specification and he needs our help. Here is Fran’s request for our help to contribute:
AsyncAPI is an open source project that’s currently maintained by me, with no company or funds behind. More and more companies are using AsyncAPI and the work needed is becoming too much work for a single person working in his spare time. E.g., for each release of the specification, tooling and documentation should be updated. One could argue that I should be dedicating full time to the project, but it’s in this point where it’s too much for spare time and very little to get enough money to live. I want to keep everything for free, because I firmly believe that engineering must be democratized. Also, don’t get me wrong, this is not a complaint. I’m going to continue running the project either with or without contributors, because I love it. This is just a call-out to you, the AsyncAPI lover. I’d be very grateful if you could lend a hand, or even raise your hand and become a co-maintainer. Up to you 😊 On the other hand, I only have good words for all of you who use and/or contribute to the project. Without you, it would be just another crazy idea from another crazy developer 😄 Thank you very much! 🙌 – Fran Mendez
When it comes to contributing to the AsyncAPI, Fran has laid out some pretty clear ways in which he needs our help, providing a range of options for you to pitch in and help, depending on what your skills are, and the bandwidth you have in your day.
1. The specification There is always work to do in the spec. It goes from fixing typos to writing and reviewing new proposals. I try to keep releases small, to give time to tooling authors to update their software. If you want to start contributing, take a look at https://github.com/asyncapi/asyncapi/issues, pick one, and start working on it. It’s always a good idea to leave a comment in the issue saying that you’re going to work on it, just so other people know about it.
2. Tooling As developers, this is sometimes the most straightforward way to contribute. Adding features to the existing tools or creating new ones if needed. Examples of tools are:
- Code generators (multiple languages):
- https://github.com/asyncapi/node-codegen (going to be deprecated soon in favor of https://github.com/asyncapi/generator)
- Documentation generators (multiple formats):
- https://github.com/asyncapi/docgen (going to be deprecated soon in favor of https://github.com/asyncapi/generator)
- Validation CLI tool (nobody implemented it yet)
- API mocking (nobody implemented it yet)
- API gateways (nobody implemented it yet)
As always, usually the best way to contribute is to pick an issue and chat about it before you create a pull request.
3. Evangelizing Sometimes the best way to help a project like AsyncAPI is to simply talk about it. It can be inside your company, in a technology meetup or speaking at a conference. I’ll be happy to help with whatever material you need to create or with arguments to convince your colleagues that using AsyncAPI is a good idea 😊
4. Documentation Oh! documentation! We’re trying to convince people that documenting your message-driven APIs is a good idea, but we lack documentation, especially in tooling. This is often a task nobody wants to do, but the best way to get great knowledge about a technology is to write documentation about it. It doesn’t need to be rewriting the whole documentation from scratch, but just identifying the questions you had when started using it and document them.
5. Tutorials We learn by examples. It’s a fact. Write tutorials on how to use AsyncAPI in your blog, Medium, etc. As always, count on me if you need ideas or help while writing or reviewing.
6. Stories You have a blog and write about the technology you use? Writing about success stories, how-to’s, etc., really helps people to find the project and decide whether they should bet on AsyncAPI or not.
7. Podcasts/Videos You have a Youtube channel or your own podcast? Talk about AsyncAPI. Tutorials, interviews, informal chats, discussions, panels, etc. I’ll be happy to help with any material you need or finding the right person for your interview.
I’m going to take the liberty and add an 8th option, because I’m so straightforward when it comes to this game, and I know where Fran needs help.
8. Money AsyncAPI needs investment to help push forward, allowing Fran to carve out time, work on tooling, and pay for travel expenses when it comes to attending events and getting the word out about what it does. There is no legal entity setup for AsyncAPI, but I’m sure with the right partner(s) behind it, we can make something happen. Step up.
AsyncAPI is important. We all need to jump in and help. I’ve been investing as many cycles as I can in helping learn about the specification, and tell stories about why it is important. I’ve been working hard to learn more about it so I can contribute to the roadmap. I’m using it as one of the key definition formats driving my Streamdata.io API Gallery work, which is all driven using APIs.json, OpenAPI, and provides Postman Collections as well as AsyncAPI definitions when a message, topic, event, or streaming API is present. AsyncAPI is where OpenAPI (Swagger) was in 2011/2012, and with more investment, and a couple more years of adoption and maturing, it will be just as important for working with the evolving API landscape as OpenAPI and Postman Collections are.
If you want to get involved with AsyncAPI, feel free to reach out to me. I’m happy to help you get up to speed on why it is so important. I’m happy to help you understand how it can be applied, and where it fits in with your API infrastructure. You are also welcome to just dive in, as Fran has done an amazing job of making sure everything is available in the Github organization for the project, where you can submit pull requests, and issues regarding whatever you are working on and contributing. Thanks for your help in making AsyncAPI evolve, and something that will continue to help us understand, quantify, and communicate about the diverse API landscape.
We are gearing up for the next edition of APIStrat in Nashville, TN this September 24th through 26th. With the conference less than two months away, and the schedule up, I’m building momentum with my usual drumbeat about the speakers, and companies involved. So you’ll be reading a lot of stories related to APIStrat in coming weeks, where I’m looking to build awareness and attendance of the conference, but more importantly showcasing the individuals and companies who are supporting it and helping making the 9th edition of APIStrat amazing.
One of innovative startups I’m partnering with right now, and who you will find speaking and sponsoring APIStrat is 42Crunch. Full disclosure, I’m regularly talking with 42Crunch regarding their road map, and I consider them an API Evangelist partner, however, this is because I find them to be one of the more progressive, and important API startups out there right now. 42Crunch is important in my opinion, because they are focusing on API security, a critical stop along the API lifecycle, and also because of the OpenAPI-driven, awareness building approach to delivering API security. 42Crunch isn’t just bringing API security solutions to the table for you to purchase as a service, they bring API security solutions to the table that help you invest in your internal API security practices–which is the most critical aspect of what they do in my opinion.
42Crunch is focused on API security, but they are what I consider to be a full API lifecycle solution. Meaning they play nicely as one of the tools in your API lifecycle toolbox. Which begins with being OpenAPI-driven, and treating your API’s definition as a contract, but with 42Crunch it is about using this contract to empower your API team to make API security a first-class citizen across all stops along the API lifecycle. Not just at the API management layer, or as an after thought later on when you scan your infrastructure–42Crunch is baking security into your OpenAPI contract, proxying your APIs, and ensuring the right security policies are being applied consistently across the API lifecycle. Helping you think of API security all along the way from design to deployment, to management, testing, monitoring, and deprecation.
If you want to learn more about what 42Crunch offers you should be registered for APIStrat, and joining us in Nashville, TN. My friend Isabelle Mauny will be giving her talk on Practical SecDevOps for APIs–here is the abstract for her talk: In an ever agile world, API security must become a commodity. By working with security “ON” as early as possible, API developers can detect vulnerabilities when they are easy to fix. By continuously testing APIs for issues, they can ensure vulnerabilities do not sneak in later in the lifecycle. In this session, Isabelle presents a SecDevOps methodology and shares practical solutions for API security assessment, API protection and security monitoring. You should be taking advantage of this opportunity to learn more about what 42Crunch has to offer, and speaking with Isabelle in person about where API security is going in 2018, because they are the people pushing forward the conversation as part of the OpenAPI Initiative, and with their API security services.
Make sure you don’t miss out on the API conversation in Nashville this fall. Get registered for APIStrat, and make sure you are participating in the ongoing API discussion that is APIStrat. While the conference will continue to be an environment where developers and business folk to gather and discuss the technology, business, and politics of APIs, this year will also have a music focus because of the venue. Making it something you will not want to miss out, with all the keynotes, sessions, workshops, and hallway and late night conversations with all API leaders from across the sector.
I’m currently learning more about SLA4OAI, an open source standard for describing SLA in APIs, which is based on the standards proposed by the OAI, adding an optional profile for defining SLA (Service Level Agreements) for APIs. “This SLA definition in a neutral vendor flavor will allow to foster innovation in the area where APIs expose and documents its SLA, API Management tools can import and measure such key metrics and composed SLAs for composed services aggregated way in a standard way.” Providing not just a needed standard for the API sector, but more importantly one that is built on top of an existing standard.
SLA4OAI, provides an interesting way to define the SLA for any API, providing a set of objects that augment and can be paired with an OpenAPI definition using an x-sla vendor extension:
- Context - Holds the main information of the SLA context.
- Infrastructure - Required Provides information about tooling used for SLA storage, calculation, governance, etc.
- Pricing - Global pricing data.
- Metrics - A list of metrics to use in the context of the SLA.
- Plans - A set of plans to define different service levels per plan.
- Quotas - Global quotas, these are the default quotas, but they could be overridden by each plan later.
- Rates - Global rates, these are the default rates, but they could be overridden by each plan later.
- Guarantees - Global guarantees, these are the default guarantees, but they could be overridden by each plan later.
- Configuration - Define the default configurations, later each plan can be override it.
These objects provide all the details you will need to quantify the SLA for any OpenAPI defined API. In order to validate each of the SLAs, a separate Basic SLA Management Services is provided to implement the services that control, manage, report and track SLAs. Providing the reporting output you will need to understand whether or not each individual API is meeting its SLA. Providing a universal SLA format that can be used to create SLA templates, applied as individual API SLAs, and then leveraging a common schema for reporting on SLA monitoring, which can actually be used in conjunction with the API management layer of your API operations.
I’m still getting familiar with the specification, but I’m impressed with what I’ve seen so far. There is a lot of detail available in there, and it provides all the context that will be needed to quantify, measure, and report upon API SLAs. My only critique at this point is that I feel the pricing, metrics, plans, and quotas elements should be broken out into a separate specification so that they can be used out of the context of just SLA management, and as part of the wider API management strategy. There are plenty of scenarios where you will want to be using these elements, but not in the context of SLA enforcement (ie. driving pricing and plan pages, billing and invoicing, and API discovery). Other than than, I’m pretty impressed with the work present in the specification.
It makes me happy to see sister specifications emerge like this, rather than also baked directly into the OpenAPI. The way they’ve referenced the SLA from the OpenAPI definition, and the OpenAPI from the SLA definition is how these things should occur, in my opinion. It has been something I’ve done for years with the APIs.json format, having seen a future where there are many sister or even competing specifications all working in unison. Which is why I feel the pricing and plan elements should be decoupled from the SLA object here. However, I think this is a great start to something that we are going to need if we expect this whole API thing to actually work at scale, and make sure the technology of APIs is in sync with the business of APIs. Without it, there will always been a gulf between the technology and business units, much like we’ve seen historically between IT and business groups across enterprise organizations today.
I am deploying a patent review API for a client, using data from the Patent Examination Data System (PEDS). You can download complete JSON or XML data from the United States Patent Office (USPTO), and they even have an API. So, why would I be launching yet another API? Well, because what they have is so cryptic, complex, and lacking in any schema or API design, there is value in me pushing the API conversation forward a bit by thinking deeply about how other people will potentially be using these resources–something the USPTO clearly hasn’t done.
The USPTO PEDS API (that is more acronyms than you can shake a stick at) is a great example of how much database people, and developers take for granted as they operate within their little bubbles, without much concern for how the rest of the world views their work–take a look at the screenshot of thee USPTO PEDS API.
There is only one telling sign on this page regarding what this API does–the email address for the contact, which has a uspto.gov address. Beyond that there is not a single sign of the resources available within this API, or the value they bring to the table. Even if you can extrapolate that this is a patent API, there is nothing to tell you that you can’t actually get patent data from this, you can only get meta data about the patents, reviewers, inventors, and the activity around the patent. For me, the API reflects many of the challenges developers and database people face when it comes to thinking out of their box, and effectively communicating with external consumers–which is the whole reason we do web APIs.
I’m pretty well versed consuming patent data, and it took me several hours to get up to speed with this set of resources. I opted to not deal with the API, which is just an ElasticSearch index on top of a patent file store, and went directly to the full-size zipped up download. Something the average user will not have the knowledge, skills, and resources to always do. Which is why I feel there is value in me investing some schema and API design cycles into making the USPTO PEDS API a little more coherent, accessible, and usable by a wider audience using a simple web API. Moving it beyond the realm of wizards (database and developers), and making it something normal people, say patent attorneys, and other business folk can put to use in their work.
The USTPO PEDS API reflects the divide between tech and business people. Some database people and developers will think the implementation is a good one, because it gives them the full download, as well as a customizable, ElasticSearch interface for querying what you want. Many though, will walk away, because they aren’t willing to make the several hour investment getting up to speed on the schema, so they can make their first API query, or load the full download into the database backend of their choosing. This is where an investment on API design begins to pay dividends, is reaching this wider audience of potential consumers who are unwilling to make the investment getting up to speed, or do not have the resources or knowledge to work with the full download or an ElasticSearch interface. Unless of course, your in the business of keeping data out of the hands of these people, which many folks are.
I am a 30 year database professional. I get databases and querying solutions. What many GraphQL and ElasticSearch believers get wrong when they rely on these solutions for delivering publicly available APIs, is that they are unwilling to come to terms with the fact they can’t see their resources through the eyes of the public. They think everyone is like them, and thus want a full blown query interface to get at a known schema. They see API design as unnecessary work, when in reality, they are just unwilling to do the heavy lifting, and they are either consciously, or unconsciously passing that work off to each individual consumer. If you are keeping your APIs available for internal use amongst controlled group of developers this isn’t a problem, but if you are making your APIs available to a wider public audience, it ends up showing that you haven’t taken the time to think deeply about how others will be using your APIs, or that you just doo not care.
While writing about the discussions I’ve been having with folks around using monorepos to manage microservices, I came across this post about whether or not people should be using a single monolithic Lambda function or multiple lambda functions with the AWS API Gateway. Again, surprising me about how lazy people are, and how difficult it is for people to think about things in a decoupled way. Which I think is the reason many people will go back to doing monolithic applications, and fail at microservices, not because it technically won’t work, it is just because it will be perceived as more work, and with a lack of imagination around how to work in a distributed way, people will give up.
First, I do not think microservices is a good idea for all applications. Second, I don’t always subscribe to microservices meaning small or micro. I think a service mindset is good, and it is healthy to decouple, and reduce the surface area of your services, minimizing dependencies, but there are many situations where a super small microservice will be a bad idea. However, if you are going to do serverless microservices with Lambda and AWS API Gateway, I do not understand why you’d want a single monolithic function behind many different API paths. I’m guessing that people who think you should do monolithic serverless haven’t thought about sensible organization of their functions, and orchestration of them using the AWS CLI or API. They are managing them through the AWS dashboard and are thinking, “man this is a lot of work, let’s just do a single function, with the routing built in.”
Similar to folks thinking a monorepo is a good idea over many different repos, without ever thinking about organizations using Github organizations, and orchestration using Git and the Github API, people aren’t getting creative with their Lamdba functions. People seem to be in love with brainstorming and dreaming about decoupled approaches to doing APIs, but when it comes to the hard work of actually doing it, and having an imagination when it comes to orchestration and reducing friction, people would rather just give up. I’m not 100% sold on serverless being the right use case for driving APIs, but I can tell you one thing, that having many different APIs with a single Lambda function behind it will not give you the granularity you need for understanding the performance, and functionality behind each API and service you are delivering–you are just going to create new problems that you won’t have the visibility into to be able to optimize.
I’m reading a lot about microservices backlash lately. I’m guessing after about 1-2 more years of serverless, we will start seeing serverless backlash. While some of this backlash will be about folks using microservices and serverless for use cases that didn’t make sense, I’m guessing a significant amount will be because people can’t decouple their imagination, and think through the necessary organization and orchestration required to think about doing distributed applications at scale. Without it, they are going to fumble, struggle, and see decoupling as all about making extra work for themselves, and go back to the way they were doing things before. In my experience these folks are always on the hunt for easy solutions to their complex problems, and when you aren’t willing to invest the time into doing it right, and properly understanding all the moving parts, you are going to fail, and revert to what you know. The problem with this, is I’m guessing you are going to also fall prey to the next trend, and not have the capacity to understand what it is all about before going all in, yet again.
I told the folks over at SAP that I would take a look at their API Business Hub. It isn’t paid work, just helping provide feedback on another addition to the API discovery front, something I’m pretty committed to helping push forward in any way that I can. They’ve pulled together a pretty clean, OpenAPI driven catalog of useful APIs for the enterprise, so I wanted to make sure I kick the tires and size it up alongside the other API discovery work I am doing.
The SAP API Business Hub is a pretty simple and clean catalog for searching and browsing applications, integrations, as well as APIs–I am going to focus in on the API section. Which at first glance looks to have about 70 separate APIs, but then you notice each of them are just umbrellas for each API platform, and some of them contain many different API endpoints. Some of the APIs are simple language translation and text extraction resources, while others provide robust access to the SAP S/4HANA Cloud, SAP Ariba, and other SAP systems. You see a lot of SAP focused solutions, but then you also see a handful of partner solutions added via their platform partner program.
I see the beginnings of a useful API catalog getting going on over at the SAP API Business Hub. Each API is well documented, and provides an OpenAPI definition for each API, complete with interactive documentation you can play within a sandbox environment. More than most API catalogs, marketplaces, and directories I profile have available. Allowing you to kick the tires and see what is going on, before working with the production version. They also provide you with a Java SDK to download for each API, something that could easily be expanded to support many different platforms, programming languages, and continuous integration cycles with solutions like APIMATIC. Making it more of a discovery, as well as integration marketplace.
Like any API marketplace effort, SAP needs to drum up activity within their catalog. They need more partners signing up to add their APIs, as well as consumers being made aware of the resources published there–something that takes a lot of work, evangelism, and storytelling. Next, I’m going to go through their partner signup and see what I can do to add some of my API resources there, and tell some stories about how they might be able to improve upon the partner flow. I like that their marketplace is OpenAPI driven. I’m curious about how much of the API publishing process is machine readable, allowing API providers to easily add their resources, without a lot of manual form work–something most are going to not have the time and resources for. I’ll keep evaluating how the SAP API Business Hub overlaps with my other API discovery work on the API Stack, the Streamdata.io API Gallery, Postman Network, and partnerships with APIs.guru, APIs.io, and others–continuing to push forward the API discovery conversation after almost 8 years.
I read a lot of API documentation, and help review API portals for clients, and one of the most common rookie mistakes I see made, is people pointing out the obvious, and writing a bunch of fluffy, meaningless content that gets in the way of people actually using an API. When the obvious API industry stuff is combined with the assumed elements of what a company does, you end up with a meaningless set of obstacles that slow API integration down. Here is the most common thing I read when entering an API portal:
“This is an API for querying data from the [Company X] platform, to get access to JSON from our system which allows you to get data from our system into yours using the web. You will need to write code to make calls to our APIs documented here on the page below. Our API uses REST to accept request and provide responses in a JSON format.”
I’ve read API after API that never tells you what the API does. It just assumes you know what the company does, and then goes into verbose explanations of what API, REST, JSON, and other things that should be intuitive if an API is well designed, and immediately accessible via an API. People tend to make to many assumptions about API consumers already knowing what a company does, while also assuming they known absolutely nothing about APIs, and burying actual API documentation behind a bunch of API blah blah blah, instead of just doing and being the API.
It is another side effect of developers, database, and IT folk not being very good at thinking outside of their bubble. It goes beyond techies not having social skills, and is more about them not having to think about other people at all. They just don’t have the ability to put themselves in the shoes of someone landing on the home page of their developer portal, and not knowing anything about the company or the API, and asking themselves, “what does this person need?”. Which I get being something developers don’t think about with internal APIs, but publishing an API publicly, and not stepping back to think about what someone is going to need isn’t acceptable.
Even with my experience, I still struggle to say exactly what needs to be said. There is no perfect introduction to a complex, often abstract set of APIs. However, you can invest a little more time thinking about what others will be needing, maybe run your portal by some external people for a little coherence testing. Most of all, just try to avoid being captain obvious, or captain assumption, and writing content that states the obvious while leaving out most of the critical details you take for granted. It really is the most important lessons we can take away from providing APIs, the ability for them to push us out of our boxes, from behind our firewalls, and have to engage with the real world.
About 60% of my work these days is building upon the last five years of my API Stack research, with a focus on building out the Streamdata.io API Gallery. We are fine tuning our approach for discovering new API-driven resources from across the landscape, while also profiling, quantifying, ranking, and publishing to the Streamdata.io API Gallery, The API Stack, and potentially other locations like the Postman Network, APIs.Guru, and other API discovery destinations I am working with. Helping us make sense of the increasingly noisy API landscape, while identifying the most valuable resources, and then profiling them to help reduce friction when it comes to potentially on-boarding and streaming data from each resource.
Discover New API-Driven Resources
Finding new APIs isn’t too difficult, you just have to Google for them. Finding new APIs in an automated way, with minimum human interaction becomes a little more difficult, but there are some proven ways to get the job done. There is no single place to go find new APIs, so I’ve refined a list of common place I use to discover new APIs:
- Search Engines - Using search engine APIs to look for APIs based upon the vocabulary we’ve developed.
- Github - Github provides a wealth of signals when it comes to APIs, and we use the Github API to discover interesting sources using our vocabulary.
- Stack Overflow - Using the Stack Exchange API, we are able to keep an eye out for developers talking about different types of interesting APIs.
- Twitter - The social network still provides some interesting signals when it comes to discussions about interesting APis.
- Reddit - There are many developers who still use Reddit to discuss technical topics, and ask questions about the APIs they are using.
Using the topic and entity vocabulary we’ve been developing, we can automate the discovery of new APIs across these sources using their APIs. Helping track on signals for the existing APIs we are keeping an eye on, but also quickly identify new APIs that we can add to the queue. Giving us the URL of companies, organizations, institutions, and government agencies who are doing interesting things with APIs.
Profile New Domains That Come In
Our API discovery engine produces a wealth of URLs for us to look at to understand the potential for new data, content, and algorithmic API resources. Our profiling process begins with a single URL, which we then use as the seed for a series of automated jobs that help us understand what an entity is all about:
- Description - Develop the most informative and concise description of what an entity does, including a set of rich meta tags.
- Developer - Identify where their developer and API program exists, for quantifying what they do.
- Blog - Find their blog, and supporting RSS feed so we can tune into what they are saying.
- Press - Also find their press section, and RSS feed so we can tune into the press about them.
- Twitter - Find their Twitter account so that we can tune into their social stream.
- LinkedIn - Find their LinkedIn account so that we can tune into their social stream.
- Github - Find their Github account so we can find more about what they are building.
- Contact - Establish a way to contact each entity, in case we have any questions or need support.
- Other - Identify other common building blocks like support, pricing, and terms of services that helps us understand what is going on.
The profiling process provides us with a framework to understand what an entity is all about, and where they fit into the bigger picture of the API landscape. Most of the sources of information we profile have some sort of machine readable component, allowing us to further quantify the entity, and better understand the value they bring to the table.
Quantify Each Entity
Next up we want to quantify each of the entities we’ve profiled, to give us a better understanding of the scope of their operations, and further define where they fit into the API landscape. We are looking for as much detail about what they are up to so we can know where we should be investing our time and energy reaching out and developing deeper relationships.
- API - We profile their APIs, generating an OpenAPI definition that describes the entire surface area of their APIs.
- Applications - Define approximately how many applications are running on an API, and how many developers are actively using.
- Blog - Pull all their blog posts, including the history, and actively pull on daily basis.
- Press - Pull all their press releases, including the history, and actively pull on daily basis.
- Twitter - Pull all their Tweets and mentions, including the history, and actively pull on daily basis.
- Github - Pull all their repos, stars, followers, and commit history, understand more about what they are building.
- Other - Pull other relevant signals from Reddit, Stack Overflow, AngelList, CrunchBase, SEC, Alexa Rank, ClearBit, and other important platform signals.
By pulling all the relevant signals for any entity we’ve profiled, we can better understand the scope of their operations, and assess the reach of their network. Helping us further quantity the value and opportunity that exists with each entity we are profiling, before we spend much more time on integrating.
Ranking Each Entity
After we’ve profiled and quantify an entity, we like to rank them, and put them into different buckets, so that we can prioritize which ones we reach out to, and which ones we invest more resources in monitoring, tracking, and integrating with. We currently rank them on a handful of criteria, using our own vocabulary and ranking formula.
- Provider Signals - Rank their activity and relevance based upon signals within their control.
- Community Signals - Rank their activity based upon signals the community generates about them.
- Analyst Signals - Rank their activity based upon signals from the analyst community.
- StreamRank - Rank the activity of their data, content, and API-driven resources.
- Topically - Understand the value of the activity based upon the topics that are available.
Our ranking of each entity gives us an overall score derived from several different dimensions. Helping us understand the scope, as well as the potential value for each set of APIs, allowing us to further prioritize which entities we invest more time and resources into, maximizing our efforts when it comes to deeper, more technical integrations, and streaming of data into any potential data lake.
Publishing APIs To The Gallery
Once an entity has been profiled, quantified, and ranked, we publish the profile to the gallery for discovery. Some of the more interesting APIs we hold back on a little bit, and share with partners and customers who are looking for interesting data sources via landscape analysis reports, but once we are ready we publish the entity to a handful of potential locations:
- Streamdata.io API Gallery - The distributed gallery owned and operated by Streamdata.io
- The API Stack - My own research area for profiling APIs that I’ve run for five years.
- APIs.guru - We are working on the best way to submit OpenAPI definitions to our friends here.
- Postman Network - For APIs that we validate, and generate working Postman Collections.
- APIs.io - Publishing to the machine readable API search engine for indexing.
- Other - We have a network of other aggregation, discovery, and related sites we are working with.
Because each entity is published to its own Github repository, with an APIs.json, OpenAPI, and Postman Collection defining its operations, once published, each entity becomes forkable. Making each gallery entry something anyone can fork, download and directly integrate into their existing systems and applications.
Keep Discovering, Profiling, Quantifying, and Publishing
This work is never ending. We’ll just keep discovery, profiling, quantifying, and publishing useful APIs to the gallery, and beyond. Since we benchmark APIs, we’ll be monitoring APIs that go away and we’ll archive them in the listings. We’ll also be actively quantifying each entity, by tuning into their blogs, press, Twitter, and Github accounts looking for interesting activity about what they are doing. Keeping our finger on the pulse of what each entity is up to, as well as what the scope and activity within their community is all about.
This project began as an API Evangelist project to understand how to keep up with the changing API space, and then evolved into a landscape analysis and lead generation tool for Streamdata.io, but now has become an engine for identifying valuable data and content resources. Providing a powerful discover engine for finding valuable data sources, but when combined with what Streamdata.io does, it also allows you to tune into the most important signals across all these entities being profiled, and stream the resulting data and signals into data lakes within your own existing cloud infrastructure, for use in training machine learning models, dashboards, and other relevant applications.
This is a report for the Department of Veterans Affairs microconsulting project, “Governance Models in Public and Private Sector”. Providing an overview of API governance to help the VA, “understand, with the intention to adopt, best practices from the private and public sector, specifically for prioritizing APIs to build, standards to which to build APIs, and making the APIs usable by external consumers.” Pulling together several years of research conducted by industry analyst API Evangelist, as well as phone interviews with API practitioners from large enterprise organizations who are implementing API governance on the ground across the public and private sector, conducted by Skylight Digital.
We’ve assembled this report to reflect the interview conversations we had with leaders from the space, helping provide a walk through of the types or roles and software architecture being employed to implement governance at large organizations. Then we walk through governance as it pertains to identifying possible APIs, developing standards around the delivery of APIs, how organizations are moving APIs into production, as well as presenting them to their consumers. Wrapping up with an overview of formal API governance details, as well as an acknowledgement that most API governance is rarely ever a fully formed initiative at this point in time. Providing a narrative for API governance, with a wealth of bulleted elements that can be considered, and assembled in the service of helping govern the API efforts across any large enterprise.
Roles Within An Organization
There are many roles being used by organizations who are leading the conversation around the delivery of high quality, industry changing APIs. Defining the personalities that are needed to make change across large organizations when it comes to delivering APIs consistently at scale. While there may be many names for the specific roles leading the charge it is clear that these people are bringing a unique blend of skills to an organization, with an emphasis in a couple of key areas:
- Leadership - Providing leadership for teams when it comes to APIs.
- Innovation - A focus on innovation using APIs across the organization.
- Communication - Facilitating communication across all teams, and projects.
- Advisory - Acting as an advisor to existing leadership and management.
- Strategy - Helping existing teams develop, evolve, and realize their strategy.
- Success - Focusing on helping existing teams be successful when it comes to APIs.
- Architect - Bringing a wide variety of software architectural skills to the table.
- Coaching - Being a coach to existing teams, and decision makers across the organization.
Bringing together a unique set of skills that range from the technical to deep knowledge of the business domain, into a concentrated, although sometimes distributed effort to bring change across an organization using APIs. Along with these roles, many large organizations are investing in new types of structure to help develop talent, take charge of new ideas, and move forward the enterprise wide API strategy, with a handful of common characteristics:
- Labs - Treating API efforts as if it is a laboratory creating new experiments.
- Center - Making it a center for API thinking, ideation, and for access to information.
- Centralized - Keeping all efforts in a single group or organization within larger entity.
- Distributed - Emphasis on keeping API knowledge distributed and not centralized at all.
- Global - Acknowledging that APIs will need to be a global initiative for larger organizations.
- Excellence - Focusing on the organization bring excellence to how APIs are delivered.
- Embedded - Making sure there is API knowledge and expertise embedded in every group.
Combing a unique set of skills and personalities, into a focused organization that takes the reins when it comes to leading API change, and digital transformation at an organization. While many of these efforts emphasize a center, or centralized presence, many are also realizing the importance of embedded and distributed approach, ensuring that talent, and ideas grow within existing teams, and are not seen as just some new, external, isolated group or initiative.
There clearly is not a single role or organization structure that brings success to API efforts at scale across the enterprise, however there are clear patterns being applied in the early stages that can be emulated. Helping ensure that API knowledge and expertise is available and accessible by all groups across an organization, and all its geographic regions, ensuring that the entire enterprise is part of the conversation and moving forward in unison.
Software Architecture Design
Governance is all about shaping and crafting the way we design and architect software, leveraging the web, and specifically web APIs to help drive the web, mobile, device, and network applications we depend on. There are a number of healthy, and not so healthy patterns across the landscape for considering as we look to shape and transform our software architecture, begin honest about the forces that influence what software is, does, and what it will ultimately become.
Software architecture is always a product of its environment, being influenced by a number of factors that already exist within any given domain. We are seeing a number of factors influence how large enterprises are investing and defining their software architecture. Here are a handful of the top areas of consideration when it comes to how the domain an enterprise exists within impact architecture:
- Resources - The types of digital resources an enterprise already possess will drive software architecture, defining how it works, grows, expands, and shifts.
- Schema - Existing schema define how data is stored, and often gathered and syndicated–even if this is abstracted away through other systems, it is still influencing architectural decisions at all levels.
- Process - Existing business process are already in motion, driving current architecture, and is something that cannot immediately be changed, without having echoes of impact on future architectural decisions.
- Industry - External industry factors are always emerging to shift how software architecture is crafted, providing design, development, and operational factors that need to be considered as architecture is refactored.
- Regulatory - Beyond more organic industry influences, there are regulatory, legal, and other government considerations that will shift how software architecture will ultimately operate.
- Definitions - The access and availability of machine readable definitions, schema, process, and other guiding structural elements that can help make software architecture operate more efficiently, or less efficiently in the absence of standardization, and portability.
Domain expertise, awareness, and structure will always shape software architecture, and the decision making process that surrounds it. Making it an imperative for there to be an investment in internal capacity as well as leveraging external expertise and vendors when it comes to shaping the enterprise architectural landscape. Without the proper internal capacity, domain knowledge can be minimized, weakening the overall architecture of the digital infrastructure nutrients an enterprise will need to move forward.
We can never escape the past when it comes to the software architectural decisions we make, and it is important that we don’t just see legacy as a negative, and also view legacy as a historical artifact that should move forward. Maybe not always the same legacy code should be in forward motion, but the wisdom, knowledge, and lessons learned around the enterprise legacy should be on display. Here are a handful of the legacy considerations we’ve identified through our discussions.
- Systems - Existing systems in operation have a significant influence over all current and future architectural decisions, making legacy system consideration a top player when it comes to decision making around software architecture conversations.
- People - Senior staff who operate and sustain legacy system, or were around when they were developed possess a significant amount of power when it comes to influencing any new system architecture, and what gets invested in or not.
- Partners - External partners who have significant history with the enterprise possess a great deal of voting power when it comes to what software architecture gets adopted or not.
- Trauma - Legacy trauma from historical outages, breaches, and bad architectural decisions will continue to influence the future, especially when legacy teams still have influence over future projects.
Systems, people, partners, and bad decisions made in the past will continue to drive, and often times haunt each wave of software architectural shifts. This influence cannot be ignored, abandoned, and needs to be transformed into positive effects on next generation investment in software architecture. Change will be inevitable, and legacy technical and cultural debt needs to be addressed, but not at the cost of repeating the mistakes of the past.
After legacy concerns, we all live in the reality have been given, and it is something that will continue to shape how we define our architecture. Throughout our discussions with companies, institutions, and government agencies regarding the challenges they face, and the current forces that shape their software architecture decisions, we found several recurring themes regarding contemporary considerations that were making the largest impact:
- Talent Available - The talent available for designing, developing, and deploying of API infrastructure dictates what is possible at all stages.
- Offshore Workers - The offshoring of work changes the governance dynamics, and requires strong processes, and a different focus when it comes to execution.
- Mainstream Awareness - Keeping software architectural practices in alignment with mainstream practices helps shape software architecture decisions, allowing them to move forward at a healthier pace.
- Internal Capacity - It has been stated several times that doing APIs at scale across the enterprise would not be possible without investing in internal capacity over outsourcing, or depending on vendor solutions.
Modern practices continue shaping how we deliver our software architecture, defining how we govern the evolution of our infrastructure, and find the resources and talent to make it happen. Keeping software architecture practices in alignment with contemporary approaches helps streamline the road map, how teams work with partners, and can outsource and work with external entities to help get the job done as efficiently as possible.
The technology we adopts helps define and govern how software architecture is delivered, and evolved. There are many evolution trends in software architecture that has moved the conversation forward, allowing teams to be more agile, consistent, and efficient in doing what they do. As we studied the architectural approaches of leading API providers across the landscape, and engaged in conversations with a handful of them, we found several technologically defined views of how software architecture is influencing future generations and iterations.
- Vendors - Specific vendors have their owning guiding principles to how software architecture gets defined, delivered, and governed. Often times given an outsized role in dictating what happens next.
- Frameworks - Software and programming language frameworks dictate specific patterns, and govern how software is delivered, and lead the conversation on how it evolves. Software frameworks can possess a significant amount of dogma that will have a gravity all its own when it comes to evolving into the future.
- Cloud Platforms - Amazon, Google, and Microsoft (Azure) have a strong voice in how software architecture is defined in the current climate, providing us with the services and tooling to govern the lifecycle. This control over how we define our infrastructure is only going to increase with their market dominance in the digital realm.
- Continuous Integration / Deployment - CI/CD services and tooling have established a new way of defining software architecture, and establishing a pipeline approach to moving it forward, building in the hooks needed to govern every step of its evolution. Reducing the cycles from annual down to monthly, weekly, and even daily cycles of change.
- Source Control - Github, Gitlab, and Bitbucket are defining how software is delivered, providing the vehicle for moving code forward, the hooks for governing each commit, and step forward any infrastructure makes as it is versioned, and evolved.
These areas are increasingly governing how we design, develop, deploy, and manage our infrastructure. Providing us with the scaffolding we need to hang our technological infrastructure on, and gives us the knobs and levers we can pull to consistently orchestrate, and move forward increasingly complex and large enterprise software infrastructure, across many teams, and geographic regions. The decisions we make around the technology we use will stick with us for years, and continue to influence decisions even after it is gone.
When it comes to delivering software architecture, not everything is governed by the technical components, and much of what gets delivered and moved forward will defined by the business side of the equation. The amount of investment by a business into its overall IT, as well as more progressive groups, will determine what gets done, and what doesn’t. With the following elements of the business governing software architecture in several cases::
- Budgets - How much money is allocated for a team to work when it comes to defining, deploying, managing, and iterating upon software architecture.
- Investors - Many groups are influenced, driven, and even restricted by outside investors, determining what software architecture is prioritized, and even dictating the decisions around what is put to work.
- Partners - External partners with strong influence over the business discussions that drive software infrastructure decision play a big role in the governance, or lack of governance involved.
- Public Image - Often times the decisions that go into software architecture, and the governance of how it moves forward will be driven by the public image concerns around company, and its stakeholders.
- Culture - The culture of a business will drive decisions being made when it comes to developing, managing, and governing software architecture, which can be more challenging to move forward than the technology in many cases.
The governance of software architecture has to be in alignment with business objectives of an enterprise. Many groups choose to begin their API journeys based upon trends, or the desire of a small group, and have encountered significant friction when trying to bring in alignment with the wider enterprise business objectives. Groups that addressed business considerations earlier on in their strategy have done much better when it came to reducing friction, and eliminating obstacles from their road map.
Almost every discussion we’ve had around governance of software infrastructure has included mentions of the importance of observability across next generation iterations. Software designed, delivered, and supported in the darkness or in isolation either fails, or is destined to become the next generation of technical debt. There were several areas of emphasis when it came to ensuring the API driven infrastructure sees the light of day from day one, and continues to operate in a way that everyone involved can see what is happening.
- Out in Open - Groups who operate out in the open, sharing their progress actively with other teams, and encouraging a transparent process find higher levels of success, adoption, and consistency across their architectural efforts.
- Co-Discovery - Ensuring that before work begins, teams are working together to discovery new ideas, learn about alternative software solutions, and working together to create buy-in, and ultimately make decisions around what gets adopted.
- Collaborative - While identified as sometimes being slower than traditional, more isolated efforts, teams who encouraged cross-team collaboration saw that their architectural decisions were sounder, more stable, and had more longevity.
- Open Source - Following open source software development practices, and working with existing open source solutions helps ensure that enterprise software architecture last longer, has more support, and follows common standards over other more proprietary approaches.
- Publicly - When it makes sense from a privacy and security standpoint, groups often articulate that being public by default helps ensure project teams behave differently, enjoy more accountability, and often attract external talent, domain expertise, and publicly opinion along the way.
Enterprise organizations that push for observability by default find that teams tend to work better together, and have a more open attitude. Attracting the right personalities, encouraging regular communication, and thinking externally by default, not as something that happens down the road. Bringing much needed sunlight and observability into processes that can often be very complex and abstract, and pushing things to speak to a wider audience beyond developer and IT groups.
Having a shared process that can be communicated across teams, going beyond just technical teams, and is something that business groups, partners, 3rd party, and all other stakeholders can follow and participate in, is a regular part of newer, API-centric, software delivery life cycles. Possessing several core elements that help ensure the process for defining, designing, delivering and evolving software architecture is shared by all.
- Contract - Crafting, maintaining, and consistently applying a common machine readable contract that is available in YAML format, is a common approach to ensuring there is a contract that can be used across all architectural projects, defining each solutions as an independent, business service.
- Pipeline - Extending the machine readable service contracts with YAML defined pipeline definitions that ensure all software is delivered in a consistent, reproducible manner across many disparate teams.
- Versioning - Establishing a common approach to versioning code, definitions, and other artifacts, providing a common semantic approach to governing how architecture is evolved in a shared manner that everyone can follow.
Historically the software development and operation lifecycle is owned by IT, and development groups. Modern approaches to delivering software at scale is a shared process, including internal business and technical stakeholders, while also sharing the process with external partners, 3rd party developers, and the public. Bringing software architecture out of the shadows, and conducting it on the open web, making it more inclusive amongst all stakeholders, but done in a way that respects privacy and security along the way.
Identifying Potential APIs
Once the architectural foundations have been laid, there are many ways in which large enterprises begin identifying the potential APIs that should be designed, deployed, and evolved supporting the many applications that will be depending on the underlying platform architecture. Depending on the organization and it’s priorities, the reasons for how new APIs are born will vary, resulting in different lifecycles, and resulting services being delivered across internal groups, partner stakeholders, and 3rd party developers.
Throughout this research, we identified that there is no single approach to identifying which APIs should be delivered, but we did work to understand a variety of approaches in use across the landscape, and by the practitioners interviewed for this project. Establishing some common areas for answering the questions around what should be an API, why are we doing APIs, leading us to the how of doing APIs, and uncovering the pragmatic reasoning behind web, mobile, device, and other applications that APIs are driving across the landscape.
Our existing realities drive the need for APIs, and reflect where we should be looking to provide new services for internal stakeholders, partners, and potentially as new revenue streams for 3rd party developers. While some APIs may be entirely new solutions, it is most likely that APIs will be born out of the realities we are already dealing with on a daily basis, based upon the digital solutions we depend on each day. We identified the most common realities that enterprise group face when it comes to their present day digital transformation challenges.
- Database - The existing databases in operation are the number one place groups are identifying potential resources for deployment of APIs. Exposing historically accumulated digital assets using the web, and making them available for use in new applications, to partners, and driving new types of data products for generating the next generation of revenue streams.
- Website - Our existing websites reflect the last 20 years of the digital evolution of our enterprises, representing the digital resources we’ve accumulated and identified as being important for sharing with partners and the public. HTML representations of our digital assets are always the 101 of API deployment, and understand what should also be available as JSON, and other more sophisticated representations.
- Integrations - Existing software integrations, system to system integrations, are represent a rich area for understanding how digital resources are delivered, accessed, and made available throughout existing applications and systems. Providing an important landscape for mapping out when understand what should be turned into APIs, having a more consistent process applied, eliminating custom integrations, and standardizing how systems speak with one another.
- Applications - Exiting web, mobile, desktop and other applications can have a variety of backend system connectivity solutions, but also might have a more custom, bespoke approach to doing APIs that is off the radar when it comes to governing how infrastructure evolves. Providing another rich area for mapping out the connections behind the common applications in use, understanding the internal, partner, and 3rd party APIs and other connections they use.
- Services - APIs have been around for a while, and exist in a variety of formats. Legacy web services, RPC, FTP, and other API and messaging formats should be mapped out and included as part of the potential API evolutionary landscape. Taking existing services, and evolving them in a consistent manner with all other API-driven services, leveraging web technologies to consistently deliver and manage digital resources.
- Spreadsheets - The majority of business in the world still happens within the spreadsheet. These portable data stores are emailed, shared, and spread around the enterprise, and represent a rich source of information when it comes to understanding which resources should be published as APIs.
You can’t govern what isn’t mapped out and known. It becomes increasingly difficult to govern software infrastructure that exists across many open and proprietary formats, and delivered as custom one-off solutions. Governance begins with a known landscape, and the greatest impedance to functional governance across organizations are the unknowns. Not knowing a solution exists, or its architectural approach being something that isn’t part of the bigger picture, leaving it to be a lone actor, in a larger landscape of known services operating in concert.
Another reason for having an open, very public approach to selecting, delivering, and operating software infrastructure, is that it establishes a public presence, across web properties, social networks, and other platforms where enterprise organizations can build community. There are a number of ways to identify potential new API resources by just being public, engaging with the community, and establishing API delivery life cycles that involve having a public presence.
- User Requests - Actively soliciting web, mobile, and developer feedback, both internally and externally is a great way to learn about potentially new API resource opportunities. Leveraging existing users as a source of insight when it comes to what services would make applications better, and demonstrating the importance of investing in an API literate user base.
- Partner Requests - Actively working with partners, conducting regular meetings regarding digital assets and transformation, seeking feedback on what types of services would improve upon existing solutions, and strengthen partner relations. Investing in existing partners, and using them as a way to evolve the API road map, and increase their dependency on enterprise resources.
- Public Feedback - Engaging with the public via websites, social networks, forums, and at events, to understand what opportunities are out there when it comes to delivering new API resources. Tapping public awareness around specific topics, within particular domains, and considering suggestions for new APIs outside the enterprise firewall and the traditional box.
- Media Coverage - Tuning on to popular tech blogs, mainstream media, and other public outlets to study and understand what opportunities are emerging for the delivery of new API services. Tuning into popular trends when it comes what is happening with APIs, or business sectors that might not have caught up to some of the modern approaches to delivering APIs.
- Feedback Loops - Cultivating trusted feedback loops with existing users, social networks, and private messaging platforms. Investing in long term feedback loops that tap the knowledge and domain expertise of trusted professionals, who can bring fresh ideas to the table when it comes to which new APIs can benefit the enterprise.
- Negative Consequences - One significant thing to note with having a public presence, is that not everything will be positive. There are serious privacy, security, and safety concerns when operating your APIs out in the public, and there should be plenty of consideration for how to do it in a healthy way. Acknowledging that not everyone in the public domain will have the enterprise’s best interest in mind.
It isn’t easy soliciting feedback from the general public when it comes to determining the direction a platform road map should head. However, with some investment, curation, and cultivation, a more reliable source of insight regarding the direction of an API platform can be established. The API community across the public and private sector has grown significantly over the last decade, providing a wealth of knowledge talent that can be tapped, if you know where to look for it.
Moving beyond the “where to look for opportunities for APIs”, and moving into the why to find, as well as how to prioritize which resources should be turned into APIs, we hear a lot about investing in the overall improvement of the enterprise when it came to the motivations behind governance. Looking at many of the common incentives behind doing APIs, but also doing it in a more consistent and scalable way that supports the mission and forward motion of the enterprise.
- Optimization - Seeking to optimize how applications are delivered, and services provided across teams. Providing consistent services that can be used across many internal groups, across partners, and that will improve the lives of 3rd party developers.
- Common Patterns - Doing APIs, and pushing for governance to help identify common patterns across how software is designed, delivered, and managed. Working to extract the existing patterns in use, help standardize and establish the common patterns, and reenforce their usage across teams, and distributed groups.
- Reusability - Encouraging reusability is the number one improvement we heard from different groups, and see across the landscape. Governing how software is not just delivered, but maximized, reused–ensuring the enterprise is maximizing software spend, as well as the revenue from services it delivers.
- Acceleration - Investing in governance to help accelerate how applications are delivered, measuring, standardizing, and optimizing along the way to improve existing efforts, as well as new projects on the horizon. Increasing the speed of not just new services, but how the services are able to be put to work by developers and integrators.
- Efficiency - Setting into motion patterns and processes that increase overall efficiency around how services are delivered, and how they enable teams to deliver on new projects, applications, integrations. Allowing IT, developers, and business users to benefit from an API focus.
- Flexibility - Increasing the flexibility of how applications operate, and teams are able to work together and deliver across the enterprise. Encouraging the design and development of APIs that work together, and flexibly achieve organizational objectives.
Providing a set of criteria that can be used to help prioritize which APIs get identified for delivery, and for evolution and versioning. If API resources help deliver on any of these areas, their benefit to the enterprise is increased, and it should be bumped up the list when looking for API opportunities. Always looking for how the enterprise can be improved upon, while also understanding which specific resources should be targeted for wrapping, and exposing as simple web APIs that can be used for both internal and external use cases.
The API journey is always full of challenges, and these areas of friction should be identified, and incorporated into the criteria for identifying new API solutions, as well as determining which APIs should be invested in and evolved. While some challenges can be minimized and overcome, many can also cause unnecessary friction throughout the API roadmap, making challenges something to be considered when putting together any API release strategy.
- Education - What education is required when it comes to acquiring the resources behind any potential API, developing and standing up a proper API, as well as deploying, managing, and supporting an API. What challenges can be foreseen, and identified early on, helping weigh what investment in education, training, and learning along the API journey for any set of services.
- Maturity - Understanding early on, and putting together a plan on what maturity will look like for any service, acknowledging that every service will begin in a juvenile state, and take time to harden, mature, and become something that is dependable, reliable, and usable in a production environment.
- Isolation - Identifying resources that are being developed, maintained, and operated in isolation, and working to move them out into the mainstream. While also ensuring that any new services being developed avoid isolation, and are developed, evolved, and managed out in the open, ensuring that services never operate alone.
- Management - Including management in discussions around which resources should be developed into APIs, including leadership in all conversations involving the targeting, evolution, and even deprecation of API services. Ensuring that the prioritization of API development is always on the radar of management, and there is a general awareness regarding the delivery of services.
- Consistency - Realizing that while consistency is the goal, it may be an elusive, non-stop chase to actually realize consistency across teams. It should be a goal, but also realistically understanding that it won’t be easy to achieve, and while we want to strive for perfection, sometimes there will be good enough achieved for some services.
- Reusability - Similar to consistency, reusability is an obvious goal, and should be worked towards, but it will also be elusive, and not always reliably achieved over time. There might still be redundancy for some services, and overlapping aspects of delivering services, while some areas reusability will be achievable.
- Build It And They Will Come - There has been a significant amount of reflection regarding targeting, developing, and publishing APIs that were not needed based upon an “if you build it, they will come” mentality–where most often, nobody came, and the work was in vain.
Challenges are a fact of life in the delivery of software, and evolving complex systems using APIs. Identifying challenges should be a natural part of targeting resources for delivery as APIs. Challenges can increase fiction when delivering service, and should be carefully evaluated before tackling the development of any new services. It is easy to identify the potential of new APIs, but it takes a more seasoned eye to understand potential challenges.
New ideas for APIs will become numerous once you begin looking. Based upon existing resources, applications, and the feedback of internal groups, partners, and the public. Along with all of the possibilities that come along, a standardized, pragmatic approach to understanding the potential, value, as well as challenges with each potential idea should be part of the equation.
Defining Data Models & Standards
To help realize and deliver upon governance at scale it will take heavy investment in standardizing data models, and incorporating existing patterns and standards throughout the API delivery lifecycle. Many enterprise API development groups are streamlining and standardization the delivery of APIs through the adoption and development of standards across operations, which is something that is also contributing to adoption, integration and removing friction for application developers.
The adoption of common data models, interfaces, media types, and web standards helps contribute to the delivery of consistent APIs at scale, but they can also prove to be a challenge for some teams, and even been seen as a threat by others. There are a number of ways in which teams are pushing for standardization across their operations, and helping achieve more consistency, reuse, and the desired results across operations. Reflecting one of the strengths of web APIs, in that they employ web standards to achieve wider adoption, and the delivery of valuable resources at web scale.
A suite of approaches have emerged in the last decade for designing, developing, evolving, and applying common API patterns across the API lifecycle. These standardized approaches to defining and delivering APIs, using common machine readable specifications, and widely used patterns, have become central to API governance discussions. Providing the fuel for the growth of the API sector to serve mobile applications, as well as the growth of other emerging channels like voice, bot automation, and the connecting of everyday objects to the net. Helping the enterprise get more organized about how services are delivered across the organization at scale.
- Resource-Defined - RESTful design patterns have provided a simple approach to taking corporate resources and defining them as an intuitive, reusable, potentially web-scale stack of API resources that can be used across a variety of applications. REST provides a philosophy that can be adopted across the enterprise to help organize digital resources as a reusable stack of resources that can be discovered and put to use across many channels.
- Schema-Driven - JSON Schema is being used to take a variety of schema and standardize them for use in RESTful API resource delivery. Providing a reusable blueprint that can be used across the request and response model for all APIs. Deriving, and standardizing existing schema in use, and making available for usage in in newly developed, and evolving APIs allows for teams to achieve many of the objectives set out as part of modern API strategies.
- Domain Driven - The business domain is used across the enterprise for guiding the identification, development, evolution, and standardization of a variety of API definitions in use across the enterprise. Lines of business, industry definitions, and a focus on the domain helps establish areas of concern, and the separation of services, allowing for the decoupling of enterprise resources used across systems, but working in unison to deliver a single set of business objectives.
- Legacy Abstraction - Continue movements to decouple, redefine, and evolve legacy systems is pushing forward the identification of common patterns, and pushing to map, transform, and give them new life as newer web APIs. Taking legacy databases, system interfaces, and distilling the wisdom that exists across them, to help drive the development of common standards.
- Vocabularies - API development teams are establishing common vocabularies based upon the standardized language already in use, but also essentially taking the slang that is used in bespoke systems and helping tame it, and add it to the common lexicon when it makes sense. Providing a standard language that can be used across the enterprise to talk about services, resources, and digital assets.
- Discovery - Many groups expressed challenges around standards not seeing the desired adoption because other teams could not find existing schema, definitions, and other existing standards. Emphasizing the importance of comprehensive, actively maintained, and evangelized catalog of core definitions across the enterprise. Providing a single, or distributed location where everyone can find and publish their common definitions.
The definitions coming out of existing API development efforts are being organized into catalog, and discovery systems that can be used to guide governance efforts. Mapping out the known landscape across the enterprise, and turning it into the common patterns that can be reused across the design, development, and operation of the next generation of APIs. Distilling down the essence of the enterprise so that it can become the building blocks of an API program, while also allowing each stop along the lifecycle to be quantified, measured, and considered as part of a wider governance strategy.
Doing Business On The Web
APis are built on the web. They use, and benefit from over 25 years of the evolution of the web. There are a number of elements to consider when working to identify and define common standards for use across the governance of any API program. While the API strategy should be rooted in definitions derived from the core of the enterprise, secondarily it should be embracing the web, and employing common patterns that make the web work to use as the foundation for delivering APIs.
- Web Standards - The web is the foundation for the delivery of APIs. Most APIs will use HTTP as a transport, and be employing URLs, HTTP verbs, headers, parameters, and other common web standards. Web standards should be part of any governance strategy to help establish common patterns and definitions for use across operations.
- Media Types - Media types are a fundamental part of the web, and help establish message formats that will be widely recognized outside the enterprise, encouraging the reuse and adoption of APIs that employ common media types. Allowing consumers to negotiate the format that makes the most sense to their team, and the types of applications they are looking to develop.
- Industry Schema - Industry level schema are emerging and maturing for use across API operations. Specifications like FHIR, PSD2, and other schema, along with API design patterns are evolving to help support industry focused API operations, while encouraging reuse and interoperability across disparate groups.
- Open Source - The usage of open source software, tooling, specifications, and processes are helping deliver on the API vision across the enterprise. Web APIs reflect the open source ethos, and plays well with the delivery of web APIs. Encouraging reuse, adoption, and bringing the observability necessary to help APIs succeed.
APIs are all about doing business on the web. The web provides the platform in which any API program will operate. When it comes to defining schema, standards, and common patterns for use across API operations, the web is always the beginning of the conversation. While enterprise defined patterns will always be front and center, the standards used to operate the web should always trump localized definitions, and be given priority whenever possible. Don’t reinvent the wheel when it comes to the web, always reuse and implement what is already known.
Under The Influence
When learning about new standards, and considering which standards to adopt, it can be easy to find yourself under the influence of specific vendors, competing standards, programming communities, and other factors. Careful evaluation of standards is important, and an awareness of what some of the common elements are that may shift your opinions one way or another, or even obfuscate what is real and prevent you from achieving objectives.
- Caught in Trends - Avoid getting caught up in the trend cycles that can often make it difficult to understand the hype around specific specifications. Do your research, understand best practices and adoption levels, and make the sensible decisions around the impact to your own efforts.
- External Entities - While engaging with external entities, understand what their priorities are when it comes to standards and specifications. Consider what affinity may exist between enterprise objectives, and any external entity that is engaged with, and makes sure there is the right alignment, and influences are pushing efforts in the right direction.
- Internal Demands - Similar to external entities, understand what the internal teams priorities are and don’t always assume internal requests will have the overall enterprise objectives in mind. Fully understanding what the awareness and motivation are around the implementation of specific standards, and how the fit into the overall strategy.
- Feedback Loops - Ensure that feedback looks are diverse, and provide a wealth of opinions around what types of standards and specifications should be supported, providing the widest possible view of the landscape when it comes to adoption and investment.
- Organic Change - Keeping an eye on vendor induced standards adoption over a more organic approach to the growth of standards, internally, as well as the outside community. Working to understand when a standard is artificially inflated, amplified, for alternative objectives beyond its core mission.
There are plenty of currents to get caught up in when it comes to identifying, defining, and evolving standards. Not all will bear fruit, or realize the type of adoption they need to be successful. Establishing a balanced view of the landscape across internal, and external actors, while keeping counsel with a diverse set of voices can help ensure you understand which API specifications, standards, and definitions will help move the enterprise forward.
Taking The Lead
While there are a number of ready to use standards available for the web, and organically grown out of the API community, these standards won’t always find their way into the enterprise. Leading organizations demonstrate that it takes a structured effort to define, disseminate, educate, and evolve standards across large organizations, with a number of proven tactics for taking the lead when it comes to standardizing API infrastructure across the enterprise.
- Workshops - Organizing, conducting, and growing the number of workshops held to introduce individuals across many teams to a variety of common standards and specifications.
- Discussions - Formalizing discussions around emerging standards, and those that are in use, to help push forward awareness, and adoption of standardized approaches across groups.
- Collaboration - Push teams to work together when it comes to sharing the standards in use, showcasing the investment they’ve made, and working together to understand the tooling, services, and standards being used.
- Event Storming - Putting event storming, a rapid, lightweight group modeling technique to help accelerate the identification, evolution, and adoption of standards that meet specific team’s needs.
- Influencers - Identifying, investing in, and cultivating influencers who exist within current groups, and encourage them to evangelize and help spread the good word about standards across the enterprise.
- Ask Questions - Always be asking questions about the standards, or lack of standards in use across the enterprise, pushing the conversation forward at all times when it comes to standards.
- Challenge Assumptions - Making sure teams don’t get complacent, and the status quo is always being challenged, and that the internal domain should always be rising to a higher level of standardization whenever possible.
It takes standards bodies to move forward common standards at the web and industry levels, and it takes the same approach to push forward the adoption and usage of standards within the enterprise. Leading enterprise organizations are able to quantify, measure and evolve the infrastructure in a more organized way through the adoption of common schemas, specifications, and standards. Providing a common vocabulary for all teams to use when designing, deploying, and managing services that can be used consistently across the enterprise, and its public interests.
Development to Production
After understanding the roles needed to realize governance, more about the underlying platform architecture that is needed, how organizations can identify where the API opportunities are, and making sure groups are putting standards to work, we scrutinized how groups are moving APIs from development to production in a more structured way. Governing how teams are efficiently moving APIs from idea and design, to actually putting services to work in a production environment at scale across large teams. Documenting the lifecycle of a service, and the common elements of how enterprises are getting the job done on a regular basis.
To be able to deliver APIs at scale in a consistent way teams are relying on a well honed, well defined lifecycle that has been defined, proven, and evolved by senior teams. Forcing structure and rigor throughout the evolution of all services, putting governance in front of teams, and forcing them to execute in a consistent way if they expect to reach a production reality with their services. Focusing on a handful of structured formats for imposing governance at the deployment levels.
- Contract - Requiring ALL services begin with a machine readable OpenAPI contract defining the entire surface area of the API and its schema. Leveraging the contract as the central truth for what the service will deliver, and how governance will be measured throughout the lifecycle of the service.
- Process - Providing a well defined process for all developers laying out how any service moves from design to production, with as much detail regarding each step along the way. Helping all developers understand what they will face as they work to move their services forward in the enterprise.
- Scorecard - Having a machine readable checklist and scorecard, with tooling to help each developer fork or establish an instance for their service. Providing a checklist of everything they need to consider, that allows them to check off what has been done, what is left to be done, and providing a definition that can be used to define and report upon governance along the way.
- Cycles - Provide a variety of cycles that every service will need to go to before they will be production worthy, forcing developers to iterate upon their services, harden and mature them before they will be considered ready for production.
- Reviews - Require all services go through a series of lifecycle reviews by other teams, pushing service owners to present their work to each review team, and work with them to satisfy any concerns, and make sure it meets all governance criteria.
- Clinics - Providing developers with a variety of clinics where they can receive feedback on their work, improve upon their service, and improve the health of their work before submitting it for inclusion in a production environment.
Enterprise organizations that provide structure for API development teams find it much easier to realize their governance aspirations. The scaffolding is already there to think about the consistency of services, and the face to face, and virtual scrutiny of services helps provide the environment for governance practices to be executed, enforced, and evolved before any service reaches a production state. A well defined API deployment lifecycle will help contribute to a well defined API governance practice.
One sign of enterprise groups who are further along in their governance journeys is when there are virtualized environments being put to use. Requiring all API developers to mock and iterate upon their APIs in a virtualized environment, presenting them as if they are real, before they ever are given a license to write any code, let alone reach a production state for their services.
- Mocking - Creating mock APIs for all endpoints, virtualizing every aspect of the surface area of an API, allowing a service to be iterated upon early on in its lifecycle.
- Data - Requiring virtualized and synthesized data be present for all mocked APIs, returning realistic data with responses, reflecting behavior encountered in a production environment.
- Sandbox - Providing a complete labs and sandbox environment for developers to publish their mocked APIs into, reflecting what they’ll encounter in a production environment, but done in a much more safer and secure way.
Virtualized environments provide an important phase in the journey for APIs moving from concept to reality. Establishing a safe environment for developers to iterate upon their work, encounter many of the challenges they’ll face in a public environment, without any of the potential for harm to users or the platform. Ensuring that when a service is ready for development, most of the rough edges have been worked out of the service contract, and for the team behind.
One of the most significant ways in which we’ve found enterprise groups governing the evolution of their APIs is through the technology they employ. This technology is providing much of the structure and discipline that organizations are depending on to help ensure that APIs are being developed, and ultimately deployed in a consistent manner. Bringing most of the governance to the table for some organizations who haven’t begun moving their governance strategy forward as a formal approach.
- Authentication - Requiring standard-based approaches to authentication using Basic Auth, API keys, OAuth, and JWT. Ensuring the teams understand when to use which protocol, and how to properly configure, and use as part of larger API governance strategy.
- Framework - Relying on the programming frameworks in use to inject discipline into the process, dictating the governance of how APIs are delivered before they are ready for a production environment.
- Gateway - Applying the policies and structure necessary to govern API services as they are made available in a production environment. Many groups also had a sandbox or development edition of their gateway emulating many of the same elements that will be found in a production world.
- Management - Similar to the gateway, groups are relying on their API management layers to help govern what APIs do, providing transformations, policy templates, and a wide variety of other standardization that occurs prior to being made available in production sense.
- Vendor - The reliance on technology to deliver governance at the API deployment level gives a lot of control to vendors when it comes to governing the API lifecycle. If a vendor doesn’t provide a specific way of doing things, it may not exist within some groups. Dictating what is governance for many enterprise groups.
- Tooling - Most groups have an arsenal of open source, and custom developed tooling for helping push code from development to production, validating, scanning, transforming, shaping, and hardening code and interfaces to be ready for production usage.
- Encryption - Requiring encryption by default for storage, and in transport, using technology to ensure security is a default parameter for everything that is exposed publicly. Reducing the possibility of a breach, and minimizing the damage when one does occur.
Demonstrating how important the technological choices we make, and the architectural planning we discussed earlier is to the overall API governance conversations. The services, tooling, and applications we adopt will either contribute, or will not contribute to our governance practices. Potentially enforcing governance for all APIs as they move from development to production, in a way that teams cannot circumvent, and often times don’t even notice is occurring behind the scenes.
Augmenting the core technology, there are a number of orchestration practices we found that help quantify and enforce governance on the road from development to production. Dictating how code, artifacts, and other elements included as part of the API lifecycle move forward, evolve, or possibly get held back until specific requirements are met to meet wider governance criteria.
- Pipeline - CI/CD services, tooling, and mindset have introduced a pipeline workflow for many teams, standardizing the API delivery process as an executable, repeatable, measurable pipeline that can be realized across any team.
- Stages - The defining of clear stages that exist after development, but before production, requiring quality assurance, security reviews, compliance audits, and other relevant governance practices to be realized.
- Hooks - Well defined pre and post commit hooks for all service repositories, requiring that governance is applied throughout a service’s pipeline, and are default for all services, no matter which organization they emerge from.
- Devops - Pushing that all teams are competent, and skilled enough to execute on behalf of their services from beginning to end, owning and executing at every stage of the life cycle. Reducing the need for reliance on special teams, and eliminating bottlenecks.
- Logging - Identifying the logging capabilities of the entire stack for each service being delivered. Making sure logging is turned on for everything, and all logs are shipped to a central location for auditing, and when possible real time analysis and response.
Orchestration provides some clarity on the automation side of moving services from development to production, while also enforcing governance along the way. Allowing for an assembly line delivery of consistent services, and the iteration of each version, in alignment with the overall governance strategy. Reducing the chance for human error, and increasing the chance for consistent execution of the enterprise API strategy at scale across many different teams.
The Legal Department
Beyond the technology, the legal department should have a significant influence over APIs going from development to production. Providing a structured framework that can generally apply across all services easily, but also providing granular level control over the fine tuning of legal documents for specific services and use cases. With a handful of clear building blocks in use to help govern the delivery of APIs from the legal side of the equation.
- Terms of Services - Have universally applicable, and modular, as well as possibly machine readable, and human readable terms of service governing all services from a central location.
- Security Policy - Provide a comprehensive security policy that governs how services are secured, reflecting the technologies, checklists, tooling, and reviews that are in use by all team members, providing an overview for all stakeholders to understand.
- Licensing - Establish clear code, data, and interface licensing to be used across the entire API stack, allowing developers to properly license their services, as well as understand the provenance for the systems and services they depend on.
- Server Level Agreements - Have universally applicable, modular, as well as possibly machine readable, and human readable service level agreement (SLA) that can be applied across all services, measured, and reported upon as part of wider governance strategy.
- Scopes - Define and publish the OAuth access scopes as part of a formal set of legal guidance regarding what data is accessible via services, and the role users and developers play in managing scope that is defined by the platform.
The legal department will play an important role in governing APIs as they move from development to production, and there needs to be clear guidance for all developers regarding what is required. Similar to the technical, and business elements of delivering services, the legal components need to be easy to access, understand, and apply, but also make sure and protect the interests of everyone involved with the delivery and operation of enterprise services.
Making APIs Available to Consumers
The next step in the life cycle of properly governed APIs is making them available to consumer after they’ve been published to a production environment. The governing of APIs is not limited to the technical side of things, and this is where we begin understanding how to consistently deliver, measure, and understand the impact of API resources across the many consumers who are integrating the valuable resources into their applications. Shining a light on the business and politics of how digital assets are being put to use across the enterprise.
This portion of this governance research is intended to provide a basic list of the building blocks used by enterprise groups to help reduce friction when putting APIs to work, but also make sense of how consumers are using API resources, establishing a feedback loop that guide the road map for the future of the platform. Taking us back to the beginning of this research and informing how we should be targeting the development of new APIs, the evolution of existing services, and in many cases the deprecation and archiving of services. Ensuring governance goes well beyond the technical details, and making sure they are benefitting the platform, as well as consumers.
Making your APIs available to consumers requires doing a lot of research on who you are marketing them to, and positioning yourself to speak to an intended audience. Tailoring not just the design of your APIs, but the overall presentation, messaging, and even portal, documentation, and other building blocks to speak to a particular audience. For many API providers, APIs might be made available to multiple audience, in a variety of ways, based upon knowing their customers, and presenting exactly the resources they are needing to get a specific job done.
- Studies - Conducting regular stories about what internal, partner, and public user groups are using, and needing when it comes to developing applications, and integrating systems.
- Landscape - Establishing an understanding of the industry landscape for the area being targeted by services, and regularly tuning into and refining the understanding of what consumers are using across that landscape.
- B2B - Positioning the API to speak to a B2B audience, providing separate portal, documentation, and other resources to cater to a business audience.
- B2C - Positioning the API to speak to a B2C audience, providing separate portal, documentation, and other resources to cater to a consumer audience.
- Partners - Providing a unique set of resources that speak to partner groups, providing separate portal, documentation, and other resources to cater to exactly what existing partners will be needing.
- Internal - Positioning the API to speak to internal groups, providing separate portal, documentation, and other resources to cater to the needs of development groups within the enterprise.
- Context - Making sure services have knowledge of the context in which they will be delivering resources into. Different patterns, processes, and practices work well within different context, while others will fail, depending on the context that is relevant to the consumer.
- Office Hours - Holding conference call, or virtual office hours for different consumer groups to be available for discussion around what APIs are available, their supporting resources, as well as contributing to the platform road map.
Knowing your existing, and potential API consumers is essential to position your API program to speak to its intended targets. It is difficult to design and present the right set of resources for an audience you do not understand. Demonstrating how knowing your consumers is something that should happen before you begin the development of services, as well as an ongoing features of a platform, and that understanding the challenges of your consumer, and shifting the road map to stay in alignment with your consumer audience is critical to realizing platform-wide governance
While studying the consumer outreach strategies of leading API providers, as well as the ones that were interviewed, there are common patterns at play defining how to best reach your audience. The consistency of governed APIs speaks to how to best reach a wide audience, helps increase the impact APIs will have in the applications they are use in, and reduce overall friction and support required to operate them. There are several common patterns present when looking at how organizations are presenting APIs to their consumers publicly and privately.
- API Design - The design of APIs, using common RESTful resource design patterns, helps present simple, intuitive, and familiar resources that speak to a wide as possible audience.
- Developer Portals - Consistently designed, easy to navigate, well branded portals provide a familiar, known destination in which consumers can discover, onboard, and stay in touch with where an API platform is going.
- Documentation - Using common open source documentation across APIs provides an interactive, hands-on way for developers to learn about APIs, understand how to integrate with them, and be able to regularly check back in for added features and benefits that they provide.
- Definitions - Providing machine readable OpenAPI, and Postman Collections for consumers gives them a portable definition of what an API does, which they can use in their client tooling, to generate code libraries, setup monitors, tests, and generally understand what is possible.
Common design and presentation patterns is one of the reasons many of the leading API providers have established their foothold with their consumers. When you study the approach of Amazon, Google, Twitter, Twilio, Stripe, and other leading API providers, you see that they all use consistent design patterns, as well as provide similar portals, documentation, and other resources for their consumers. Governing the presentation layer for their API driven services, which reflects the consistency consumers are used to when working with multiple API providers across even different business sectors.
The next aspect of presenting production APIs to consumers involves communication, and ensuring that all stakeholders are kept up to speed on what is happening. Keeping a steady stream of information flowing around the platform, blending it with, and encouraging the activity on feedback loops, with an intent to drive the platform road map.
- Updates - Providing updates on a blog, or other mechanism, helping keep consumers up to date on what is going on with the platform, and making sure everyone knows there is someone home.
- Roadmaps - Publishing a public roadmap for API consumers to help them understand what is being planned, and what the future holds. Also maintaining private versions of road map for internal groups, and potentially partners.
- Issues - Being transparent and communicative around what the current known issues are with a platform, and publishing a list of anything current.
- Change Logs - Translated the road map, and issues into a change log for the platform showcasing what has historically occurred via a platform.
- Showcase - A published listing of applications and developers, showcasing the work being done by API consumers, highlighting the value a platform brings to the table.
Maintaining a steady stream of communication around what is happening with an API platform is a clear signal coming from the strongest enterprise API platforms out there. You see regular communication around what is happening, and what is being worked on, with teams reach out to each other, and sharing healthy practices, challenges, and showcasing what is being done with their API resources.
Realizing API Governance
Everything covered so far in this document feeds into what should be considered as part of the overall governance of an API platform, but focuses on the actual delivery of APIs. This is the section where we look at what is needed specifically for governance, and what teams are doing to invest in the governance of APIs across their teams, projects, and the lifecycle of their operations. There are a number of areas we identified that were relevant for groups who are actively realizing governance across their operations.
One key component of API governance at enterprise organizations who have been doing it a while, and have made significant investment in their efforts, is the presence of organized structure and teams dedicated to advancing governance across the enterprise. While these organizational structures are often defined by many different names, they have some common elements worth noting.
- Organization - Establishing a formal organization within the enterprise that is dedicated to API infrastructure, and developing a structured approach to governance, and the shared strategy across all teams.
- Core Team - Beginning with a small, focused, core team within the API strategy and governance, then expanding and growing as it makes sense, and based upon the expansion across the larger enterprise.
- Enablement Team - Providing an enablement team that can go out and work with individual teams to help enable them to realize healthier API lifecycle practices, and achieve governance objectives.
- Advisory Board - Developing an advisory board of internal, and possibly external individuals who can provide regular feedback on the API strategy, and help move forward the governance conversation.
- Legacy Teams - Involving legacy teams in the central API strategy and governance team to make sure the legacy of the enterprise is reflected and understood as efforts evolve and move forward
There were many variations in how enterprise organizations are organizing their API teams, some with more of a centralized approach, with others possessing a decentralized, and more organic feel. Some come straight out of CIO and CTO groups, where others were more bottom-up, organically grown efforts, reflecting the tone of the conversation occurring at different types of enterprise organizations.
While there were a number of approaches used to organize and execute on the API governance vision across different enterprise, there were some common approaches, and advice regarding how to do it in a pragmatic way. Providing some key elements to consider as organizations think about forming their own strategy, and putting it into motion at their own enterprise organizations.
- Start Simple - Keeping things as simple as possible when getting going. Not trying to bite off more than you can chew, and overpromising to the organization. Start with the basics, get involvement, and buy-in, then move forward in a logical fashion.
- Not Heavy Handed - Refrain from being heavy handed with governance policing and enforcement. It is repeatedly stated that heavy handed efforts get an overwhelming amount of pushback, and can set back efforts significantly.
- Inline Defined - Provide guidance, education, and artifacts inline. Do not expect people to read governance guides, and understand what is going on from the start. Feed them information on a regular basis through the channels they are already tuned into include corporate communication channels, their integrated development environments (IDE), and other existing entry points people will stay tuned on.
- Having A Mandate - Think deeply about having a mandate regarding governance. Some people think it is better stated as a mission, rather than a mandate. Considering the negative impact a mandate might have when it comes to adoption, and participation.
- Select Enforcement - Be creative in how you enforce governance across operations. Be very selective about where you enforce and push back on users. Finding inspiring, and motivational ways to enforce governance, being more carrot than stick.
- Build Community - Working to build community around the API strategy and governance organization, building relationships across teams, recruiting advocates, and working to train and education on a regular basis.
- Evangelism - Spending a significant amount of time reaching out to internal stakeholders, partner contacts, also the public when it makes sense, but most importantly, always be evangelizing to management and leadership.
Moving beyond just a group of people in name only and having an structured and planned approach to executing on the API governance across an organization on a daily, weekly, monthly, quarterly, and annual basis. Establishing a deliberate tone to the API governance effort, measuring it’s impact across groups, and adjusting and evolving as required. Developing a strong voice, and measured approach over time, while understanding what works, and what doesn’t work-being agile and flexible in how APIs are governed.
Building upon the technological layers present in every previous section of this report, we wanted to take another look at how technology is used specifically for defining, measuring, and reporting on governance efforts. Tracking on the specific technological solutions that enterprise groups are using to understand, as well as enforce the governance strategy on the ground, in real time.
- Definitions - Use OpenAPI, Postman Collections, JSON Schema, and other machine readable artifacts available for all stages of the API lifecycle to quantify, measure, and report on how well governance efforts are being realized.
- Management - The API management layer provides a number of features that help apply policies, rate limit, log, and track on what is happening with all APIs. Primarily used to understand consumer behavior, but can also be used to understand provider, publisher, and developer behavior as well.
- Gateway - Providing the single point of entry for all services, allowing for transformation, translation, as well as all the features brought to the table by API management solutions. Providing the perfect opportunity to enforce, as well as measure how well governance is being applied across the organization.
- Logging - Logs shipped centrally will be the most important way that governance efforts measures and reports on what is happening across the enterprise platform. Without a central logging strategy, governance will be flying blind, unable to see into all the services it is supposed to be governing.
- Monitoring - Making the monitoring of ALL services the default. Tracking all services from multiple regions, and understanding if they are meeting internal or external SLAs. Providing a key benchmark for whether governance is being effective across services.
- Testing - Getting much more granular and making sure that APIs are doing what they should. Taking plain business assertions, and testing them against APIs using machine-defined tests that can be executed in real-time, and on a schedule.
- Security - Gathering as much data as possible about how security is being handled, and the results of scanning, monitoring, logging, and authentication around security checkpoints.
- Reporting - Leverage management, gateway, monitoring, testing and other technology to produce reports on how well governance benchmarks are being met. Allowing the technology to do the measurement and enforcement, as well as reporting of the numbers that can be aggregated into a single set of reports to understand the impact of governance efforts.
When making decision on what technology to use as part of the delivery of API infrastructure, it’s role in the wider governance strategy should be considered. Having services and tooling inline, that can help execute and report upon governance efforts is an important aspect of being able to move forward a governance program across the enterprise. Built in governance is much more likely to be leveraged, than externally mandated tracking and reporting.
Every organization we talked to shared their frustrations and stories around the challenges they’ve faced. Many of these have been shared as part of specific section above, but we wanted to focus on the challenges with actually implementing governance itself, and look at some of the solutions groups have found for working around challenges and roadblocks.
- Co-Creation - Isolated API governance organizations stayed isolated, and groups who co-created the strategy with other teams, and worked to share ownership over the strategy, execution, and road map had much better success meeting their objectives.
- Buy-In - Getting buy-in from teams is difficult, and many have spoke of the challenges getting buy-in to the need for a centralized, or even distributed governance approach. Making it difficult to move forward when you don’t have the buy-in of some groups or management.
- Standards - While many agreed standards were good, actually getting people on the ground to adopt, use, and realize the benefits of standards across all stops along the API lifecycle has proven elusive.
- Artifacts - There just wasn’t agreement that definitive governance artifacts like guides, prototypes, and other common solutions were necessary. Some teams just disregard these artifacts, questioning the investment of time and energy to create. While others felt strongly that they were necessary to lead through example.
- Difficult Process - Over and over, API teams express that governance, and pushing for consistency across the API lifecycle was a difficult process. It sounds easy when you plan the strategy, but actually doing it on the ground never works out as you envision.
- Refine Process - You will be constantly refining your governance strategy, adjusting, removing, tweaking, and shifting the approach until you find solutions to incremental aspects of delivering APIs.
- Takes Time - All of this will take time. Be patient. Play the long game. Understand it will take much longer than you expected to see the change you envision.
There will be more challenges along the way than there will be wins, when it comes to governance of vast, complex API infrastructure. Challenges, roadblocks, and friction will exist at all stages of standardizing how APIs are delivered across the enterprise. Dealing with failure, and recognizing challenges and the potential for them to be a roadblock is important to being able to keep moving forward at any pace.
The Road To API Governance
There has been a significant uptick in the number of companies, organizations, institutions, and government agencies doing APIs since 2010, to meet the demands of web, mobile, and device applications. A very small percentage of these entities have any sort of formal governance strategy in motion to address how APIs will be delivered across their organizations. Most API providers are living in the moment, realizing they need to be addressing governance, but struggling to overcome a handful of common roadblocks.
- People - A lack of awareness, training, and communication amongst stakeholders is the biggest challenge API governance efforts face. Do not underestimate the people when crafting a technology focused effort, otherwise the people variable will be what brings it down.
- Culture - Plan for how the governance will address the culture within an organization. This is where the studies, outreach, workshops, and planning will come into play. Plan for everything taking 5 to 10 times longer than you anticipate because of the thickness, and resistance of organizational culture.
- Problems - Count on problems coming up everywhere. Dedicate a significant amount of time and resources to identifying, thinking through, and address problems that come up. Do not let problems fester, go ignored, or address.
- Existing - Map API governance efforts to the existing realities. Yes, the objective is to move the delivery of APIs to a specific destination, but the strategy needs to be rooted in what is existing, building a bridge to where we want to be.
Not all organizations will be ready for capital “G” governance, and many will have to accept inline, ongoing, lower case “g” governance. Doing what they can, with what resources they have, evangelizing, building community, and consensus along the way. While an organized, centralized, well funded governance program is ideal, and can achieve a lot, a significant amount can be done with a scrappier approach, until more traction and resources are achieved.
This report pulls together several years of research, combined with a handful of interviews with API professionals who are pushing forward the API governance conversation at their enterprise organizations. It acknowledges that the discipline of API governance is more discussion, than it is a formal discipline as of 2018. There are many ways in which API providers are governing their APIs, but few that have a formalized API governance strategy and program, and even fewer that are sharing their strategy, or lack of one in a public manner.
The objective with this report is to pull together as much information regarding how organizations are governing their APIs, and assemble the findings in the following logical order, reflecting how an organization might approach governance on the ground:
- Roles Within An Organization - Who is needed to make this happen?
- Design Software Architecture - Laying the foundation for governance.
- Identifying Potential APIs - Defining the right resources to expose.
- Defining Data Models & Standards - working to standardize how things are done.
- Development to Production - Moving from idea to reality in a standard way.
- Making APIs Available to Consumers - Exposing resources properly to consumers.
- Realizing API Governance - Moving towards a structured vision of governance.
- The Road To API Governance - Acknowledging governance is more vision than reality.
Not every detail in this report will apply to the VA, or any other enterprise organization looking to establish a wider API governance strategy. It is meant to be educational, enlightening, and show the scope of how enterprise groups are addressing governance. Allowing enterprise API efforts to learn from each other, and hopefully even share more stories regarding the challenges they face, and the success they are finding–no matter how small.
Hopefully this report reflects a patchwork of things that should be considered, rather than a complete list of what has to be done. There is no such thing as the perfect governance strategy for any API program. There is however, a great deal of things that can be done when you have the right team, the right amount of enthusiasm, and a positive outlook on what governance means. Addressing early on some of the negative perceptions that will exist out there about governance, and how it is something that comes from the top, and how it has the potential to not give regular people at the front lines a voice in the process–this is a myth, it doesn’t have to be the reality.
A definition for governance from the Oxford English Dictionary is, “the way in which an organization is managed at the highest level, and the systems for doing this”. Don’t mistake the highest level being about the highest levels of management, and let it be more about the highest levels of strategy across the organization. It is the system for governing a complex machine of API driven gears that make systems and applications work across the enterprise. It is the governance of a machine that has the potential to allow every individual within the enterprise to play an important role in influencing, allowing everyone to contribute, even if they do not work in a technical capacity within the enterprise machine.
One of the layers of the API universe where I come across an increased number Hypermedia APIs is in the movie, television, and entertainment space. Where having a more flowing API experience makes a lot of sense, and the extra investment in link relations will pay off. One example of this I recently came across was over at TVMaze, who has a pretty robust hypermedia API, where they opted for using HAL as their media type.
Like any good hypermedia should, TVMaze begins with its root URL: http://api.tvmaze.com, and provides a robust set of endpoints from there:
- Show main information
- Show episode list
- Episode by number
- Episodes by date
- Show seasons
- Season episodes
- Show cast
- Show crew
- Show AKA’s
- Show index
The TVMaze API isn’t an overly complex hypermedia API. I think it is simple, elegant, and shows how you can use link relations to establish a more meaningful experience for API consumers. Allowing you to navigate the large, ever-changing catalog of television shows, allowing the API client to do the heavy lifting of navigating the shows, schedules, and people involved with each production.
There hasn’t been enough showcasing of the hypermedia APIs available out there. Usually once a year I remember to give the subject some attention, or when I come across interesting ones like TVMaze. Hypermedia isn’t just an academic idea anymore, and is something that has gotten traction in a number of sectors, and I keep seeing signs of growth and adoption. I don’t think it will be the API solution most hypermedia believers envisioned it, but I do think it is a viable tool in our API toolbox, and for the right projects it makes a lot of sense.
While profiling any company, a couple of the Google searches I will execute right away are for “[Company Name] Swagger” and “[Company Name] OpenAPI”, hoping that a provide is progressive enough to have published an OpenAPI definition–saving me hours of work understanding what their API does. I’ve added a third search to my toolbox, if these other two searches do not yield results, searching for “[Company Name] Postman”, revealing whether or not a company has published a Postman Collection for their API–another sign of a progressive, outward thinking API provider in my book.
A machine readable definition for an API tells me more about what a company, organization, institution, or government agency does, than anything else I can dig up on their website, or social media profiles. An OpenAPI definition or Postman Collection is a much more honest view of what an organization does, than the marketing blah blah that is often available on a website. Making machine readable definitions something I look for almost immediately, and prioritize profiling, reviewing, and understanding the entities I come across with a machine readable definition, over those that do not. I only have so much time in a day, and I will prioritize an entity with an OpenAPI or Postman, over those who do not.
The presence of an OpenAPI and / or Postman Collection isn’t just about believing in the tooling benefits these definitions provide. It is about API providers thinking externally about their API consumers. I’ve met a lot of API providers who are dismissive of these machine readable definitions as trends, which demonstrates they aren’t paying attention to the wider API space, and aren’t thinking about how they can make their API consumers lives easier–they are focused on doing what they do. In my experience these API programs tend to not grow as fast, focus on the needs of their integrators and consumers, and often get shut down after they don’t get the results they thought they’d see. APIs are all about having that outward focus, and the presence of OpenAPI and Postman Collection are a sign that a provider is looking outward.
While I’m heavily invested in OpenAPI (I am member), I’m also invested in Postman. More importantly, I’m invested in supporting well defined APIs that provide solutions to developers. When an API has an OpenAPI for delivering mocks, documentation, testing, monitoring, and other solutions, and they provide a Postman Collection that allows you to get up an running making API calls in seconds or minutes, instead of hours or days–it is an API I want to know more about. Making these potential searches the deciding factor between whether or not I will continue profiling and reviewing an API, or just flagging it for future consideration, and moving on to the next API in the queue. I can’t keep up with the number of APIs I have in my queue, and it is signals like this that help me prioritize my world, and get my work done on a regular basis.
It is interesting for me to still regularly come across so many API providers who have a public API portals, but insist on keeping most of their documentation behind a login. Stating that they are concerned with competitors getting access to the design of their API and the underlying schema. Revealing some indefensible API business models, and general paranoia around doing business on the web. Something that usually is a sign for me of a business that is working really hard to maintain a competitive grip within an industry, without actually having to do the hard work of innovating and moving the conversation forward.
Confident API providers know that you can put your API documentation out in the open, complete with schema, without giving away the farm. If your competition can take your API design, and underlying schema, and recreate your business–you should probably go back to the drawing board, and come up with a new business idea. Your API and schema definition is not your business. I’ve used this comparison may times–your API docs are like a restaurant menu. Can you imagine restaurants that kept them hidden until they were sure you are going to be a customer? If you think that your competition can read your menu and recreate all your dishes, then you won’t be in business very long, because your dishes probably weren’t that special to begin with.
For every competitor you keep out of your API documentation, you are keeping twenty new customers out as well. I’m guessing that your savvy competitors are going to be able to get in anyways with a fake account, or otherwise. Don’t waste your time on hiding your API and keeping it out of the view of your potential customers–invest your energy in making sure your APIs kick ass. To use the restaurant analogy again, make sure ingredients are the best, and your processes, and your service are top notch. Don’t make your menu hard to get, it just shows how out of touch you are with the mainstream world of APIs, and your worst fears will come true–someone will come along and do what you do, but even better, and you will become irrelevant.
Be proud of your APIs, and publish them prominently in your API portal. Make sure you have a OpenAPI definition handy, driving your documentation, tests, monitors, and other elements of your operations. Also make sure you have Postman Collections available, allowing your API definition to be portable and importable into the Postman client, allowing consumers to get up and running making calls in minutes, not hours or days. Get out of the way of your API consumers, don’t put up unnecessary, outdated obstacles in their way. I know that you feel you know best because you’ve been doing this for so long, and know your industry, but the world is moving on, and APIs are about doing business on the web in a much more open, accessible, and self-service way. If you aren’t moving in this direction, I’m guessing you won’t be doing what you do for much longer, because someone will come along who can move faster and be more open.
I’m regularly fascinated with API development teams I work with expressing their concern with working with many Github repositories. With all of the complexity I watch teams embrace when it comes to frameworks, scaffolding, continuous integration, deployment, and orchestration solutions, I’m lost on why many Github repositories suddenly become such a challenge. A Github organization is a folder, and a Github repository is a folder, that you can check out locally, on your server, or work with via an API, or Git. It is a distributed file store, that you can orchestrate with programmatically, and can be as logical or illogical as you design it to be.
I work really hard to keep technical complexity to a minimum in my world, but this means limiting unnecessary vendor lock-in, and tech just because it is the latest trend. For me, Git is just a distributed file system, with version control built in, with Github providing a nice API and network effect layer that makes it compelling. Referencing a Github folder is just a matter of using its org and repo name, checking it out, and working with the standardized layout of information I have published there. Allowing me to work with, and orchestrate thousands of separate folders (repos), across almost 50 organizations (folders), in a consistent way, as a one person team. Something that has taken me about four years to setup, and fine tune, but has become essential to what I do on a daily basis.
I’m not saying working with hundreds or thousands of individual Github repositories can’t be complex, or doesn’t take a significant amount of work. I’m just saying I’m intrigued by how technologists who manage large systems, adopt complex frameworks for delivering simple web solutions, and regularly make other complex technological investments, draw the line here. I see Github as a robust file store, with two doorways, 1) Git, and 2) API. I have a standardized structure to what I store on Github, something that is similar to my Amazon S3 store, or my backend of databases, and is something that takes discipline to maintain and keep from being unwieldy, but if I do the hard work it is possible. I’m guessing folks who see Github as a lot of work are seeing it through a web or desktop UI lens, and haven’t stopped to think about through an API or CLI lens–I find my AWS, Google, and other platforms to be complicated if I only look at them through the UI.
I enjoy being able to checkout and work with the relevant repositories in my world, and automate my backend and front end systems to automate with the same repositories. My blog schedules, checks out, and publishes posts across 250 repositories. My curation system pulls from my Feedly every day, organizing what I’ve bookmarked and tagged across almost 500 repositories. My API system updates and publishes APIs.json, OpenAPI, and Postman collections across almost 4,000 separate repositories, as I discovery, profile, and make sense of the API landscape. I don’t see this as complexity, I see it as pretty simple, Git, Jekyll, HTML, CSS, JS, JSON, and YAML driven goodness. Something I can easily migrate off of Github if I wanted, and run on AWS, Google, Azure, or my own server infrastructure. For now I’m enjoying the network effect provided by Github, and the power of their API when it comes to more granular changes, and tuning into valuable signals that are available via the social platform.
The conversations I’m having around the complexities of managing many Github repositories are a reminder for me of how technological complexity is relative. Some will see opportunity, while others see complexity, and vice versa. If you understand something, there is a lesser chance you will see complexity–you’ve made the investment already. For many, Github is a burden. I see it as liberating, and something to be orchestrated, providing me with some of the most significant signals I can find online–rivaling Twitter when it comes to driving my work. I can see managing Github through the web or desktop interface being pretty cumbersome, but once you’ve elevated beyond these tools, and work with repositories using the API and CLI, taking full advantage of the Git in Github, the landscape changes, and become less complex, and a more empowering experience. Something I don’t think everyone will fully realize, and be able to get beyond their view of Github as complex, difficult, and challenging. And, thats ok.
I am working on delivering a handful of new APIs, which I will be turning into products. I will be prototyping, developing, and operating them in production environments for myself, and for a handful of customers. To help guide my workflow, I’ve crafted this API lifecycle definition to help direct my efforts in an ongoing lifecycle approach.
Define - Define the problem and solution in a human and machine readable way.
- Repository - Establish a Github repository.
- README - Craft a README for the repository.
- Title - Provide title for the service.
- Description - Provide concise description for the service.
- Goals - Establish goals for the service.
- Schema - Organize loose, and JSON schema for the service.
- OpenAPI - Establish an OpenAPI for the service.
- Assertions - Craft a set of assertions for the service.
- Team - Define the team behind the service.
Design - Establish a base set of design practices for the service.
- Versioning - Determine how the code, schema, and the API be versioned.
- Base Path - Set the base path for the service.
- Path(s) - Define a set of resource paths for the service.
- Verb(s) - Define which HTTP, and other verbs will be used for the service.
- Parameters - Define a list of query parameters in use to work with service.
- Headers - Define the HTTP Headers that will be used to work with the service.
- Response(s) - Provide a resulting message and associated schema definition for the service.
- Media Types - Define whether service will return CSV, JSON, and / or XML responses.
- Status Codes - Define the available status code for each responses.
- Pagination - Define how pagination will be handled for requests and responses.
- Sorting - Define how sorting will be handled for requests and responses.
Database - Establish the base for how data will be managed behind the service.
- Platform - Define which platform is in use to manage the database.
- Schema - Establish the schema used for database based upon definition provided.
- Location - Define where the database is located that supports services.
- Logs - Define where the database logs are located that support services.
- Backup - Define the database backup process and location for service.
- Encryption - Define the encryption layer for the service database.
Storage - Establish how all objects will be stored for the service.
- Platform - Define which platform is used to store objects.
- Location - Define where objects are stored behind the service.
- Access - Quantify how objects access is provided behind the service.
- Backup - Define the backup process for objects behind the service.
- Encryption - Define the encryption layer for stored objects.
DNS - Establish the DNS layer for this service.
- Platform - Define which platform is used to operate DNS.
- Prototype - Provide host for prototype of service.
- Mock - Provide host for mock of service.
- Development - Provide host for development version of service.
- Production - Provide host for production version of service.
- Portal - Provide host for the portal for this service.
- Encryption - Define the encryption layer for service in transport.
Mocking - Provide a mock representation of this service.
- Paths - Providing virtualized paths for the API driving service.
- Data - Providing synthesized data behind each API response for service.
Deployment - Define the deployment scaffolding for this service.
- Platform - Define the platform used to deploy this service.
- Framework - Define the code framework used to deploy service.
- Gateway - Define the gateway used to deploy service.
- Function - Define the function(s) used to deploy service.
- Containers - Define the container used to deploy service.
- Pipeline - Define the pipeline in place to build and deploys service.
- Location - Define where the service is deployed to.
Orchestration - Define how the service will be orchestrated.
- Build - Define the build process for this service.
- Hooks - Detail the pre and post commit hooks in use for this service.
- Jobs - Define the jobs being executed as part of this service operations.
- Events - Define the events that are in play to help operate this service.
- Schedule - Details of the schedules used to orchestrate this service.
Dependencies - Providing details of the dependencies that exist for this service.
- Service - Details of other services this service depends upon.
- Software - Details of other software this service depends upon.
- People - Details of other people this service depends upon.
Authentication - Details regarding authentication in use for this service.
- Type - Define whether this service uses Basic Auth, API Keys, JWT, or OAuth for authentication.
- Overview - Provide a location of the page that delivers an overview of this services authentication.
Management - Define the management layer for this service’s API.
- Platform - Defining the platform used for the API management layer.
- Administration - Provide a location for administrating the management layer.
- Signup - Provide a location for users to signup for access to this service.
- Login - Provide a location for users to login to access to this service.
- Account - Provide a location for users to access a dashboard for this service.
- Applications - Provide the location of applications that are approved to use service.
Logging - Define the logging layer for supporting this service.
- Database - Define the logging for the database layer.
- API - Define the logging for the API access layer.
- DNS - Define the logging for the DNS layer.
- Shipping - Define how longs are shipped or centralized for auditing.
Monetization - Define the costs associated with the delivery of this service.
- Acquisition - Provide costs associated with acquisition of resources behind service.
- Development - Provide costs associated with the development of this service.
- Operation - Provide costs associated with the operation of this service.
- Value - Provide a description of the value delivered by this service.
Plans - Define the operational plan for the business of this service.
- Tiers - Define the tiers of access in place to support this service.
- Elements - Define the elements of access for each tier for this service.
- Paths - Define which API paths are available as part of each tier or service.
- Metrics - Provide a list of metrics being used to measure service access.
- Timeframes - Define the timeframes in use to measure access to this service.
- Limits - Define what limitations and constraints are in place for this service.
- Pricing - Define the monetary value in place to define the price for this service.
Portal - Define the public or private portal in use to present this service.
- Hosting - Provide details on where this service portal is hosted.
- Template - Define which graphical UI and brand template is in use for this portal.
- Analytics - Define which analytics package is used to measure traffic for portal.
Documentation - Provide documentation for this service within portal.
- Overview - Publish concise over for this service’s documentation.
- Paths - Publish an interactive list of API paths available for service.
- Examples - Provide as many examples of API requests in a variety of languages.
- Definitions - Publish a list of schema definitions in use by this service.
- Errors - Provide a list of available errors users will encounter for this service.
Getting Started - Provide a getting started page for this service within portal.
- Overview - Provide an introduction to the getting started process for this service.
- Signup - Provide a link to where users can signup for this service.
- Authentication - Provide a link to the authentication overview for this service.
- Documentation - Provide a link to the documentation for this service.
- SDKs - Provide a link to where users can find SDKs and code libraries for this service.
- FAQ - Provide a link to to the frequently asked questions for this service.
- Support - Provide a link to where users can get support for this service.
SDKs - Providing code samples, libraries, or complete software development kits (SDKs).
- PHP - Provide a PHP SDK.
- Python - Provide a Python SDK.
- Ruby - Provide a Ruby SDK.
- Go - Provide a Go SDK.
- Java - Provide a Java SDK.
- C# - Provide a C# SDK.
- Node.js - Provide a Node.js SDK.
FAQ - Publish a list of the frequently asked questions (FAQ) for this service.
- Categories - Break all questions down by logical categories.
- Questions - Publish a list of questions with answers within each category.
- Ask Question - Provide a form for users to ask a new question.
Road Map - Provide a road map for the future of this service.
- Private - Publish a private, internal version of entries for the road map.
- Public - Publish a publicly available version of entries for the road map.
- Suggest - Provide a mechanism for users to make suggestions for the road map.
Issues - Provide a list of currently known issues for this service.
- Entries - Publish a list of all known issues currently outstanding.
- Report - Provide a mechanism for users to report issues.
Change Log - Providing a historical list of what has changed for this service.
- Outline - Publish a list of all road map and issue entries that have been satisfied for this service.
Communication - Establish a communication strategy for this service.
- Blog - Provide a simple blog and update mechanism for this service.
- Twitter - Provide the Twitter handle that is used as part of this service.
- Github - Provide the Github account or organization behind this service.
- Internal - Provide a location where internal communication is available.
- External - Provide a location where public communication is available.
Support - Establish the support apparatus in place for this service.
- Email - Define the email account used to support this service.
- Issues - Provide a URL to the repository issues to support this service.
- Tickets - Provide a URL to the ticketing system used to support this service.
Licensing - Provide a set of licensing in place for this service.
- Server - Define how all backend server code is licensed for this service.
- Data - Define how all data and schema is licensed for this service.
- API - Define how the API definition is licensed for this service.
- SDK - Define how all client code is licensed for this service.
Legal - Provide a set of legal documents guiding the service.
- Terms of Service - Provide a terms of service for this service to operate within.
- Service Level Agreement - Provide a service level agreement (SLA) for this service to operate within.
Monitoring - Defining the uptime monitoring for this service.
- Monitors - Establish the monitors for this service.
- Status - Provide real time details of monitor activity.
Testing - Defining the testing for this service.
- Assertions - Provide details of the assertions being tested for.
- Results - Provide real time details of testing activity.
Performance - Defining the performance monitoring for this service.
- Tests - Provide details of the performance testing in place for this service.
- Results - Provide real time details of performance testing activity.
Security - Defining the security practices in place for this service.
- Overview - Provide the URL of the security practices overview page.
- Policies - Define the security, IAM, and other policies are in place for this service.
- Tests - Define the security tests that are conducted for this service.
- Results - Provide real time details of security testing activity.
Discovery - Defining the discovery aspects for this service.
- API Discovery - Publish an API Discovery (APIs.json) index for the project.
- OpenAPI - Provide URL for all OpenAPI definitions and index in API discovery index.
- Postman Collection - Provide URL for all Postman Collections and index in API discovery index.
Analysis - Define the analysis in play for this service.
- Traffic - Document traffic to the service portal.
- Usage - Document usage of APIs for the service.
- Activity - Document other activity around the service.
- SLA - Document whether the SLA was met for this service.
Stages - Define the stages that are applied to this lifecycle outline.
- Prototype - When a prototype of this service is being developed.
- Development - When a production instance of service is being developed.
- Production - When a production instance of service is being operated.
Maintenance - Define the maintenance cycles applied to this lifecycle outline.
- Daily - Provide a version of this outline that should be considered daily.
- Weekly - Provide a version of this outline that should be considered weekly.
- Monthly - Provide a version of this outline that should be considered monthly.
- Releases - Provide a version of this outline that should be considered for each release.
- Governance - Provide an outline of how this outline is measured, reported, and evolved.
This outline gets executed differently depending on the service stage and maintenance cycle being executed. It is meant to provide a master checklist to consider from day one, and every other time this service is being versioned, maintained, and considered as part of my overall operations. Providing a living checklist, and scorecard rubric for how well this service is doing, depending on stage and maintenance dimensions.
Ultimately the API Discovery (APIs.json) document is the heartbeat for this service checklist, which resides in the root of the Github repository will provide the machine readable index for reporting and governance, while also driving the human readable interface that is accessible via the service portal, which is driven by Jekyll running in Github Pages. Which is something I will publish more about next, as part of a working portal for one of the first services coming off the assembly line.
You don’t usually find me writing about API acquisitions unless I have a relationship with the company, or there are other interesting aspects of the acquisition that makes it noteworthy. This acquisition of Elastic Beam by Ping Identity has a little of both for me, as I’ve been working with the Elastic Beam team for over a year now, and I’ve been interested in what Ping Identity is up to because of some research I am doing around open banking in the UK, and the concept of an industry level API identity and access management, as well as API management layer. All of which makes for an interesting enough mix for me to want to quantify here on the blog and load up in my brain, and share with my readers.
From the press release, “Ping Identity, the leader in Identity Defined Security, today announced the acquisition of API cybersecurity provider Elastic Beam and the launch of PingIntelligence for APIs.” Which I think reflects some of the evolution of API security I’ve been seeing in the space, moving being just API management, and also being about security from the outside-in. The newly combined security solution, PingIntelligence for APIs, focuses in on automated API discovery, threat detection & blocking, API deception & honeypot, traffic visibility & reporting, and self-learning–merging the IAM, API management, and API security realms for me, into a single approach to addressing security that is focused on the world of APIs.
While I find this an interesting intersection for the world of APIs in general, where I’m really intrigued by the potential is when it comes to the pioneering open banking API efforts coming out of the UK, and the role Ping Identity has played. “Ping’s IAM solution suite, the Ping Identity Platform, will provide the hub for Open Banking, where all UK banks and financial services organizations, and third-party providers (TPPs) wanting to participate in the open banking ecosystem, will need to go through an enrollment and verification process before becoming trusted identities stored in a central Ping repository.” Which provides an industry level API management blueprint I think is worth tuning into.
Back in March, I wrote about the potential of the identity, access management, API management, and directory for open banking in the UK to be a blueprint for an industry level approach to securing APIs in an observable way. Where all the actors in an API ecosystem have to be registered and accessible in a transparent way through the neutral 3rd party directory, before they can provide or access APIs. In this case it is banking APIs, but the model could apply to any regulated industry, including the world of social media which I wrote about a couple months back as well after the Cambridge Analytics / Facebook shitshow. Bringing API management and security out into the open, making it more observable and accountable, which is the way it should be in my opinion–otherwise we are going to keep seeing the same games being played we’ve seen with high profile breaches like Equifax, and API management lapses like we see at Facebook.
This is why I find the Ping Identity acquisition of ElasticBeam interesting and noteworthy. The acquisition reflect the evolving world of API security, but also has real world applications as part of important models for how we need to be conducting API operations at scale. ElasticBeam is a partner of mine, and I’ve been talking with them and the Ping Identity team since the acquisition. I’ll keep talking with them about their road map, and I’ll keep understanding how they apply to the world of API management and security. I feel the acquisition reflects the movement in API security I’ve been wanting to see for a while, moving us beyond just authentication and API management, looking at API security through an external lens, exploring the potential of machine learning, but also not leaving everything we’ve learned so far behind.
There are a lot of people making money off of the acquisition, organization, and providing access to data in our digital world. While I quietly tune into what the data monetization trends are, I am also actively looking for interesting approaches to generating revenue from data, but specifically with an eye on revenue sharing opportunities for the owners or stewards of that data. You know as opposed to just the exploitation of people’s data, and generating of revenue without them knowing, or including them in the conversation. To help counteract this negative aspect of the data economy, I’m always looking to highlight (potentially) more positive outcomes when it comes to making money from data.
I was recently profiling the API of the people intelligence platform LotaData, and I came across their data monetization program, which provides an interesting look at how platforms can help data stewards generate revenue from data, but in a way that makes it accessible to individuals looking to monetize their own data as well. “LotaData’s AI platform transforms raw location signals into ‘People Intelligence’ for monetization, usually based upon the follow key attributes: latitude, longitude, timestamp, deviceID, and accuracy.”
Representing activity at a location and/or point in time, allowing LotaData’s to understand what is happening at specific places at scale, and develop meaningful insights and behavioral segments that other companies and government agencies will want to buy into. Some of the examples they provide are:
- Commuting daily on CalTrain from Palo Alto to San Francisco
- Mid-week date night at Nopa on the way back from work
- Sweating it out at Soul Cycle on Saturday mornings
- Taking the dog out for a walk on Sunday afternoons
- Season ticket holder for Warriors games at the Oakland Arena
LotaData’s location-based insights and segments are entirely inferred from raw location signals, emphasizing that they do not access or collect any personally identifiable information (PII) from mobile phones–stating that they “do not and never will collect PII such as name, email, phone number, date of birth, national identifier, credit cards, or other sensitive information”. Essentially walking on the light side of the whole data acquisition and monetization game, and playing the honest card when it comes to the data economy.
When it comes to the monetization of data, LotaData enables marketers, brands, city governments and enterprise businesses to purchase location-based insights–providing an extensive network of data buyers who are ready to purchase the insights generated from this type of data. Then the revenue generated from the sale of an insight is split proportionately and shared with the app developers who contributed their app data. With the SDK agreement with LotaData governing the payment terms, conditions and schedule for sharing revenue. However, if you are unable to integrated LotaData’s SDK in a mobile app for any reason, they can offer you alternative ways to share and monetize your location data:<p></p>
- Geo-Context API - The Geo-Context API is a simple script that you can embed in your mobile web sites and web apps. The script collects location data with explicit notice and permission obtained from end users.
- Bulk Data Transfer - Customers that are proficient in collecting location signals from their mobile apps, websites or other services, can easily upload their historical location archives to LotaData’s cloud for analyzing, inferring and monetizing mobile user segments. The data can be transferred to LotaData by configuring the appropriate access policies for AWS S3 buckets.
- Integration - LotaData can integrate with CRM and in-house data warehouse systems to ingest custom datasets or usage logs for deep analysis, enrichment and segmentation. Questions?
Providing a pretty compelling model for data providers to monetize their location based data. It is something I’ll be exploring more regarding how individuals can aggregate their own personal or professional data, as well as take advantage of the geo context API, bulk data transfer, or other integration opportunities. I have no idea how much money an individual or company could make from publishing data to LotaData, but the model provides an interesting approach that I think is worth exploring. It would be interesting to run a 30 to 90 test of tracking all of my location data, uploading it to LotaData, and then sharing the revenue details about what I can make with through a single provider like LotaData, as well as explore other potential providers so that you could sell your location data multiple times.
In a world where our data is the new oil, I’m interested in any way that I can help level the playing field, and seeing how we can put more control back into the device owners hands. Allowing mobile phone, wearable, drone, automobile, and other connected device owners to aggregate and monetize their own data in a personal or professional capacity. Helping us all better understand the value of our own bits, and potentially generating some extra cash from its existence. I don’t think any of us are going to get rich doing this, but if we can put a little cash back in our own pockets, and limit the exploitation of our bits by other companies and device manufacturers, it might change the game to be a little more in our favor.
It is funny work with companies, organizations, institutions, and government agencies of all shapes and sizes, and learn all the weird practices they have, and the strange belief systems they’ve established. One day I will be talking to a 3 person startup, the next day I’ll be talking with a large bank, and after that I’ll be working with a group at a massive government agency. I have to be mindful of my time, make sure I’m meeting my mission, having an impact, as well as paying my bills, but for the most part I don’t have any entrenched rules about who I will talk to, or who I will share my knowledge with.
One thing I chuckle at regularly is when I come across large organizations who will gladly talk with me, and tap my knowledge, but won’t work with some of the startups I work with, or the conferences I produce, because they are “too small”. They can’t waste their time working with small startups because it won’t bring the scope of revenue they need to justify the relationships, but they’ll gladly talk to me and welcome the exposure and knowledge I might bring. Sometimes I feel like telling organizations, “sorry you are just to large to work with, you are almost guaranteed to fail at this whole API thing, why should I bother?” I think I’ll say it sometimes jokingly, but not really interested in truly being a dick at that level.
Most large organizations can’t figure out how to work with me in any long term anyway, because they are too bureaucratic and slow moving. Other large organizations have no problem figuring out how to get me past legal, and getting me paid, but some just can’t figure it out. I had one large enterprise group who follows my work, wanted to get me in really badly, but their on-boarding team needed proof that I was the API Evangelist going back every year since 2010, a letter from client, tax returns, or other proof that I was who I say I was–just so I could share my knowledge with them. Um, ok? You really are going to put up so many barriers to people coming into your organization and sharing knowledge? I’m guessing you aren’t going be very good at this whole API thing, with these types of barriers in the way.
I know I can’t change the way large organization behave, but know I can influence their behavior. I’ve done it before, and I’ll keep doing it. Especially when large organizations reach out to me, asking to help them in their journey. At 99% of them I will have no impact, but it is the other 1% that I’m hoping to influence in some way. I can also regularly point out how silly their organizations are, even if the people I’m working with are well aware of the state of things. I know it isn’t how ALL large organizations have to behave because I do a lot of business with large entities, who are able to get me through legal, and able to pay me without problems. Somewhere along the way, certain organizations have made the decision to be more bureaucratic and the trick is going to be how do you begin unwinding this–this is what the API journey is all about.
I import and work with a number of OpenAPI definitions that I come across in the wild. When I come across a version 1.2, 2.0, 3.0 OpenAPI, I import them into my API monitoring system for publishing as part of my research. After the initial import of any OpenAPI definition, the first thing I look for is the consistent in the naming of paths, the availability of summary, descriptions, as well as tags. The naming conventions used is paths is all over the place, some are cleaner than others. Most have a summary, with fewer having descriptions, but I’d say about 80% of them do not have any tags available for each API path.
Tags for each API path are essential to labeling the value a resource delivers. I’m surprised that API providers don’t see the need for applying these tags. I’m guessing it is because they don’t have to work with many external APIs, and really haven’t put much thought into other people working with their OpenAPI definition beyond it just driving their own documentation. Many people still see OpenAPI as simply a driver of API documentation on their portal, and not as an API discovery, or complete lifecycle solution that is portable beyond their platform. Not considering how tags applied to each API resource will help others index, categorize, and organize APIs based upon the value in delivers.
I have a couple of algorithms that help me parse the path, summary, and description to generate tags for each path, but it is something I’d love for API providers to think more deeply about. It goes beyond just the resources available via each path, and the tags should reflect the overall value an API delivers. If it is a product, event, messaging, or other resource, I can extract a tag from the path, but the path doesn’t always provide a full picture, and I regularly find myself adding more tags to each API(if I have the time). This means that many of the APIs I’m profiling, and adding to my API Stack, API Gallery, and other work isn’t as complete with metadata as they possibly could be. Something API providers should be more aware of, and helping define as part of their hand crafting, or auto-generation of OpenAPI definitions.
Tag your APIs as part of your OpenAPI definitions! I know that many API providers are still auto-generating from a system, but once they have published the latest copy, make sure you load up in one of the leading API design tools, and give that last little bit of polish. Think of it as that last bit of API editorial workflow that ensures your API definitions speak to the widest possible audience, and are as coherent as it possibly can. Your API definitions tell a story about the resources you are making available, and the tags help provide a much more precise way to programmatically interpret what APIs actually deliver. Without them APIs might not properly show up in search engine and Github searches, or render coherently in other API services and tooling. OpenAPI tags are an essential part of defining and organizing your API resources–give them the attention they deserve.
I gave a talk early in in June at POST/CON 2018 in San Francisco. The conference was a great mix of discussions reflecting the Postman community. You can find all the talks on Google, including mine about moving towards a modern AP lifecycle.
You can find all the stop along what I consider to be a modern API lifecycle on the home page of API Evangelist, with links to any of my research, services, tooling, and other storytelling I’ve done in each area.
Thanks again to Postman for having me out!
I am shifting my long running API operations from a PHP / EC2 based implementation to a more efficient Node.js / Lambda based solution, and I promised James Higginbotham (@launchany) that I’d share a breakdown of my process with him a month or so back. I’m running 100+, to bursts of 1000+ long running API requests for a variety of purposes, and it helps me to tell the narrative behind my code, introducing some coherence into the why and how of what I’m doing, while also sharing with others along the way. I had covered my earlier process a little bit in a story a few months ago, but as I was migrating the process, I wanted to further flesh out, and make sure I wasn’t mad.
The base building block of each long running API request I am making is HTTP. The only difference between these API requests, and any others I am making on a daily basis, is that they are long running–I am keeping them alive for seconds, minutes, and historically hours. My previous version of this work ran as long running server side jobs using PHP, which I monitored and kept alive as long as I possibly could. My next generation scripts will have a limit of 5 minutes per API request, because of constraints imposed by Lambda, but I am actually find this to be a positive constraint, and something that will be helping me orchestrate my long running API requests more efficiently–making them work on a schedule, and respond to events.
Ok, so why am I running these API calls? A variety of reasons. I’m monitoring a Github repository, waiting for changes. I’m monitoring someone’s Twitter account, or a specific Tweet, looking for a change, like a follow, favorite, or retweet. Maybe I’m wanting to know when someone asks a new question about Kafka on Stack Overflow, or Reddit. Maybe I’m wanting to understand the change schedule for a financial markets API over the course of a week. No matter the reason, they are all granular level events that are occurring across publicly available APIs that I am using to keep an eye on what is happening across the API sector. Ideally all of these API platforms would have webhook solutions that would allow for me to define and subscribe to specific events that occur via their platform, but they don’t–so I am doing it from the outside-in, augmenting their platform with some externally event-driven architecture.
An essential ingredient in what I am doing is Streamdata.io. Which provides me a way to proxy any existing JSON API, and turn into a long running / streaming API connection using Server-Sent Events (SSE). Another essential ingredient of this is that I can choose to get my responses as JSON PATCH, which only sends me what has changed after the initial API response comes over the pipes. I don’t receive any data, unless something has changed, so I can proxy Github, Twitter, Stack Overflow, Reddit, and other APIs, and tailor my code to just respond to the differential updates I receive with each incremental update. I can PATCH the update to my initial response, but more importantly I can take some action based upon the incremental change, triggering an event, sending a webhook, or any other action I need based upon the change in the API space time continuum I am looking for.
My previous scripts would get deployed individually, and kept alive for as long as I directed the jobs manager. It was kind of a one size fits all approach, however now that I’m using Lambda, each script will run for 5 minutes when triggered, and then I can schedule to run again every 5 minutes–repeating the cycle for as long as I need, based upon what I’m trying to accomplish. However, now I can trigger each long running API request based upon a schedule, or based upon other events I’m defining, leveraging AWS Cloudwatch as the logging mechanism, and AWS Cloudwatch Events as the event-driven layer. I am auto-generating each Node.js Lambda script using OpenAPI definitions for each API, with a separate environment layer driving authentication, and then triggering, running, and scaling the API streams as I need, updating my AWS S3 Lake(s) and AWS RDS databases, and pushing other webhook or notifications as I need.
I am relying heavily on Streamdata.io for the long running / streaming layer on top of any existing JSON API, as well as doing the differential heavy lifting. Every time I trigger a long running API request, I’ll have to do a diff between it’s initial response, and the previous one, but every incremental update for the next 4:59 is handled by Streamdata.io. Then AWS Lambda is doing the rest of the triggering, scaling, logging, scheduling, and event management in a way more efficient way than I was previously with my long running PHP scripts running as background jobs on a Linux EC2 server. It is a significant step up in efficiency and scalability for me, allowing me to layer on an event-driven layer on top of existing 3rd party API infrastructure I am depending on to keep me informed of what is going on, and keep my network of API Evangelist research moving forward.
I’m doing a lot of thinking regarding how JSON PATCH can be applied because of my work with Streamdata.io. When you proxy an existing JSON API with Streamdata.io, after the initial response, every update sent over the wire is articulated as a JSON PATCH update, showing only what has changed. It is an efficient, and useful way to show what has changed with any JSON API response, while being very efficient about what you transmit with each API response, reducing polling, and taking advantage of HTTP caching.
As I’m writing an OpenAPI diff solution, helping understand the differences between OpenAPI definitions I’m importing, and allowing me to understand what has changed over time, I can’t help but think that JSON PATCH would be a great way to articulate change of the surface area of an API over time–that is, if everyone loyally used OpenAPI as their API contract. Providing an OpenAPI diff using JSON PATCH would be a great way to articulate an API road map, and tooling could be developed around it to help API providers publish their road map to their portal, and push out communications with API consumers. Helping everyone understand exactly what is changing in way that could be integrated into existing services, tooling, and systems–making change management a more real time, “pipelinable” (making this word up) affair.
I feel like this could help API providers better understand and articulate what might be breaking changes. There could be tooling and services that help quantify the scope of changes during the road map planning process, and teams could submit OpenAPI definitions before they ever get to work writing code, helping them better see how changes to the API contract will impact the road map. Then the same tooling and services could be used to articulate the road map to consumers, as the road map becomes approved, developed, and ultimately rolled out. With each OpenAPI JSON PATCH moving from road map to change log, keeping all stakeholders up to speed on what is happening across all API resources they depend on–documenting everything along the way.
I am going to think more about this as I evolve my open API lifecycle. How I can iterate a version of my OpenAPI definitions, evaluate the difference, and articulate each update using JSON PATCH. Since more of my API lifecycle is machine readable, I’m guessing I’m going to be able to use this approach beyond just the surface area of my API. I’m going to be able to use it to articulate the changes in my API pricing and plans, as well as licensing, terms of service, and other evolving elements of my operations. It is a concept that will take some serious simmering on the back burners of my platform, but a concept I haven’t been able to shake. So I might as well craft some stories about the approach, and see what I can move forward as I continue to define, design, and iterate on the APIs that drive my platform and API research forward.
It is tough to help developers think outside of the world they operate within. Most software is still developed and managed within silos, knowing it’s inner workings will never be seen by anyone outside of the team. This mode of operation is a rich environment for poor code quality, and teams with port communication. This is one of the reasons I’ve embraced web APIs, after running software development teams since the 1990s, I’ve been put in charge of some pretty dysfunctional teams, and some pretty unwieldy legacy codebases, so once I started working out in the open using web APIs, I did’t want to go back. Web APIs aren’t the cure for all of our technology problems, but it does begin to let some sunlight in on some messed up ways of doing things.
One common illness I still see trickling out of API operations are developers not using plain language. Speaking in acronyms, code, and other cryptic ways of articulating the resources they are exposing. I came across a set of API resources for managing a DEG the other day. You could add, updated, delete and get DEGs. You can also pull analytics, history, and other elements of a DEG. I spent about 10-15 minutes looking around their developer portal, documentation, and even Googling, but never could figure out what a DEG was. Nowhere in their documentation did they ever tell consumers what a DEG was, you just had to be in the know I guess. The API designer (if that occurred) and developer had never stopped to consider that maybe someone would stumble across their very public API and not know what a DEG was. Demonstrating how us developers have trouble thinking outside our silos, and thinking about what others will need.
There is no reason that your API paths shouldn’t be plain language, using common words. I’m not even talking about good RESTful resource design, I’m simply talking about looking at the URI for an API and being able to understand what it is because it used words we can understand. If you have trouble pausing, and stepping back, and thinking what some random 3rd party developer will interpret your API paths as, I recommend printing them out and sharing them with someone that isn’t on your team, and familiar with the resources you work with. Even if your APIs aren’t going to be public, someday you will be gone, and maybe your documentation isn’t up to date, and someone will have to reverse engineer what your API does. There is no reason your API should hide what it does, and not speak for itself, providing an intuitive, plain language description fo the value it possesses.
I look at hundreds of APIs each month. I push myself to understand what an API does in seconds, or minutes. When I spend 10-15 minutes unsuccessfully to understand what an API does, there is a problem with its design. I’m not talking about good API design, I’m just talking about coherent API design. There is no reason you should have an acronym in your API path. I don’t care how short-lived, or internal you view this API. These are often times the APIs that end up sticking around for generations, and becoming part of the technical debt future teams will have to tackle. Don’t be part of the problem in the future. Speak in plain language, and make your API paths speak for themselves. Make them speak to as wide as possible audience as you can. Make them reach outside of your developer circles, and become something any human can copy and paste, and put to work as part of their daily routine.
When you operate your application within the API ecosystem of a large platform, depending on the platform, you might have to worry about the platform operator copying, and emulating what you do. Twitter has long been accused of sharecropping within their ecosystem, and other larger platforms have come out with similar features to what you can find within their API communities. Not all providers take the ideas, it is also very common for API platforms to acquire talent, features, and applications from their ecosystems–something that Twitter has done regularly. Either way, API ecosystems are the R&D, and innovation labs for many platforms, where the latest features get proven.
As the technology playing field has consolidated across three major cloud providers, AWS, Azure, and Google, this R&D and innovation zone, has become more of a cloud kill zone for API providers. Where the cloud giants can see the traction you are getting, and decide whether or not they want to launch a competing solution behind the scenes. Investors are tuning into this new cloud kill zone, and in many cases opting not to invest in startups who operate on a cloud platform, afraid that the cloud giant will just come along and copy a service, and begin directly competing with companies operating within their own ecosystem. Making it a kill zone for API providers, who can easily be assimilated into the AWS, Azure, or Google stack, and left helpless do anything but wither on the vine, and die.
Much like other API ecosystems, AWS, Azure, and Google all have the stats on who is performing across their platforms, and they know which solutions developers are demanding. Factoring in the latest growth trends into their own road maps, and making the calculations around whether they will be investing in their own solutions, or working to partner, and eventually acquire a company operating with this new kill zone. The 1000 lb cloud gorillas carry a lot of weight in regards to whether or not they choose to partner and acquire, or just crush a startup. I’m guessing there are a lot of factors they consider along the way that will contribute to whether or not they play nicely or not. There are no rules to this game, and they really can do whatever they want with as much market share and control over the resources as they all possess. It will be interesting to begin tracking on acquisitions and partnerships across all players to better understand the score.
I wrote last year about how the API space is in the tractor beam of the cloud providers now, and it is something I think will only continue in coming years. It will be hard to deploy, scale, and operate your API without doing it on one of the cloud platforms, or multiple cloud platforms, forcing all API providers to operate within the cloud kill zone. Exposing all new ideas to share their analytics with their platform overlords, and open them up for being copied, or at least hopefully acquired. Which is something that will stunt investment in new APIs, making it harder for them to scale and grow on the business side of things. Any way you look at it, the cloud providers have the upper hand when it comes to cherry picking the best ideas and features, with AWS having a significant advantage in the game with their dominant cloud market position. It will be pretty hard to do APIs in the next decade without AWS, Azure, and Google knowing what you are doing, and having the last vote in whether you are successful or not.
I was profiling the volume of API from the Internet of Things platform Predix this last week. Luckily they have OpenAPI definitions for each of the APIs, something that makes my life a lot easier. As, they have a wealth of APIs available, doing an amazing amount of work when it comes to connecting devices to the Internet–I love their enthusiasm for putting out APIs. My only critical feedback for them after working my way through their API definitions, is they should invest some time to develop an API design guide, and distribute across their teams. The wild variances in definition and design of their APIs made me stumble a number of times while learning about what they do.
While looking through the definitions for the Predix APIs, I found many inconsistent patterns between them, and you could tell that they had different teams (or individuals) working across the suite of APIs. The inconsistencies ranged from the naming, description, and how the meta data was provided for each API, all the way to acronyms used in API paths, and other things that prevented me from understand what an API did all together. While I am stoked they provide OpenAPI definitions for all of their APIs, I still struggled to understand what was possible with many of their APIs. It kind of feels like they need an external editor to review each API definition before it leaves the door, as well as some sort of automated validation using JSON schema, that would work against a common set of API design standards.
I can tell that Predix has an extremely powerful stack of Internet of Things API resources. They have insight, predictive, and event-driven layers, and a wealth of resources for device operators to put to work. They just need another layer of API design polish on their APIs, as well as ensuring their API documentation reflects this design polish, helping bring it all home. If they did, I’m guessing they would see their adoption numbers increase. It can be tough to come into someone’s world and understand the value they bring to the table, even with simple API resources, but with something as robust and complex as what Predix is up to, even an experienced integrator like me is having trouble getting up to speed on what was possible. A little API design and documentation polish would go a long way to reduce the friction for new consumers getting up to speed.
I struggle with making sure some of my writing gets the editing love before it gets out the door. I also struggle with making sure my own API definitions and designs get the love they need before they see the light of day. As a one person show I just do not have the resources it always takes to deliver at the scope I need. So I fully understand the challenge of small startups when it comes to investing in proper API design across their operations–you just don’t always have the time to slow down and invest in a common API design guide, and the training and awareness across teams. I don’t want to shame the Predix team, as I can tell they’ve invested a lot into their APIs. I just want to make sure they understand that a little investment in API design will go a long ways in helping them better achieve their goals as an Internet of Things API provider.
About half of the teams I work with on microservices strategy are beginning to freak out about the number repositories they have, and someone is regularly bringing up the subject of having a mono repo. Which is usually a sign for me that a group is not ready for doing the hard work involved with microservices, but also shows a lack of ability to think, act, and respond to things in a distributed way. It can be a challenge to manage many different repositories, but with a decoupled awareness of the sprawl that can exist, and some adjustments and aggregation to your strategy it can be doable, even for a small team.
The most import part of sticking to multiple repositories is for the sake of the code. Keeping services decoupled in reality, not just name is extremely important. Allowing the code behind each service to have its own repository, and build a pipeline that keeps things more efficient, nimble, and fast. Each service you layer into a mono repo will be one more chunk of time needed when it comes to builds, and understanding what is going on with the codebase. I know there are a growing number of strategies for managing mono repos efficiently, but it is something that will begin to work against your overall decoupling efforts, and you are better off having a distributed strategy in place, because code is only the first place you’ll have to battle centralization, in the name of a more distributed reality.
Github, Gitlab, and Bitbucket all have an API, which makes all of your repositories accessible in a programmatic way. If you are building microservices, and working towards a distributed way of doing things, it seems like you should be using APIs to aggregate and automate your reality. It is pretty easy to setup an organization for each grouping of microservices, and setup a single master or control repository where you can aggregate information, and activity across all repository–using Github Pages (or other static implementation) as a central dashboard, and command center for managing and containing microservice sprawl. Your repository structure should reflect your wider microservices organization strategy, and all the moving parts should be allowed to operated in a distribution fashion, not just the code, but also the conversation, support, and other essential elements.
I’m spending more time learning about Kubernetes, and studying how microservices are being orchestrated. Like other aspects of the API world, I’m going to focus on not just the code, but also the other communications, support, dependencies, security, testing, and critical building blocks of delivering APIs. I feel like many folks Im talking with are getting hung up on the distributed nature of everything else, while trying to distribute and decouple their code base. Microservices are definitely not easy to do, and decoupling isn’t an automatic solution to all our problems. From what I am seeing, it is opening up more problems than it is solving in some of the organizations I am working with, and causing a lot of anxiety about the scope of what teams will have to tackle when trying to find success with microservices across their increasingly distributed organizations.
We are building up to the 9th edition of API Strategy & Practice (APIStrat) happening in Nashville, Tennessee this September 24th through 26th. As part of the build up we are looking for sponsors to help make the event happen, bringing the API community together once again to share stories from the trenches, and discuss healthy practices that are allowing companies, organizations, institutions, and government agencies make an impact when it comes to their API operations.
The 2017 edition of APIStrat in Portland, OR was a huge success, and help complete the transition of APIStrat to be part of the OpenAPI Initiative (OAI). After seven editions, and four years of operation exclusively by 3Scale and API Evangelist, the event has matured and will continue growing under the guidance of the OAI, and the community that has evolved around the OpenAPI specification. Presenting an opportunity for other API providers, and API service providers to get involved by joining as an OAI member and / or sponsoring APIStrat, and joining the conversation that has been going on in the community since early 2013.
You can download the APIStrat conference prospectus from the Linux Foundation / OAI event website, and there is a form to submit to learn more about sponsoring. You can also email [email protected] if you’d like to get plugged in. Feel free to also reach out to me as well, as I’m in charge of trying to drum up sponsors, and expand our base beyond just the OAI membership, and the companies who stepped up last year. Helping API providers and service providers understand what a community event APIStrat is, and help it differentiate from the other API, and tech-focused conferences that are happening.
I’m definitely biased, as I help start and grow the conference, but after running tech events for over a decade, it was important to me that APIStrat grow into a community event about ideas, and less about vendors and product pitches. It is a great opportunity for API providers, and API service and tooling providers to actually rub elbows with developers who are building on top of their APIs, and putting their tools and services to work. The keynotes, sessions, and workshops are always great, but the hallway conversations are always where the magic happens for me. Please step up and help make sure the event continues to grow, and help sponsor APIStrat in Nashville. If you do, I promise to cover your APIs here on the blog, and help tell the story of the impact you are making on the community leading up to the event this September in Tennessee!!
GraphQL folks keep on with the GraphQL vs REST narratives, rather than a REST and / or GraphQL narrative lately with a recent burger meme/narrative. Continuing to demonstrate their narrow view of the landscape, and revealing the short lived power of an adversarial approach to community building. I get why people do this, because they feel they are being clever, and that the click response from the echo chamber re-enforces it, but ultimately it is something that won’t move the conversation forward, but it does get them kudos within their community–which is what many of them live for.
I’ll start with the usual disclaimer. I actually like GraphQL, and prescribe it as part of my API toolbox. However, rather than a REST vs GraphQL approach, I sell it as REST and GraphQL, depending on the developer audience we are trying to reach with our efforts. Whether or not you use GraphQL on your platform is completely based upon knowing your developers, and working with a group that understands the resources before offered–something the GraphQL community continues their failure to see. Also their adversarial marketing tactics has lost me several GraphQL projects in government because it comes off as being a trend, and not something that will be around very long.
With that said, I think this meme tells a great story about GraphQL, and demonstrates the illnesses of not the technology, but the ideology and beliefs of the community. I had a couple of thoughts after seeing the Tweet, and reviewing the replies:
1) I thought it was an anti-GraphQL meme at first. Demonstrating that you can build a horrible burger with some very well known ingredients. Spoofing on the burger emoji drama that has been going on in recent years.. I mean, is the lettuce the plate in the GraphQL burger?
2) Like GraphQL, the food choices demonstrate that GraphQL works well in very controlled environments. Where there are known ingredients, and your clients/customers/developers know the ingredients, and know what they want. Hell yeah GraphQL is a better choice in this environment. The problem is you are selling it as a a better solution than REST in general. I hate to tell you, but most of the business getting done in the world IS NOT FAST FOOD.
3) The meme demonstrates the whole fast food, limited world view of many technologists who work with known ingredients, and think everyone is just like them. This tool works for me, and everyone is just like me, so what I use is cool, and everyone should use the tools and the process that I do. A common perspective out of the white bread (bun?) world of technology.
4) Let’s take this GraphQL meme and begin applying it to an Ethiopian, Greek, or French menu. Let’s take it and apply to a BBQ, catering, or maybe home cooked family gathering. Try applying it when you get a food basket from your local community supported agriculture (CSA), where you have no idea the ingredients that are coming, and you’ll have to adjust based upon the season and whatever is available to you that week. Maybe do the same for a food shelter and pantry, does everyone get it their way?
5) There are some restaurants in New York I’d love to take you to, and have you ask for it your way. I’d love to see you get yelled out of the place when you think you know more than the chef, and you always should have things your way. Really, you know more than someone who has been cooking for years, and your fast food loving, unsophisticated tastes are going to dictate what gets served? Get outta here!!
6) I love API to restaurant menu analogies. I wrote one to support the Oracle v Google copyright case, which the Google lawyer referenced in the latest round. There are many ways you can use restaurants and food to make API comparisons, and educate people about the potential of APIs. I’m sorry though, this one just wasn’t sophisticated enough to really bring home the potential of APIs, and it was more about reflecting this same unsophisticated approach of people marketing and telling stories around GraphQL.
I’ll say it again, and again, and again. I’m not anti-GraphQL. I’m against ya’ll saying it is a replacement for REST. Stop it. It’s dumb. It shows your lack of awareness of the larger API world. It shows you live in tech isolation, where you think everyone wants it your way. Most developers I know do not have a clue as to what they want. They don’t understand the existing schema being used, and need menus, and hand-crafted buffets. Sure, there are development groups who know exactly what they need, and have a full grasp on the schema and resource models being used, but this isn’t EVERYONE!! Stop it. I get GraphQL, but I’m getting tired of coming across new APIs I don’t understand at all, and being expected to just know what I want. I love GraphQL for Github because I KNOW GITHUB. I don’t love GraphQL for the OpenStates API, because I have no clue what the schema and model is for their API–please do the extra work to document your resources, and provide me intelligent, well-crafted paths to get at your valuable data.
Instead of bashing REST, how about thinking more about REST as a starter. Having feedback loops in place to get to your audience. Sure, if it is all internal development, in service of a known group of React developers, go for it–use GraphQL! However, if it is a public API, start with REST, establish feedback loops, and get to know your audience. If enough developers are requesting a query language (GraphQL isn’t the only show in town), and it makes sense in your roadmap, then offer GraphQL alongside REST, but not instead of REST. GraphQL works in a known known, and sometimes a known unknown environment, but not in an unknown unknown environment. The community needs to wake up and realize this. Stop selling it as a replacement to REST, and realize it is just another tool in the API toolbox. Y’all are just hurting your cause, and running some people off with this regular REST v GraphQL storytelling. In the end, you are just showing your lack of knowledge and respect for the web–just like Facebook does.
P.S. Anyone who has their feelings hurt by this post, needs to get out more. Maybe change jobs, move to a new city and industry. You need to see and experience more than you have currently.
It has been a fascinating and eye opening experience sitting at the intersection of tech startups and the web, mobile, and device applications they’ve built over the last decade. In 2010 I was captivated by the power of APIs to deliver resources to developers, and end-users. In 2018, I’m captivated by the power of APIs to mine end-users like they are just a resource, with the assistance of the developer class. A dominant white male class of people who are more than willing to look the other way when exploitation occurs, and make for the perfect “tools” to be exploited by the wealthy investor class.
While I do not have much hope for diversity efforts in tech, or the bro culture waking up, I do have hope for city and state/provincial lawmakers to wake up to the exploitation that is going on. I’ve seen hints of cities waking up to the mining that has been occurring by Facebook and Google over the last decade. The open exploitation and monetization of a city’s and state’s most precious resources–their constituents. While some cities are still swooning over having Amazon set up shop, or Facebook to build a data center, these company’s web, mobile, and device applications have infiltrated their districts been probing, mining, extracting, and shipping value back to offshore corporate headquarters.
You can see this occurring with Google Maps, which has long been a darling of the API community. We were all amazed at the power of this new mapping resource, something us developers could never have built on our own. We all integrated it into our websites, and embedded it into our mobile applications. We could use it to navigate and find where we were going, completely unaware of the power of the application to mine data from our local transit authorities, businesses, as well as track the location of all of us at each moment. Google Maps was the perfect trojan horse to invade our local communities, extract value, only leaving us with a handful of widgets and embeddable apps to keep us hooked, and working for the Google machine–always giving as little back as possible.
Facebook is probably the highest profile example, connecting our families and communities, while it also disrupted our local news, and information channels, as well as take control over our elections. While connecting us all at the local level, we failed to see we were being connected to the Facebook corporate machine, reminiscence of the Matrix movie of the 1990s. Now we are just mindlessly scrolling, clicking, and emotionally responding, where we are simultaneously being mined, tracked, influenced, nudged, and directed. Something that was once done out in the open for many years through a public API program, but is slowly being closed up and done privately behind closed doors, so that a new regulatory show can be performed to demonstrate that Facebook really cares.
I’m spending more time in Europe, having conversations with regulators and business leaders about a more sensible future driven by APIs. Having conversations with city leaders about the value of their data, content, and algorithms. Discussing the value of their constituents personal data, privacy, and security. Talking about the imperialist nature of Facebook, Google, Twitter, Amazon, and Microsoft, and how they invade, conquer, then extract value from our communities. Helping mayors, governors, and other lawmakers realize the value they have before it is gone, and helping them realize that they can take control over their digital resources using APIs, and gain an upper hand in the conversations that are already occurring across the web.
API versioning is almost always one of the top attended discussions at conferences I help organize, and one of the first questions I get in the QA sessions at workshops I conduct. People want to understand the “right way” to version, when it my experience there is rarely ever a “right way” to version your APIs. There are commonly held practices regarding sensible ways to version your APIs, as well as dominant patterns for how you version APIs, but there isn’t any 100% solid answer to the question, despite what many folks might say.
In my experience, the most commonly held approach to properly versioning your APIs (if you are going to), is to put the major and minor version in your header and / or combine it with content-type negotiation via your header. However, even with this knowledge being widely held, the most dominant pattern for versioning your APIs is sticking it in the URL of your API. I know many API providers who put the version in the header, despite many on their team fully being aware that it is something that should be put in the header. So, why is this? Why do people still do it the “wrong way”, even though then know how to do it the “right way”?
I feel like this phenomenon reflects the wider API space, and how upside down many API belief systems are. People put the version in the URL because it is easier for them, and it is easier for their developers to understand. While headers are a native aspect of developing using the web, they are still very foreign and unknown to most developers. While this shows the lack of web literacy that is rampant amongst developers, it also demonstrates why simple web APIs have dominated the landscape–they are easy for a wide segment of developers to understand. An aspect of why this whole API thing has worked that many technologists overlook, and take for granted as they try to push the next trend or solution on the sector.
While conducting workshops, I always teach the more sensible patterns around versioning, but I can’t always sell them as the “right way”. Because I don’t see a “right way”. I see people trying to get a job done, and reach a wide audience. I see people trying to keep things simple for their developers, and taking the path of least resistance. I see a whole lot of web literacy education that needs to occur across the tech sectors and in school. I just don’t see any perfect answer to the API versioning debate. I see a whole lot of interesting and useful patterns, and I see people doing the best they can with what they have. Reflecting why APIs work so well, because they are scrappy, often times simple, and allow people to get business done on the web using low cost, easy to understand approaches to making resources available.
People love to hate in the API space. Ok, I guess its not exclusive to the API space, but it is a significant aspect of the community. I receive a regular amount of people hating on my work, for no reason at all. I also see people doing it to others in the API space on a regular basis. It always makes me sad to see, and have always worked to try to be as nice as I can to counteract the male negativity and competitive tone that often exists. While I feel bad for the people on the receiving end of all of this, I often times feel bad for the people on the giving end of things, as they are often not the most informed and up to speed folks, who seem to enjoy opening their mouth before they understand what is happening.
One thing I notice regularly, is that these same people like to bash on is OpenAPI (fka Swagger). I regularly see people (still) say how bad of an idea it is, and how it has done nothing for the API space. One common thread I see with these folks, which prevents me from saying anything to them, is that it is clear they really don’t have an informed view of what OpenAPI is. Most people spend a few minutes looking it, maybe read a few blog posts, and then establish their opinions about what it is, or what it isn’t. I regularly find people who are using it as part of their work, and don’t actually understand the scope of the specification and tooling, so when someone is being vocal about it and doesn’t use actually it, it is usually pretty clear pretty quickly how uninformed they are about the specification, tooling, and scope of the community.
I’ve been tracking on it since 2011, and I still have trouble finding OpenAPI specifications, and grasping all of the ways it is being used. When you are a sideline pundit, you are most likely seeing about 1-2% of what OpenAPI does–I am a full time pundit in the game and I see about 60%. The first sign that someone isn’t up to speed is they still call it Swagger. The second sign is they often refer to it as documentation. Thirdly, they often refer to code generation with Swagger as a failure. All three of these views date someone’s understanding to about a 2013 level. If someone is forming assumptions, opinions, and making business decisions about OpenAPI, and being public about it, I’d hate to see what the rest of their technology views look like. In the end, I just don’t even feel like picking on them, challenging them on their assumptions, because their regular world is probably already kicking their ass on a regular basis–no assistance is needed.
I do not feel OpenAPI is the magical solution to fix all the challenges the API space, but it does help reduce friction at almost every stop along the API lifecycle. In my experience, 98% of the people who are hating on it do not have a clue what OpenAPI is, or what it does. I used to challenge folks, and try to educate them. Over the years I’ve converted a lot of folks from skeptics to believers, but in 2018, I think I’m done. If someone is openly criticizing it, I’m guessing it is more about their relationship to tech, and their lack of awareness of delivering APIs at scale, and they probably exist in a pretty entrenched position because of their existing view of the landscape–they don’t need me piling on. However, if people aren’t aware of the landscape, and ask questions about how OpenAPI works, I’m always more than happy to help open their eyes to how the API definition is serving almost every stop along the API lifecycle from design to deprecation, and everything in between.
When it comes to providing data, content, and even ML and AI models via APIs, having a public platform will become a competitive advantage. I know that many companies see it as giving away something, especially when your resources and business model are not defensible, but in reality having a publicly available, 24/7 operational, self-service solution will give you an edge over your more proprietary approaches to making resources available on the web. Sure, your competition will be able to often get in there without friction, but so will your customers–how many customers vs. competitors do you have?
I know many companies believe in the power of a sales team to be able to squeeze every last penny out of would be customers, but a sales only approach leaves a significant amount of self-service revenue on the table. Throughout the course of our busy days, many IT decision makers just do not have the time for the phone calls and lunches involved with the traditional sales process. Sure, there are some IT decision makers who fill their schedule with these types of conversations, but there are a growing number who depend on self-service, SaaS approaches to getting business done on a daily basis–look at the growth of Amazon Web Services over the last decade if you need a reference point.
If you think a public API platform involves giving away your intellectual property in 2018, you are severely behind the times on where the sector has been headed for about a decade. Far enough behind that you may not be able to play catch up at the speed in which things are shifting. A public portal, documentation, and other resources does not mean you are giving anything away. Even having a free tier doesn’t mean that you are giving away the farm. Modern API management solutions allow you to generate leads, let developers kick the tires, while also still being able to charge what the market will bear for your data, content, and algorithms. You can also still have a sales force that will swoop in on leads, and close the deals when it makes sense.
Even with a self-service API, and robust documentation, code samples and SDKs, API providers still have to work hard to reduce friction when on boarding–providing OpenAPI definitions, Postman collections, connectors, plugins, and platform specific development kits to make integration quick and painless. If you don’t even have a public self-service presence you are just getting in the way of integration, and a growing number of your customers will just choose to go with your competitors who have opted to get out their way. The companies who don’t have self-service in their DNA won’t be able to compete in the new landscape, making it essential to be able to do business out in the open, in a self-service way, essential to staying competitive in the new API-driven landscape.
My friend James Higginbotham (@launchany) was sharing his frustration with being able to stay in tune with changes to a variety of APIs. Like me, James works to stay in tune with a variety of signals available via platforms like Twitter, Github, and other commonly used services. These platforms don’t always properly signal when things are updated, changed, or advanced, making it difficult to understand the granular changes that occur like likes, votes, edits, and other common events that occur via highly active platforms.
This challenge is why the evolution towards a more event-driven approach to operating an API platform is not just more efficient, it gives users what they need. Using event-driven architectural approaches like Webhooks, and real times streams. This is one of the reasons I’m interested in what Streamdata.io does, beyond them helping support me financially, is that they allow me to focus on the event-driven shift that is occurring with many leading API providers, and needs to be brought to the attention of other platforms. Helping API providers be more efficient in what they are doing, while also meeting the needs of the most demanding customers like James and myself.
It is easy to think Streamdata.io is just about streaming real time data. This is definitely a large aspect of what the SaaS solution does, but the approach to using Server-Sent Events (SSE), with incremental updates using JSON Patch adds another useful dimension when it comes to understanding what has changed. You can proxy an existing HTTP API that returns a JSON response using Streamdata.io, and the first response will look just like any other, but every pushed response after that will be a JSON Patch of just what has changed. Doing the heavy lifting of figuring out what has changed in each API response and only sending you the difference, and allowing you to focus only on what has changed, and not having to rely on timestamps, and other signals within the JSON response to understand what the difference is from the previous API response.
Using Streamdata.io you don’t have to keep polling an API asking if things have changed, you just proxy the API and you get pushed changes via an HTTP stream. You also don’t have to sort through each response and try to understand what changed, you just take the JSON Patch response, and it tells you what has changed. I’m going to create a draft blueprint for James of how to do this, that he can use across a variety of APIs to establish multiple API connections using long running, server-side API streams for a variety of topics. Allowing him to monitor many different APIs, and stay in tune with what changes as efficiently as possible. Once I craft a generic blueprint, I’m going to apply to Twitter and see if I can increase the efficiency of my Twitter monitoring, by turning their REST APIs into real time feeds using Streamdata.io.
One thing I see a lot from API service providers who are selling their services to the API sector, is that once they find success servicing one stop along the API lifecycle, they often want to service other additional stops. I don’t have a problem with API service providers delivering across multiple stops along the API lifecycle, however I do caution of trying to expand across too many stops, and potentially doing any of them poorly, rather than partnering with other more specialized API service providers to help you focus on what you do best.
I’m a big advocate for encouraging API providers to service one to five stops along the API lifecycle well, and then partner for helping deliver the rest of the stops. I know that all your investors are encouraging to take as many pieces of the puzzle as you possibly can, but there is more money in doing a handful of things really well, over doing many things poorly. Try to be an expert in a handful of specialized areas, over being a generalist. Then make sure your platform is as interoperable as possible, while investing in your partner program to attract the best of breed API service providers to your platform.
This balance between focusing on a handful of stops and partnering is why I emphasize and study common approaches to delivering plugins. All platforms should invest in plugin infrastructure, to allow for extending their reach beyond the stops that a platform services. Feature creep, and platform bloat is a real challenge, especially when you have investors whispering in your ear to keep building, and a very vocal, but often long-tail group of users demanding solutions to their unique problems. Plugin and connector architecture is how you help manage this reality, and provide a relief valve for delivering too many features as part of your platform, while also bringing in potential partners who can help extend what your platforms in a way that allows you to keep doing what you do best.
I see a big push going on from many legacy API service providers, as well as some of the next generation of startups bringing services and tooling to the space. I feel like many people desire a single solution to do everything, but then fail to realize that every platform that has attempted this in the past ends up failing, because you can’t be everything to everyone. I want my API service providers to stick to doing a handful of things well, but then acknowledge that I will also be using several other tools to what get I need accomplished on a daily basis. Ideally all of my tools are interoperable with import and export capabilities, as well as a suite of API driven connectors, and plugins that allow me to keep all of my services and tooling working together in concert. For this reality to occur we all have to resist the temptation to lock our customers in, and put down delusions that we can serve all stops along a modern API lifecycle all by ourselves.
I’m fascinated by the dominating power of programming languages. There are many ideological forces at play in the technology sector, but the dogma that exists within each programming language community continues to amaze me. The potential absence of programming language dogma within the world of APIs is one of the reasons I feel it has been successful, but alas, other forms of dogma tends to creep in around specific API approaches and philosophies, making API evangelism and adoption always a challenge.
The absence of programming languages in the API design, management, and testing discussion is why they have been so successful. People in these disciplines have ben focused on the language agnostic aspects of just doing business with APIs. It is also one of the reasons the API deployment conversation still is still so fragmented, with so many ways of getting things done. When it comes to API deployment, everyone likes to bring their programming language beliefs to the table, and let it affect how we actually deliver this API, and in my opinion, why API gateways have the potential to make a comeback, and even excel when it comes to playing the role of API intermediary, proxy, and gateway.
Programming language dogma is why many groups have so much trouble going API first. They see APIs as code, and have trouble transcending the constraints of their development environment. I’ve seen many web or HTTP APIs called Java API, Python APIs, or reflect a specific language style. It is hard for developers to transcend their primary programming language, and learn multiple languages, or think in a language agnostic way. It is not easy for us to think out of our boxes, and consider external views, and empathize with people who operate within other programming or platform dimensions. It is just easier to see the world through our lenses, making the world of APIs either illogical, or something we need to bend to our way of doing things.
I’m in the process of evolving from my PHP and Node.js realm to a Go reality. I’m not abandoning the PHP world because many of my government and institutional clients still operate in this world, and I’m using Node.js for a lot of serverless API stuff I’m doing. However I can’t ignore the Go buzz I keep coming stumbling upon. I also feel like it is time for a paradigm shift, forcing me out of my comfort zone and push me to think in a new language. This is something I like to do every five years, shifting my reality, keeping me on my toes, and forcing me to not get too comfortable. I find that this keeps me humble and thinking across programming languages, which is something that helps me realize the power of APIs, and how they transcend programming languages, and make data, content, algorithms, and other resources more accessible via the web.
I spend a lot of time talking about the SalesForce API, using it as a reference for where the API evolution began 18 years ago, but it has been a long time since I’ve actually worked with the SalesForce API. Getting up and running with any API, especially iconic APIs that we all should be familiar with, is always an enlightening experience for me. Going from zero to understanding what is going on and actually achieving the API call(s) you want, is really what this game is all about.
As part of some work I’m doing with Streamdata.io I needed to be able to add new leads into SalesForce, and I thought it would be a good time for me to get back into the saddle with the SalesForce REST API–so I volunteered to tackle the integration. The SalesForce API wasn’t as easy to get up and running as many simpler APIs I onboard with is, as the API docs isn’t as modern as I’d expect, and what you need is buried behind multiple clicks. Once you find what you are looking for, and click numerous times, you begin to get a feel for what is going on, and the object model in use becomes a little more accessible.
In addition to finding what you need with the SalesForce REST API, you have to make sure you have a handle on the object structure and nuance of SalesForce itself. For this story, I am just working with one object–Leads. I’m using PHP to work with the API, and to begin I wanted to be able to get leads, to be able to see which leads I currently have in the system:
I will add pagination, and other elements in the future. For now, I just wanted to be able to get the latest leads I have in the system to help with with some checks on what is being added. Now that I can check to see what leads are in the system, I wanted to be able to add a lead, with the following script:
I am only displaying some of the default fields available for this example, and you can add other custom fields based upon which values you wish to add. Once I have added my lead, I wanted to be able to update with a PATCH API call:
Now I am able to add, update, and get any leads I’m working with via the SalesForce API. The project gave me a good refresher for what is possible with the SalesForce API. The API is extremely powerful, and something I want to be up to speed on so that I can intelligently respond to questions I get. I wish the SalesForce API team would spend some time modernizing their API portal and documentation, providing a more coherent separation between the different flavors of their API, and provide OpenAPI driven documentation, as well as Postman Collections. It would have saved me hours of working through their API docs, and playing around with different API calls in Postman before I was able to successfully OAuth, and make my first call against the accounts and leads API endpoints.
While I think SalesForce remains a worthwhile API to showcase when I talk about the history of APIs, and the power of providing web APIs, their overall documentation and approach is beginning to fall behind the times. SalesForce possesses many of the building blocks I recommend other API providers operate, and are very advanced in some of their support and training efforts, but their documentation, which is the biggest pain point for developers, leaves a lot to be desired. I’m used to having to jump through hurdles to get up and running APIs, so the friction for me was probably less than a newer API developer would experience. I could see some of the domain instance url, versioning, and available API paths proving to be a significant hurdle if you didn’t understand what was going on. Something that could be significantly minimized with some simpler, more modern API docs, and OpenAPI and Postman Collections available.
This report summarizes Skylight’s evaluation of the VA’s public datasets, which exist within the va.gov web domain, as well as an analysis of what types of data representatives of the Veteran community expressed would be most useful and valuable to Veterans and their supporters if made more digitally accessible and available by the VA. This report also outlines potential resources that can be turned into application programming interfaces (APIs) as part of the VA’s Lighthouse platform initiative, and actions the VA should consider to move forward successfully.
APIs are the next evolution in the web, and shouldn’t be thought of as the latest tech trend or vendor solution. The first phase of the web was about delivering data and content to humans using a browser. The second phase of the web is about delivering that same data and content to other applications and algorithms using APIs.
With that said, the purpose of this landscape analysis is, in effect, to assist the VA in evaluating the data and content that they’ve made available during the first phase of the web. This, in turn, will help set the stage for the VA to make smart investments in phase two of their web presence.
The VA has already signaled they’re committed to investing in the second phase of their web presence with the announcement of the Lighthouse API platform initiative. Our landscape analysis will help ensure that the Lighthouse program is aware of what types of data and content that the VA has already identified as important to serving the Veteran community. This visibility will allow the Lighthouse program to bring these resources into alignment with the development and operation of their API platform.
To help the VA evaluate the landscape that defines their web presence, we employed a “low-hanging-fruit” process that involved identifying the resources that exist across their web properties. That process relied on a spidering script, which we ran for two weeks (and continue to run). To begin the process, we seeded the script by giving it the root URL for the va.gov domain. The script then proceeded to:
Parse every URL on the page and store it in a database;
Count every table on the page, and the number of rows that exist in the table;
Count every form that exists on the page; and
Extract the title from the meta tags for each page.
The script then iterated and repeated this for every URL it found on any web page, working to identify each of the following types of data resources:
HTML table with more than 10 rows
The script ignored any URLs external to the seed domain (va.gov) and many common web objects (for example, images, Word docs, and videos).
As each page was processed, the script tried to identify potential data resources to deliver as an API by parsing several elements from them:
The title of the page a file was published on,
The name of the file itself, and
Occasionally a sample of the data.
We took the list of words extracted from this process, and sorted and grouped them by the number of times the word appeared, helping us understand the overall presence of each potential resource. Sometimes this produced a lot of meaningless words, but we worked to filter those out, leaving only the meaningful data resources.
After running the script for a couple of weeks, we spidered nearly 1/3 of the URLs (out a total of 4M+) targeted for processing. That was enough progress to start painting an interesting picture of the VA’s web presence and existing data resources, as described in the sections that follow.
VA’s web presence
The VA has a sprawling web presence, spanning multiple domains and subdomains. We focused our analysis on everything within the va.gov domain. In the future, we can extend the analysis to other domains, but for now we focused on the core VA web presence.
The VA’s web presence is spread across a mix of domain levels, including top-level domain and subdomain levels — program, state/region, and city. Domains play a role in providing addresses in the browser so that users can find the resources they need, as well as providing similar addressing for applications to find the resources they need via APIs. The VA domain sprawl reflects the growth of the VA’s web presence, and the lack of overall strategy when it comes to providing web address and routing to all VA resources. The current strategy (or lack thereof) represents the need for location- and program-related resource discovery, whether it’s in the browser, or for other applications via APIs.
We assume there are other domains that haven’t been indexed by our spider, as we were only able to index less than 1/3 of the targeted URLs during our two-week timebox. We can continue indexing and updating numbers beyond this period in order to paint an even more complete picture. Ideally, this would extend beyond the core va.gov domain. Domain and subdomains play an important role in determining how APIs will be accessed, and have a downstream impact on the overall API design, affecting both API path and parameter design. This makes domain and subdomains a top-level consideration early on in the VA’s API journey.
After the top-level domains of va.gov and www.va.gov, the most common approach to defining domains is by program, providing the addressing needed for organizing information by relevant programs. We have identified 133 individual program-related domains.
While there aren’t consistent naming conventions used in crafting these subdomains, it does demonstrate the prominence of programs, research, and other related groupings used across the VA web presence for organizing resources.
Beyond program-related domains, state/region level domains are being used to organize data and content for presentation to consumers. Only 22 subdomains are represented currently, but the practice demonstrates the prominence of these locations when it comes to organizing information.
Some states are just paths within the top-level VA domains, while others exist within regional subdomains, with the rest possessing their own subdomain. This demonstrates the importance of states and regions, but also the inconsistency of how domains or paths are used to organize information.
Lastly, you find many city-related subdomains being used to organize data and content, providing another dimension on how resources are being organized, while demonstrating the dominance of specific cities. We have identified 120 individual city-related domains.
Like states, there isn’t a consistent pattern in which cities have their own subdomain, with others existing as a path within state subdomains or top-level domains. The approach to using cities as part of subdomain DNS addressing further demonstrates the importance of location when it comes to the organization of data and content.
As part of the spidering the va.gov domain across the 278 subdomains that exist, over 4M individual URLs were identified, with slightly less than 1/3 of these URLs evaluated for potential data sources to-date. Across these URLs, we took the base path and grouped them by the number of pages and data files that exist.
While there are many other paths in use across the VA websites, these paths reflect the top paths in use to deliver data and content. Providing a look at what the most relevant resources are when it comes to providing web access to data and content, which is something that should be considered when delivering the same data and content to other applications.
After assessing the titles of HTML pages and the names of files, it’s clear there’s no consistent vocabulary in use across VA resources. This, combined with the use of key phrases, acronyms, and singular and plural variances, make it difficult to cleanly identify resources. We opted to use just keywords over phrases and to not expand acronyms as part of the process due to the difficulty in consistently identifying resources.
Even with the difficulties in identifying some resources, we were still able to paint a fairly compelling picture of the resources being exposed as common data formats across VA web properties. That’s because we were able to isolate, group, and identify words that are most commonly associated with resources. From there, we were able to establish some resource lists, which we have organized visually as tag clouds and tag lists.
Data is available across VA websites in a variety of formats. We focused on a handful of easy to identify formats, reflecting the low-hanging-fruit aspect of this landscape analysis. While there’s data locked up in simple text files and zipped packages, we chose to look for the easiest to identify and the easiest to publish data sources. Data that’s published by humans usually take the form of CSV files, spreadsheets, and HTML tables. Data that’s published by systems usually take the form of JSON and XML.
Each of type of format that we targeted provides a different story as to the type of resources being published. Publication implies that those resources carry some level of value and importance to VA stakeholders, and, potentially, to Veterans, their supporters, and other consumers of this information. We worked to harvest all the data available from several formats, but also worked to identify the top resources available from each type.
We discovered 534 CSV files containing a variety of data. By parsing the titles of the web pages these CSV files were linked from, and the names of some of the files, we identified handful of top resource types present across these files.
CSV files tell a particular story because they were most likely published by people working at the VA, who exported the files from spreadsheets and made them available on the website for a reason. This makes them relevant to the VA’s API conversation. You can view a list of CSV resources, as well as a complete list of CSV files on the GitHub repository.
We identified 6,077 spreadsheets containing a variety of data. After parsing these files for semantic meaning, we identified handful of top resource types present across these files.
Similar to CSV files, the presence of spreadsheets tell a very human story. Spreadsheets are the #1 source of data on the web, and reflects the data management and publishing practices across the VA. After evaluating what types of resources are available across these spreadsheets, we have been considering the use of spreadsheets as a data source, as well as a data publishing tool. You can view a list of XLS/XLSX resources, as well as a complete list of XLS/XLSX files on the GitHub repository.
We identified 467 JSON files containing a variety of data. Unlike the CSV and spreadsheet data sources, JSON files likely represent a more modern systems approach to publishing data and a whole another set of data sources, which should be considered when deploying APIs.
JSON reflects the latest evolution of data publishing at the VA. But they are only a small subset of the data being made available across VA web properties. This implies they have only become a recent priority when it comes to publishing data in a format that is consumable by developers and computers. You can view a list of JSON resources, as well as a complete list of JSON files on the GitHub repository.
We found 3,099 XML files containing a variety of data. Like JSON files, XML files represent system-generated publication of data. Unlike JSON, however, XML reflects an older systems approach to data publication. And are likely being generated by legacy systems that’ll be important to interface with over the course of the VA’s API journey.
XML represents a large portion of the data being published across VA web properties. This list of priority resources represents a significant part of the system-based publishing of data occurring at the VA. And provides a large snapshot of the systems that should be evolved as part of the deployment of APIs. You can view a list of XML resources, as well as a complete list of XML files on the GitHub repository.
We identified 8,393 pages that had tables on them with over 10 rows. These tables represent potentially valuable data and should be considered as part of the VA’s API deployment conversations.
While HTML tables tell a story about top resources that VA stakeholders thought website users needed access to, these tables also represent data that was published with potential search engine optimization (SEO) in mind. In other words, someone wanted the data to be indexed by search engines in order to make it more readily accessible. You can view a list of table resources, as well as a complete list of pages containing tables on the GitHub repository.
We identified 9,439 pages with more than one form present, which is usually just a basic search. Similar to HTML tables, these forms provide a window into how the VA is making data available for users to search, explore, and consume in the browser. This, in turn, tells another story of what types of resources are published to VA websites.
HTML forms often times provide a search mechanism for other table, CSV, JSON, XML, and spreadsheet resources, many of which are listed in the sections above. HTML forms tell their own story as to how and why data are being published across VA websites. And offer another source of resources that are being made available and should be considered as part of the VA’s API deployment efforts.
The only external source of data that we analyzed was data.gov, which hosts a number of VA data resources. While somewhat out-of-date, the VA datasets on data.gov tell an important part of the story that should be considered as part of the Lighthouse efforts. There are a lot of lessons to be learned from how data.gov has been used, beyond just understanding what resources have been published there.
The resources published to data.gov reflect the VA’s recent past when it comes to making data resources available and accessible via manual downloads and APIs. We think that the most important lesson that the VA should take away from its experience with data.gov is that the VA should own all the data and API resources and syndicate them as part of other external efforts. That way the VA owns the full scope of the effort, which will ultimately result in the VA being more invested in API operations. You can view a list of data.gov resources, as well as a complete list of data files on the GitHub.
Humanizing the data
To give us a more human perspective on what types of data resources are most valuable to Veterans and their supporters, we facilitated a series of online workshops using Mural. About 50 people total participated across all three workshops, with about 40% reporting as Veterans and 60% non-Veterans. During these workshops, we employed the KJ technique for establishing group priorities. The KJ technique relies on a focus question to drive the results of the workshop. We used the following focus question:
“What types of data, content, and other resources would be most useful to Veterans and their supporters if the VA could make them more available and accessible on the web, mobile devices, and other platforms?”
The following images capture the results of each workshop:
The yellow cards represent all the ideas, in response to the focus question, that everyone brainstormed. As you can see, these yellow cards were organized into like groups. The blue cards represent descriptive labels that participants gave to each group. The black circles with numbers represent the votes that the participants casted when asked which group labels they thought best answered the focus question. We weighted Veteran votes 2x more heavily than votes from non-Veterans.
You may notice things that seem out of place in the final results (for example, yellow cards that look like they belong to another category). This is largely due to the timeboxed nature of the activities. In other words, not everything could be made perfect, but that doesn’t detract from the overall usefulness of the results.
Given the fact that the results were spread across three different workshop sessions, we took the additional step of normalizing the groupings and merging the votes.
Directory of Services/Resources – 34 votes
Mental Health – 21 votes
Personal Healthcare Data – 20 votes
Personalized Self-Service Portal – 20 votes
Benefits – 13 votes
Peer Support Networking – 8 votes
Family Support Networking – 9 votes
Real-Time Status – 8 votes
Patient Experience Data – 7 votes
Military-to-Civilian Transition – 7 votes
Ratings and Calculators – 6 votes
Appointments – 6 votes
Medical Healthcare – 5 votes
Housing Assistance – 5 votes
Public Accountability and Awareness – 4 votes
Service History Data – 4 votes
Veteran Status Verification – 0 votes
Statistical Analysis and Machine Learning – 0 votes
Metadata Support – 0 votes
Documentation – 0 votes
It’s entirely possible that these groupings could be further normalized, or even some of the ideas within the original groups split out into separate groups. Some groupings could even be disregarded as irrelevant (for example, Metadata Support). However, we didn’t want to dilute the results of what the participants came up with. Somewhat surprising is the low number of votes for Medical Health. That may be a result of lacking the right type of participant representation in the workshops, at least for that particular category.
Summary of the landscape analysis
The landscape analysis, which only processed about 1/3 of the URLs targeted for spidering over the course of a two-week period, revealed about 20,216 data files. This work produced a lot of data to wrangle and make sense of. Each of the data formats made available tell their own story about what types of data has been published to VA websites and why. The number of times a resource has been published using a particular data format (CSV, XML, JSON, XLS/XLSX, etc.) serves as a vote for making that resource available and accessible on the web.
Despite the huge amount of information to work with, we believe that our analysis provides valuable insight into some of the most relevant data resources, based on years of publication to VA websites. The top resources identified from all of the URLs, file formats, tables, and forms all point to data resources that should be considered turning into APIs. If these data resources were considered a priority when publishing to the VA’s websites, then there’s a good chance that they should be considered priorities when it comes to publishing via APIs as part of the Lighthouse initiative.
Lighthouse program considerations moving forward
Resources to prioritize
After spending time with all of the data uncovered during the landscape analysis, we began to see patterns emerge from across all the resources being published to the VA’s web properties, as well as those resources identified by people during our facilitated workshops. So based on analysis of the available data, we recommend that the VA Lighthouse program give consideration to prioritizing the following 25 resources:
Healthcare Facilities – Up-to-date information on hospitals, clinics, and other healthcare facilities.
Organizations – Details on any type of organization that services Veterans and their families.
Services – Services being offered by the VA, healthcare facilities, and other organizations.
Programs – Programs being offered by the VA, healthcare facilities, and other organizations.
Resources – Content, video, and other resources providing healthcare, outpatient, and other relevant content.
Schedules – The schedules of healthcare facilities, organizations, services, and programs being offered.
Events – Calendar and details of relevant events that service Veterans around the country.
Benefits – Details of the benefits being offered to Veterans, including elements of the process involved.
Performance – Performance details for the healthcare facilities, organizations, services, and programs.
Insurance – Home, auto, and healthcare insurance information that Veterans can take advantage of.
Loans – Information on home, auto, and other types of loans available to Veterans and their families.
Grants – Grants for education, businesses, projects, and other Veteran-focused efforts.
Education – Educational opportunities and information available to Veterans and their families.
Training – Specific training opportunities available that Veterans can take advantage of.
Jobs – Job postings that Veterans can apply to and use to guide their career.
Human Resources – VA human resources and related information in support of VA employees and Veterans.
Forms – Directory, access, and management of forms and the data that’s stored within them.
Budgets – Budget information on healthcare facilities, organizations, programs, and services.
Statistics – Statistics and data on all aspects of VA operations, and anything that impacts Veterans.
Cemeteries – Details of the cemeteries, and the Veterans who are laid to rest at all locations.
News – News that impacts Veterans from across any source and is relevant to the community.
Press – Press releases from the VA and related organizations and programs.
Research – Information and other resources produced as part of specific Veteran-related research.
Surveys – Centralized organization, access to, and the results of Veteran and program-related surveys.
FOIA – Process and information related to Freedom of Information Act (FOIA) efforts occurring at VA.
These resources represent what was harvested and analyzed as part of our landscape analysis, merging many of the patterns present across individual datasets. They’re organized using a REST-centric approach to turning data into API resources, which allows for data access via HTTP. Many of the keywords identified as part of the landscape analysis have been rolled-up into higher level areas — such as PTSD, mental health, and suicide — would exist across services, programs, and resources.
These suggested resources are derived from about 65% of the top-level resources identified across all the top paths, file formats, tables, and forms. They represent a nice cross-section of resources across all the data formats, but also reflect the general web presence of the VA. Our list also provides a coherent stack of resources that could be developed, deployed, and maintained in support of the central veteran APIs, offering personalized and generalized data experiences that would benefit Veterans and their families.
Centralize focus on the Veteran
From a data perspective, the most important resource above all is the Veteran and their personal data. Therefore, the identity and healthcare record of a Veteran should be at the front and center of any API deployed as part of the Lighthouse API platform initiative. This requires full knowledge and accurate information about a Veteran. In others words, in order for the Lighthouse’s APIs to work well, there must be a robust identity and access management in place, as well as detailed, layered, portable, and usable Veteran profiles.
One thing that became evident during our work is the need for greater personalization of data across almost every resource that we identified. Where there’s value in having general information available (for example, medical facilities), this data becomes exponentially more valuable when it’s personalized, localized, and made more relevant to the Veteran who is browsing, searching, engaging. Therefore, there are two types of engagement models with the resources that we propose: (1) general access without knowledge of the Veteran and (2) personalized knowledge of the Veteran via custom configuration settings that determine the relevancy of the data and content when these are made available via APIs and within applications.
There are existing portal efforts such as vets.gov that are available as part of the VA’s online presence. The personalization efforts occurring there should be reflected across the design and operation of the Lighthouse’s APIs. By designing APIs to operate in generalized or personalized mode, this would empower API developers to act on behalf of a Veteran using OAuth tokens. If a token’s present, each API will act in a more personalized manner and allow for localization based upon a Veteran’s preferences and history of interactions. This personalization layer should act as a bridge between the core healthcare record of a Veteran and the other resources that we outlined above.
Writing, not just reading data
Many of the resources that we identified represent read-only access to data and content. It’s important to note that getting access to data and content is useful; however, a significant portion of the resources that we harvested and gathered through conversations with the community will require the ability to write information via APIs. Forms, surveys, and other feedback loops will need to allow for APIs that not only GET data, but also POST and PUT data as part of their operations. This additional operations will round-off the Lighthouse’s stack of resources, helping to ensure that services provide a two-way street for engaging with the VA community.
In addition to reading, the ability to write data and content will be a deciding factor in whether applications built on top of the APIs will deliver meaningful value to Veterans and their supporters. If information is only being pushed outwards, many applications will be seen as having little to no value to users, developers, and operators of the Lighthouse API platform. To help ensure meaningful value is delivered to everyone involved, all applications should be capable of sharing usage data and feed analytics and support feedback loops between users and the platform operators. With the ability to write data, the APIs will lack meaning and substance, and will contribute to lack of adoption and integration.
Focus on the source of the data
A common misconception in conducting a landscape analysis such as the one we performed is to assume that the data discovered can be published via any APIs that are deployed as part of the next phase of work. That’s rarely the situation, because most of the discovered data is just published snapshots derived from existing data sources. This is certainly the case with the VA. Much of the data we discovered is unusable in its current state due to lack of normalization, duplication, being out of date, and other noise and clutter. Many of the XML and JSON files identified provider a much cleaner option for transforming into web APIs. However, with any resource identified, it’s more desirable to integrate the original source of data than relying on published snapshots.
Even after coming to a consensus on the data resources to transform into APIs, the next phase of work should focus on identifying the data sources for each of the targeted resource areas, and not relying on published data that already exist across websites. While it’s tempting, and sometimes necessary to rely on published data for the source of API data, it increases the chance that an API will eventually become dormant, out-of-date, and cause many of the issues that we’ve seen play out with the existing VA datasets. Our landscape analysis came at the resource prioritization from an external, public perspective. We recommend a subsequent, more internal landscape analysis to identify the data sources for important resource types emerging from this landscaping effort.
Improve domain management
Moving forward, it’d be logical to have a standardized approach to naming subdomains for both web and API properties in support of VA operations. Establishing a common approach to naming city, state, regional, program, research, and other resource areas would provide human- and machine-readable access to these resources. This might be difficult to do for web properties with so much legacy infrastructure, but the API platform provides an opportunity to establish a standardized approach moving forward.
Leverage common data formats
Our landscape analysis revealed a lack of a consistency when it comes to vocabulary, schema, and data formats. Most of the data published is derived from an existing system or represents a human-directed process. There’s a significant amount of fluff and noise surrounding these valuable data and a lack of consistent naming and field types.
Based upon the resources that we’ve identified for your consideration, there are a handful of existing data formats the VA should consider. Some of these are already underway, while others are not currently reflected in the Lighthouse’s efforts, but are used by other government entities to publish data in a consistent manner.
Fast Healthcare Interoperability Requirements (FHIR) – FHIR is already in-motion at the VA, but worth highlighting here. FHIR provides an anchor for why common data schema formats are relevant to other resources beyond Veteran healthcare records.
Open Referral (211) – Open Referral is a common schema and API specification for defining human services, including organizations, locations, and services, along with all the supporting information and metadata that goes with this core set of resources.
Open311 – Open311 is a common data format for reporting problems and issues at the municipal level, but can easily be adopted for establishing feedback loops at any level of government. It provides a common schema for how large volumes of information get submitted via API infrastructure.
Schema.org – A common schema vocabulary that provides object definitions for almost every resource identified throughout this landscape analysis, and the recommended list of resources above.
There are undoubtedly other open data formats that can be leveraged. Common microformats and other RFCs should also be considered, but these can be addressed during the define and design stages of the API development lifecycle, once individual resources have been decided upon. Common formats help ensure resources are interoperable and reusable across VA groups; they also help bring teams together to speak in a common language, using a common dictionary, which will go a long ways to standardizing how data is published and consumed.
Improve analytical information
We had hoped there would have been more analytical information available to help rank the resource data that we identified. We did incorporate the ranking information available from data.gov as an input into our resource prioritization. However, we relied mostly on publication frequency and the overall occurrence of each keyword to help weight relevance. The existence of a word in a path, title, and file name gives it a weight, which can be amplified for every occurrence, providing us with adequate levels of prioritization, grouping, and organization to help us understand each topics importance. If a topic exists frequently across VA web properties, and exists as a sectional grouping, and title of data file, it has importance and relevance.
The lack of analytics, or access to current analytics, across existing VA data sources demonstrates the importance of having a consistent and comprehensive analytics strategy across the VA’s data. There should be download counts for all machine-readable files. And, more importantly, real-time analytics for the consumption of this data via simple web APIs. There should be regional- and program-related ata. We should have personalized data that reflects what’s most important to Veterans. We should understand what’s relevant what isn’t through strategically-designed analytics across web and API operations. The lack of analytics is why we’re working to identify relevant data sources, so those can be made more available and analytics become the default — not an afterthought.
Continue refining the landscape analysis
Our landscape analysis produced a lot of information that was messy and difficult to work with. We can continue to make another pass, which would involve refining indexes, optimizing title and filename parsing, and developing key phrase, plural word, and other dictionaries to make the results much more refined.
There was a lot of data to harvest, process, and make sense of in a two-week sprint. However, we feel that we were able to do a good job of making sense of what was captured. Another sprint could easily be spent sorting through all of the data targeted, separating quality datasets from the more messier ones. Creating a dictionary to translate words and rehydrate acronyms would be very useful to help make further sense of what’s available in CSV, XLS/XLSX, JSON, and XML files. More work could also be done around forms: identifying the types available; defining their search mechanisms; defining what input parameters they allow (whether GET or POST); and unlocking further details on how they store data. Form and table data often have a direct connection to backend databases, which make them more valuable than some of the published data files.
All of the data from the landscape analysis has been published to GitHub, minus the primary index of harvested and processed URLs. Those are too big to publish as JSON to GitHub, but we’ll evaluate how best to provide access to each site index using a solution such as Amazon AMI. We also started experimenting with a secondary spider solution in order to generate an index of the VA website and can publish those indexes as separate GitHub repositories within a single GitHub repository when completed. We feel like these newer indexes could provide a much richer approach to understanding the data and content across the VA’s web properties. And allow for other researchers and analysts to fork and work to make sense of the data that they contain.
Incorporate user research
It’s critical that any further landscape analysis focused on uncovering valuable data resources from across the VA’s web presence is combined with user research activities, such as the series of design workshops that we conducted. Doing so will provide a human perspective on what’s most important to Veterans and their supporters.
We strongly encourage the Lighthouse program to conduct a similar workshop activity to the one that we ran, leveraging the VA’s much stronger outreach capability in order to attract an even larger and more diversified representation of the Veteran community’s data resource needs.
We also recommend that the Lighthouse program consider using the service blueprinting technique as a way to help identify and prioritize specific APIs for deployment. For example, a service blueprint could be created for a specific interaction that Veterans have with the VA, such as trying to find information on healthcare facilities. It’s likely that any service blueprints you want to create could be acquired by chunking the work into multiple micro-purchases. At the very least, we recommend trying to do at least one as an experiment. Once specific APIs are identified, you could then map them against a 2x2 prioritization matrix, based on how high they score against two main criteria: Veteran Experience Impact (y-axis) and Readiness to Execute (x-axis).
Conclusion: this journey is just beginning
The landscape analysis for the VA doesn’t end here. Just like the resulting API effort, the evaluation of the VA’s web presence should be an ongoing process. Work should continue to help identify what datasets are being published to the VA’s web properties, and to incorporate these datasets into API operations or to replace them with API-driven solutions. In the long term, there shouldn’t be any tables, forms, CSV files, JSON files, XML files, or XLS/XLSX files without a direct connection to the API platform. Eventually all data should be derived from a federated, but standardized, set of API platforms that are designed, deployed, and managed consistently as part of the VA Lighthouse effort.
Hopefully the work conducted here provides a base of resources for the Lighthouse program to consider as it moves forward. Ideally, everything uncovered as part of this work eventually becomes an API, or part of a suite of APIs. We understand that this won’t be a reality anytime soon, but we worked diligently to uncover the most valuable resources and to provide a concise list of data resources that could be turned into APIs and used to begin driving web, mobile, and desktop applications that serve Veterans and their families. There’s a wealth of resources available to Veterans across the VA’s websites. The challenge now is how do we ensure these resources deliver value consistently across many platforms? A simple, consistent, and usable API stack is the answer.
Lessons to share
This project is one of the VA’s first experiments using the microconsulting model in support of the Lighthouse initiative. Sharing what went well, what didn’t, and what could we have been done better — all in the name of continuous improvement — is the responsibility of everyone involved in order to make not only the Lighthouse initiative a success, but the microconsulting model as well. With that said, here’s what we have to share:
We thoroughly enjoyed working on this as a micro-project. We felt that the short-timeboxed, tightly-scoped nature of the work focused our efforts on executing only the most essential activities, giving even more meaning to inherently impactful work. As Parkinson’s Law states, “work expands so as to fill the time available for its completion.”
Looking back, our approach to this project involved some known unknowns (and some unknown unknowns) from a technical standpoint. In particular, the question of how well our spidering process could scale to handle the VA’s enormous web footprint. It would have been best to propose conducting an agile “spike” activity as a small micro-project in order to gain risk-reducing knowledge.
For micro-projects under a tight schedule and for which there are external dependencies (for example, scheduling interviews or workshops with external participants), some lead time may be necessary before formally kicking off the project.
Those people who participated in our facilitated workshops expressed extreme gratitude for the opportunity to contribute to the progress of the Lighthouse program. Working in the open and co-creating with the public will not only foster an engaged community of supporters, but will also lead to better quality outcomes.
While our workshop activities were extremely valuable in giving us a human perspective on our landscape analysis, we felt that there could have been even greater representation from the Veteran community. We should have been more proactive about leveraging the VA’s outreach capability to draw in an even more dense and diverse group of Veterans and their supporters.
In my world, OpenAPI is always a primary actor, and the tooling and services that put it to work are always secondary. However, I’d say that 80% of the people I talk with are the opposite, putting OpenAPI tooling in a primary role, and the OpenAPI specification in a secondary role. This is the primary reason that many still see Swagger tooling as the value, and haven’t made the switch to the concept of OpenAPI, or understand the separation between the specification and the tooling.
Another way in which you can see the importance of OpenAPI tooling is the slow migration of OpenAPI 2.0 to 3.0 users. Many folks I’ve talked to about OpenAPI 3.0 tell me that they haven’t made the jump because of the lack of tooling available for the specification. This isn’t always about the external services and tooling that supports OpenAPI 3.0, it is also about the internal tooling that supports it. It demonstrates the importance of tooling when it comes to the evolution, and adoption of OpenAPI. It demonstrates the need for the OAI community to keep investing in the development and evangelism of tooling for the latest version.
I am going to work to invest more time into rounding up OpenAPI tooling, and getting to know the developers behind them, as I prepare APIStrat in Nashville, TN. I’m also going to invest in my own migration to OpenAPI 3.0. The reason I haven’t evolved isn’t because of lack tooling, it is because of a lack of time, and the cognitive load involved with thinking new ways. I fully grasp the differences between 2.0 and 3.0, but I just don’t have intuitive knowledge of 3.0 in the way I do for 2.0. I’ve spent hundreds of hours developing around 2.0, and I just don’t have the time in my schedule to make similar investment in 3.0–soon!
If you need to get up to speed on the latest when it comes to OpenAPI 3.0 tooling I recommend checking out OpenAPI.Tools from Matt Trask (@matthewtrask) and Crashy McCiderface (aka Phil Sturgeon) (@philsturgeon). It is the best source of OpenAPI tooling out there right now. If you are still struggling with the migration from 2.0 to 3.0, or would like to see a specific solution developed on top of OpenAPI 3.0, I’d love to hear from you. I’m working to help shape the evolution of the OpenAPI tooling conversation, as well as tell stories about what tools are available, or should be available, and how they are can be put to work on the ground at companies, organizations, institutions, and government agencies.
After eight years of educating people about sensible API security and management, I’m always amazed at how many people I come across who still think public web APIs are about giving away access to your data, content, and algorithms for free. I regularly come across very smart people who say they’d be doing APIs, but they depend on revenue from selling their data and content, and wouldn’t benefit from just putting it online for everyone to download for free.
I wonder when we stopped thinking the web was not about giving everything away for free? It is something I’m going to have to investigate a little more. For me, it shows how much education we still have ahead of us when it comes to informing people about what APIs are, and how to properly manage them. Which is a problem, when many of the companies I’m talking to are most likely doing APIs to drive internal systems, and public mobile applications. They are either unaware of the APIs that already exist across their organization, or think that because they don’t have a public developer portal showcasing their APIs, that they are much more private and secure than if they were openly offering them to partners and the public.
Web API management has been around for over a decade now. Requiring ALL developers to authenticate when accessing any APIs, and the ability to put APIs into different access tiers, limit that the rate of consumption, while logging and billing for all API consumption isn’t anything new. Amazon has been extremely public about their AWS efforts, and the cloud isn’t a secret. The fact that smart business leaders see all of this and do not see that APIs are driving it all represents a disconnect amongst business leadership. It is something I’m going to be testing out a little bit more to see what levels of knowledge exist across many fortune 1000 companies, helping paint of picture of how they view the API landscape, and help me quantify their API literacy.
Educating business leaders about APIs has been a part of my mission since I started API Evangelist in 2010. It is something that will continue to be a focus of mine. This lack of awareness is why we end up with damaging incidents like the Equifax breach, and the Cambridge Analytica / Facebook scandal. Its how we end up with so many trolls on Twitter, and an out of balance API ecosystems across federal, state, and municipal governments. It is a problem that we need to address in the industry, and work to help educate business leaders around common patterns for securing and managing our API resources. I think this process always begins with education and API literacy, but is a symptom of the disconnect around storytelling about public vs private APIs, when in reality there are just APIs that are secured and managed properly, or not.
It is APIStrat time again! This time it is in Nashville, Tennessee! We are in the early stages of the event, but we are getting close to the deadline of the call for papers. We’ve assembled another rockstar ensemble for this round to help us steer the event, and review talk submissions once the CFP process has closed. I just wanted to take a moment and recognize the folks who are helping out and make sure they get the recognition they deserve.
First up are the six members of the APIStrat steering committee, playing different leadership roles in the conference, making sure everything gets done by September:
- Darrel Miller (@darrel_miller)– Workshop Chair - Microsoft
- Erin McKean (@emckean) – Program Chair - IBM
- Kin Lane (@kinlane) – Marketing Chair - API Evangelist
- Lorinda Brandon (@lindybrandon) – Community Chair - Capital One DevExchange
- Steven Willmott (@njyx) – Chair - Red Hat.
- Tim Burks (@timburks) – Program Vice Chair - Google
Then we have assembled nineteen folks on the program committee who will be reviewing your talk submissions before you can get on stage at APIStrat in Nashville:
- Amy Palamountain (@ammeep) - GitHub
- Ash Hathaway (@Ash_Hathaway) - Pivotal
- Carly Thelander (@CarlyNThelander) - MilkRun
- David Biesack (@davidbiesack) - Apiture
- Gail Frederick (@screaminggeek) - eBay
- George Atala (@gmatala) - Quantero Capital
- Glenn Block (@gblock) - Auth0 https://auth0.com/
- Gregory Koberger (@gkoberger) - ReadMe
- James Higginbotham (@launchany) - LaunchAny
- Jennifer Riggins (@jkriggins) - the eBranding Ninja
- Joey French (@josephtfrench) - Intrinio
- Kyle Dallaire (@kyledallaire) - Capital One Digital
- Melissa Jurkoic (@melissa_jurkoic) - Amadeus Hospitality
- Phil Sturgeon (@philsturgeon) - WeWork
- Sai Vennam (@sai_vennam) - IBM https://www.ibm.com/
- Shelby Switzer (@switzerly) - Healthify
- Skip Hovsmith (@skiphovsmith) - Critical Blue
- Taylor Barnett (@taylor_atx) - Stoplight
- Tessa Mero (@TessaMero) - Cisco
- Yina Arenas (@yina_arenas) - Microsoft Graph
Than you to everyone for helping do the hard work of making sure APIStrat not only continues, but continues to represent the wider API community. Everyone is doing this work because they care about the community, and want to make the event as good as, or better than it has been in the past. This is the 9th edition of APIStrat, spanning New York, San Francisco, Amsterdam, Chicago, Berlin, Austin, Boston, Portland, and now Nashville! It has been a pretty wild ride.
While we have everyone we need for these committees, we still need help in other areas. First, get your talk submitted before the CFP closes next week. Second, we need your financial support, so make sure you consider sponsoring APIStrat, and help make sure Nashville rocks. Beyond that we can use some help spreading the word. We are looking to grow the event beyond the usual 500 threshold, helping expand participation in the event, as well as the OpenAPI Initiative. If you want to help, feel free to ping me anytime, and I’ll see you in Nashville.
I’m kicking off a busy week of travel and talks this week in DC with a discussion about delivering microservices at federal agencies at DevNation Federal on Tuesday, June 5th, 2018. I was invited by Red Hat to come speak about the work I’m doing as API Evangelist across federal agencies. You can find me in the afternoon lineup, sharing my stories title “The Tech, Business, and Politics of APIs In Federal Government”. Focusing on information gathered as part of my research, workshops, and consulting across the public and private sector.
My talk reflects my work to motivate federal agencies to do APIs over the last five years, and help pollinate the ideas and practices I gather from across the private sector, and understand which ones will work in the public sphere. Not everything about doing APIs at startups and in the enterprise translates perfectly to delivering APIs in the federal government, but there are many practices that will help agencies better serve the people. My goal is to open up discussion with government employees and contractors, to help figure out what works and what doesn’t–sharing stories along the way.
Let me know if you are going to be at DevNation Federal let me know. I am happy to make some time to talk, and hear what you are up to with APIs. I depend on these hallway conversations to populate my blog with stories. I’ll be in town around noon, and there until around 5 or so until I head over to the DC API Meetup for my second talk of the day. Thanks to Red Hat for having me out. I enjoy doing talks for Red Hat events, as they tend to reflect more of the audience I’m looking for, with a focus on more open source, and a little less proprietary focus when it comes to delivering government technology. I’ll see you in Washington D.C. on Tuesday!
After I speak at DevNation Federal in Washington DC this Tuesday, I am going to give a similar talk at the DC API API User Group that evening. I love going to the Meetups in DC, partly because my good friend Gray Brooks runs the event, but also because I’ve been working to jumpstart API conversations in Washington DC since 2012 when I held the first DC edition of API Craft. I was on a mission to jumpstart API Craft gatherings around the country that year, and it makes me happy to see the API Meetup culture continuing to thrive in DC, where other places it has died out.
At the DC API Meetup I’ll be giving a variation of my talk that I’m giving earlier that day at DevNation Federal. Talking about the technology, business, and politics of doing APIs, with an emphasis on a consistent and repeatable API lifecycle. I’ll be reworking my regular material in light of current projects I’m working on at the federal level including with the VA, FDIC, HHS, and beyond. Sharing stories about how a microservice approach can help make government services more agile, flexible, and delivered in smaller more bite sized chunks–helping move the IT conversation forward across federal agencies.
If you can’t make it to DevNation Federal, I recommend you head out to the DC API User Group later that evening. I’d love to get a chance to hang out with you and talk about APIs. I’m always impressed with the folks who turn out for the DC API Meetup, consistently providing a fresh opportunity to discuss APIs and the impact they are making across the federal government. Organizer Gray Brooks has his finger on the pulse of what is going on across agencies, way beyond what I am capable of from the outside-in. I look forward to hanging out in DC, and hope you can make it out Tuesday to talk some APIs with me.
I’ve been evaluating API management providers, and this important stop along the API lifecycle in which they serve for eight years now. It is a space that I’m very familiar with, and have enjoyed watching it mature, evolve, and become something that is more standardized, and lately more commoditized. I’ve enjoyed watching the old guard (3Scale, Apigee, and Mashery) be acquired, and API management be baked into the cloud with AWS, Azure, and Google. I’ve also had fun learning about Kong, Tyk, and the next generation API management providers as they grow and evolve, as well as some of the older players like Axway as they work to retool so that they can compete and even lead the charge in the current environment.
I am renewing efforts to study what each of the API management solutions provide, pushing forward my ongoing API management research, understanding what the current capacity of the active providers are, and potentially they are pushing forward the conversation. One of the things I’m extremely interested in learning more about is the connector, plugin, and extensibility opportunities that exist with each solution. Functionality that allows other 3rd party API service providers to inject their valuable services into the management layer of APIs, bringing other stops along the API lifecycle into management layer, allowing API providers to do more than just what their API management solution delivers. Turning the API management layer into much more than just authentication, service plan management, logging, analytics, and billing.
Over the last year I’ve been working with API security provider ElasticBeam to help make sense of what is possible at the API management layer when it comes to securing our APIs. ElasticBeam can analyze the surface area of an API, as well as the DNS, web, API management, web server, and database logs for potential threats, and apply their machine learning models in real time. Without direct access at the API management layer, ElasticBeam is still valuable but cannot respond in real-time to threats, shutting down keys, blocking request, and other threats being leveraged against our API infrastructure. Sure, you can still respond after the fact based upon what ElasticBeam learns from scanning all of your logs, but without being able to connect directly into your API management layer, the effectiveness of their security solution is significantly diminished.
Complimenting, but also contrasting ElasticBeam, I’m also working with Streamdata.io to help understand how they can be injected at the API management layer, adding an event-driven architectural layer to any existing API. The first part of this would involve turning high volume APIs into real time streams using Server-Sent Events (SSE). With future advancements focused on topical streaming, webhooks, and WebSub enhancements to transform simple request and response APIs into event-driven streams of information that only push what has changed to subscribers. Like ElasticBeam, Streamdata.io would benefit being directly baked into the API management layer as a connector or plugin, augmenting the API management layer with a next generation event-driven layer that would compliment what any API management solution brings to the table.
Without an extensible connector or plugin layer at the API management layer you can’t inject additional services like security with ElasticBeam, or event-driven architecture like Streamdata.io. I’m going to be looking for this type of extensibility as I profile the features of all of the active API management providers. I’m looking to understand the core features each API management provider brings to the table, but I’m also looking to understand how modern these API management solutions are when it comes to seamlessly working with other stops along the API lifecycle, and specifically how these other stops can be serviced by other 3rd party providers. Similar to my regular rants about API service providers always having APIs, you are going to hear me rant more about API service providers needing to have connector, plugin, and other extensibility features. API management service providers can put their APIs to work driving this connector and plugin infrastructure, but it should allow for more seamless interaction and benefits for their customers, that are brought to the table by their most trusted partners.
|<< Prev||Next >>|