Conducting An API Landscape Analysis

I am having conversations with different organizations about where to start with APIs, pushing me to revisit some of my previous API landscape analysis work, like an evaluation of the Department of Veterans Affairs (VA) existing data and resource presence. The process is born out of my low hanging fruit work, identifying where the existing data sets for an organization already exist, shining a light on how and where an organization should begin with their API investment. Asking the question, if an organization is regularly publishing spreadsheets, CSV, JSON, and XML data to their website, why are these not available as APIs? A question that 98% of the enterprise organizations I come across do not have an adequate answer to. Possessing no formal strategy for how data is created, published, shared, centralized, distributed, or made available via APIs. Resulting in most organizations having no idea of where their digital assets are, and what their organizational capabilities are.

My low hanging fruit process involves spidering a domain looking for CSV, JSON, and XML files, as well as identifying pages that have tables and forms on them. I’ve done this for several organizations who have paid me, a handful of others where people who work there asked me to do it, as well as for organizations that I find interesting and work to identify the API low hanging fruit for my own purposes. I use it to gain a better understanding the digital capabilities of an organizations, but it is also work that has got me in trouble after applying it to the University of Oklahoma, uncovering some spreadsheets that shouldn’t have been made public, and attracting the attention of the institution’s leadership. It all worked out, but for a little while they were considering calling the FBI on me--it all provides an important story about how we manage our digital assets, and making sure we have a comprehensive strategy for how we publish data, what should be making available internally, with the public, and ideally via a centralized API strategy that is accessible to everyone, not just developers.

The VA API Landscape Analysis and Roadmapping Project Report is a pretty healthy look at how the API conversation should begin at most organizations. It doesn’t just work to identify the existing data sets available across an organization, it shows the existing information architecture being rolled out intentionally or unintentionally, and will set the stage for what an API program will look like, including which resources are made available, and the vocabulary we employ to describe our ongoing, perpetual, digital transformations. I played around with the idea of doing landscape analysis as a business, making big part of how I operate API Evangelist, but after engaging with a handful of large enterprise organizations around the approach, I realize that most will not be able to process what is being given to them, let alone understand where to begin when it comes to course correcting. Some just see this process as an annoyance, seeing me as being too focused on their shortcomings, and failing to see that I am trying to help—not ever fully grasping what APIs are all about.

If I was running a large institution, I would welcome someone spidering my website helping map out the API landscape that already exists across my company, organization, institution, or government agency. I would want to identify the spreadsheets that shouldn’t be published as soon as they are published, and helping those who did it understand better practices when it comes to sharing their data. I would want to understand how my employees are creating, managing, sharing, and reporting around data. I would want to understand what all of my organizational capabilities are, and make sure I had them available as a single stack of machine readable, and human access API resources. Sadly, this isn’t the state of things within most large organizations. Most people at lower levels don’t care about the big picture, and many at the higher levels don’t understand the importance of having a formal strategy for how data is managed, and that APIs are how you not only provide access to data, they are also how you get your data schema house in order. Making sense of the data sprawl that exists across your organization, and standardizing how these assets are published and accessed, providing further awareness into how data is being put to work across a company, organization, institution, or government agency.

I will keep investing my API landscape analysis process. It is a cornerstone of my API research. Mapping out the landscape of existing API providers, as well as the more closeted API providers, is how I do what I do. I will keep using it to help organizations who are ready for an honest investment in their overall API journey understand the digital assets they already possess, and see the value of publishing them as simple, intuitive APIs that are part of a wider organizational API catalog. Ultimately it is up to each organization to want to begin stabilizing their domain landscape, and understand the relationship between our web presence and our API presence. Working to get more organized about our overall information architecture, and how API design is more about the structure of our organizations and what different teams produce each day, than it is actually about APIs. Understanding that our digital self is just a reflection of our physical self, and what we publish online tells a story about who we are behind the network router and firewall. And, you are either someone who is aware of this, and works to shape your physical and digital presence in a concerted way, or you just blindly operate hoping for the best in this new online reality we have created for ourselves.