How Should Teams Be Documenting Their APIs When You Have Both Legacy And New APIs?

I’m continuing my work to help the Department of Veterans Affairs (VA) move forward their API strategy. One area I’m happy to help the federal agency with, is just being available to answer questions, which I also find make for great stories here on the blog–helping other federal agencies also learn along the way. One question I got from the agency recently, is regarding how the teams should be documenting their APIs, taking into consideration that many of them are supporting legacy services like SOAP.

From my vantage point, minimum viable API documentation should always include a machine readable definition, and some autogenerated documentation within a portal at a known location. If it is a SOAP service, WSDL is the format. If it is REST, OpenAPI (fka Swagger) is the format. If its XML RPC, you can bend OpenAPI to work. If it is GraphQL, it should come with its own definitions. All of these machine readable definitions should exist within a known location, and used as the central definition for the documentation user interface. Documentation should not be hand generated anymore with the wealth of open source API documentation available.

Each service should have its own GitHub/BitBucket/GitLab repository with the following:

  • README - Providing a concise title and description for the service, as well as links to all documentation, definitions, and other resources.
  • Definitions - Machine readable API definitions for the APIs underlying schema, and the surface area of the API.
  • Documentation - Autogenerated documentation for the API, driven by its machine readable definition.

Depending on the type of API being deployed and managed, there should be one or more of these definition formats in place:

  • Web Services Description Language (WSDL) - The XML-based interface definition used for describing the functionality offered by the service.
  • OpenAPI - The YAML or JSON based OpenAPI specification format managed by the OpenAPI Initiative as part of the Linux Foundation.
  • JSON Schema - The vocabulary that allows for the annotation and validation of the schema for the service being offered–it is part of OpenAPI specification as well.
  • Postman Collections - JSON based specification format created and maintained by the Postman client and development environment.
  • API Blueprint - The markdown based API specification format created and maintained by the Apiary API design environment, now owned by Oracle.
  • RAML - The YAML based API specification format created and maintained by Mulesoft.

Ideally, OpenAPI / JSON Schema is established as the primary format for defining the contract for each API, but teams should also be able to stick with what they were given (legacy), and run with the tools they’ve already purchased (RAML & API Blueprint), and convert between specifications using API Transformer.

API documentation should be published to it’s GitHub/GitLab/BitBucket repository, and hosted using one of the service static project site solutions with one of the following open source documentation:

  • Swagger UI - Open source API documentation driven by OpenAPI.
  • ReDoc - Open source API documentation driven by OpenAPI.
  • RAML - Open source API documentation driven by RAML.
  • DapperDox - DapperDox is Open-Source, and provides rich, out-of-the-box, rendering of your OpenAPI specifications, seamlessly combined with your GitHub flavoured Markdown documentation, guides and diagrams.
  • wsdldoc - The tool can be used to generate HTML documentation out of WSDL file.

There are other open source solutions available for auto-generating API documentation using the core API’s definition, but these represent some of the commonly used solutions out there today. Depending on the solution being used to deploy or manage an API, there might be built-in, ready to go options for deploying documentation based upon the OpenAPI, WSDL, RAML or other using AWS API Gateway, Mulesoft, or other existing vendor solution already in place to support API operations.

Even with all this effort, a repository, with a machine readable API definition, and autogenerated documentation still doesn’t provide enough of a baseline for API teams to follow. Each API documentation should possess the following within those building blocks:

  • Title and Description - Provide the concise description of what an API does from the README, and make sure it is based into the APIs definition.
  • Base URL - Have the base URL, or variable representation for a base URL present in API definitions.
  • Base Path - Provide any base path that is constant across paths available for any single API.
  • Content Types - List what content types an API accepts and returns as part of its operations.
  • Paths - List all available paths for an API, with summary and descriptions, making sure the entire surface area of an API is documented.
  • Parameters - Provide details on the header, path, and query parameters used for API path being documented.
  • Body - Provide details on the schema for the body of each API path that accepts a body as part of its operations.
  • Responses - Provide HTTP status code and reference to the schema being returned for each path.
  • Examples - Provide example requests and response for each API path being documented.
  • Schema - Document all schema being used as part of requests and responses for all APIs paths being documented.
  • Authentication - Document the authentication method used (ie. Basic Auth, Keys, OAuth, JWT).

If EVERY API possesses its own repository, and README to get going, guiding all API consumers to complete, up to date, and informative documentation that is auto-generated, a significant amount of friction during the on-boarding process can be eliminated. Additionally, friction at the time of hand-off for any service from on team to another, or one vendor to another, will be significantly reduced–with all relevant documentation available within the project’s repository.

API documentation delivered in this way provides a single known location for any human to go when putting an API to work. It also provides a single known location to find a machine readable definition that can be used to on-board using an API client like Postman, PAW, or Insomnia. The API definition provides the contract for the API documentation, but it also provides what is needed across other stops along the API lifecycle, like monitoring, testing, SDK generation, security, and client integration–reducing the friction across many stops along the API journey.

This should provide a baseline for API documentation across teams. No matter how big or small the API, or how new or old the API is. Each API should have API documentation available in a consistent, and usable way. Providing a human and programmatic way for understanding what an API does, that can be use to on-board and maintain integrations with each application. The days of PDF and static API documentation are over, and the baseline for each APIs documentation always involves having a machine readable contract as the core, and managing the documentation as part of the pipeline used to deploy and manage the rest of the API lifecycle.