I am continuing to iterate on what I consider to be a modern API toolbox. API Evangelist research is born out of the SOA and API worlds colliding, and while I have been heavily focused on HTTP APIs over the years, I have regularly acknowledged that a diverse API toolbox is required for success, and invested time in understanding just what I mean when I say this. Working to broaden my own understanding of the technologies in use across the enterprise, and realistically map out what I mean when I say API landscape. I am still workshopping my new API toolbox definition for 2020, but I wanted to work on some of the narrative around each of the items in it, helping me learn along the way, while also expanding the scope of what I am talking about.
Transmission Control Protocol (TCP)
The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite, and provides reliable, ordered, and error-checked delivery of a stream of bytes between applications running on hosts communicating via an IP network. The Web and APIs both rely on TCP, which is part of the Transport Layer of the TCP/IP suite. SSL/TLS often runs on top of TCP. It is the backbone of our API toolbox, but there are many different ways you can put TCP to work when it comes to the programming interfaces behind the applications we depend on.
It can be tough to separate what is a protocol, and what is a methodology when looking at the API landscape. I’m still working to understand each of these tools in the toolbox, and organize them in a meaningful way—which is why I am writing this post. While all APIs technically rely on TCP, these approaches to communication and information exchange are often implemented directly using TCP.
- Electronic Data Interchange (EDI) - Electronic Data Interchange (EDI) is the electronic interchange of business information using a standardized format; a process which allows one company to send information to another company electronically rather than with paper. Business entities conducting business electronically are called trading partners.
- Advanced Message Queuing Protocol (AMQP) - The Advanced Message Queuing Protocol (AMQP) is an open standard application layer protocol for message-oriented middleware, focusing on message orientation, queuing, routing, reliability and security.
- MQ Telemetry Transport (MQTT) - MQTT is a publish/subscribe, extremely simple and lightweight messaging protocol, designed for constrained devices and low-bandwidth, high-latency or unreliable networks.
- Java Message Service (JMS) - The Java Message Service (JMS) API is a Java message-oriented middleware API for sending messages between two or more clients. JMS is a part of the Java Platform, Enterprise Edition (Java EE).
- Simple Object Access Protocol (SOAP) - SOAP is a message protocol that allows distributed elements of an application to communicate. SOAP can be carried over a variety of lower-level protocols, including the web-related Hypertext Transfer Protocol (HTTP).
- Streaming Text Oriented Messaging Protocol (STOMP) - STOMP provides an interoperable wire format so that STOMP clients can communicate with any STOMP message broker to provide easy and widespread messaging interoperability among many languages, platforms and brokers.
- Websockets - The WebSocket Protocol enables two-way communication between a client running untrusted code in a controlled environment to a remote host that has opted-in to communications from that code.
- Kafka - Apache Kafka is an open-source stream-processing software platform that provides a unified, high-throughput, low-latency platform for handling real-time data feeds, connecting external systems via Kafka Connect and provides Kafka Streams.
I am still defining the individual building blocks of each of these tools, allowing me to better organize them, and articulate how they overlap and differ. Understanding the technical details of each approach, as well as the history and details of where they have been put to use helps be better understand the API landscape. I have been exposed to Websockets, Kafka, SOAP, and MQTT, but the other areas are new for me, and I have a lot to learn. For now I am just trying to quantify each technology as simply as I can, and work to define the community around them. While HTTP APIs still dominate much of the API conversation, areas like EDI still dominate much of the true commerce landscape, and I need to better understand what is going on if I am going to paint a clearer picture of what the future might hold.
User Datagram Protocol (UDP)
UDP uses a simple connectionless communication model with a minimum of protocol mechanisms. UDP provides checksums for data integrity, and port numbers for addressing different functions at the source and destination of the datagram. It has no handshaking dialogues, and thus exposes the user's program to any unreliability of the underlying network; there is no guarantee of delivery, ordering, or duplicate protection.
I am sure that some of the other approaches to delivering APIs also use UDP, but until I get to know them better, or find examples of that, SOAP is the only item I have on my UDP list.
- Simple Object Access Protocol (SOAP) - SOAP is a message protocol that allows distributed elements of an application to communicate. SOAP can be carried over a variety of lower-level protocols, including the web-related Hypertext Transfer Protocol (HTTP).
I will be looking for other examples of APIs delivered vi UDP, but I also wanted to have UDP on my list because of what is happening with HTTP/3. I have long felt like APIs were too TCP focused, and would benefit from some of the constraints of UDP when it came to how we design our API infrastructure. I am looking forward to further falling down this rabbit hole with each pass over my diverse API toolbox research.
HTTP 1.1 / 2 / 3
Now we get into the more known (showcased and talked about) API toolbox territory with HTTP 1.1 APIs, and what we are seeing evolve with HTTP/2 and now HTTP/3. Obviously HTTP is dependent on TCP, but the philosophies, methodologies, and other belief systems list below live primarily in the HTTP layer of this global network we’ve evolved over the last fifty years. This is the portion of the toolbox we have been hyper focused on in the API sector for over a decade, but as the space matures I think we are being forced to realize that there is a bigger world out there, and there are many approaches we should be considering beyond just HTTP and REST.
- Electronic Data Interchange (EDI) - Electronic Data Interchange (EDI) is the electronic interchange of business information using a standardized format; a process which allows one company to send information to another company electronically rather than with paper. Business entities conducting business electronically are called trading partners.
- Simple Object Access Protocol (SOAP) - SOAP is a message protocol that allows distributed elements of an application to communicate. SOAP can be carried over a variety of lower-level protocols, including the web-related Hypertext Transfer Protocol (HTTP).
- Remote Procedure Call (RPC) - RPC is a remote procedure call (RPC) protocol which uses XM or JSONL to encode its calls and HTTP as a transport mechanism to deliver APIs that might be resource based, but usually are more programmatic in nature.
- Representational State Transfer (REST) - Representational state transfer (REST) is a software architectural style that defines a set of constraints to be used for creating Web services. Web services that conform to the REST architectural style, called RESTful Web services, provide interoperability between computer systems on the Internet.
- Hypermedia - Hypermedia, an extension of the term hypertext, is a nonlinear medium of information that includes graphics, audio, video, plain text and hyperlinks. This designation contrasts with the broader term multimedia, which may include non-interactive linear presentations as well as hypermedia.
- Server-Sent Events - A server-sent event is when a web page automatically gets updates from a server. This was also possible before, but the web page would have to ask if any updates were available. With server-sent events, the updates come automatically.
- GraphQL - GraphQL is an open-source data query and manipulation language for APIs, and a runtime for fulfilling queries with existing data. It allows clients to define the structure of the data required, and the same structure of the data is returned from the server, therefore preventing excessively large amounts of data from being returned, but this has implications for how effective web caching of query results can be.
- gRPC - gRPC is a modern open source high performance RPC framework that can run in any environment. It can efficiently connect services in and across data centers with pluggable support for load balancing, tracing, health checking and authentication.
This list, minus EDI, is the usual list of tools I include in my API toolbox. However, due to the growth of Websockets and Kafka for more real time applications, and MQTT and AMQP for Internet of Things and other device centered approaches, I have continued to expand my horizons. While I believe strongly that HTTP 1.1 APIs using REST is where most developers should begin their API journey, I believe that all of these approaches have their use and should be a ready to go part of our API toolboxes, and I will keep fleshing out, understanding, and using within my storytelling until I am able to confidently explain which tool should be applied in each possible scenario.
SMTP/POP
As part of this latest wave of research into what I consider to be the wider API economy I was reminded of how SMTP is still used by many legacy providers when it comes to EDI and SOAP. While this still might be a relic of the past I am still interested in understanding how email is used as the OG messaging system and can either still be applied or at least considered as we evolve our modern approaches to messaging.
- Electronic Data Interchange (EDI) - Electronic Data Interchange (EDI) is the electronic interchange of business information using a standardized format; a process which allows one company to send information to another company electronically rather than with paper. Business entities conducting business electronically are called trading partners.
- Simple Object Access Protocol (SOAP) - SOAP is a message protocol that allows distributed elements of an application to communicate. SOAP can be carried over a variety of lower-level protocols, including the web-related Hypertext Transfer Protocol (HTTP).
While I prefer a dead simple HTTP API for most implementation, the concept of sending and receiving machine readable information using email is still very intriguing to me. I like email in the same way I like spreadsheets, because they are ubiquitous, and is an infrastructure item that exists everywhere and is used by the common folk. While I’m not prescribing that we start delivering APIs via email, I do think there are interesting use cases here already in existence, and we should be learning from this approach to messaging when we think about the future of machine readable messaging.
Data Formats
Next, I wanted to look at the data formats being used as part of each API in operation, and how data is passed, serialized, and made sense of in a machine readable way. I think that data serialization is one of the competitive edges that gRPC has over REST, and I think this can also be seen across Kafka adoption with the use of Apache Avro when it comes to moving data around the enterprise. Here are the data formats portion of my API toolbox, helping us standardize how the bits and bytes are moved around between systems, and used to power applications.
- CSV - A CSV is a comma-separated values file, which allows data to be saved in a tabular format. CSVs look like a garden-variety spreadsheet but with a . csv extension. CSV files can be used with most any spreadsheet program, such as Microsoft Excel or Google Spreadsheets.
- JSON - JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language Standard ECMA-262 3rd Edition - December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language.
- XML - Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.
- Apache Avro - Apache Avro is a data serialization system. Avro provides: Rich data structures. A compact, fast, binary data format. A container file, to store persistent data. Remote procedure call (RPC). Simple integration with dynamic languages. Code generation is not required to read or write data files nor to use or implement RPC protocols. Code generation as an optional optimization, only worth implementing for statically typed languages.
- Apache Thrift - The Apache Thrift software framework, for scalable cross-language services development, combines a software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.
- Protocol Buffers - Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages.
While JSON still dominates the discussion, I predict that Protobuf, Avro, and Thrift will continue to gain mindshare because of performance. I also think that we will continue to realize what we threw out in the move from XML to JSON, and we will find that some things are still needed. I also demand that CSV always be a first class citizen when possible, because it makes the APIs we deliver more accessible by the business class of consumers, allowing them to put our APIs to work within the spreadsheets they use to get work done each day.
Specifications
Last, I wanted to include the specifications for defining our APIs, and the schema for our data formats. Providing us with machine readable definitions of the API landscape no matter which solution we go with. Much of the expansion of the horizon when it comes to my API toolbox has been because of the work Fran is doing on AsyncAPI, but also with the maturing of OpenAPI, and the increased adoption of Postman collections when it comes to not just defining the surface area of your API, but also how it will be used at runtime. Here are the specifications I focus on when it comes to my API toolbox.
- OpenAPI - The OpenAPI Specification, originally known as the Swagger Specification, is a specification for machine-readable interface files for describing, producing, consuming, and visualizing RESTful web services.
- AsyncAPI - AsyncAPI is an open source initiative that seeks to improve the current state of Event-Driven Architectures (EDA). Our long-term goal is to make working with EDA’s as easy as it is to work with REST APIs. That goes from documentation to code generation, from discovery to event management.
- Postman Collection - Postman Collections are Executable API Descriptions Collection folders make it easy to keep your API requests and elements organized. A Postman Collection lets you group individual requests together. You can organize these requests into folders. You can group together requests into folders and collections, so that you don't have to search through your history repeatedly. Then collections can be used to mock, document, and test APIs as part of the API life cycle.
- JSON Schema - JSON Schema is a vocabulary that allows you to annotate and validate JSON documents. Describes your existing data format(s). Provides clear human- and machine- readable documentation. Validates data which is useful for: Automated testing, and ensuring quality of client submitted data.
These specification formats help describe the surface area of API infrastructure in a machine readable way, allowing the resulting definitions to be shared and used across teams helping provide a common truth around what an API does and doesn’t do. They can also then be used across the API life cycle to generate mock servers, publish documentation, generate tests, client and server side code, and many other aspects of operating an API. While these specifications haven’t expanded to include every approach listed in this toolbox, that is part of my work here, to help identify where the shadows exist when it cones to adequately mapping out the API landscape.
Expanding My API Toolbox For The Next Decade
This list represents my API toolbox going into the next decade. I will be further fleshing it out and adding more of the life cycle tooling like mocks, docs, testing, security, and other essential areas. So the next iteration of this toolbox definition will have more detail about each of the TCP and HTTP approaches, as well as the data formats and specifications, but it will also begin to list out specific types of tooling that is being used to support each approach. I am looking to understand the maturity of standardized tooling across the HTTP and TCP universe, while also looking to understand where the new opportunities are based upon movements in single areas, which might not be known and applied across all approaches to delivering APIs.
One of the most important aspects of doing this API toolbox revamp goes beyond any particular technical approach and speaks to the wider API economy. I know many of us in the API space think we are the center of the world when it comes to the real world supply chain, and wield terms like the API economy to define the impact we make. However, once you start actually taking an honest look at the number of SOAP APIs still in use, as well as quantify the scope of EDI, you begin to realize that we are still just a toddler when it comes to the global economy. Sure, one can argue we are the future, and we are well on our way to redefining the supply chains of the world, but once you size up our world against what is already in place, we have a lot more work ahead of us to not just develop what’s new, but to convert what already exists to actually operate as part of the API economy.