Thinking About Why We Rate Limit Our APIs

I am helping a client think through their API management solution at the moment, so I’m working through all the moving parts of how, and why of API management solutions. The API management landscape has shifted since the last time I helped a small company navigate the process of getting up and running, so I wanted to work through each aspect and think critically before I make any recommendations. My client has a content API, which isn’t very complex, but possesses some pretty valuable data they’ve aggregated, curated, and are looking to make available via a simple web API. It is pretty clear that all developers will need a key to be access the API, but I wanted to pause for a moment and think more about API rate limiting.

Why do we rate limit? The primary reason is to help manage the compute resources available for all API consumers. You don’t want any single user hitting the server too hard, and taking things down for everyone else. I’d say after that, the next major reason is to enforce API access tiers, and ensure API consumers are only consuming what they should be. Which both seem like pretty dated concepts, that might need re-evaluation in general, but also in the context of this particular project. There is no free access to this API. I believe there will be a public account for test driving (making very limited # of calls), and some that drive their embeddable strategy, but for access to the majority of content, developers will have to register for a key, and provide a credit card to pay for their consumption. Which leaves me with the question, should we be rate limiting at all?

If users are paying for whatever they consume, and there is a credit card on file, do we want to rate limit? Why are we so worried about server capacity in a cloud world? It seems like rate limiting is a legacy constraint, that has continue to live on unquestioned, and even propped up by accounting and business decisions over simple technical ones. API access tiers with varying rate limits are sometime imposed as part of identity and access control, limiting what new users have access to, but often times they are used to corral and route users into specific, and measurable account plans, that help startups predict and articulate revenue to investors. I know many of my friends disagree with my thoughts on this, but I feel this accounting decision behind rate limiting are hurting the bottom line, more than they are helping. If you are focused on your API being the product it is hurting it, if you are focused on your API consumers being your product, then you are helping it.

My client in question is looking to build an actual business that sells a product to customers, without an exit strategy, so I want to do my best to help them understand how they can reduce technical and business complexity, while maximizing revenue around the API services they are offering. If we have the API properly resourced with scalable compute, load-balancing, monitoring, checks and balances. Then we also have a verified credit card on file for each API key holder. Why do we want to rate limit? It seems like it is an unnecessary complexity for API consumers to have to wrestle with. Let’s just allow them to register, make API calls, measure, and bill accordingly. Amazon provides a clear precedent for how this works, and from my experience I tend to spend more on my AWS bill then I do with services I use which keep me in tiered access plans. I’m not saying tiered access plans don’t have their place, I’m saying we should be questioning their value each time we are constructing them, and not just assuming they should be done my default.

A by-product of noticing how the API management landscape recently is helping me reassess each of the common building blocks of API management, and think more critically about the how and why behind their existence. There is a significant difference between rate limiting and metering API calls, and I don’t think we always have to do both. We still need the ability to turn off keys, and block specific user agents and IP addresses, but in some cases I think rate limiting shouldn’t be part of the API management operations. We have the compute, storage, and databases resources at our disposal to scale as we need to meet demand, and we have the credit card verified, and on file to bill against, let’s just get out of API consumers way. In the case of this particular project I’m working, I think this will be my recommendation. Focusing on reducing the amount of API management overhead, and simplifying the load for API consumers along the way.