API Caching, Part II
In Part 1, we covered the basics of using Fastly to accelerate a comments API. Using Instant Purge™, we hooked into model callbacks to ensure that the appropriate content was purged from the cache whenever data changed. In this article, we’ll build upon the original approach and use one of Fastly’s more advanced features: cache control.
Cache Control Headers
Recall that in our original example, we made an assumption about what types of requests would and would not be cached. Again, the basic rule was:
GET requests are cached. PUT, POST, and DELETE requests are not.
From the perspective of an ideal REST API, this rule will suffice. Unfortunately, most APIs are far from ideal, not to mention that certain requests, regardless of the particular method, may need special caching rules. Enter the Cache-Control header (stage left). This header is used to instruct user agents (e.g., web browsers) how to handle the caching of server responses.
Cache-Control, as defined by the HTTP Specification (RFC 2616), has many options to allow for the appropriate handling of cached data. Instead of reciting some elephantine doctrine, let’s just take a look at some examples:
Cache-Control: no-cache
Cache-Control: private
Cache-Control: max-age=86400
In the first example, we set the Cache-Control
header to no-cache
, which tells the user agent that the response should not be cached. The second, private
, dictates the information is specific to a single user and should not be cached for other users. Finally the third, max-age=86400
, tells the agent that the response can be cached, is accessible publicly, and expires in exactly 86,400 seconds (one day).
Fastly, by default, will respect the Cache-Control
headers if provided, but there is another, more proxy-specific header that can be used: Surrogate-Control. Surrogate-Control
headers work in much the same way as Cache-Control
headers, but are used to give directions specifically to reverse proxy caches like Fastly.
Example: Using Control Headers in Tandem
Casting aside the theoretical for a moment, let’s look at a concrete way that these headers might be used in our original example, the comments API. Recall the API endpoint that provided a list of comments for a given article:
GET /article/:article_id/comments
Whenever a user submits a comment for a given article, the response from this endpoint will be purged from the Fastly cache by the comment model. It is fair to say that users comment in a fairly non-deterministic way, and because of this, it’s pretty hard for anyone to predict when the content will change. Thus, we’d like to ensure the following:
If the content doesn’t change, it should stay cached in Fastly for a reasonable amount of time.
If the content does change, it should not be cached by the browser longer than it needs to be.
In other words, we want to ensure that API responses will reach clients in a timely manner, but we also want to ensure that clients always have the most up-to-date information. The first constraint can be solved by using the Surrogate-Control
header and the second by the Cache-Control
header, like so:
Surrogate-Control: max-age=86400
Cache-Control: no-cache
By returning these headers, the Fastly cache will know it is allowed to cache the content for up to one day, and the browser will know to never cache the content and always go back to the source of truth (in this case, the Fastly cache).
Piecewise Integration
Many of the APIs that could benefit the most from Fastly are mature, large, and relatively complex software systems. Attempting a wholesale migration of such an API is daunting and can place heavy burdens on even the most seasoned of engineering teams. Luckily, there is a easy-to-follow and effective strategy for integrating such APIs with Fastly.
A piecewise integration strategy breaks the task of migrating the entire API down to series of single endpoint migrations while simultaneously ensuring the validity of the API as a whole. By following this method, even the largest APIs can be effectively accelerated via Fastly and the engineering tasks can be made more manageable.
The process itself can be expressed in three steps.
Step 1: Prepare the API
In order to ensure that the API behaves as we need it to during the piecewise migration, we must have every endpoint in our API return a specific control header:
Cache-Control: private
The above control header tells Fastly that a request to any endpoint on the API will bypass the cache and be sent directly to origin. This will allow us to serve the API via Fastly and have it work exactly as expected.
Note: Modern web frameworks often allow for blanket rules to be set in place and be overridden by specific endpoints (for example, via the use of middlewares). Depending on how the API has been implemented, this step might be as simple as adding a single line of code.
Step 2: Serve Traffic via Fastly
For the next step, we configure a Fastly service to serve the API’s traffic. Once the switch has been made, there will be an immediate speed improvement. This happens because Fastly cache servers keep long-lasting connections to the API’s origin servers, which reduces the latency overhead of establishing multiple TCP connections.
Step 3: Systematically Migrate individual Endpoints
Finally, we modify the API by implementing Instant Purge™ caching for each cacheable endpoint, one endpoint at a time. The order in which this is done depends on the API, but by targeting the slowest endpoints first, we can achieve dramatic improvements for endpoints that need them the most. Because each endpoint can be done independently, the engineering process becomes more fluid and easier to manage.
But wait, there’s more…
Cache control is an important part of API caching. By using the control headers correctly, an API can leverage Fastly to get great performance all while ensuring that the data is correct and up-to-date.
But there’s one final piece of the puzzle that brings it all together: purging related items via surrogate keys. Keep an eye out for our third and final API caching article, where we dive into key-based purging.
Read more: Part III