Origin Offload: A measure of CDN efficiency for reducing egress cost

Monique Barbanson

Senior Engineering Manager, Customer Usage Pipeline Systems, Fastly

Peter Teichman

Senior Principal Software Engineer, Fastly

Brad Benvenuti

Director, Engineering - Observability Products, Fastly

Hossein Lotfi

SVP of Engineering, Fastly

July 01, 2024

CDN & Delivery Performance

The world of Content Delivery Networks (CDNs) has long been obsessed with cache hit ratio (CHR), but there are two big problems that need to be cleared up. The first point is that many people misunderstand how big of an impact they can have at their origin by implementing what seems like “small” CDN caching improvements. The second point is that CHR isn’t actually the best way to measure total offload to a CDN, which is why Fastly is excited to introduce a better measure called Origin Offload that focuses on origin server efficiency rather than just the number of requests being made.

Origin Offload measures the ratio of bytes served to end users that were cached inside the CDN (not fetched from the origin), over total bytes served to end users for the service. An Origin Offload of 100% means all bytes were served from the CDN.

A common misunderstanding of origin offload

If you like classic mathematical puzzles, one way to understand this is through the potato paradox. But the way this applies to CDNs, CHR, and Origin Offload is that many people get distracted by relatively high CHR percentages and think they’re doing a good job without realizing that they’re missing massive opportunities for cost savings through egress reduction, and other savings through overall traffic reduction.

The cache hit ratio is the percentage of requests a CDN can serve from its cache rather than fetching them from the origin server. Even for those building CDNs, the CHR can be confusing. If your CHR increases from 90% to 95%, it’s not merely a 5% improvement; it actually halves your origin load.

To understand why a small increase in CHR can significantly reduce origin load, think about the miss rate. When the CHR improves from 90% to 95%, the miss rate drops from 10% to 5%. This means the miss rate is now halved. Since the requests that miss the cache must be fetched from the origin, halving the miss rate effectively halves the load on the origin.

Understanding the limitations of Cache Hit Ratio

Request-Based Calculation: CHR measures requests, not data size. If your objects vary greatly in size, CHR won’t accurately represent the origin load. For instance, a single large file and many small files might have the same CHR but very different impacts on the origin.
CDN Internal States: CDNs have complex internal mechanisms that can skew CHR. E.g. shielding, restarts inside VCL logic, image optimization, and segmented caching. Take shielding as an example, a feature that reduces origin traffic by caching content at an intermediate layer can make CHR appear misleading. If a request is served from the shield cache but missed at the edge, the classical CHR calculation shows a 50% CHR (shield hit / (edge miss + shield hit)). In reality, the request was fully served by the CDN and was not fetched from the origin.

Here’s a graph that shows a customer disabling shielding (as part of an early experiment during onboarding). The graphs show that shielding was in fact significantly reducing their origin traffic (from just above 1.6GiB/s with shielding to 20+GiB/s steady state without shielding), yet the classical CHR did not reflect this improvement accurately:

Edge Traffic:

As you can see in the example above, with shielding, CHR peaks around the low 90%, while origin load is capped well below 5GiB/s. Without shielding, after an initial impact to CHR and origin load, CHR recovers to low 90s% while the origin load stabilizes in the low 20GiBs range, IE 4x the origin load with shielding.

This case study shows that CHR doesn’t adequately capture the state of origin load when shielding is at play.

Introducing Origin Offload

For many customers, egress traffic (the amount of data transferred out of their origin infrastructure) is a primary cost driver. Egress is particularly a huge cost element for video and audio streaming customers as well as large download providers. To provide our customers with a better view of this, we introduced a new metric: Origin Offload.

Here’s a graph showing the same customer, now with the Origin Offload metric, clearly showing the impact of disabling shielding.

The new origin_offload metric is available from the Fastly Historical API as well as the Real Time API. The endpoints you are familiar with (see links for sample code) will now include the origin_offload value along with existing metrics in their response.

You can also find the graph in Fastly UI:

When thinking about the capacity of your origin infrastructure, requests and egress bytes are the two key components of capacity planning. The cache hit ratio is vital to the performance of your origin as it directly influences the efficiency of your content delivery and your system’s load. A higher cache hit ratio indicates that more requests are being served from our edge caches rather than your origin server, significantly reducing latency and improving your origin’s response times. On the other hand, Origin Offload is critical to your origin server's efficiency as it reduces the volume of traffic that your origin server must process and the network capacity required to deliver that traffic to our CDN.

By leveraging Fastly edge caches to serve static and frequently accessed content, you can lower your origin egress cost and free up request processing capacity on your origin for handling dynamic content.

This model of load distribution prevents your origin server from degrading your users’ experience, increasing overall system resilience and reliability, and as an added bonus, a high origin offload reduces the infrastructure and operational costs associated with scaling your origin to handle peak loads. A high origin offload is essential for maintaining efficient resource utilization and ensuring an uninterrupted experience for your end users. Furthermore, a high cache hit ratio reduces operational stress on your origin infrastructure, minimizing the opportunities for bottlenecks and server outages.

How to improve your origin offload and CHR

Cache hit ratio is crucial, but it should be supplemented with metrics like Origin Offload to fully understand CDN performance and its impact on origin load for informed decisions and cost reduction. Fastly’s edge cloud platform offers more than just a standard CDN. It offers a modern network with powerful, strategically placed, and software-defined POPs. If you want to improve your performance, speed and cut costs, let’s talk!

Want to learn more about how to significantly reduce egress costs and traffic to your origin?

Get in touch