Service chaining

When one Fastly service is configured to be the backend for another (different) Fastly service, this setup is known as service chaining. This is conceptually similar to shielding except that instead of being processed by the same service configuration in multiple POPs, requests are processed by multiple service configurations in the same POP.

Service chaining illustration

Scenarios involving service chaining are complex, especially when combined with clustering and shielding. Simpler solutions are often appropriate, but service chaining remains a useful tool in a variety of cases:

  • Routing: Where you want to choose which Fastly service handles a request based on something other than the domain, a service chain can facilitate the selection of the correct service that depends on some additional request characteristics (e.g., URL path prefix or cookies).
  • Segmenting developer access: Where different elements of your Fastly configuration need to be accessible to different groups of engineers, a service chain allows each service to have different permissions.
  • Cache sharing: Each Fastly service has a separate cache. Service chaining may be useful if you want requests that are handled by different services to have access to the same cache.
  • ESI generation: Where Edge Side Includes tags are used in a response generated at the edge using synthetic, the ESI tags are not parsed and processed within that service. Placing another service downstream allows edge-generated content that includes ESI directives to process the ESI directives at the edge.
  • Compute/VCL handoff: Customers with existing VCL services that are migrating to the Compute platform, or who want to use features of both platforms, may choose to put one type of service in a chain with the other type.
  • Internal redirection: If a request is received on a domain belonging to service A, but should be redirected to a domain belonging to service B, chaining to service B is an alternative to instructing the client to redirect their request.
  • Unintentional: Some third-party services that are intended to be deployed as part of your application architecture are Fastly customers, and if you are also a Fastly customer you might end up chaining your Fastly service to another Fastly service in a different customer account, unknowingly. This is unlikely to be a problem unless it triggers loop detection

Where services are chained, the service that initially handles a request from an end user is the first service. When this service passes traffic to another service, that's the second service within the service chain. The second service will normally then pass the request to a customer origin - but may also (rarely) chain to a third service.

Enabling service chaining

Fastly selects the service that will process a request by using the Host header. The service that should receive public traffic (the 'first' service) and the service that should receive requests from the first one (the 'second' service) need to be assigned different, publicly resolvable domains. For example, the first service may be handling the public domain www.example.com and chain to a service attached to chained-service.example.com.

To make one service the backend for another, follow these steps:

  1. Configure your DNS settings to route traffic on both domains to Fastly.
  2. In the first service:
    • Add a domain using the hostname you want to use for public traffic (e.g., www.example.com).
    • Add a backend using the hostname of the second service (e.g., chained-service.example.com).
    • Configure the backend with a host header override and set the TLS certificate hostnames, all of which should be set to the same hostname you used to define the backend (e.g., chained-service.example.com). Learn more in the backends integration guide.
  3. In the second service:
    • Add a domain using the hostname that you have assigned (e.g., chained-service.example.com).
    • Add a backend using the hostname of your own origin infrastructure. (e.g., aws-elb-2.example.com).

A single service can be chained to more than one other service. Likewise, a single service can act as the backend in more than one service chain. In fact, most service chaining scenarios involve a many-to-one relationship between services.

Service chaining and shielding

When a request is made from the first service to the second one, Fastly will route the request internally within the cache server, so that there is no network latency (unless local route bypass is enabled). This means that the second service will execute in the same physical location as the first service. While service chaining provides the benefit of composing distinct packages of edge logic together, it does not consolidate requests into a single POP. To focus all requests onto a single POP requires the use of shielding alongside service chaining.

The routing of requests when using both chaining and shielding is complex and multiple strategies are possible.

The most straightforward way to combine shielding with service chaining is to enable shielding within the second service's configuration. This creates a chain-first strategy, in which the request will move from the first to the second service within the edge POP, and then (if the shield location is different from the edge POP for that request) the request will be transferred to the shield POP and processed by the second service again.

Alternatively, enabling shielding on the first service instead of the second one will create a shield-first strategy, in which the request will be processed by the first service in the edge POP, and then be transferred to the shield POP where it will be again processed by the first service, but also by the second service too.

These strategies can be visualized like this:

Shielding configurations

The effects of choosing one of these strategies over the other include:

  • Cache performance: A chain-first strategy may increase the probability of a cache hit at the first Fastly POP, if the two services are configured to cache different resources.
  • Cost: In most cases Fastly services are billed based on Fastly egress, so the two strategies may result in different billing implications for your Fastly services.
  • Fragmentation: A shield-first strategy results in fewer copies of cached objects since the second service runs in only one POP.

Chaining with shielding in Fastly Compute

Currently, the Compute platform does not support shielding, making service chaining a useful mechanism for adding shielding to compute services. For a shield-first strategy, the compute service must be the second service, and for a chain-first strategy, the compute service is the first service - since the shielding is performed in the VCL service.

Preventing direct access to chained services

While the first service is intended to receive direct traffic from end user clients, there is (by default) nothing to stop end users making requests directly to the second service as well (if they know the hostname you have assigned as the second service's domain). However, you may want the second service to only accept traffic from the first service.

A simple but relatively insecure way to do this is to set a shared secret into a custom header in the first service and check that it is present in the second one:

First service
sub vcl_recv { ... }
Fastly VCL
set req.http.Edge-Auth = "some-pre-shared-secret-string";
Second service
sub vcl_recv { ... }
Fastly VCL
if (req.http.Edge-Auth != "some-pre-shared-secret-string") {
error 403;
}

While this technique will prevent clients from inadvertently accessing the second service, it is possible for the client to intentionally set the necessary header in order to masquerade as a request forwarded from your first Fastly service. Secrets that are constants can also be easily leaked if a request is ever forwarded to the wrong host.

Why not restrict access to Fastly IPs? Learn more...

For a secure solution, construct a one-time, time-limited signature in the first service, and verify it in the second service.

First service
sub vcl_miss { ... }
Fastly VCL
declare local var.edge_auth_secret_id STRING;
declare local var.edge_auth_secret STRING;
# Consider using an edge dictionary for these
set var.edge_auth_secret_id = table.lookup(config, "edge_auth_secret_id");
set var.edge_auth_secret = table.lookup(config, var.edge_auth_secret_id);
# Should be called in both vcl_miss and vcl_pass subroutines
if (!bereq.http.Edge-Auth) {
declare local var.data STRING;
set var.data = strftime({"%s"}, now) + "," + server.datacenter;
set bereq.http.Edge-Auth = var.edge_auth_secret_id + "," + var.data + "," + digest.hmac_sha256(var.edge_auth_secret, var.data);
}
Second service
sub vcl_recv { ... }
Fastly VCL
declare local var.edge_auth_secret STRING;
if (req.http.Edge-Auth ~ "^([0-9]+),(([0-9]+),[^,]+),(0x[0-9a-f]{64})$") {
set var.edge_auth_secret = table.lookup(config, re.group.1);
if (digest.secure_is_equal(digest.hmac_sha256(var.edge_auth_secret, re.group.2), re.group.4)) {
declare local var.time TIME;
set var.time = std.time(re.group.3, std.integer2time(-1));
# Verify the timestamp is not off by more than 2 seconds
if (!(time.is_after(var.time, time.sub(now, 2s)) && time.is_after(time.add(now, 2s), var.time))) {
error 403; # Expired
}
} else {
error 403; # Incorrect sig
}
} else {
error 403; # Invalid or missing auth
}

It's a good idea to store the secret outside your VCL, in an edge dictionary. In the example above, the code assumes the existence of a dictionary called config, with an item called edge_auth_secret_id, which contains the key for another config item that contains the string you want to use as the HMAC secret to construct and verify the signature. When you need to rotate keys, add a second secret to the dictionary, and then change the value of edge_auth_secret_id to target the new secret.

Advanced chaining

Bypassing local routing

By default, Fastly cache servers will handle any request made by a Fastly service to a backend that is also a Fastly service by internally routing within the same machine, except for shielding requests (which target a specific POP).

This is normally a good approach, because it means the handoff from one service to another incurs no latency. However, in some situations, it may be preferable to resolve the second service's domain publicly and route to the resulting cache server. For example, hot spots may arise where clustering in the first service focuses requests on one server. While this is normal clustering behavior which increases cache efficiency, requests passed from the first service to a second one will end up distributed across the available servers in the POP based on the distribution of objects in the cache, rather than using our normal load balancing strategy. This is rarely a problem but if it causes a performance degradation in your service, consider bypassing local routing.

To enable local route bypass, set .bypass_local_route_table = true in the backend declaration in VCL. For example:

backend F_example_com {
.bypass_local_route_table = true; # <-- Local routing bypass
.always_use_host_header = true; # <-- Override host
.host_header = "chained-service.example.com"; # <-- Override host
.host = "chained-service.example.com";
.port = "443";
# ... other normal backend properties ...
}

IMPORTANT: Local route bypass is a protected feature which must be explicitly allowed on your service by a Fastly employee before the route bypass setting will take effect. Contact Fastly support to make a request.

The bypass_local_route_table option is not available in the web interface or API, so backends that require this feature must be defined in VCL. Since the backend will be defined in VCL, take care to ensure that it also has the always_use_host_header and host_header options set, which implement the host header override required for service chaining and would otherwise be set using the API or web interface as part of a standard service chaining setup.

NOTE: Local route bypass was previously required when chaining a Compute service with a VCL service (in either direction), but this is no longer the case.

Fastly Compute to VCL chaining

Compute services currently offer a fetch API that performs a backend fetch through the Fastly edge cache, and stores cacheable responses. There is no way to adjust cache rules for objects received from a backend before they are inserted into the cache within a Compute service. As a result, if you need to process received objects before caching them, or to set custom cache TTLs, a solution is to place a VCL service in a chain with a Compute one.

In this scenario it is usually advisable to configure the Compute service to pass all backend fetches, ignoring the cache within the Compute service in order to delegate caching concerns to the VCL service.

  1. Fastly VCL
  2. Rust
  3. JavaScript
  4. Go

Setting the Compute request to "pass" will normally provide the desired behavior, but in edge cases it may be necessary to configure the compute service to skip the cache layer entirely. This behavior can only be enabled with a flag by a Fastly employee, and can be requested by contacting Fastly support.

Chaining more than two services

In general, there should rarely be any reason to chain more than two Fastly services together. Currently the only use case we encourage this pattern for is to create a VCL to Compute to VCL "sandwich", in order to make use of features in both the VCL and Compute platforms, on both sides of the compute logic.

Purging caches

When services are chained, a request to origin might result in objects being saved into caches belonging to multiple services. One way to avoid the challenges created by this is to ensure that only one of the services participating in the chain uses the cache.

However, if multiple services within the chain are using cache, then each service must be purged individually in order to entirely remove the object from the edge cache, and they must be purged in downstream order, i.e. from the origin to the client. This is because if the first service in a chain purges an object, and then receives a request for that resource, it is possible that it will pull a cached version of the object from the second service in the chain, and write it to its own cache as a fresh entry.

If any of the services in the chain has a unique way of calculating the cache key (vcl_hash in VCL services) that is not present in the other services in the chain, or if any service in the chain manipulates the request in a way that would change its cache key before passing it on to the next service, then the object to be purged may not be labelled in the same way in each cache. Consider using purge all or surrogate keys to make it easier to refer to objects in a consistent way regardless of which service they are being purged from.

Troubleshooting

Service chaining can be a complex solution. Here are some of the more common problems that can arise:

Loop detection

When Fastly receives a request that we believe is going around in a loop, we will terminate processing of the request and return a 503 (service unavailable) error with the response reason text "Loop detected". If you receive this error, follow these steps to resolve the problem:

  1. Ensure that the first service is overriding the Host header (usually by configuring a host header override on the backend definition). If the first service does not change the Host header before forwarding the request to the second service, then the request will be processed by the first service again, because both services' domains resolve to Fastly.

  2. Check that the number of services and Fastly hops in the service chain is less than the limits enforced by Fastly:

    • Max hops: 20
    • Unique services: 6

    HINT: A hop is counted when a cache server starts to process a request for a service, so for example, a single VCL service with clustering and shielding enabled may experience up to 4 hops before the request is forwarded to origin.

Transiting more than two POPs

Shielding is intended to consolidate requests into a POP close to your infrastructure, and often hugely improves performance, but there is typically no benefit to passing a request through more than two Fastly POPs. If a response shows three or more POP identifiers in the X-Served-By or Fastly-Debug-Path headers, and you are using service chaining, then it may be that the chained services are both configured to shield, but in different locations.

In general, enable shielding on only one of the services in a service chain. If you want to enable shielding on all of them (perhaps because the second service also receives direct end-user traffic) then make sure that all the shielding configurations are set to the same shield POP.

Cached content appearing despite purging

In chained services, cached content can be hard to remove, because it is saved into multiple independent caches. If you experience cached content and no origin traffic even after purging all the services in a service chain, ensure that the services are purged in the correct order. See purging caches for more details.