Back to blog

Follow and Subscribe

Server-sent events with Fastly

Andrew Betts

Principal Developer Advocate, Fastly

If you're looking to implement this today, check out Fastly Fanout! Fanout makes it easy to build and scale real-time, streaming APIs. Want to dig a little deeper? Find out how our customer, Dansons, uses Fastly’s Compute and Fanout to delight grilling enthusiasts with blazing-fast app response speeds.

Originally published September 12, 2017

Server-sent events allow web servers to push real-time event notifications to the browser on a long-lived HTTP response. It's a beautiful part of the web platform that’s severely underused by websites, and whether it’s flight departures, stock prices, or news alerts, with Fastly you can efficiently broadcast events with low latency, to thousands or even millions of users simultaneously.

Here’s a simulation I put together of an airport departure board using real-time events:

So that’s pretty cool, and thanks to server-sent events (SSE), very simple. Even in the early days of the web, people saw the value of creating “live” content; server-sent events evolved out of this need and replaced some inventive, but horrifically convoluted solutions.

Things were pretty bad

In 2006, I was running a consulting business that was doing a project for the Financial Times, and we needed to push real-time, low-latency data to the browser for a live chat feature called Markets Live (which still runs every weekday at 11 AM in London). The chat is between two journalists commentating on the markets and needs to be broadcast to everyone watching on the web. At the time there was no mechanism on the web that would do this, and sites would often use “polling,” i.e., issuing an XMLHttpRequest every second to see if there was new content.

Now this is fine, if you're the sort of person who likes orchestrating a denial of service attack against yourself, but I needed something a bit more efficient that would also scale to potentially tens of thousands of concurrent viewers, without prompting 10,000 requests per second to hit my server.

Enter a sackful of hacks that came to be known as comet, a term coined by Alex Russell. Essentially the idea was to dribble an HTTP response out to the browser very slowly, making it look like a file download but actually sending a stream of events. This relied on the browser's ability to render an HTTP response progressively (as it is being downloaded). Even at the time, all browsers supported this, but with a now-hilarious smorgasbord of quirks.

In Firefox, you could use an XHR's interactive event, which triggered each time more data was received (well done Mozilla!). For Opera and Safari, the body of an XHR was not accessible until the response was done, so we had to load the content in an IFRAME and send chunks of something akin to <script>top.receiveData({...});</script>. And then there was Internet Explorer.  

This was the era in which Microsoft thought it was a good idea to emit clicking noises when navigating (which included IFRAMEs) and to keep spinning a huge IE logo in the top right of the browser window until the page had “finished loading.” The solution was an ActiveX control that created an in-memory page, in which you could then in turn load an endless IFRAME. I’m not even kidding about this, this is quite possibly the second-ugliest hack I’ve ever implemented.

The point is, we went to a lot of effort to do something that today is easy because all this hacking precipitated the development of the server-sent events API.

OK, so now things are mostly great

The principle of using a text-based HTTP response as a streaming transport is actually very simple and fairly easy for most people to set up on the server side. What we needed was a standard way for browsers to fire a JavaScript event every time a chunk of content is received on a slowly-loading response. This is the API that server-sent events provides:

(new EventSource("/stream")).addEventListener("newArticle", e => {
  const data = JSON.parse(e.data);
  console.log(“A new article was just published!”, data);
});

The server responds to /stream requests (the URL in this example) with a feed of double-newline separated events:

id: 1
event: newArticle
data: {"foo": 42}
id: 2
event: newArticle
data: {"foo":123, "bar":"hello"}

Pretty simple, right? As fabulous as server-sent event streams are, they do suffer from a few drawbacks:

  1. Your server needs to hold open a lot of idle TCP connections, which can require some custom optimizations to server or network hardware configuration to scale properly

  2. Your application architecture needs to be able to generate content and serve it to multiple waiting connections, which can be challenging for request-isolated backend technologies like PHP.

We can solve both these issues by fronting your SSE stream with Fastly.

Fastly Fanout makes stateful real-time communications easy

Read the blog

Supercharging SSE with Fastly

There are a number of Fastly features that make SSE work particularly well through our network:

  • Request collapsing allows a second request for the same URL that we’re already fetching from origin to join on to the same origin request, avoiding a second origin fetch for the URL (as long as the object is still 'fresh', see below!). For normal requests, this is intended to avoid the cache stampede problem, but for streams, it also acts to “fan out” one stream to multiple users.

  • Streaming miss allows the start of an origin response to be sent to waiting clients (maybe more than one, thanks to request collapsing) before we’ve received the entire content from the origin server. Since SSE streams send chunks of data at unpredictable intervals, it’s important that the browser receives each chunk as soon as it comes out of the origin server.

  • Shielding aggregates requests from all our edge POPs in one nominated POP that is physically close to your origin server, allowing request collapsing to happen at all the edge POPs and also at the shield POP, so that no matter how many clients you are streaming to, your origin should see only one request at a time.

  • HTTP/2 allows responses to be multiplexed on the same connection. Without this, an SSE stream over HTTP/1.1 would use up a TCP connection, reducing the amount of concurrency available to load other assets from the same origin domain.  This used to be 4 connections per origin, and recently it is more commonly 8 or 16, but tying one of them up permanently is bad news.  HTTP/2 solves that.

We do request collapsing automatically (unless you turn it off), but for requests to be collapsed, the origin response must be cacheable and still 'fresh' at the time of the new request. You don’t actually want us to cache the event stream after it ends; if we did, future requests to join the stream would just get an instant response containing a batch of events that happened over some earlier period in time. But you do want us to buffer the response as well as streaming it out, so that a cache record exists for new clients to join onto. That means your time to live (TTL) for the stream response must be the same duration as you intend to stream for. Say your server is configured to serve streams in 30-second segments (the browser reconnects after each segment ends): the response TTL of the stream should be exactly 30 seconds (or 29, if you want to cover the possibility of clock-mis-syncs):

Cache-control: public, max-age=29

It is very important that your origin server cuts off streams after this time has elapsed.  If you serve an endless stream to Fastly, we will hold those connections forever and you will exhaust the maximum number of concurrent connections to origin.  Don't ever serve endless downloads through Fastly.

Next you’ll need to enable streaming miss in VCL — add the following VCL snippets to your service:

Finally, shielding is simply a matter of choosing a shield node when you define your origin server, and H2 is enabled by default as long as you’re serving over HTTPS/TLS.

With all these features working together, you can emit one single event stream from origin and have that stream delivered in real-time to potentially thousands or millions of users.

In the process of making my flight demo, I also made a NodeJS module to manage server sent events using a pub/sub architecture. Find it as sse-pubsub on npm.

Doesn’t it use a lot of power?

Yes, SSE connections keep a mobile device’s radio powered up all the time. You should avoid connecting a SSE stream on a device that has a low battery, or possibly avoid using SSE at all unless the device is plugged in. You can check battery level using the battery status API, though at time of writing, only in Chrome.

Let's build together

Did you give this build a try? If you’re experimenting with this, show off what you’re building to our dev community, Fastly Connect. And if you're looking to get started with Fastly, we have great news, free developer accounts are here! Instantly get started and take advantage of the most developer-friendly edge platform in the world to build the future of the internet.