Tackling Real-Time Ad Personalization for Live Streaming at the Edge

Content Marketing Manager

May 30, 2025

With rising operational costs and fierce competition, increasing ad revenue has become a lifeline for many streaming media companies. Offering highly targeted and personalized ads is a key selling point for advertisers. Recently, a major North American broadcaster approached us with a complex problem: they wanted to overlay customized ads over existing generic ones in their live streams. This is how we solved it.

Understanding the Challenge

What made this project extra challenging was that we were talking about live streams, and with live streams, latency is critical. Additionally, there were three different systems that needed to communicate together without impacting performance.

"We not only had to communicate with [the broadcaster's] systems, but we also had to communicate with their third-party provider for the ad inventory. So there were a lot of places where things could go wrong," Robert Labonte, Principal Solutions Architect on our team, said.

The solution was required to parse SCTE-35 markers to identify ad insertion points, interface with third-party ad inventory servers, and manipulate both HLS and DASH streaming manifests in real time—all with minimal impact on latency because it's all done at the edge.

Building a Custom Solution at the Edge

Building custom solutions at the edge presents unique technical challenges. Edge computing environments operate under stricter constraints than traditional cloud infrastructure. Memory is limited, latency budgets are measured in milliseconds, and there's virtually no room for retries or buffering. When the broadcaster approached us, we didn’t have an off-the-shelf solution. At the time, this was a slightly untested use case for Compute.

“ There was a strong likelihood we could do this within the latency budget. So we started to run some experiments,” Robert said.

We began with a proof of concept. Our Principal Solutions Architect spent several weeks experimenting and proving it could work on Fastly’s Compute platform.

Fastly’s ability to examine and manipulate response bodies, not just headers—was key to the success of the project.

"As far as I know, we're the only CDN that lets you manipulate the response body," Robert said.

We could parse streaming manifests in real time, make asynchronous calls to third-party systems (like Freewheel), rewrite manifests with new ad segments on the fly, and use our KV store to maintain state across multiple requests for manifests.

Solving Technical Complexity

Not every ad break has the same amount of segments. In HLS, a new media segment is added every few seconds (typically 1–6, depending on configuration), and each one needs to align with previous logic to preserve a seamless viewing experience.

Robert explained, “Sometimes we might have 14 segments, sometimes we’d have 16… it wasn’t always clean.” These variations cause inconsistencies in the media timeline that could lead to playback errors unless HLS manifest's discontinuity counter was updated correctly.

“We needed to remember the ad decision because two seconds later, that same client goes and requests the next segment and we need to continue to serve the same ad,” Robert said. The system couldn’t afford to forget what ad it was serving.

Originally, we hoped to manage it with relative tracking or deltas. Unfortunately, live streaming playlists are constantly shifting. Old segments drop off, and assumptions break quickly. So instead, we built a custom state-tracking mechanism using the CDN cache and KV store to increment and persist discontinuity counters manually.

“It was a big headache,” Robert admitted. “We really were hoping for a solution where we could statelesly handle the HLS manifests, but it just didn’t work out… so at the end of the day, we just ended up having to count it.”

Fastly Compute's support for parallel asynchronous calls enabled this architecture to stay responsive. Rather than chaining network requests one after the other, we could query the KV store, config store, and even third-party services simultaneously, waiting for all responses to come back in parallel. This dramatically improved latency time.

Explore the escalating cyber threats facing online broadcasters, including DDoS attacks, bots, and piracy.

Request a demo

The Broader Capabilities This Unlocks

The ability to manipulate response bodies at the edge makes the Fastly platform great for customized streaming media solutions.

“We have FFMPEG running on edge. We're able to do manipulations dynamically and on the fly,” Robert noted.

“We have another customer who has to support very old video players that don't support modern streaming technologies like HLS and DASH. So they had a completely separate workflow to create, store and deliver MP4s. We came up with a solution for them that allowed them to get rid of their whole MP4 workflow, just publish HLS, and then when a request actually comes in from one of these old devices, on-the-fly, we'll generate the MP4 from their HLS stream to dynamically deliver to those clients. We store the dynamically generated MP4s in the CDN cache, so subsequent requests for the MP4 are delivered via cache.”

Advice for Media Companies Exploring Similar Solutions

For other media companies facing similar challenges, Robert advises starting small to build confidence. "There's a lot of hesitancy to adopt new technology, especially in streaming media. My honest advice is to just give it a try. Start with something small like streaming manifest manipulation."

And there aren’t many projects in streaming media that we can’t help you execute. As Robert put it, “If you can dream it, we most likely can do it.”

So don’t hesitate to get in touch with our streaming media experts to discuss the project that fits your needs.