BoringSSL to make TLS more secure
In 2023, Fastly made some big investments in TLS security. Today we’ll explain our migration away from OpenSSL, and in a future article, we’ll discuss our implementation of Neverbleed to isolate private key operations.
OpenSSL has a long history of high-severity vulnerabilities, including the notorious Heartbleed bug. In addition to the risk of exploitation, there is a significant operational cost incurred to rapidly test and deploy patches whenever a new vulnerability is announced. Our primary goal in replacing OpenSSL with BoringSSL was to reduce the frequency and impact of CVEs and improve the security of our TLS termination system for our customers.
BoringSSL is a fork of OpenSSL that was created and maintained by Google. It is widely considered to be fundamentally more secure than OpenSSL because it is less complex. OpenSSL remains the Swiss knife of SSL libraries, and a bunch of great work has been done over the years to improve it, but we are convinced that BoringSSL provides better protection for our customers.
The Journey
Our work began about a year ago with the ambitious idea of replacing OpenSSL on our edge for all incoming connections. We considered a few alternatives but stuck with our original vision of migrating to BoringSSL to gain the following benefits:
Smaller more modern code base
Safer API
“BoringSSL is an OpenSSL derivative and is mostly source-compatible,” making our migration less challenging
Extensive fuzzing
Used by big players and maintained by Google
Similar performance to OpenSSL
In summary, the consensus was that BoringSSL offers a more focused code base, one without OpenSSL’s myriad of legacy code, which makes it intrinsically more secure.
Lines of code comparison between OpenSSL and BoringSSL:
As mentioned above, ensuring BoringSSL provided the same level of performance was extremely important in our evaluation. We used a number of different strategies to compare performance between the libraries.
Before we even started development, we ran performance benchmarks, especially around crypto operations.
Graph comparing key exchange (ECDH operation/s) across different TLS libraries:
Then as soon as we had a BoringSSL binary ready for testing, we began running “canaries”, meaning that we deployed the new code to a few production servers. That allowed us to compare performance with real traffic patterns that are not easily simulated. Finally, we ran “overload tests”, which are tests that put more live traffic on specific Fastly servers, including the one running the test binary as well as a couple of others, for control.
There is no Free Lunch
From the start, the level of effort around development was quite evident and expected to be high. We had to assume that such a big change in our network would also demand additional operational and customer support work. After all, the intent was to make this change without impacting customers.
We also knew what we’d be missing with the switch:
No versioning: unlike OpenSSL, the maintainers of BoringSSL don’t make periodic releases. Therefore, BoringSSL demands constant upkeep as we have to track upstream changes as well as determine whether they are pertinent or not.
tooling and configuration scripts
OCSP stapling
support for certain weak ciphers (e.g. CBC modes)
Session Resumption Troubles
Midway through - or at least we thought it was “midway” at the time - we faced our first challenge: OpenSSL and BoringSSL sessions aren’t compatible. Sessions that originated in an H2O instance using BoringSSL could not be used by a different H2O using OpenSSL and vice versa, preventing both ticket and cache session resumption across libraries.
Fastly’s load balancer architecture makes it likely that connections originating from the same client end up reaching different destination servers. So, cross-library resumption attempts would be abundant when rolling out the change across thousands of servers.
Because the aforementioned cross-library resumption attempts would only take place during the rollout phase, the issue was prevented by merely segmenting the TLS session caches and handling the momentary (partial) loss of TLS handshake resumptions.
Cipher Troubles
Another way in which BoringSSL is more secure is that support for some older cipher suites has been removed. Switching to BoringSSL caused all weak CBC ciphers to silently get dropped. The number of clients that rely on these ciphers is extremely small, but at the scale of many of Fastly’s customers, even a small percentage of traffic can have a real impact. During testing, we discovered one cipher in particular that was used often enough to block our migration. We implemented a simulated cipher matching feature to count “as if BoringSSL were being used” in order to assess the actual loss of backward compatibility.
This issue was recognized by the BoringSSL maintainers and they subsequently reintroduced one single weak CBC cipher, “ECDHE-RSA-AES128-SHA256”. Once the data was collected and analyzed and subsequently retested, we were able to move forward.
As shown below, with that one weak CBC cipher back, the incompatible cipher count becomes negligible. Even that number is likely inflated by bots and scanners probing our TLS configurations.
OCSP
Many clients have stopped using Online Certificate Status Protocol (OCSP) to check for revoked certificates. In Chrome, Google relies on CRLSets, so the OCSP verification function was removed from BoringSSL. There are still lots of clients that do perform OCSP checking, and by default, it is a blocking network call that harms performance. OCSP Stapling is an improvement that allows a server to deliver the OCSP response in the handshake and eliminate the additional network call. Being Fastly, losing this performance feature was unacceptable. We simply added the missing function to BoringSSL.
Conclusion
An early idea we had that worked really well was to preemptively convert features that relied on OpenSSL-specific API functions to be as TLS library agnostic as possible.
While the additional upkeep cost was expected, we now face the additional work of analyzing all recent BoringSSL changes upstream to assess whether we need, or should, perform updates. We’ll be watching to avoid the silent loss of backward compatibility seen with the CBC ciphers mentioned earlier, for example.
Even though BoringSSL brings with it some extra challenges, overall we are pleased with our decision to invest in better TLS security.