Jan 7 2024

The browsers biggest TLS mistake

Much like a previous talk of mine at Chaos Computer Congress this blog post is a direct write-up of a talk, if you prefer to consume this kind of content in video form you can watch the video here:

When you connect to a TLS server you will generally get a certificate chain back ( added emphasis on the chain part of that). The server sends a set of x509 certificates that on one end is a certificate that concerns the domain that you are connecting to, and (most of the time) the other end is the intermediate that matches against a root CA that is installed inside your computer.

However this doesn’t always happen this way, some TLS servers return only the certificate that concerns your domain on it. This shouldn’t work with client browsers, as there is no way for the TLS certificate verification code to actually draw a chain of trust between this certificate and the root CA that concerns it. However this does actually work in a lot of browsers.

This tends to happen because ACME clients and similar certificate generation utilities output put three separate files, you have the private key, the certificate file (which contains just the certificate with your domain on it), and you have the “full chain” which provides the server with the full “ chain ” of certificates that needs to be sent over to the client so that it’s can verify it.


$ ls
fullchain.pem # <- You want this
cert.pem # <- You likely don't want this
private.key # <- You want this

The weird thing however is that even if a TLS server has been misconfigured with the wrong file, the certificate chain often validates anyway. This is because the browsers have deployed a mitigation for this problem that allows servers with incomplete certificate chains to validate. Well none of these introduce explicit security issues, they do give off a very “un-pure” feeling

To avoid opening a can of worms in a lightning talk, I will gloss over the Authority Information Access (AIA) Extension that also impacts this behaviour as well

As far as I understand there are two methodologies for doing this

The Firefox solution is to ship a large set of intermediate CAs into the browser that it will check whenever there is an incomplete TLS certificate chain, this requires the set of chains to be updated on an regular basis, however that is fine because people are already updating their browsers on a regular basis.

Google Chrome (and its descendants) however is in my opinion much worse, Chrome will try to match intermediate certificates with what it is seen since the browser has been started. This has the effect of meaning that a cold start of Chrome does not behave the same way as a Chrome that has been running for 4 hours.

Word art saying I don't like this

Personally I feel like that’s a pretty insane piece of behaviour, when you have something critical like TLS validation varying on how long a user has been running an application you make things like debugging extremely difficult. For example if a user was to turn on their computer in the morning and visit your website (that has a incomplete chain) there’s a chance browser will raise a certificate validation error, and if that user was to then send a ticket into the website support, the admin administering the website would not be able to reproduce this problem as Chrome would have had already had a chance to see that intermediate certificate before.

I was interested in learning how often this kind of misconfiguration was actually happening in the wild so using a go library that mimics the Firefox behaviour we can perform a test on the Tranco 1 million list, to detect this misconfiguration we will try and start a TLS connection to all of the Tranco 1M with and without the mitigation enabled, and then the domains that are suddenly connectable we will assume that they have short certificate chains.

data table

The data that comes out of this shows that once you get past the 10,000 rank for domains you get around about 0.8% of domains that have misconfigured TLS certificate chains presented to clients. Some notable interesting examples of misconfigured certificate chains in the wild are playstation.com who sends their leaves certificate twice, bt.com, and a large number of government websites not even just limited to the United States government but other other national governments too.

I look at this particular mitigation and wonder if we really needed to do this when the error rate is around 0.8%. I feel like we pointlessly opened the x509 Pandora’s box for no reason. But at least knowing about it means that you can keep your eyes open for this kind of misconfiguration (there is a second and highly unrecommended trick where you can leverage this behaviour to block out bots that actually validate TLS chains)

If you are interested in the full list of broken domains at the time of scanning, you can find the full set of websites I detected here:

https://docs.google.com/spreadsheets/d/1rbPDQQHNPR4JdWnl_DLxoHyjj8ykWuRemtLaoB4I9_4/edit?usp=sharing

It is worth pointing out that I did not mention anything about AIA in this talk, this is a similar type of mitigation in some certificates that allows the browser to automatically download the certificate chain when doing a validation based on a HTTP URL embedded in the certificate presented by the misconfigured server. This is also mildly crazy as it requires your browser at TLS verification time to go and make additional network connections to validate the chain. (We have been here before with things like OCSP, and instead we moved to things like OCSP Stapling to ensure the browsers did not have to make follow-up requests for both privacy and reliability reasons)

If you want to stay up to date with the blog you can use the RSS feed or you can follow me on Mastodon/Fediverse @benjojo@benjojo.co.uk

Until next time!

The strange case of ICMP Type 69 on Linux (2015)

Random Post:

Going multipath without Multipath TCP (2022)