Sep 9 2019

The year of RPKI on the control plane

This post is a textual version of the talk I gave at NLNOG 2019, You can watch the recording of the talk on youtube below if that’s your preferred medium:

Last year I gave a similar talk at NLNOG 2018 about a novel way to measure RPKI adoption using the data plane (ie, testing if hosts are actually accessible by sending them data), You can read or watch that talk in a previous post, if you are unaware of how RPKI works the previous year’s post has an explanation of what it is and may be useful for those who feel out of the loop.

Previously on RPKI

Last year I finished with two critical points, One to encourage people to actually sign their prefixes and if they are deploying inbound route validation to watch out for the ARIN TAL issue.

I also presented this graph:

Graph showing invalid RPKI prefix share in announced prefixes

and mentioned that RPKI Invalid filtering should be done with care, since while the amount of invalid prefixes are small, to large networks they could still mean a lot of traffic.

This year we know more, and now we can safely say:

It does not matter

NTT Communications did a bunch of traffic analysis and found that the amount of traffic to these invalid prefixes are incredibly small, and thus are not worth losing too much sleep over. At the worst case they are actually legitimate BGP hijacks going on that networks are accepting and routing for!

Job Snijders showing traffic levels of prefixes grouped by their prefix status Watch this talk on YouTube

There is also this graph, that shows that when you take into consideration the total address space covered by these invalid prefixes (and thus somewhat related to how much traffic it could generate). The numbers are even smaller!

RPKI status share in proportion to address space

While giving the talk last year at another event, I also got a point afterwards that dismissed RPKI due to it’s lack of Path Validation, and the view that RPKI is trivial to overcome because of it.

No path validation, and why RPKI does not have to be perfect

For those new to the term, Path Validation means that every ASN on the path is validated to be legitimate.

Showing the good path of RPKI route acceptance

In an ideal situation, Your router will see a set of networks on the AS path, and see the end originating ASN announcing some prefixes. Ideally they will be ROA signed so that they are validated to have been authorised by the RIR/Owner to be announced by AS99

A example situation where RPKI would block a hijack

Since those prefixes are ROA signed, when they are announced by an unauthorised originating ASN, they will be rejected upon import.

a hijack trying to compete with a legit announcement based on as path length

However, the attacker can get around this limitation by simply pretending to downstream the authorised ASN. RPKI does nothing to protect against this exact attack, however there are things worth noting:

Since almost all routing policies end up being to some degree “shortest AS path first” then the hijacking ASN will need to compete on AS path length, meaning the legit announcement already has the upper hand
If you set the max announcement size on your RPKI ROA’s, they attacker will not be able to “hole punch” (meaning you announce a /24 out of someone’s /20) since the ROA will also forbid that more specific route from being accepted.

The critical thing to keep in mind about RPKI is that it’s not perfect, It can’t be perfect, last time we tried to design a perfect protocol we ended up with IPv6

So it’s 2019, how is adoption?

In the past year we have seen a doubling of ROA signed prefixes, this is great and shows that people are doing the groundwork to ensure that their prefixes can be protected by RPKI.

Comparison of signed prefix volume between 2018 and 2019

However this does not solve the whole problem space. Just because you are signing your prefixes to ensure they can only be originated by your network does not mean that anyone is actually verifying this when they import your routes.

As like last year, we can check this using deliberately invalid prefixes:

Showing invalid prefixes that we are using for testing

You will notice that this year we have significantly less prefixes. We settled with ARIN and RIPE prefixes, since APNIC,JPNIC,LACNIC and AFRINIC prefixes had very similar results to RIPE and ARIN is the only prefix worth tracking as an extra here, since it is likely that not all networks validate ARIN prefixes due to the TAL agreement.

The ICMP 9000

With a program I wrote (I nicknamed the ICMP 9000) we can ping every IP on the internet and with those IP’s that do respond, we can ping those IP’s using our Invalid RPKI IP prefixes.

Prefix buckets

This results in having 3 buckets. Those who responded to a control prefix, and those who responded to a invalid ping (for RIPE and ARIN). This works because a ping should always be able to reach the end network, but when the network responds to the ping we sent, they should not have a route back to get back to the invalid prefix, as it should have been filtered on the router and discarded.

If we look into the amount of IPs in each bucket:

Showing a jump in numbers from 2018 to 2019

The general counts of responding hosts have gone up this year, I cannot place easily why this has happened, however the gap between ARIN and RIPE invalid buckets have sadly grown.

One thing to point out is that this method is not perfect, for example my mobile service provider (EE) in the UK has started to filter RPKI invalids. This is huge but not detected by this system because mobile IP access is normally behind heavy firewalls and or CGNAT, meaning pinging these IPs is almost always filtered.

A screenshot from my phone showing that EE is doing RPKI

You can find the tester used above here

This year it’s great to say that the RPKI Validating universe has expanded a huge percent:

616 Filtering ASNs

Up from 50 last year

6 /8s of IP space protected from RPKI Invalids

Last year it was 2 /11s

This huge growth has been due to major internet bandwidth providers starting their own filtering.

A slide showing ATT,Seacom,Workonline,Surfnet, and merit

This is an interesting move! AT&T being one of the biggest networks here to start filtering has made a huge impact on the US’s adoption of RPKI, in addition Seacom and WorkOnline turned up RPKI validation at roughly the same time:

(Full mailing list thread)

Hello all. In November 2018 during the ZAPF (South African Peering Forum) meeting in Cape Town, 3 major African ISP’s announced that they would enable RPKI-based ROV (Route Origin Validation), including dropping Invalid routes as part of efforts to improve Internet routing security, on the 1st April, 2019. On the 1st of April, Workonline Communications (AS37271) enabled ROV and began dropping Invalid routes. This applies to all eBGP sessions, both IPv4 and IPv6. On the 5th of April, SEACOM (AS37100) enabled ROV and began dropping Invalid routes. This applies to eBGP sessions with public peers, private peers and transit providers, both for IPv4 and IPv6. eBGP sessions toward downstream customers will follow in 3 months time.

It’s also worth pointing out that while they did so, they decided including the ARIN TAL (think of it as a root CA) due to legal concerns:

Please note that for the legal reasons previously discussed in various fora, neither Workonline nor SEACOM are utilising the ARIN TAL. As a result, any routes covered only by a ROA issued under the ARIN TAL will fall back to a status of Not Found. Unfortunately, this means that ARIN members will not see any improved routing security for their prefixes on our networks until this is resolved. We will each re-evaluate this decision if and when ARIN’s policy changes. We are hopeful that this will happen sooner rather than later.

Looking at network adoption by country:

a pie chart showing US being the biggest, then NL, ZA, and DE

We can see that the US has overtaken the Netherlands by a large margin, mostly due to the large amount of single homed networks using AT&T as a upstream now have filtering by proxy of AT&T hacing filtering. We can also observe that South Africa’s adoption has expanded due to Seacom and WorkOnline’s combined adoption efforts.

another piechart showing the majority being unknown, then ATT

If we look at adoption from an IP Prefix basis, AT&T have by far made the biggest impact of the slice. Showing the size of AT&T’s network by influence

graphs for decision makers, followed by the haha business guy

So, using these two data points. We can assume that 2020 will see a huge exponential growth, and by 2021 we will have more RPKI validating ASNs then there are currently ASNs:

a expenational growing graph, with 2021 being the year it overflows

The future is looking solid for RPKI. Let’s all pray that the long tail of networks is not too long.

As always the data that I have been working on is public, and can be found here:

Link to google sheets

If you have questions (that might not have have been asked in the NLNOG recording) then feel free to email me on ben+nlnog@benjojo.co.uk, if you want to stay up to date with the blog you can use the RSS feed or you can follow me on Twitter

Until next time!

The speed of BGP network propagation (2019)

Random Post:

Imaging mounted disk volumes under duress (2021)