< All posts | Fediverse | RSS | GitHub | Talks

Sep 26 2024

Flexing the Windows RRAS BGP implementation

A abandoned back ally, overalid is windows window saysing "starting RRAS"

At this point I am a bit of a BGP protocol implementation connoisseur thanks to my day job, however there is one implementation that has been in the back of my mind (In the information hazard kind of way) for quite some time now, and that is the Microsoft Windows RRAS BGP implementation.

RRAS ( or by its longer name “Routing and Remote Access” ) has existed in Windows NT releases for a very long time, allowing windows to speak RIPv2 (or OSPFv2) as the only official way to do dynamic network routing on windows, however in Windows Server 2016 RRAS shipped with BGP4 support!

This means that Windows can now talk the load bearing dynamic routing protocol of the internet, and so allows us to do some very fun experiments in pushing it to its limits! All of the previous documentation I have found on the internet for RRAS’s BGP support have only been mentioning IPv4, however I intend to operate this with only IPv6 for extra “fun”!

Ingredients for Windows BGP Support

To get started I obviously had to install Windows. Now, I have been out of the windows game for quite a while, so I grabbed a ISO for Windows Server 2022 and installed it in a evaluation mode, And I was quite surprised to find that the first login screen looked like this:

A simple black CMD window saying Press Ctrl Alt Del to login

It seems that windows and it’s default/recommended installer configuration now pushes you towards a command line environment, Not entirely surprising based on where the industry is going, however I didn’t want to particularly relearn everything that I can remember about operating Windows Server 10+ years ago, so I opted to reinstall Windows with “desktop experience” to make it a little bit more familiar to me.

Once that was done, we need to manually install the RRAS functionality as a feature in Windows

A windows install setup wizard for "Before you Begin"

The wizard gives you quite a lot of guidance on how to do this, and gives you an overview of what you’re about to install

A installation screen describing what RRAS is

Perhaps most interesting, “Routing and Remote Access” also requires you to install the Windows web server IIS. I don’t know why, and I’m not quite sure I want to know why.

A final list of install dependencies for RRAS, it include IIS

Getting BGP Running

Once we have RRAS Installed, we need to set it up, From this point onwards we are mostly doing stuff inside PowerShell (Although I’m pretty sure you can do everything here in powershell I’m not quite sure how to do it and I’m doing what I already remember to save myself time).

First we setup the RRAS in Routing Mode only, RRAS has a lot of different functions, and It seems that Microsoft intends for you to use RRAS for advanced VPN trickery, as that is what most of their documentation is focused on.

PS> Install-RemoteAccess -VpnType RoutingOnly

After that we will configure our basic “ bgp router “, I have picked a publicly visible ASN here because my original intention was for it to be routed into the internet via a friend’s 16 bit AS number that I am borrowing, however getting that to work would have been a lot of extra configuration work on my production network that I didn’t want to do, so this is more of a placeholder number. Windows does not support 32 bit AS numbers, this is already a big problem for modern day internet BGP, but I’m under the impression that this is not designed for internet scale bgp!

PS> Add-BgpRouter -BgpIdentifier 1.1.1.1 -LocalASN 39335

Now we can add our first bgp peer!

PS> Add-BgpPeer -PeerIPAddress 2a0c:2f07:4663:628::1 -LocalIPAddress 2a0c:2f07:4663:628::2 -PeerASN 64496
cmdlet Add-BgpPeer at command pipeline position 1
Supply values for the following parameters:
Name: bird2
WARNING: IPv6 routing is not enabled. This peering will not be able to learn/advertise any routes.

This adds without a fight, and after we can check our session status by running:

PS> Get-BgpPeer
PeerName LocalIPAddress PeerIPAddress PeerASN OperationMode ConnectivityStatus 
-------- -------------- ------------- ------- ------------- ------------------
bird2 2a0c:2f07:4663:628::2 2a0c:2f07:4663:628::1 64496 Mixed Connecting

We can then add the prefix we intend to originate to our peer(s):

PS> Add-BgpCustomRoute -Network 2a0c:2f07:4663:629::/64
WARNING: IPv6 routing is not enabled, added IPv6 routes will not be advertised to peers.

The observant among us would have noticed a warning when we added that peer.

WARNING: IPv6 routing is not enabled. This peering will not be able to learn/advertise any routes.

I cannot overstate how much this drove me crazy, I cannot tell if this is the continuous degradation of Google search or whether I am the only user on the planet who has used this with IPv6. So if you get this error message you are supposed to enable IPv6 routing on the BgpRouter itself (not inside all of the other places where you can enable RRAS IPv6 routing), you can do that with:

PS> Set-BgpRouter -BgpIdentifier 1.1.1.1 -IPv6Routing Enabled
Confirm One or more BGP peering sessions are active.
Applying these changes might require restart or soft-reset of those sessions.
Do you want to continue? [Y] Yes [N] No [S] Suspend [?] Help (default is "Y"): y

WARNING: LocalIPv6Address is not configured, IPv6 routes learnt from peers peering over IPv4 address or link-local IPv6 address will not be advertised to eBGP peers.

PS> Set-BgpRouter -BgpIdentifier 1.1.1.1 -IPv6Routing Enabled -LocalIPv6Address 2a0c:2f07:4663:628::2
Confirm One or more BGP peering sessions are active.
Applying these changes might require restart or soft-reset of those sessions.
Do you want to continue? [Y] Yes [N] No [S] Suspend [?] Help (default is "Y"): y

Now that we’ve managed to make it work, we can configure the peer side, and within a few moments we can see routes!

$ sudo birdc s ro table windows6 all
BIRD 2.0.10 ready.
Table windows6:
2a0c:2f07:4663:629::/64 unicast [windows 18:13:21.001] * (100) [AS39335i]
        via 2a0c:2f07:4663:628::2 on lxcbr0
        Type: BGP univ
        BGP.origin: IGP
        BGP.as_path: 39335
        BGP.next_hop: 2a0c:2f07:4663:628::2
        BGP.local_pref: 100

Amazing! I then made a loopback device (by installing an instance of the loopback driver in the hardware manager) and assigned it an IPv6 address from that prefix. We should also at this point allow ICMP through the firewall, as it is blocked by default (grr) and is very useful for debugging:

Windows Firewall, with the ICMP options set

After that, we can see from the peer, a hop to the windows VM, and then inside a hop to the loopback!

                               My traceroute  [v0.93]
metropolis (2a0c:2f07:4663:628::1)                           2024-09-14T20:27:57+0100
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                             Packets               Pings
 Host                                      Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 2a0c:2f07:4663:628:e750:9d6f:29ba:7b3   0.2%  7139    0.3   0.2   0.2   2.0   0.2
 2. 2a0c:2f07:4663:629::1                   0.0%  7139    0.6   0.3   0.3   9.3   0.2

We have Windows RRAS speaking IPv6 BGP!

But can we scale it more?

So imagine a hypothetical (and suboptimal for a number of reasons) scenario, we have two uplinks that are connected to bgp and we are using Windows Server 2022 as our edge router to direct traffic for a client network behind it:

A GIF drawing of the setup as described above

I’m obviously not going to connect my AS5511 and AS1299 ports to a Windows VM, so I am going to use MRT snapshots of their routing tables from bgp.tools and convert them back into BGP peers for testing, using a internal (and unreleased) tool to turn those MRT files back into working BGP sessions.

At the time of writing the IPv6 table is 200,000 routes large. Ideally windows can handle this while still directing traffic based on the shortest routes between the two carriers.

After setting up two remote “routers” to emulate 2 carrier links and peering them with windows, I noticed that windows was consuming a pretty extreme amount of memory, and 4 to 6 entire CPU cores:

A windows task manager showing svchost using a lot of RAM and CPU

Checking what service the process matched with, and yep, it’s RRAS.

a windows task manage showing the same PID above being linked to RRAS services

A quick checking of the routing table and:

PS> Get-NetRoute | more

ifIndex DestinationPrefix      NextHop          RouteMetric ifMetric Po
------- -----------------      -------          ----------- -------- --
[..]
14      2a13:f580:2::/48       2a0c:2f07:4663:628::1299   0 15       Ac
14      2a13:ef80:3::/48       2a0c:2f07:4663:628::5511   0 15       Ac
14      2a13:ef80::/48         2a0c:2f07:4663:628::1299   0 15       Ac
14      2a13:ebc0::/29         2a0c:2f07:4663:628::1299   0 15       Ac
14      2a13:ebc0::/29         2a0c:2f07:4663:628::5511   0 15       Ac
14      2a13:e3c1::/32         2a0c:2f07:4663:628::5511   0 15       Ac
14      2a13:e2c3:21c::/48     2a0c:2f07:4663:628::1299   0 15       Ac
14      2a13:d900::/29         2a0c:2f07:4663:628::5511   0 15       Ac
14      2a13:d880::/29         2a0c:2f07:4663:628::1299   0 15       Ac
14      2a13:d040:1::/48       2a0c:2f07:4663:628::5511   0 15       Ac
14      2a13:cd40::/30         2a0c:2f07:4663:628::1299   0 15       Ac
14      2a13:ccc3::/32         2a0c:2f07:4663:628::1299   0 15       Ac
14      2a13:cc87:ff80::/41    2a0c:2f07:4663:628::5511   0 15       Ac
14      2a13:cc82::/48         2a0c:2f07:4663:628::5511   0 15       Ac
14      2a13:c900:55::/48      2a0c:2f07:4663:628::5511   0 15       Ac

Success… Sort of?

After making some tea and coming back after 10 mins the amount of routes actually visible on ROUTE.exe or Get-NetRoute was only around 5500 (suggesting at least an average route insert time of 100ms)! It does however appear to be selecting the routes correctly.

Unfortunately RRAS is also completely unresponsive to outside input during this, meaning we cannot easily query what it thinks about some routes, some sessions even stopped getting KEEPALIVE from windows, suggesting that in a real world setup they would just trigger the BGP HoldTimer Expiry mechanism.

Conclusions

Given what we have seen, you would not want to use Windows RRAS’s BGP stack for anything outside of a dozen routes, But as far as I can see in the (sparse) documented literature for RRAS that is what it is designed for, basically dynamic failover for gateways. For this it is mostly fine. The BGP client is extremely barebones (to the point where it does not support 32 bit ASNs) but does work for basic use cases!


If you want to stay up to date with the blog you can use the RSS feed or you can follow me on Fediverse @benjojo@benjojo.co.uk!

Until next time!