Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition between L3VNI creation and SVI creation #17342

Open
2 tasks done
Tuetuopay opened this issue Nov 5, 2024 · 0 comments
Open
2 tasks done

Race condition between L3VNI creation and SVI creation #17342

Tuetuopay opened this issue Nov 5, 2024 · 0 comments
Labels
triage Needs further investigation

Comments

@Tuetuopay
Copy link
Contributor

Description

Hi,

I have inconsistencies on startup with FRR related to import route-targets on an EVPN VRF. Sometimes, the route-target import xxx:yyy is ignored (not shown in a show run) or not replacing the automatic one of *:<vni>. In both cases, all routes matching the assigned part of the RT are imported, breaking the isolation from the admin part of the RT.

This is related to adding a VXLAN SVI (vxlan + bridge + vrf) during FRR startup when the configuration is loaded, e.g. in a lab with containerlab.

The config is as follows:

vrf vrf-1777
  vni 1777
exit-vrf
!
router bgp 12876
  bgp router-id 172.20.20.11
  no bgp default ipv4-unicast
  !
  neighbor 172.20.21.1 remote-as internal
  neighbor 172.20.21.1 update-source 172.20.20.11
  neighbor 172.20.21.1 description rr01
  !
  address-family l2vpn evpn
    neighbor 172.20.21.1 activate
    advertise-all-vni
  exit-address-family
!
router bgp 12876 vrf vrf-1777
  bgp router-id 172.20.20.11
  !
  neighbor 169.254.0.1 remote-as external
  neighbor 169.254.0.1 local-as 12876 no-prepend replace-as
  neighbor 169.254.0.1 update-source 169.254.0.0
  !
  address-family ipv4 unicast
    neighbor 169.254.0.1 activate
  exit-address-family
  !
  address-family l2vpn evpn
    advertise ipv4 unicast
    !
    route-target export 64699:1777
    route-target import 12876:1777
  exit-address-family

This is brought up by the following containerlab snippet (there are actually 6 in the lab but are all the same, the lab is templatized):

name: evpn
mgmt:
  network: fixedips
  ipv4-subnet: 172.20.20.0/22
topology:
  nodes:
    sw01:
      kind: linux
      image: quay.io/frrouting/frr:10.1.1
      binds:
        - frr-daemons:/etc/frr/daemons
        - sw01.conf:/etc/frr/frr.conf
      mgmt-ipv4: 172.20.20.11
      exec:
        - ip link add vrf-1777 up type vrf table 1777
        - ip link add br-1777 up master vrf-1777 type bridge
        - ip link add vxlan-1777 up master br-1777 type vxlan id 1777 dstport 4789 local 172.20.20.11 nolearning
        - ip link set eth1 up master vrf-vpc-1777
        - ip addr add 169.254.0.0/31 dev eth1
  • the RR at 172.20.21.1 announces evpn type-5 routes with the 12876:1777 RT (192.168.2.2/32 via 172.20.23.243)
  • the ebgp sessions at 169.254.0.1 all announce the 192.168.1.1/32 prefix, re-announced as an evpn type-5 route with the 64699:1777 RT

Quite often (in the 10% range) the bringup kind of fails on the route-target import 12876:1777 part. Either it is plain missing, or it is plain ignored using the default 0:1777 RT acting as a wildcard.

show run properly shows it in the conf:

# sh ru
-- snip --
router bgp 12876 vrf vrf-1777
-- snip --
 address-family l2vpn evpn
  advertise ipv4 unicast
  route-target import 12876:1777
  route-target export 64699:1777
 exit-address-family
exit

However, with the route-target import 12876:1777 line we expect only 192.168.2.2/32 to be imported, but it is not the case:

# sh ip route vrf vrf-1777
-- snip --
VRF vrf-1777:
C>* 169.254.0.0/31 is directly connected, eth1, 00:15:05
L>* 169.254.0.0/32 is directly connected, eth1, 00:15:05
B>* 192.168.1.1/32 [200/200] via 172.20.20.11, br-1777 onlink, weight 1, 00:15:00
  *                          via 172.20.20.12, br-1777 onlink, weight 1, 00:15:00
B>* 192.168.2.2/32 [200/100] via 172.20.23.243, br-1777 onlink, weight 1, 00:15:00

expected output being:

sw01# sh ip route vrf vrf-1777
-- snip --
VRF vrf-vpc-1777:
C>* 169.254.0.0/31 is directly connected, eth1, 00:39:56
L>* 169.254.0.0/32 is directly connected, eth1, 00:39:56
B>* 192.168.1.1/32 [20/0] via 169.254.0.1, eth1, weight 1, 00:39:51
B>* 192.168.2.2/32 [200/200] via 172.20.23.243, br-1777 onlink, weight 1, 00:39:50

Inpecting bgp yields some... weird results:

# sh bgp l2vpn evpn vrf-import-rt
Route-target: 12876:1777
List of VRFs importing routes with this route-target:
  vrf-1777
Route-target: 0:1777
List of VRFs importing routes with this route-target:
  vrf-1777

0:1777 is the default one being created for auto-rt, which is supposed to be removed by the manual import rt. Correct frr instances show the following:

# sh bgp l2vpn evpn vrf-import-rt
Route-target: 12876:1777
List of VRFs importing routes with this route-target:
  vrf-1777

After some discussions with Trey on slack, we tried to down-up the SVI on the kernel side:

ip link set vrf-1777 down
ip link set br-1777 down
ip link set vxlan-1777 down
ip link set vrf-1777 up
ip link set br-1777 up
ip link set vxlan-1777 up

Which fixed the routing table from show ip route vrf vrf-1777, but still has the incorrect output from sh bgp l2vpn evpn vrf-import-rt. Which is, while the routing table is correct wrt my intent, it is incorrect from a vrf-import-rt standpoint. bgpd is now even more inconsistent.

This was furthen broken after discussions. We tried to deconf and reconf the import rt, which corrupted bgpd further more:

sw01(config)# router bgp 12876 vrf vrf-1777
sw01(config-bgp)# address-family l2vpn evpn
sw01(config-router-af)# no route-target import 12876:1777
sw01(config-router-af)# do sh bgp l2vpn evpn vrf-import-rt
Route-target: 0:1777
List of VRFs importing routes with this route-target:
  vrf-1777
  vrf-1777
sw01(config-router-af)# route-target import 12876:1777
% RT specified already configured for this VRF: 12876:1777
sw01(config-router-af)# do sh bgp l2vpn evpn vrf-import-rt
Route-target: 0:1777
List of VRFs importing routes with this route-target:
  vrf-1777
  vrf-1777

(the double vrf-1777 outputs are not copy/paste artifacts, those were the actual vtysh outputs).

So in this state:

  • the same vrf is mentioned twice
  • 12876:1777 is not there anymore but still there at the same time
  • and the routing table was "corrent" for 12876:1777 but incorrect for 0:1777 (what bgpd tells us it filters on)

As for why this is very likely linked to a race between netlink and vtysh, when I change the containerlab exec section from

      exec:
        - ip link add vrf-1777 up type vrf table 1777
        - ip link add br-1777 up master vrf-1777 type bridge
        - ip link add vxlan-1777 up master br-1777 type vxlan id 1777 dstport 4789 local 172.20.20.11 nolearning

to

      exec:
        - ip link add vrf-1777 type vrf table 1777
        - ip link add br-1777 master vrf-1777 type bridge
        - ip link add vxlan-1777 master br-1777 type vxlan id 1777 dstport 4789 local 172.20.20.11 nolearning
        - ip link set vrf-1777 up
        - ip link set br-1777 up
        - ip link set vxlan-1777 up

the issue could not be reproduced in 20+ restarts of the lab (which has 6 nodes susceptible of the bug), while it happens every two to three restarts on average with the original scripts.

I did not dig into the code as I don't have time for this right now, but I hope to.

Thanks!

Version

sw01# show version
FRRouting 10.1.1_git (sw01) on Linux(6.11.5-arch1-1).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--prefix=/usr' '--sysconfdir=/etc' '--localstatedir=/var' '--sbindir=/usr/lib/frr' '--libdir=/usr/lib' '--enable-rpki' '--enable-vtysh' '--enable-multipath=64' '--enable-vty-group=frrvty' '--enable-user=frr' '--enable-group=frr' '--enable-pcre2posix' '--enable-scripting' 'CC=gcc' 'CXX=g++'

How to reproduce

Start an FRR with the above config a bunch of times, creating the svi as frr loads its configuration file. The ebgp peer can be skipped as it does not matter for the issue.

I am using containerlab for convenience, and it may have just the right timing to trigger the issue. A lab with 6 nodes triggers the issue every two to three restarts of the lab (containerlab deploy --reconfigure).

Docker image: quay.io/frrouting/frr:10.1.1
Docker version: 27.3.1, build ce1223035a
Kernel version: 6.11.5-arch1-1
Containerlab version: version: 0.56.0, commit: b593b206

As this is a timing issue, the hardware used is important. I'm running this on a Ryzen 4650U with an NVMe drive.

Expected behavior

I expect my import rt to be the only one used:

sw02# sh bgp l2vpn evpn vrf-import-rt
Route-target: 12876:1777
List of VRFs importing routes with this route-target:
  vrf-1777

Actual behavior

My RT and the default "catch-all" RT are present:

sw01# sh bgp l2vpn evpn vrf-import-rt
Route-target: 12876:1777
List of VRFs importing routes with this route-target:
  vrf-1777
Route-target: 0:1777
List of VRFs importing routes with this route-target:
  vrf-1777

With no way of clearing the 0:1777 RT.

Additional context

This is not the first time I've had consistency issues with FRR when upping the parts of an SVI as I create them (ip link add ... up ...). Create then upping is much more robust.

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@Tuetuopay Tuetuopay added the triage Needs further investigation label Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

1 participant