Route with HA routing peers group broken with relay functionality #1408

Open
opened 2025-11-20 05:29:47 -05:00 by saavagebueno · 0 comments
Owner

Originally created by @saule1508 on GitHub (Nov 8, 2024).

Describe the problem

When I use relay, the routes on the client are continuously flipping between the routing peers and it is not working. Without relay it is working. In the log I see that kind of messages for each routes.

2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]

To Reproduce

  1. Install 0.30.3 version of netbird on the server, enable the relay functionality
  2. Install 0.30.3 version of netbird on routing peers (rhel 9): two routing peers
  3. install 0.31.0 version of netbird on client peer (fedora 41 or rocky linux 9) but also issue with 0.29.1
  4. on the client.log I see that the route are continuously flapping between the two routing peers

As this is similar to this issue I tried with client 0.29.1 but same issue. https://github.com/netbirdio/netbird/issues/2575

When I don't use relay it is working, the route have a score (and a latency) and they are not flapping

Expected behavior

The routes should be available on the client and not being removed/added all the time

Are you using NetBird Cloud?

self-hosted

NetBird version

0.30.3 server and 0.31 client

NetBird status -dA output:

in the output, it shows the peer has only two routes but there should be a lot more

`text
[root@testpeerclient ~]# netbird status -dA
Peers detail:
nbrpeer0101a3.anon-qkAva.domain:
NetBird IP: 100.74.183.10
Public key: 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA=
Status: Connected
-- detail --
Connection type: Relayed
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address: rels://netbird.offnet.sandbox.apac.anon-1jqeA.domain:443/relay
Last connection update: 4 seconds ago
Last WireGuard handshake: 4 seconds ago
Transfer status (received/sent) 92 B/180 B
Quantum resistance: false
Routes: -
Latency: 0s

nbrpeer0101a1.anon-qkAva.domain:
NetBird IP: 100.74.199.179
Public key: lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA=
Status: Connected
-- detail --
Connection type: Relayed
ICE candidate (Local/Remote): -/-
ICE candidate endpoints (Local/Remote): -/-
Relay server address: rels://netbird.XXXX.anon-1jqeA.domain:443/relay
Last connection update: 5 seconds ago
Last WireGuard handshake: 5 seconds ago
Transfer status (received/sent) 92 B/276 B
Quantum resistance: false
Routes: 10.146.253.0/24, 10.146.39.0/24
Latency: 0s

OS: linux/amd64
Daemon version: 0.31.0
CLI version: 0.31.0
Management: Connected to https://netbird.XXXX.anon-1jqeA.domain:443
Signal: Connected to https://netbird.XXXX.anon-1jqeA.domain:443
Relays:
[stun:stun.l.anon-ndGdW.domain:19302] is Available
[turns:coturn.XXXX.apac.anon-1jqeA.domain:443?transport=tcp] is Available
[rels://netbird.XXXX.anon-1jqeA.domain:443/relay] is Available
Nameservers:
[10.XXX.35.147:53] for [.] is Available
[10.XXX.35.84:53] for [.] is Available
[10.XXX.34.214:53] for [.] is Unavailable, reason: 1 error occurred:
* read udp 10.152.0.6:49063->10.XXX.34.214:53: i/o timeout
FQDN: testpeerclient.anon-qkAva.domain
NetBird IP: 100.74.140.188/16
Interface type: Kernel
Quantum resistance: false
Routes: -
Peers count: 2/2 Connected
`
Do you face any (non-mobile) client issues?

Yes, client is linux
I will send the debug log if usefull, but in the client.log this is what I see for one of the route

`
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24]
2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24]

`

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

Originally created by @saule1508 on GitHub (Nov 8, 2024). **Describe the problem** When I use relay, the routes on the client are continuously flipping between the routing peers and it is not working. Without relay it is working. In the log I see that kind of messages for each routes. 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] **To Reproduce** 1. Install 0.30.3 version of netbird on the server, enable the relay functionality 2. Install 0.30.3 version of netbird on routing peers (rhel 9): two routing peers 3. install 0.31.0 version of netbird on client peer (fedora 41 or rocky linux 9) but also issue with 0.29.1 4. on the client.log I see that the route are continuously flapping between the two routing peers As this is similar to this issue I tried with client 0.29.1 but same issue. https://github.com/netbirdio/netbird/issues/2575 When I don't use relay it is working, the route have a score (and a latency) and they are not flapping **Expected behavior** The routes should be available on the client and not being removed/added all the time **Are you using NetBird Cloud?** self-hosted **NetBird version** 0.30.3 server and 0.31 client **NetBird status -dA output:** in the output, it shows the peer has only two routes but there should be a lot more `text [root@testpeerclient ~]# netbird status -dA Peers detail: nbrpeer0101a3.anon-qkAva.domain: NetBird IP: 100.74.183.10 Public key: 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= Status: Connected -- detail -- Connection type: Relayed ICE candidate (Local/Remote): -/- ICE candidate endpoints (Local/Remote): -/- Relay server address: rels://netbird.offnet.sandbox.apac.anon-1jqeA.domain:443/relay Last connection update: 4 seconds ago Last WireGuard handshake: 4 seconds ago Transfer status (received/sent) 92 B/180 B Quantum resistance: false Routes: - Latency: 0s nbrpeer0101a1.anon-qkAva.domain: NetBird IP: 100.74.199.179 Public key: lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= Status: Connected -- detail -- Connection type: Relayed ICE candidate (Local/Remote): -/- ICE candidate endpoints (Local/Remote): -/- Relay server address: rels://netbird.XXXX.anon-1jqeA.domain:443/relay Last connection update: 5 seconds ago Last WireGuard handshake: 5 seconds ago Transfer status (received/sent) 92 B/276 B Quantum resistance: false Routes: 10.146.253.0/24, 10.146.39.0/24 Latency: 0s OS: linux/amd64 Daemon version: 0.31.0 CLI version: 0.31.0 Management: Connected to https://netbird.XXXX.anon-1jqeA.domain:443 Signal: Connected to https://netbird.XXXX.anon-1jqeA.domain:443 Relays: [stun:stun.l.anon-ndGdW.domain:19302] is Available [turns:coturn.XXXX.apac.anon-1jqeA.domain:443?transport=tcp] is Available [rels://netbird.XXXX.anon-1jqeA.domain:443/relay] is Available Nameservers: [10.XXX.35.147:53] for [.] is Available [10.XXX.35.84:53] for [.] is Available [10.XXX.34.214:53] for [.] is Unavailable, reason: 1 error occurred: * read udp 10.152.0.6:49063->10.XXX.34.214:53: i/o timeout FQDN: testpeerclient.anon-qkAva.domain NetBird IP: 100.74.140.188/16 Interface type: Kernel Quantum resistance: false Routes: - Peers count: 2/2 Connected ` **Do you face any (non-mobile) client issues?** Yes, client is linux I will send the debug log if usefull, but in the client.log this is what I see for one of the route ` 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8o1b0kn140rrppog with peer lDMoPxKvgzTA7uBP7+smggRoqhseYoSK0LKbq4gTQyA= with score 0.000000 for network [10.XXX.167.0/24] 2024-11-08T22:26:17Z INFO client/internal/routemanager/client.go:171: New chosen route is csc820pb0kn6h96umhvg:cqor8r1b0kn140rrppp0 with peer 3c185SdFwbQW0VvZkyEUZCF8D2QKg+inBnqP3T2ObSA= with score 0.000000 for network [10.XXX.167.0/24] ` **Screenshots** If applicable, add screenshots to help explain your problem. **Additional context** Add any other context about the problem here.
saavagebueno added the triage-needed label 2025-11-20 05:29:47 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SVI/netbird#1408