in the last weeks netbird randomly lost connection and not able to recover #1625

Open
opened 2025-11-20 06:03:40 -05:00 by saavagebueno · 29 comments
Owner

Originally created by @lfarkas on GitHub (Feb 14, 2025).

Originally assigned to: @pappz on GitHub.

since v0.36.5 no longer be able to connect other peers. sometimes netbird restart solve the problem sometimes not.
netbird status -d
show connected but not even a ping works with the peers 100.76.x.x ip address.ps axuf
here is a part from the log:

2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-1.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-Dn9xgXi3A/4FEe90jhVUP/dkvMcxA59y/e7x0g3oZO4=
2025-02-14T19:48:03+01:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-1.relay.netbird.io:443, endpoint port: :3
2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/conn.go:447: created new wgProxy for relay connection: 127.0.0.1:3
2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started
2025-02-14T19:48:03+01:00 INFO [peer: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE=] client/internal/peer/conn.go:476: start to communicate with peer via relay
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/client.go:164: create new relay connection: local peerID: gsrpCbJwc8lkmNV783rxIHpyj+zZIhy/rFj5HsfVuBY=, local peer hashedID: sha-99JRJjv0
PJBbfBPJzmU0KgWX+n3VVc6ezC48fcixQBE=
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/client.go:170: connecting to relay server
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/dialer/race_dialer.go:64: dialing Relay server via quic
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/dialer/race_dialer.go:64: dialing Relay server via WS
2025-02-14T19:48:03+01:00 INFO [peer: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=] client/internal/peer/conn.go:476: start to communicate with peer via relay
2025-02-14T19:48:03+01:00 INFO client/internal/routemanager/client.go:210: New chosen route is co1co8bl0ubs739dfm90 with peer FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY= with score 19990.001000 for network [192.168.0.0/16]
2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/conn.go:476: start to communicate with peer via relay
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/dialer/race_dialer.go:89: successfully dialed via: WS
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/dialer/race_dialer.go:75: connection attempt aborted via: quic
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/client.go:186: relay connection established
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-d6bmxNpKji4X2AM4Syi/oXpY9FJ6J27RG3gTY9ONhdE=
2025-02-14T19:48:03+01:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-0.relay.netbird.io:443, endpoint port: :4
2025-02-14T19:48:03+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/conn.go:447: created new wgProxy for relay connection: 127.0.0.1:4
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-1.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-Asv8+qhh3HsYQgPXy3cIzGzTjlTvEIoTND3nPoVZDgw=
2025-02-14T19:48:03+01:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-1.relay.netbird.io:443, endpoint port: :5
2025-02-14T19:48:03+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/conn.go:447: created new wgProxy for relay connection: 127.0.0.1:5
2025-02-14T19:48:03+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started
2025-02-14T19:48:03+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started
2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-1.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-ULsX413ckuLILuPUeQ8liU9B86RCBgkvFP0SdhMWbUw=
2025-02-14T19:48:03+01:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-1.relay.netbird.io:443, endpoint port: :6
2025-02-14T19:48:03+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/conn.go:447: created new wgProxy for relay connection: 127.0.0.1:6
2025-02-14T19:48:03+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started
2025-02-14T19:48:03+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/conn.go:476: start to communicate with peer via relay
2025-02-14T19:48:03+01:00 INFO client/internal/routemanager/client.go:210: New chosen route is co1dqj3l0ubs739dfnsg with peer hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0= with score 49990.001000 for network [192.168.0.0/16]
2025-02-14T19:48:03+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/conn.go:476: start to communicate with peer via relay
2025-02-14T19:48:03+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/conn.go:476: start to communicate with peer via relay
2025-02-14T19:48:03+01:00 INFO client/internal/routemanager/client.go:210: New chosen route is co1kv3bl0ubs739dg130 with peer 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE= with score 0.001000 for network [10.20.0.0/24]
2025-02-14T19:48:03+01:00 INFO client/internal/routemanager/client.go:210: New chosen route is co1kuj3l0ubs739dg11g with peer 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE= with score 0.001000 for network [10.30.0.0/24]
2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/conn.go:328: set ICE to active connection
2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped
2025-02-14T19:48:03+01:00 INFO [peer: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE=] client/internal/peer/conn.go:328: set ICE to active connection
2025-02-14T19:48:03+01:00 INFO [peer: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped
2025-02-14T19:48:03+01:00 INFO [peer: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=] client/internal/peer/conn.go:328: set ICE to active connection
2025-02-14T19:48:03+01:00 INFO [peer: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped
2025-02-14T19:48:04+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/conn.go:328: set ICE to active connection
2025-02-14T19:48:04+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped
2025-02-14T19:48:04+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/conn.go:328: set ICE to active connection
2025-02-14T19:48:04+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped
2025-02-14T19:48:05+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/conn.go:328: set ICE to active connection
2025-02-14T19:48:05+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped
2025-02-14T19:48:05+01:00 INFO [peer: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE=] client/internal/peer/guard/guard.go:84: start reconnect loop...
2025-02-14T19:48:05+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/guard/guard.go:84: start reconnect loop...
2025-02-14T19:48:05+01:00 INFO [peer: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=] client/internal/peer/guard/guard.go:84: start reconnect loop...
2025-02-14T19:48:06+01:00 INFO [peer: Yg/JDeFsAfMnue9KOTNm77L0AlG1g3Y6pYIm3KhUxyw=] client/internal/peer/guard/guard.go:84: start reconnect loop...
2025-02-14T19:48:06+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/guard/guard.go:84: start reconnect loop...
2025-02-14T19:48:06+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/guard/guard.go:84: start reconnect loop...
2025-02-14T19:48:06+01:00 INFO [peer: Kc8hGcw4uOpvTwgvTste9cdhtPpmMLsZDeOYSITNGnk=] client/internal/peer/guard/guard.go:84: start reconnect loop...
2025-02-14T19:53:02+01:00 INFO client/internal/peer/guard/sr_watcher.go:94: network changes detected by ICE agent
Originally created by @lfarkas on GitHub (Feb 14, 2025). Originally assigned to: @pappz on GitHub. since v0.36.5 no longer be able to connect other peers. sometimes netbird restart solve the problem sometimes not. netbird status -d show connected but not even a ping works with the peers 100.76.x.x ip address.ps axuf here is a part from the log: ``` 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-1.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-Dn9xgXi3A/4FEe90jhVUP/dkvMcxA59y/e7x0g3oZO4= 2025-02-14T19:48:03+01:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-1.relay.netbird.io:443, endpoint port: :3 2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/conn.go:447: created new wgProxy for relay connection: 127.0.0.1:3 2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started 2025-02-14T19:48:03+01:00 INFO [peer: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE=] client/internal/peer/conn.go:476: start to communicate with peer via relay 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/client.go:164: create new relay connection: local peerID: gsrpCbJwc8lkmNV783rxIHpyj+zZIhy/rFj5HsfVuBY=, local peer hashedID: sha-99JRJjv0 PJBbfBPJzmU0KgWX+n3VVc6ezC48fcixQBE= 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/client.go:170: connecting to relay server 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/dialer/race_dialer.go:64: dialing Relay server via quic 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/dialer/race_dialer.go:64: dialing Relay server via WS 2025-02-14T19:48:03+01:00 INFO [peer: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=] client/internal/peer/conn.go:476: start to communicate with peer via relay 2025-02-14T19:48:03+01:00 INFO client/internal/routemanager/client.go:210: New chosen route is co1co8bl0ubs739dfm90 with peer FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY= with score 19990.001000 for network [192.168.0.0/16] 2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/conn.go:476: start to communicate with peer via relay 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/dialer/race_dialer.go:89: successfully dialed via: WS 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/dialer/race_dialer.go:75: connection attempt aborted via: quic 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/client.go:186: relay connection established 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-0.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-d6bmxNpKji4X2AM4Syi/oXpY9FJ6J27RG3gTY9ONhdE= 2025-02-14T19:48:03+01:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-0.relay.netbird.io:443, endpoint port: :4 2025-02-14T19:48:03+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/conn.go:447: created new wgProxy for relay connection: 127.0.0.1:4 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-1.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-Asv8+qhh3HsYQgPXy3cIzGzTjlTvEIoTND3nPoVZDgw= 2025-02-14T19:48:03+01:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-1.relay.netbird.io:443, endpoint port: :5 2025-02-14T19:48:03+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/conn.go:447: created new wgProxy for relay connection: 127.0.0.1:5 2025-02-14T19:48:03+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started 2025-02-14T19:48:03+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started 2025-02-14T19:48:03+01:00 INFO [relay: rels://streamline-de-fra1-1.relay.netbird.io:443] relay/client/client.go:214: open connection to peer: sha-ULsX413ckuLILuPUeQ8liU9B86RCBgkvFP0SdhMWbUw= 2025-02-14T19:48:03+01:00 INFO client/iface/wgproxy/ebpf/proxy.go:102: turn conn added to wg proxy store: rels://streamline-de-fra1-1.relay.netbird.io:443, endpoint port: :6 2025-02-14T19:48:03+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/conn.go:447: created new wgProxy for relay connection: 127.0.0.1:6 2025-02-14T19:48:03+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/wg_watcher.go:87: WireGuard watcher started 2025-02-14T19:48:03+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/conn.go:476: start to communicate with peer via relay 2025-02-14T19:48:03+01:00 INFO client/internal/routemanager/client.go:210: New chosen route is co1dqj3l0ubs739dfnsg with peer hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0= with score 49990.001000 for network [192.168.0.0/16] 2025-02-14T19:48:03+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/conn.go:476: start to communicate with peer via relay 2025-02-14T19:48:03+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/conn.go:476: start to communicate with peer via relay 2025-02-14T19:48:03+01:00 INFO client/internal/routemanager/client.go:210: New chosen route is co1kv3bl0ubs739dg130 with peer 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE= with score 0.001000 for network [10.20.0.0/24] 2025-02-14T19:48:03+01:00 INFO client/internal/routemanager/client.go:210: New chosen route is co1kuj3l0ubs739dg11g with peer 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE= with score 0.001000 for network [10.30.0.0/24] 2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/conn.go:328: set ICE to active connection 2025-02-14T19:48:03+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped 2025-02-14T19:48:03+01:00 INFO [peer: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE=] client/internal/peer/conn.go:328: set ICE to active connection 2025-02-14T19:48:03+01:00 INFO [peer: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped 2025-02-14T19:48:03+01:00 INFO [peer: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=] client/internal/peer/conn.go:328: set ICE to active connection 2025-02-14T19:48:03+01:00 INFO [peer: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped 2025-02-14T19:48:04+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/conn.go:328: set ICE to active connection 2025-02-14T19:48:04+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped 2025-02-14T19:48:04+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/conn.go:328: set ICE to active connection 2025-02-14T19:48:04+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped 2025-02-14T19:48:05+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/conn.go:328: set ICE to active connection 2025-02-14T19:48:05+01:00 INFO [peer: hCDjKQBW9TBwsZigTRXxvVzpAYE+ZqDHBol4sOSUMl0=] client/internal/peer/wg_watcher.go:111: WireGuard watcher stopped 2025-02-14T19:48:05+01:00 INFO [peer: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE=] client/internal/peer/guard/guard.go:84: start reconnect loop... 2025-02-14T19:48:05+01:00 INFO [peer: +i/q6dNa3AeF/iNJMH9+CbnsTLmFPfN+/K0KUPJI5wI=] client/internal/peer/guard/guard.go:84: start reconnect loop... 2025-02-14T19:48:05+01:00 INFO [peer: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=] client/internal/peer/guard/guard.go:84: start reconnect loop... 2025-02-14T19:48:06+01:00 INFO [peer: Yg/JDeFsAfMnue9KOTNm77L0AlG1g3Y6pYIm3KhUxyw=] client/internal/peer/guard/guard.go:84: start reconnect loop... 2025-02-14T19:48:06+01:00 INFO [peer: 1u25Mrocd2aMv88fUgRnKmM1caynzX+bGTzThCZ3CnE=] client/internal/peer/guard/guard.go:84: start reconnect loop... 2025-02-14T19:48:06+01:00 INFO [peer: RtObgAe/KslyFa/t0a/iGwy7HohRzO8xhNNUPIR1ri8=] client/internal/peer/guard/guard.go:84: start reconnect loop... 2025-02-14T19:48:06+01:00 INFO [peer: Kc8hGcw4uOpvTwgvTste9cdhtPpmMLsZDeOYSITNGnk=] client/internal/peer/guard/guard.go:84: start reconnect loop... 2025-02-14T19:53:02+01:00 INFO client/internal/peer/guard/sr_watcher.go:94: network changes detected by ICE agent ```
saavagebueno added the triage-neededself-hosting labels 2025-11-20 06:03:40 -05:00
Author
Owner

@lfarkas commented on GitHub (Feb 22, 2025):

peer is connected but can't be ping:

# ping fox
PING fox.netbird.cloud (100.76.171.201) 56(84) bytes of data.
^C
--- fox.netbird.cloud ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3109ms

from netbird status -d:

 fox.netbird.cloud:
  NetBird IP: 100.76.171.201
  Public key: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY=
  Status: Connected
  -- detail --
  Connection type: P2P
  ICE candidate (Local/Remote): host/srflx
  ICE candidate endpoints (Local/Remote): 10.5.5.217:51820/185.199.30.141:14255
  Relay server address: rels://streamline-de-fra1-2.relay.netbird.io:443
  Last connection update: 18 minutes, 19 seconds ago
  Last WireGuard handshake: -
  Transfer status (received/sent) 12.1 KiB/20.3 KiB
  Quantum resistance: true
  Routes: -
  Networks: -
  Latency: 8.003627ms

OS: linux/amd64
Daemon version: 0.36.7
CLI version: 0.36.7
Management: Connected to https://api.netbird.io:443
Signal: Connected to https://signal.netbird.io:443
Relays: 
  [stun:stun.netbird.io:5555] is Available
  [turns:turn.netbird.io:443?transport=tcp] is Available
  [rels://streamline-de-fra1-2.relay.netbird.io:443] is Available
Nameservers: 
  [192.168.208.1:53] for [int.vidux.hu] is Unavailable, reason: 1 error occurred:
	* read udp 10.5.5.217:50996->192.168.208.1:53: i/o timeout
  [10.30.0.1:53] for [szeged.vidux.hu] is Available
FQDN: dell.netbird.cloud
NetBird IP: 100.76.111.32/16
Interface type: Kernel
Quantum resistance: true (permissive)
Routes: -
Networks: -
Peers count: 5/8 Connected
@lfarkas commented on GitHub (Feb 22, 2025): peer is connected but can't be ping: ``` # ping fox PING fox.netbird.cloud (100.76.171.201) 56(84) bytes of data. ^C --- fox.netbird.cloud ping statistics --- 4 packets transmitted, 0 received, 100% packet loss, time 3109ms ``` from netbird status -d: ``` fox.netbird.cloud: NetBird IP: 100.76.171.201 Public key: FfiyZKMquYILabBxOquw/jXEuTjhBq6tUvBEPdV3ckY= Status: Connected -- detail -- Connection type: P2P ICE candidate (Local/Remote): host/srflx ICE candidate endpoints (Local/Remote): 10.5.5.217:51820/185.199.30.141:14255 Relay server address: rels://streamline-de-fra1-2.relay.netbird.io:443 Last connection update: 18 minutes, 19 seconds ago Last WireGuard handshake: - Transfer status (received/sent) 12.1 KiB/20.3 KiB Quantum resistance: true Routes: - Networks: - Latency: 8.003627ms OS: linux/amd64 Daemon version: 0.36.7 CLI version: 0.36.7 Management: Connected to https://api.netbird.io:443 Signal: Connected to https://signal.netbird.io:443 Relays: [stun:stun.netbird.io:5555] is Available [turns:turn.netbird.io:443?transport=tcp] is Available [rels://streamline-de-fra1-2.relay.netbird.io:443] is Available Nameservers: [192.168.208.1:53] for [int.vidux.hu] is Unavailable, reason: 1 error occurred: * read udp 10.5.5.217:50996->192.168.208.1:53: i/o timeout [10.30.0.1:53] for [szeged.vidux.hu] is Available FQDN: dell.netbird.cloud NetBird IP: 100.76.111.32/16 Interface type: Kernel Quantum resistance: true (permissive) Routes: - Networks: - Peers count: 5/8 Connected ```
Author
Owner

@mlsmaycon commented on GitHub (Feb 22, 2025):

@lfarkas, can you please run the following command while repeating the ping test?

netbird debug for 5m -S

Then please share the generated bundle file?

@mlsmaycon commented on GitHub (Feb 22, 2025): @lfarkas, can you please run the following command while repeating the ping test? ``` netbird debug for 5m -S ``` Then please share the generated bundle file?
Author
Owner

@lfarkas commented on GitHub (Feb 22, 2025):

netbird.debug.201054929.zip

@lfarkas commented on GitHub (Feb 22, 2025): [netbird.debug.201054929.zip](https://github.com/user-attachments/files/18921942/netbird.debug.201054929.zip)
Author
Owner

@lfarkas commented on GitHub (Feb 22, 2025):

To be honest it is a serious problem for us. In the last few month it happened regularly not to be able access to the work network from home and someone must restart the NetBird service in the internal network... Sometimes even in this case the connection is not working.

@lfarkas commented on GitHub (Feb 22, 2025): To be honest it is a serious problem for us. In the last few month it happened regularly not to be able access to the work network from home and someone must restart the NetBird service in the internal network... Sometimes even in this case the connection is not working.
Author
Owner

@nickz-LR commented on GitHub (Feb 26, 2025):

Hey just wanted to chime in that I'm having the same issue when deploying via Kubernetes, really keen on a fix for this.

@nickz-LR commented on GitHub (Feb 26, 2025): Hey just wanted to chime in that I'm having the same issue when deploying via Kubernetes, really keen on a fix for this.
Author
Owner

@Kamaradeivanov commented on GitHub (Mar 13, 2025):

I'm experiencing the same issue. I deployed a self-hosted instance using the Helm chart from totmicro/helms. There might be a problem with the relay server configuration, as my peers seem to disconnect after a while (they appear to lose connection with the relay server):

Relays: 
  [rels://vpn.my.domain:443/relay] is Unavailable, reason: relay connection is not established

On relay server I got a lot of following errors

ERRO relay/server/relay.go:121: failed to handshake: validate sha-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (x.x.x.x:yyyy): expired token
@Kamaradeivanov commented on GitHub (Mar 13, 2025): I'm experiencing the same issue. I deployed a self-hosted instance using the Helm chart from [totmicro/helms](https://github.com/totmicro/helms). There might be a problem with the relay server configuration, as my peers seem to disconnect after a while (they appear to lose connection with the relay server): ``` Relays: [rels://vpn.my.domain:443/relay] is Unavailable, reason: relay connection is not established ``` On relay server I got a lot of following errors ``` ERRO relay/server/relay.go:121: failed to handshake: validate sha-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx (x.x.x.x:yyyy): expired token ```
Author
Owner

@lfarkas commented on GitHub (Mar 14, 2025):

we don't use a self hosted version but the cloud version. and it's still happen and very annoying. since when i can't access to the remote site there is no other way to restart netbird just go to the site and reboot the machine or restart netbird service.

is there any progress with it?

@lfarkas commented on GitHub (Mar 14, 2025): we don't use a self hosted version but the cloud version. and it's still happen and very annoying. since when i can't access to the remote site there is no other way to restart netbird just go to the site and reboot the machine or restart netbird service. is there any progress with it?
Author
Owner

@ugurtam commented on GitHub (Mar 14, 2025):

we don't use a self hosted version but the cloud version. and it's still happen and very annoying. since when i can't access to the remote site there is no other way to restart netbird just go to the site and reboot the machine or restart netbird service.

is there any progress with it?

Try this one; disable and enable policy. Please share the result here

@ugurtam commented on GitHub (Mar 14, 2025): > we don't use a self hosted version but the cloud version. and it's still happen and very annoying. since when i can't access to the remote site there is no other way to restart netbird just go to the site and reboot the machine or restart netbird service. > > is there any progress with it? Try this one; disable and enable policy. Please share the result here
Author
Owner

@lfarkas commented on GitHub (Mar 14, 2025):

what policy and where and how ?
anyway i send you the whole debug log 3 weeks ago above. did you look into that?

@lfarkas commented on GitHub (Mar 14, 2025): what policy and where and how ? anyway i send you the whole debug log 3 weeks ago above. did you look into that?
Author
Owner

@lfarkas commented on GitHub (Mar 14, 2025):

i've got 7 connected peer and from it i can ping 5 and can't 2. there is only one policy in https://app.netbird.io/access-control the default. if i disable and enable it still 2 can't ping but not the same 2:-)

@lfarkas commented on GitHub (Mar 14, 2025): i've got 7 connected peer and from it i can ping 5 and can't 2. there is only one policy in https://app.netbird.io/access-control the default. if i disable and enable it still 2 can't ping but not the same 2:-)
Author
Owner

@lfarkas commented on GitHub (Mar 14, 2025):

and today i updated all client to 0.38.0

@lfarkas commented on GitHub (Mar 14, 2025): and today i updated all client to 0.38.0
Author
Owner

@lfarkas commented on GitHub (Mar 14, 2025):

after play a bit with policy disable/enable sometimes able to access this critical peer (which is always online in the peer list) for a few minutes or second, but this is never longer the 5 minutes and after then no longer works. after a new disable enable it's works again for a few minutes but with this click i disconnect my whole netbird network...

@lfarkas commented on GitHub (Mar 14, 2025): after play a bit with policy disable/enable sometimes able to access this critical peer (which is always online in the peer list) for a few minutes or second, but this is never longer the 5 minutes and after then no longer works. after a new disable enable it's works again for a few minutes but with this click i disconnect my whole netbird network...
Author
Owner

@mlsmaycon commented on GitHub (Mar 14, 2025):

@lfarkas, we will prepare a debugging version for you to try tomorrow, as it seems like the fixes from recent versions are not helping your case.

@mlsmaycon commented on GitHub (Mar 14, 2025): @lfarkas, we will prepare a debugging version for you to try tomorrow, as it seems like the fixes from recent versions are not helping your case.
Author
Owner

@lfarkas commented on GitHub (Mar 14, 2025):

x86_64 rpm please

@lfarkas commented on GitHub (Mar 14, 2025): x86_64 rpm please
Author
Owner

@mlsmaycon commented on GitHub (Mar 16, 2025):

@lfarkas, you can download the packages from the link:

https://github.com/netbirdio/netbird/actions/runs/13881767994/artifacts/2759669965

this file will have builder artifacts for the PR: https://github.com/netbirdio/netbird/pull/3517. You will find the rpm installer there, too.

In case of an issue, please make sure that the agent is running for at least 10 minutes, then generate a bundle with logs for analyzis with the command:

netbird debug bundle -S

Also, please share which peers the node can't connect to.

@mlsmaycon commented on GitHub (Mar 16, 2025): @lfarkas, you can download the packages from the link: https://github.com/netbirdio/netbird/actions/runs/13881767994/artifacts/2759669965 this file will have builder artifacts for the PR: https://github.com/netbirdio/netbird/pull/3517. You will find the rpm installer there, too. In case of an issue, please make sure that the agent is running for at least 10 minutes, then generate a bundle with logs for analyzis with the command: ```shell netbird debug bundle -S ``` Also, please share which peers the node can't connect to.
Author
Owner

@lfarkas commented on GitHub (Mar 16, 2025):

So I've to install it into one client and not all? And the other client can be the normal 0.38 version?

@lfarkas commented on GitHub (Mar 16, 2025): So I've to install it into one client and not all? And the other client can be the normal 0.38 version?
Author
Owner

@mlsmaycon commented on GitHub (Mar 16, 2025):

If you can install on all affected clients, that will increase our chances of getting helpful logs

@mlsmaycon commented on GitHub (Mar 16, 2025): If you can install on all affected clients, that will increase our chances of getting helpful logs
Author
Owner

@mlsmaycon commented on GitHub (Mar 16, 2025):

@lfarkas, the last build had the potential to cause a panic. You can use this one instead: https://github.com/netbirdio/netbird/actions/runs/13884417374/artifacts/2760236240

@mlsmaycon commented on GitHub (Mar 16, 2025): @lfarkas, the last build had the potential to cause a panic. You can use this one instead: https://github.com/netbirdio/netbird/actions/runs/13884417374/artifacts/2760236240
Author
Owner

@lfarkas commented on GitHub (Mar 16, 2025):

these are both contains the asame commit id: netbird_0.38.1-SNAPSHOT-9c4fdec9_linux_amd64.rpm
anyway before i install it i can't ping 100.76.121.209 (which status is connected)
after i install it ping start to work after about 5 minutes it's no longer works ie ping no longer works.syste
after this i stop the normal systemd service and while i looking into which command to start ping in the other window start to working and turn out something start netbird service!? i stopped again with:
systemctl stop netbird.service
and about a minutes later ping works again and netbird runs again!? why i can't stop it?
after a
systemctl disable --now netbird.service
still start itself in about a minutes. is there any why how can i stop it???
anyway if i'm fast enough:
root@wolf:~# systemctl stop netbird.service ;netbird debug bundle -S
Job for netbird.service canceled.
/tmp/netbird.debug.1526428740.zip
i hope i can run in test mode.
my local netbird ip is: 100.76.24.179

the remote client's:
NetBird IP: 100.76.121.209
Public key: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE=

and of course when i start it in this mode ping is working, but after 183 packet it's no longer works again,
here is the debug output (and i only install this rpm only my local client. if you need it on the remote client too let me know.

@lfarkas commented on GitHub (Mar 16, 2025): these are both contains the asame commit id: netbird_0.38.1-SNAPSHOT-9c4fdec9_linux_amd64.rpm anyway before i install it i can't ping 100.76.121.209 (which status is connected) after i install it ping start to work after about 5 minutes it's no longer works ie ping no longer works.syste after this i stop the normal systemd service and while i looking into which command to start ping in the other window start to working and turn out something start netbird service!? i stopped again with: systemctl stop netbird.service and about a minutes later ping works again and netbird runs again!? why i can't stop it? after a systemctl disable --now netbird.service still start itself in about a minutes. is there any why how can i stop it??? anyway if i'm fast enough: root@wolf:~# systemctl stop netbird.service ;netbird debug bundle -S Job for netbird.service canceled. /tmp/netbird.debug.1526428740.zip i hope i can run in test mode. my local netbird ip is: 100.76.24.179 the remote client's: NetBird IP: 100.76.121.209 Public key: f+tmDAAoOYRUT/WAoJl0PsqalR4zJvt7ljkxZboO9iE= and of course when i start it in this mode ping is working, but after 183 packet it's no longer works again, here is the debug output (and i only install this rpm only my local client. if you need it on the remote client too let me know.
Author
Owner

@lfarkas commented on GitHub (Mar 16, 2025):

netbird.debug.1526428740.zip

@lfarkas commented on GitHub (Mar 16, 2025): [netbird.debug.1526428740.zip](https://github.com/user-attachments/files/19273532/netbird.debug.1526428740.zip)
Author
Owner

@lfarkas commented on GitHub (Mar 16, 2025):

but i don;t know it's a valid output or not since this command return immediately:

root@wolf:~# systemctl stop netbird.service ;netbird debug bundle -S
Job for netbird.service canceled.
/tmp/netbird.debug.1526428740.zip
@lfarkas commented on GitHub (Mar 16, 2025): but i don;t know it's a valid output or not since this command return immediately: ``` root@wolf:~# systemctl stop netbird.service ;netbird debug bundle -S Job for netbird.service canceled. /tmp/netbird.debug.1526428740.zip ```
Author
Owner

@mlsmaycon commented on GitHub (Mar 16, 2025):

@lfarkas, sorry, I didn't get why you tried to stop the agent. The agent should be running and failing when getting the bungle.

@mlsmaycon commented on GitHub (Mar 16, 2025): @lfarkas, sorry, I didn't get why you tried to stop the agent. The agent should be running and failing when getting the bungle.
Author
Owner

@lfarkas commented on GitHub (Mar 16, 2025):

ok but the agent is ALWAYS running since it's not possible to turn it off. imho it's a problem.

here is another dump (when the ping is not working and i'm sure if i restart the service it's working again for a few minutes):
netbird.debug.3319656448.zip

is there anything what can i do?

@lfarkas commented on GitHub (Mar 16, 2025): ok but the agent is ALWAYS running since it's not possible to turn it off. imho it's a problem. here is another dump (when the ping is not working and i'm sure if i restart the service it's working again for a few minutes): [netbird.debug.3319656448.zip](https://github.com/user-attachments/files/19273996/netbird.debug.3319656448.zip) is there anything what can i do?
Author
Owner

@pappz commented on GitHub (Mar 16, 2025):

@lfarkas Szia!
Can we schedule a call to go through some details?

@pappz commented on GitHub (Mar 16, 2025): @lfarkas Szia! Can we schedule a call to go through some details?
Author
Owner

@mlsmaycon commented on GitHub (Apr 17, 2025):

@lfarkas can you confirm if the issue persist with the latest version and rosenpass?

@mlsmaycon commented on GitHub (Apr 17, 2025): @lfarkas can you confirm if the issue persist with the latest version and rosenpass?
Author
Owner

@lfarkas commented on GitHub (Apr 17, 2025):

to be honest i'm not really like to test it. at least not before easter. if i reconfigure my vpn setting to rosen and then still not working i'll no longer be able to access to my office network (which happened before) and there is no way to recover from this state...
may be after easter...

@lfarkas commented on GitHub (Apr 17, 2025): to be honest i'm not really like to test it. at least not before easter. if i reconfigure my vpn setting to rosen and then still not working i'll no longer be able to access to my office network (which happened before) and there is no way to recover from this state... may be after easter...
Author
Owner

@baldy2811 commented on GitHub (May 1, 2025):

Hi there,

i am having exactly the same issue. After a Re-Install it should work for a couple if mins and afterwards its stops working.
After enable/disbale the Policy i get this issues:

client/internal/peer/handshaker.go:79: wait for remote offer confirmation on both servers.

I am running the current Version 0.43.1 on an Debian

@baldy2811 commented on GitHub (May 1, 2025): Hi there, i am having exactly the same issue. After a Re-Install it should work for a couple if mins and afterwards its stops working. After enable/disbale the Policy i get this issues: client/internal/peer/handshaker.go:79: wait for remote offer confirmation on both servers. I am running the current Version 0.43.1 on an Debian
Author
Owner

@baldy2811 commented on GitHub (May 1, 2025):

Hi,

please forget what i said.

Chain fail2ban-SIP (1 references)
target     prot opt source               destination
REJECT     all  --  100.114.165.225      anywhere             reject-with icmp-port-unreachable
REJECT     all  --  100.114.188.68       anywhere             reject-with icmp-port-unreachable

@baldy2811 commented on GitHub (May 1, 2025): Hi, please forget what i said. ``` Chain fail2ban-SIP (1 references) target prot opt source destination REJECT all -- 100.114.165.225 anywhere reject-with icmp-port-unreachable REJECT all -- 100.114.188.68 anywhere reject-with icmp-port-unreachable ```
Author
Owner

@Markovich01 commented on GitHub (Jun 3, 2025):

Hi all,
I just stumbled across this issue and wondered if I would be able to help out future people as I also had similar issues. I wrote about this in my comment on #3852.

We were having random disconnects after we enabled Rosenpass, so to test this theory I disabled Rosenpass across all peers and set a pre-shared key instead. The random disconnects completely stopped after this.

Therefore I would suggest disabling quantum resistance in the hope that doing so will enable your peers to remain connected.

@Markovich01 commented on GitHub (Jun 3, 2025): Hi all, I just stumbled across this issue and wondered if I would be able to help out future people as I also had similar issues. I wrote about this in my [comment](https://github.com/netbirdio/netbird/issues/3852#issuecomment-2917419716) on #3852. We were having random disconnects after we enabled Rosenpass, so to test this theory I disabled Rosenpass across all peers and set a pre-shared key instead. The random disconnects completely stopped after this. Therefore I would suggest disabling quantum resistance in the hope that doing so will enable your peers to remain connected.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SVI/netbird#1625