all clients lose connection to management server #1943

Open
opened 2025-11-20 06:09:53 -05:00 by saavagebueno · 0 comments
Owner

Originally created by @scroguard on GitHub (Jun 7, 2025).

as of yesterday afternoon, all of my clients have lost connectivity to the management server. all of them show the following error when attempting to connect:

2025-06-07T22:08:24Z WARN client/cmd/root.go:257: retrying Login to the Management service in 1.090533137s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key
2025-06-07T22:08:25Z WARN client/cmd/root.go:257: retrying Login to the Management service in 1.56593389s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key
2025-06-07T22:08:27Z WARN client/cmd/root.go:257: retrying Login to the Management service in 1.68838013s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key
2025-06-07T22:08:29Z WARN client/cmd/root.go:257: retrying Login to the Management service in 1.777346067s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key
2025-06-07T22:08:30Z WARN client/cmd/root.go:257: retrying Login to the Management service in 4.633737091s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key
2025-06-07T22:08:35Z WARN client/cmd/root.go:257: retrying Login to the Management service in 10.758309078s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key
2025-06-07T22:08:46Z WARN client/cmd/root.go:257: retrying Login to the Management service in 7.401033378s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key
Error: login backoff cycle failed: rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key

this is a self-hosted install that used the 5-minute quickstart install guide. if i roll the management server vm to a backup i have, everyone comes back online for about an hour and then it all dies again.

To Reproduce

Steps to reproduce the behavior:

  1. unknown - cannot currently determine a cause for the issue.

Expected behavior
clients should stay connected.

Are you using NetBird Cloud? - no. self-hosted that was setup using quickcstart guide

NetBird version server is on latest release, most clients are also on latest release. otherwise it's a mixture of 0.43.2, 0.38.0, 0.39.2 and 0.37.1.

Is any other VPN software installed?

no

Debug output

To help us resolve the problem, please attach the following anonymized status output

netbird status -dA - this doesn't work as a netbird up will not connect.

Create and upload a debug bundle, and share the returned file key:

netbird debug for 1m -AS -U - can't as current status is LoginFailed

Alternatively, create the file only and attach it here manually:

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

Have you tried these troubleshooting steps?

no firewall changes.
tried restarting netbird management services.
tried rebooting netbird management vm.
tried rolling back to a known working backup, clients only stayed online for about an hour and then died again.

Originally created by @scroguard on GitHub (Jun 7, 2025). as of yesterday afternoon, all of my clients have lost connectivity to the management server. all of them show the following error when attempting to connect: 2025-06-07T22:08:24Z WARN client/cmd/root.go:257: retrying Login to the Management service in 1.090533137s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key 2025-06-07T22:08:25Z WARN client/cmd/root.go:257: retrying Login to the Management service in 1.56593389s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key 2025-06-07T22:08:27Z WARN client/cmd/root.go:257: retrying Login to the Management service in 1.68838013s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key 2025-06-07T22:08:29Z WARN client/cmd/root.go:257: retrying Login to the Management service in 1.777346067s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key 2025-06-07T22:08:30Z WARN client/cmd/root.go:257: retrying Login to the Management service in 4.633737091s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key 2025-06-07T22:08:35Z WARN client/cmd/root.go:257: retrying Login to the Management service in 10.758309078s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key 2025-06-07T22:08:46Z WARN client/cmd/root.go:257: retrying Login to the Management service in 7.401033378s due to error rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key Error: login backoff cycle failed: rpc error: code = Unknown desc = getting device authorization flow info failed with error: failed while getting Management Service public key this is a self-hosted install that used the 5-minute quickstart install guide. if i roll the management server vm to a backup i have, everyone comes back online for about an hour and then it all dies again. **To Reproduce** Steps to reproduce the behavior: 1. unknown - cannot currently determine a cause for the issue. **Expected behavior** clients should stay connected. **Are you using NetBird Cloud?** - no. self-hosted that was setup using quickcstart guide **NetBird version** server is on latest release, most clients are also on latest release. otherwise it's a mixture of 0.43.2, 0.38.0, 0.39.2 and 0.37.1. **Is any other VPN software installed?** no **Debug output** To help us resolve the problem, please attach the following anonymized status output netbird status -dA - this doesn't work as a netbird up will not connect. Create and upload a debug bundle, and share the returned file key: netbird debug for 1m -AS -U - can't as current status is LoginFailed Alternatively, create the file only and attach it here manually: **Screenshots** If applicable, add screenshots to help explain your problem. **Additional context** Add any other context about the problem here. **Have you tried these troubleshooting steps?** no firewall changes. tried restarting netbird management services. tried rebooting netbird management vm. tried rolling back to a known working backup, clients only stayed online for about an hour and then died again.
saavagebueno added the triage-needed label 2025-11-20 06:09:53 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SVI/netbird#1943