Config Option to Tell Netbird to Exit When Tunnel Becomes Invalid #492

Open
opened 2025-11-20 05:12:18 -05:00 by saavagebueno · 0 comments
Owner

Originally created by @othocaes on GitHub (Oct 30, 2023).

Is your feature request related to a problem? Please describe.
We are running netbird containers in an HA setup to spin up tunnels on demand. The flow is like this:

  • serverless function is called which wants to connect to a backend system only available through some remote peer
    • it connects to a docker engine node through our keepalived virtual IP
      • it checks if there is a docker container associated with the remote peer already running
        • if not, call another function to create a new netbird container with a tunnel open to the remote peer
      • return IP address of netbird container with the tunnel to the remote peer
    • using collected information, connect to the remote peer using the netbird container associated with that peer

The idea is that each time the function is called, it is checking whether a container exists with a valid tunnel, and uses that container if it exists, or creates a new one if it does not, and leaves that new one online as long as it remains valid.

By keeping the containers alive as long as they are valid, we are trying to avoid creating a large number of peers, which would happen if we generated a new peer for each function call.

During our testing, we found that the netbird process inside a docker container does report that the peer network is invalid when we revoke a key, but the process does not exit. We didn't see an option for this in the netbird configuration.

Describe the solution you'd like
We would like the option for the netbird process to exit if the key it's using is revoked, or the tunnel otherwise becomes invalid. This would cause any container with an invalid tunnel to automatically die, making management of the system described above very simple.

Describe alternatives you've considered
We've considered writing a docker container which wraps the netbird process in a command that monitors the tunnel status, and exits (thus killing the container) when the tunnel becomes invalid. This solution is OK but is something of an anti-pattern with docker and requires extra maintainence: in our view it's best to use the docker process status functionality directly.

We also considered creating a docker container based on the netbird container that runs netbird in the background and changes the entrypoint to be the command we run to connect to the remote host behind the tunnel. This solution would generate a new peer for each function call, which we are trying to avoid.

We also considered writing a validation step when the function checks for an existing container, to see if the tunnel is valid. This has the same problem as the first alternative: it requires extra maintenance and is something of an anti-pattern with docker.

Additional context
Hopefully this is pretty clear. We have netbird peers deployed out in many DCs. We want to send commands through them using serverless functions. The foundational issue is that these DCs may have conflicting RFC 1918 networks, and so we need to be able to spin up isolated, point-to-point peered networks between these peers, rather than leaving all the connections open in one place, where the conflicts would cause routing issues. We're trying to create essentially an HA outbound operator that can connect on demand to any netbird peer within the domain, leaving those tunnels open and reusing them as possible, but being able to automatically recover from key migrations or revoking-and-generating-new-key events by relying on the containers automatically shutting themselves down after those events.

Originally created by @othocaes on GitHub (Oct 30, 2023). **Is your feature request related to a problem? Please describe.** We are running netbird containers in an HA setup to spin up tunnels on demand. The flow is like this: - serverless function is called which wants to connect to a backend system only available through some remote peer - it connects to a docker engine node through our keepalived virtual IP - it checks if there is a docker container associated with the remote peer already running - if not, call another function to create a new netbird container with a tunnel open to the remote peer - return IP address of netbird container with the tunnel to the remote peer - using collected information, connect to the remote peer using the netbird container associated with that peer The idea is that each time the function is called, it is checking whether a container exists with a valid tunnel, and uses that container if it exists, or creates a new one if it does not, and leaves that new one online as long as it remains valid. By keeping the containers alive as long as they are valid, we are trying to avoid creating a large number of peers, which would happen if we generated a new peer for each function call. During our testing, we found that the netbird process inside a docker container does report that the peer network is invalid when we revoke a key, but the process does not exit. We didn't see an option for this in the netbird configuration. **Describe the solution you'd like** We would like the option for the netbird process to exit if the key it's using is revoked, or the tunnel otherwise becomes invalid. This would cause any container with an invalid tunnel to automatically die, making management of the system described above very simple. **Describe alternatives you've considered** We've considered writing a docker container which wraps the netbird process in a command that monitors the tunnel status, and exits (thus killing the container) when the tunnel becomes invalid. This solution is OK but is something of an anti-pattern with docker and requires extra maintainence: in our view it's best to use the docker process status functionality directly. We also considered creating a docker container based on the netbird container that runs netbird in the background and changes the entrypoint to be the command we run to connect to the remote host behind the tunnel. This solution would generate a new peer for each function call, which we are trying to avoid. We also considered writing a validation step when the function checks for an existing container, to see if the tunnel is valid. This has the same problem as the first alternative: it requires extra maintenance and is something of an anti-pattern with docker. **Additional context** Hopefully this is pretty clear. We have netbird peers deployed out in many DCs. We want to send commands through them using serverless functions. The foundational issue is that these DCs may have conflicting RFC 1918 networks, and so we need to be able to spin up isolated, point-to-point peered networks between these peers, rather than leaving all the connections open in one place, where the conflicts would cause routing issues. We're trying to create essentially an HA outbound operator that can connect on demand to any netbird peer within the domain, leaving those tunnels open and reusing them as possible, but being able to automatically recover from key migrations or revoking-and-generating-new-key events by relying on the containers automatically shutting themselves down after those events.
saavagebueno added the feature-request label 2025-11-20 05:12:18 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SVI/netbird#492