* Ensure route settlement on iOS before handling DNS responses to prevent bypassing the tunnel.
* add more logs
* rollback debug changes
* rollback changes
* [client] Improve logging and add comments for iOS route settlement logic
- Switch iOS route settlement log level from Debug to Trace for finer control.
- Add clarifying comments for `waitForRouteSettlement` on non-iOS platforms.
---------
Co-authored-by: mlsmaycon <mlsmaycon@gmail.com>
* [client] Batch macOS DNS domains across multiple scutil keys to avoid truncation
scutil has undocumented limits: 99-element cap on d.add arrays and ~2048
byte value buffer for SupplementalMatchDomains. Users with 60+ domains
hit silent domain loss. This applies the same batching approach used on
Windows (nrptMaxDomainsPerRule=50), splitting domains into indexed
resolver keys (NetBird-Match-0, NetBird-Match-1, etc.) with 50-element
and 1500-byte limits per key.
* check for all keys on getRemovableKeysWithDefaults
* use multi error
* Refactor WG endpoint setup with role-based proxy activation
For relay connections, the controller (initiator) now activates the
wgProxy before configuring the WG endpoint, while the non-controller
(responder) configures the endpoint first with a delayed update, then
activates the proxy after. This prevents the responder from sending
traffic through the proxy before WireGuard is ready to receive it,
avoiding handshake congestion when both sides try to initiate
simultaneously.
For ICE connections, pass hasRelayBackup as the setEndpointNow flag
so the responder sets the endpoint immediately when a relay fallback
exists (avoiding the delayed update path since relay is already
available as backup).
On ICE disconnect with relay fallback, remove the duplicate
wgProxyRelay.Work() calls — the relay proxy is already active from
initial setup, so re-activating it is unnecessary.
In EndpointUpdater, split ConfigureWGEndpoint into explicit
configureAsInitiator and configureAsResponder paths, and add the
setEndpointNow parameter to let the caller control whether the
responder applies the endpoint immediately or defers it. Add unused
SwitchWGEndpoint and RemoveEndpointAddress methods. Remove the
wgConfigWorkaround sleep from the relay setup path.
* Fix redundant wgProxyRelay.Work() call during relay fallback setup
* Simplify WireGuard endpoint configuration by removing unused parameters and redundant logic
Set ContinueOnConnectorFailure: true in the embedded Dex config so that the Management server starts successfully even when an external IdP connector is unreachable at boot time.
* Fix WebSocket support by implementing Hijacker interface
Add responsewriter.PassthroughWriter to preserve optional HTTP interfaces
(Hijacker, Flusher, Pusher) when wrapping http.ResponseWriter in middleware.
Without this delegation:
- WebSocket connections fail (can't hijack the connection)
- Streaming breaks (can't flush buffers)
- HTTP/2 push doesn't work
* Add HijackTracker to manage hijacked connections during graceful shutdown
* Refactor HijackTracker to use middleware for tracking hijacked connections
* Refactor server handler chain setup for improved readability and maintainability
When an ICE connection disconnects and falls back to relay, reset the
WireGuard endpoint and handshake watcher if the remote peer's ICE session
has changed. This ensures the controller re-establishes a fresh WireGuard
handshake rather than waiting on a stale endpoint from the previous session.
* Optimize Windows DNS performance with domain batching and batch mode
Implement two-layer optimization to reduce Windows NRPT registry operations:
1. Domain Batching (host_windows.go):
- Batch domains per NRPT
- Reduces NRPT rules by ~97% (e.g., 184 domains: 184 rules → 4 rules)
- Modified addDNSMatchPolicy() to create batched NRPT entries
- Added comprehensive tests in host_windows_test.go
2. Batch Mode (server.go):
- Added BeginBatch/EndBatch methods to defer DNS updates
- Modified RegisterHandler/DeregisterHandler to skip applyHostConfig in batch mode
- Protected all applyHostConfig() calls with batch mode checks
- Updated route manager to wrap route operations with batch calls
* Update tests
* Fix log line
* Fix NRPT rule index to ensure cleanup covers partially created rules
* Ensure NRPT entry count updates even on errors to improve cleanup reliability
* Switch DNS batch mode logging from Info to Debug level
* Fix batch mode to not suppress critical DNS config updates
Batch mode should only defer applyHostConfig() for RegisterHandler/
DeregisterHandler operations. Management updates and upstream nameserver
failures (deactivate/reactivate callbacks) need immediate DNS config
updates regardless of batch mode to ensure timely failover.
Without this fix, if a nameserver goes down during a route update,
the system DNS config won't be updated until EndBatch(), potentially
delaying failover by several seconds.
Or if you prefer a shorter version:
Fix batch mode to allow immediate DNS updates for critical paths
Batch mode now only affects RegisterHandler/DeregisterHandler.
Management updates and nameserver failures always trigger immediate
DNS config updates to ensure timely failover.
* Add DNS batch cancellation to rollback partial changes on errors
Introduces CancelBatch() method to the DNS server interface to handle error
scenarios during batch operations. When route updates fail partway through, the DNS
server can now discard accumulated changes instead of applying partial state. This
prevents leaving the DNS configuration in an inconsistent state when route manager
operations encounter errors.
The changes add error-aware batch handling to prevent partial DNS configuration
updates when route operations fail, which improves system reliability.
Bug Fixes
Network and DNS updates now defer service and reverse-proxy reloads until after account updates complete, preventing inconsistent proxy state and race conditions.
Chores
Removed automatic peer/broadcast updates immediately following bulk service reloads.
Tests
Added a test ensuring network-range changes complete without deadlock.
* Fix race condition and ensure correct message ordering in
connection establishment
Reorder operations in OpenConn to register the connection before
waiting for peer availability. This ensures:
- Connection is ready to receive messages before peer subscription
completes
- Transport messages and onconnected events maintain proper ordering
- No messages are lost during the connection establishment window
- Concurrent OpenConn calls cannot create duplicate connections
If peer availability check fails, the pre-registered connection is
properly cleaned up.
* Handle service shutdown during relay connection initialization
Ensure relay connections are properly cleaned up when the service is not running by verifying `serviceIsRunning` and removing stale entries from `c.conns` to prevent unintended behaviors.
* Refactor relay client Conn/connContainer ownership and decouple Conn from Client
Conn previously held a direct *Client pointer and called client methods
(writeTo, closeConn, LocalAddr) directly, creating a tight bidirectional
coupling. The message channel was also created externally in OpenConn and
shared between Conn and connContainer with unclear ownership.
Now connContainer fully owns the lifecycle of both the channel and the
Conn it wraps:
- connContainer creates the channel (sized by connChannelSize const)
and the Conn internally via newConnContainer
- connContainer feeds messages into the channel (writeMsg), closes and
drains it on shutdown (close)
- Conn reads from the channel (Read) but never closes it
Conn is decoupled from *Client by replacing the *Client field with
three function closures (writeFn, closeFn, localAddrFn) that are wired
by newConnContainer at construction time. Write, Close, and LocalAddr
delegate to these closures. This removes the direct dependency while
keeping the identity-check logic: writeTo and closeConn now compare
connContainer pointers instead of Conn pointers to verify the caller
is the current active connection for that peer.
* Unified NetBird combined server (Management, Signal, Relay, STUN) as a single executable with richer YAML configuration, validation, and defaults.
* Official Dockerfile/image for single-container deployment.
* Optional in-process profiling endpoint for diagnostics.
* Multiplexing to route HTTP/gRPC/WebSocket traffic via one port; runtime hooks to inject custom handlers.
* **Chores**
* Updated deployment scripts, compose files, and reverse-proxy templates to target the combined server; added example configs and getting-started updates.
* Fix race condition and ensure correct message ordering in
connection establishment
Reorder operations in OpenConn to register the connection before
waiting for peer availability. This ensures:
- Connection is ready to receive messages before peer subscription
completes
- Transport messages and onconnected events maintain proper ordering
- No messages are lost during the connection establishment window
- Concurrent OpenConn calls cannot create duplicate connections
If peer availability check fails, the pre-registered connection is
properly cleaned up.
* Handle service shutdown during relay connection initialization
Ensure relay connections are properly cleaned up when the service is not running by verifying `serviceIsRunning` and removing stale entries from `c.conns` to prevent unintended behaviors.
* Add gRPC update debouncing mechanism
Implements backpressure handling for peer network map updates to
efficiently handle rapid changes. First update is sent immediately,
subsequent rapid updates are coalesced, ensuring only the latest
update is sent after a 1-second quiet period.
* Enhance unit test to verify peer count synchronization with debouncing and timeout handling
* Debounce based on type
* Refactor test to validate timer restart after pending update dispatch
* Simplify timer reset for Go 1.23+ automatic channel draining
Remove manual channel drain in resetTimer() since Go 1.23+ automatically
drains the timer channel when Stop() returns false, making the
select-case pattern unnecessary.
- Add WireguardPort option to embed.Options for custom port configuration
- Fix KernelInterface detection to account for netstack mode
- Skip SSH config updates when running in netstack mode
- Skip interface removal wait when running in netstack mode
- Use BindListener for netstack to avoid port conflicts on same host