DNS Name Resolution of internal hostnames works only a few hours #937

Closed
opened 2025-11-20 05:20:14 -05:00 by saavagebueno · 7 comments
Owner

Originally created by @jogrie on GitHub (May 28, 2024).

Describe the problem

Hi there,

i want to use Netbird to connect my homelab to my cloud server.

I am using internal Domain Names and have problems to resolve these names.

the setup looks like this

cloud server -> netbird connection -> home server -> netbird route -> dns server

on the home server i have ntfy.sh running so i can send me notifications when e.g. a backup is done.

my acl look like this

cloud server <-> home server - allow ping
cloud server -> home server - allow http / https

my network routes look like this

name: dns
dst: 192.168.178.105/32
router: home server

name: reverse proxy
dst 192.168.178.100/32 (home server ip)
router: home server

my dns config for the cloud server
192.168.178.105
All Domains

Both servers are debian 12 servers
at the moment the connection is only used once at night to send a notification for a finished backup

when connecting the cloud to the home server dns resolution works fine
i can ping the home server with the domain name ntfy.home.example.com

curl -d "Test" ntfy.home.example.com/test
works fine

now the problem

after a few hours / days the dns resolution is not working anymore

if i ping the ip 192.168.178.100 it works fine
if i ping the hostname - unknown hostname

seems like the dns resolution is gone

in the netbird status -d it says Nameserver available (see below)

To Reproduce

Steps to reproduce the behavior:

see above

Expected behavior

I expectet that the dns resolution is working not only a few hours

Are you using NetBird Cloud?

Yes

NetBird version

0.27.3 on both sides

NetBird status -d output:

netbird status -d
Peers detail:
 homeserver.netbird.cloud:
  NetBird IP: 100.74.168.78
  Public key: 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY=
  Status: Connected
  -- detail --
  Connection type: P2P
  Direct: true
  ICE candidate (Local/Remote): host/prflx
  ICE candidate endpoints (Local/Remote): XXX.XXX.XXX.XXX:51820/XXX.XXX.XXX.XXX:51820
  Last connection update: 2024-05-26 21:26:03
  Last WireGuard handshake: 2024-05-28 21:12:06
  Transfer status (received/sent) 366.8 KiB/247.9 KiB
  Quantum resistance: false
  Routes: 192.168.178.100/32, 192.168.178.105/32
  Latency: 11.045915ms

Daemon version: 0.27.3
CLI version: 0.27.3
Management: Connected to https://api.netbird.io:443
Signal: Connected to https://signal.netbird.io:443
Relays: 
  [stun:stun.netbird.io:5555] is Available
  [turns:turn.netbird.io:443?transport=tcp] is Available
Nameservers: 
  [192.168.178.105:53] for [.] is Available
FQDN: cloudserver.netbird.cloud
NetBird IP: 100.74.138.188/16
Interface type: Kernel
Quantum resistance: false
Routes: -
Peers count: 1/1 Connected
Originally created by @jogrie on GitHub (May 28, 2024). **Describe the problem** Hi there, i want to use Netbird to connect my homelab to my cloud server. I am using internal Domain Names and have problems to resolve these names. the setup looks like this cloud server -> netbird connection -> home server -> netbird route -> dns server on the home server i have ntfy.sh running so i can send me notifications when e.g. a backup is done. my acl look like this cloud server <-> home server - allow ping cloud server -> home server - allow http / https my network routes look like this name: dns dst: 192.168.178.105/32 router: home server name: reverse proxy dst 192.168.178.100/32 (home server ip) router: home server my dns config for the cloud server 192.168.178.105 All Domains Both servers are debian 12 servers at the moment the connection is only used once at night to send a notification for a finished backup when connecting the cloud to the home server dns resolution works fine i can ping the home server with the domain name ntfy.home.example.com curl -d "Test" ntfy.home.example.com/test works fine now the problem after a few hours / days the dns resolution is not working anymore if i ping the ip 192.168.178.100 it works fine if i ping the hostname - unknown hostname seems like the dns resolution is gone in the netbird status -d it says Nameserver available (see below) **To Reproduce** Steps to reproduce the behavior: see above **Expected behavior** I expectet that the dns resolution is working not only a few hours **Are you using NetBird Cloud?** Yes **NetBird version** 0.27.3 on both sides **NetBird status -d output:** ``` netbird status -d Peers detail: homeserver.netbird.cloud: NetBird IP: 100.74.168.78 Public key: 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY= Status: Connected -- detail -- Connection type: P2P Direct: true ICE candidate (Local/Remote): host/prflx ICE candidate endpoints (Local/Remote): XXX.XXX.XXX.XXX:51820/XXX.XXX.XXX.XXX:51820 Last connection update: 2024-05-26 21:26:03 Last WireGuard handshake: 2024-05-28 21:12:06 Transfer status (received/sent) 366.8 KiB/247.9 KiB Quantum resistance: false Routes: 192.168.178.100/32, 192.168.178.105/32 Latency: 11.045915ms Daemon version: 0.27.3 CLI version: 0.27.3 Management: Connected to https://api.netbird.io:443 Signal: Connected to https://signal.netbird.io:443 Relays: [stun:stun.netbird.io:5555] is Available [turns:turn.netbird.io:443?transport=tcp] is Available Nameservers: [192.168.178.105:53] for [.] is Available FQDN: cloudserver.netbird.cloud NetBird IP: 100.74.138.188/16 Interface type: Kernel Quantum resistance: false Routes: - Peers count: 1/1 Connected ```
saavagebueno added the bugclientdnscloud labels 2025-11-20 05:20:14 -05:00
Author
Owner

@bcmmbaga commented on GitHub (May 29, 2024):

Hello @jogrie, could you update to version 0.27.10 test it again, and check if you still encounter the same issue?

@bcmmbaga commented on GitHub (May 29, 2024): Hello @jogrie, could you update to version `0.27.10` test it again, and check if you still encounter the same issue?
Author
Owner

@jogrie commented on GitHub (May 30, 2024):

Hi,
i just updated to the version 0.27.10.
i will give you an update in a few days if the error occours again.

@jogrie commented on GitHub (May 30, 2024): Hi, i just updated to the version 0.27.10. i will give you an update in a few days if the error occours again.
Author
Owner

@jogrie commented on GitHub (May 31, 2024):

Hi,
yesterday i just updated to the version 0.27.10.

at first moment the dns resolution worked fine but at night, there was no notification.

same error as described above.
i can ping the machine with the internal ip but the name resolution is not working.

here the client.log from the cloud server

2024-05-30T12:01:25+02:00 WARN [error: read udp 100.74.138.188:54773->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream
2024-05-30T12:01:25+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout
2024-05-30T12:01:30+02:00 WARN [error: read udp 100.74.138.188:34181->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream
2024-05-30T12:01:30+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout
2024-05-30T15:37:29+02:00 WARN management/client/grpc.go:162: disconnected from the Management service but will retry silently. Reason: rpc error: code = Internal desc = stream terminated by RST_STREAM with error code: INTERNAL_ERROR
2024-05-30T15:37:30+02:00 INFO management/client/grpc.go:147: connected to the Management Service stream
2024-05-30T15:37:30+02:00 INFO client/internal/acl/manager.go:52: ACL rules processed in: 4.510397ms, total rules count: 7
2024-05-30T15:44:20+02:00 WARN [error: read udp 100.74.138.188:44916->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream
2024-05-30T15:44:20+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout
2024-05-30T15:44:25+02:00 WARN [error: read udp 100.74.138.188:33462->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream
2024-05-30T15:44:25+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout
2024-05-30T17:33:09+02:00 WARN signal/client/grpc.go:171: disconnected from the Signal service but will retry silently. Reason: rpc error: code = Unavailable desc = closing transport due to: connection error: desc = "error reading from server: EOF", received prior goaway: code: NO_ERROR, debug data: "server_shutting_down"
2024-05-30T17:33:10+02:00 INFO signal/client/grpc.go:158: connected to the Signal Service stream
2024-05-30T18:10:07+02:00 WARN [upstream: 192.168.247.105:53, error: read udp 100.74.138.188:57893->192.168.247.105:53: i/o timeout] client/internal/dns/upstream.go:101: got an error while connecting to upstream
2024-05-30T18:10:07+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout
2024-05-30T18:10:12+02:00 WARN [error: read udp 100.74.138.188:50975->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream
2024-05-30T18:10:12+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout
2024-05-30T18:10:17+02:00 WARN [upstream: 192.168.247.105:53, error: read udp 100.74.138.188:37623->192.168.247.105:53: i/o timeout] client/internal/dns/upstream.go:101: got an error while connecting to upstream
2024-05-30T18:10:17+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout
2024-05-30T23:00:55+02:00 WARN signal/client/grpc.go:171: disconnected from the Signal service but will retry silently. Reason: rpc error: code = Internal desc = server closed the stream without sending trailers
2024-05-30T23:00:59+02:00 INFO signal/client/grpc.go:158: connected to the Signal Service stream
2024-05-31T03:09:46+02:00 WARN client/internal/routemanager/client.go:154: the network 192.168.247.105/32 has not been assigned a routing peer as no peers from the list [9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY=] are currently connected
2024-05-31T03:09:46+02:00 WARN client/internal/routemanager/client.go:154: the network 192.168.247.100/32 has not been assigned a routing peer as no peers from the list [9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY=] are currently connected
2024-05-31T03:09:48+02:00 ERRO client/internal/peer/conn.go:630: failed signaling candidate to the remote peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY= no connection to signal
2024-05-31T03:09:48+02:00 ERRO client/internal/peer/conn.go:630: failed signaling candidate to the remote peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY= no connection to signal
2024-05-31T03:09:49+02:00 INFO client/internal/peer/conn.go:388: connected to peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY=, endpoint address: 87.186.125.102:51820
2024-05-31T03:09:49+02:00 INFO client/internal/routemanager/client.go:165: new chosen route is cn0aq4bl0ubs73elq6s0:cmpojjqfic3c73eq9mtg with peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY= with score 2.988673 for network 192.168.247.105/32
2024-05-31T03:09:49+02:00 INFO client/internal/routemanager/client.go:165: new chosen route is cmpp0cqfic3c73eq9ni0:cmpojjqfic3c73eq9mtg with peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY= with score 2.988673 for network 192.168.247.100/32
2024-05-31T03:37:30+02:00 WARN management/client/grpc.go:162: disconnected from the Management service but will retry silently. Reason: rpc error: code = Internal desc = server closed the stream without sending trailers
2024-05-31T03:37:31+02:00 INFO management/client/grpc.go:147: connected to the Management Service stream
2024-05-31T03:37:31+02:00 INFO client/internal/acl/manager.go:52: ACL rules processed in: 9.845559ms, total rules count: 7```
@jogrie commented on GitHub (May 31, 2024): Hi, yesterday i just updated to the version 0.27.10. at first moment the dns resolution worked fine but at night, there was no notification. same error as described above. i can ping the machine with the internal ip but the name resolution is not working. here the client.log from the cloud server ``` 2024-05-30T12:01:25+02:00 WARN [error: read udp 100.74.138.188:54773->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream 2024-05-30T12:01:25+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout 2024-05-30T12:01:30+02:00 WARN [error: read udp 100.74.138.188:34181->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream 2024-05-30T12:01:30+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout 2024-05-30T15:37:29+02:00 WARN management/client/grpc.go:162: disconnected from the Management service but will retry silently. Reason: rpc error: code = Internal desc = stream terminated by RST_STREAM with error code: INTERNAL_ERROR 2024-05-30T15:37:30+02:00 INFO management/client/grpc.go:147: connected to the Management Service stream 2024-05-30T15:37:30+02:00 INFO client/internal/acl/manager.go:52: ACL rules processed in: 4.510397ms, total rules count: 7 2024-05-30T15:44:20+02:00 WARN [error: read udp 100.74.138.188:44916->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream 2024-05-30T15:44:20+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout 2024-05-30T15:44:25+02:00 WARN [error: read udp 100.74.138.188:33462->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream 2024-05-30T15:44:25+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout 2024-05-30T17:33:09+02:00 WARN signal/client/grpc.go:171: disconnected from the Signal service but will retry silently. Reason: rpc error: code = Unavailable desc = closing transport due to: connection error: desc = "error reading from server: EOF", received prior goaway: code: NO_ERROR, debug data: "server_shutting_down" 2024-05-30T17:33:10+02:00 INFO signal/client/grpc.go:158: connected to the Signal Service stream 2024-05-30T18:10:07+02:00 WARN [upstream: 192.168.247.105:53, error: read udp 100.74.138.188:57893->192.168.247.105:53: i/o timeout] client/internal/dns/upstream.go:101: got an error while connecting to upstream 2024-05-30T18:10:07+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout 2024-05-30T18:10:12+02:00 WARN [error: read udp 100.74.138.188:50975->192.168.247.105:53: i/o timeout, upstream: 192.168.247.105:53] client/internal/dns/upstream.go:101: got an error while connecting to upstream 2024-05-30T18:10:12+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout 2024-05-30T18:10:17+02:00 WARN [upstream: 192.168.247.105:53, error: read udp 100.74.138.188:37623->192.168.247.105:53: i/o timeout] client/internal/dns/upstream.go:101: got an error while connecting to upstream 2024-05-30T18:10:17+02:00 ERRO client/internal/dns/upstream.go:133: all queries to the upstream nameservers failed with timeout 2024-05-30T23:00:55+02:00 WARN signal/client/grpc.go:171: disconnected from the Signal service but will retry silently. Reason: rpc error: code = Internal desc = server closed the stream without sending trailers 2024-05-30T23:00:59+02:00 INFO signal/client/grpc.go:158: connected to the Signal Service stream 2024-05-31T03:09:46+02:00 WARN client/internal/routemanager/client.go:154: the network 192.168.247.105/32 has not been assigned a routing peer as no peers from the list [9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY=] are currently connected 2024-05-31T03:09:46+02:00 WARN client/internal/routemanager/client.go:154: the network 192.168.247.100/32 has not been assigned a routing peer as no peers from the list [9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY=] are currently connected 2024-05-31T03:09:48+02:00 ERRO client/internal/peer/conn.go:630: failed signaling candidate to the remote peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY= no connection to signal 2024-05-31T03:09:48+02:00 ERRO client/internal/peer/conn.go:630: failed signaling candidate to the remote peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY= no connection to signal 2024-05-31T03:09:49+02:00 INFO client/internal/peer/conn.go:388: connected to peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY=, endpoint address: 87.186.125.102:51820 2024-05-31T03:09:49+02:00 INFO client/internal/routemanager/client.go:165: new chosen route is cn0aq4bl0ubs73elq6s0:cmpojjqfic3c73eq9mtg with peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY= with score 2.988673 for network 192.168.247.105/32 2024-05-31T03:09:49+02:00 INFO client/internal/routemanager/client.go:165: new chosen route is cmpp0cqfic3c73eq9ni0:cmpojjqfic3c73eq9mtg with peer 9d/Y7q0AmiJ70OdL7CRGuiP5NkOk3SFOevkDTHqdjyY= with score 2.988673 for network 192.168.247.100/32 2024-05-31T03:37:30+02:00 WARN management/client/grpc.go:162: disconnected from the Management service but will retry silently. Reason: rpc error: code = Internal desc = server closed the stream without sending trailers 2024-05-31T03:37:31+02:00 INFO management/client/grpc.go:147: connected to the Management Service stream 2024-05-31T03:37:31+02:00 INFO client/internal/acl/manager.go:52: ACL rules processed in: 9.845559ms, total rules count: 7```
Author
Owner

@arthur-trt commented on GitHub (Jun 4, 2024):

I had this problem on Linux because of NetworkManager which rewrite /etc/resolv.conf multiple time per day.
You could try to disable it : https://askubuntu.com/a/1140591

@arthur-trt commented on GitHub (Jun 4, 2024): I had this problem on Linux because of NetworkManager which rewrite `/etc/resolv.conf` multiple time per day. You could try to disable it : https://askubuntu.com/a/1140591
Author
Owner

@jogrie commented on GitHub (Jun 5, 2024):

Hi Arthur,

thanks for your suggestion.
But it seems that NetworkManager is not active

systemctl status NetworkManager
Unit NetworkManager.service could not be found
@jogrie commented on GitHub (Jun 5, 2024): Hi Arthur, thanks for your suggestion. But it seems that NetworkManager is not active ``` systemctl status NetworkManager Unit NetworkManager.service could not be found ```
Author
Owner

@nazarewk commented on GitHub (Apr 23, 2025):

@jogrie were you able to resolve your issue?

I suspect some operating-system/lan router configuration issue independent of NetBird, but just in case did you try upgrading to the latest NetBird version to check if it helps? I observed issues like this myself when my the DHCP/DDNS were not automatically renewing leases.

@nazarewk commented on GitHub (Apr 23, 2025): @jogrie were you able to resolve your issue? I suspect some operating-system/lan router configuration issue independent of NetBird, but just in case did you try upgrading to the latest NetBird version to check if it helps? I observed issues like this myself when my the DHCP/DDNS were not automatically renewing leases.
Author
Owner

@jogrie commented on GitHub (May 2, 2025):

@nazarewk i upgraded a few times and it didn't work.

but switching to NixOS fixed my problem.

@jogrie commented on GitHub (May 2, 2025): @nazarewk i upgraded a few times and it didn't work. but switching to NixOS fixed my problem.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SVI/netbird#937