Kubernetes, CoreDNS: Problem with DNS resolving of custom cluster domain name #964

Open
opened 2025-11-20 05:20:44 -05:00 by saavagebueno · 3 comments
Owner

Originally created by @vaceslav on GitHub (Jun 10, 2024).

Description:
I'm experiencing an issue with NetBird cloud when trying to access custom domain names for Kubernetes services.

Setup:

  • Kubernetes cluster based on k3s with CoreDNS.
  • Internal domain name: cluster.local.
  • Services are accessible under: service-name.namespace.svc.cluster.local.
  • DNS server IP: 10.43.0.10.

For testing, I deployed an NGINX service, accessible at: nginx-service.default.svc.cluster.local.

A NetBird peer is deployed inside the Kubernetes cluster, and the connection is established.

NetBird Configuration:

  • DNS Nameserver: 10.43.0.10
  • Match Domains: svc.cluster.local

With this setup, I can access the NGINX service from my local computer without issues.

Problem:
I have multiple clusters and want to access all of them via NetBird. To achieve this, I added a rewrite rule in CoreDNS:

CoreDNS Config
.:53 {
        errors
        health
        ready
        rewrite name substring dev.compute.local svc.cluster.local
        kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
        }
        hosts /etc/coredns/NodeHosts {
          ttl 60
          reload 15s
          fallthrough
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
        import /etc/coredns/custom/*.override
    }
    import /etc/coredns/custom/*.server

This makes the service available under both:

  • nginx-service.default.svc.cluster.local
  • nginx-service.default.dev.compute.local

I updated the DNS configuration in the NetBird admin UI to include dev.compute.local as a second match domain.

Issue:
The domain nginx-service.default.dev.compute.local is not reachable from my local computer.

Environment:

  • OS: Mac OS
  • Wireguard client installed but disconnected and quit.

Commands Output:

scutil --dns
DNS configuration

resolver #1
  search domain[0] : netbird.cloud
  nameserver[0] : 172.18.0.1
  if_index : 14 (en0)
  flags    : Request A records
  reach    : 0x00020002 (Reachable,Directly Reachable Address)

resolver #2
  domain   : netbird.cloud
  nameserver[0] : 100.121.255.254
  port     : 53
  flags    : Supplemental, Request A records
  reach    : 0x00000002 (Reachable)
  order    : 101600

resolver #3
  domain   : svc.cluster.local
  nameserver[0] : 100.121.255.254
  port     : 53
  flags    : Supplemental, Request A records
  reach    : 0x00000002 (Reachable)
  order    : 102401

resolver #4
  domain   : dev.compute.local
  nameserver[0] : 100.121.255.254
  port     : 53
  flags    : Supplemental, Request A records
  reach    : 0x00000002 (Reachable)
  order    : 102400

resolver #5
  domain   : local
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300000

resolver #6
  domain   : 254.169.in-addr.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300200

resolver #7
  domain   : 8.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300400

resolver #8
  domain   : 9.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300600

resolver #9
  domain   : a.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 300800

resolver #10
  domain   : b.e.f.ip6.arpa
  options  : mdns
  timeout  : 5
  flags    : Request A records
  reach    : 0x00000000 (Not Reachable)
  order    : 301000

DNS configuration (for scoped queries)

resolver #1
  nameserver[0] : 172.18.0.1
  if_index : 14 (en0)
  flags    : Scoped, Request A records
  reach    : 0x00020002 (Reachable,Directly Reachable Address)

dscacheutil -q host -a name nginx-service.default.svc.cluster.local
name: nginx-service.default.svc.cluster.local
ip_address: 10.43.140.140

dscacheutil -q host -a name nginx-service.default.dev.compute.local

dig @10.43.0.10 nginx-service.default.svc.cluster.local

; <<>> DiG 9.10.6 <<>> @10.43.0.10 nginx-service.default.svc.cluster.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46857
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;nginx-service.default.svc.cluster.local. IN A

;; ANSWER SECTION:
nginx-service.default.svc.cluster.local. 5 IN A	10.43.140.140

;; Query time: 331 msec
;; SERVER: 10.43.0.10#53(10.43.0.10)
;; WHEN: Mon Jun 10 18:29:51 CEST 2024
;; MSG SIZE  rcvd: 123

dig @10.43.0.10 nginx-service.default.dev.compute.local

; <<>> DiG 9.10.6 <<>> @10.43.0.10 nginx-service.default.dev.compute.local
; (1 server found)
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56850
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;nginx-service.default.dev.compute.local. IN A

;; ANSWER SECTION:
nginx-service.default.svc.cluster.local. 5 IN A	10.43.140.140

;; Query time: 1514 msec
;; SERVER: 10.43.0.10#53(10.43.0.10)
;; WHEN: Mon Jun 10 18:31:20 CEST 2024
;; MSG SIZE  rcvd: 123

So you can see that a direct request to the nameserver delivers the correct answer!!!!!

Expected Behavior:
The address nginx-service.default.dev.compute.local should be accessible from my local computer.

Originally created by @vaceslav on GitHub (Jun 10, 2024). **Description:** I'm experiencing an issue with NetBird cloud when trying to access custom domain names for Kubernetes services. **Setup:** - Kubernetes cluster based on k3s with CoreDNS. - Internal domain name: **cluster.local**. - Services are accessible under: `service-name.namespace.svc.cluster.local`. - DNS server IP: 10.43.0.10. For testing, I deployed an NGINX service, accessible at: `nginx-service.default.svc.cluster.local`. A NetBird peer is deployed inside the Kubernetes cluster, and the connection is established. **NetBird Configuration:** - DNS Nameserver: 10.43.0.10 - Match Domains: svc.cluster.local With this setup, I can access the NGINX service from my local computer without issues. **Problem:** I have multiple clusters and want to access all of them via NetBird. To achieve this, I added a [rewrite rule](https://coredns.io/plugins/rewrite/#name-field-rewrites) in CoreDNS: <details> <summary>CoreDNS Config</summary> ``` .:53 { errors health ready rewrite name substring dev.compute.local svc.cluster.local kubernetes cluster.local in-addr.arpa ip6.arpa { pods insecure fallthrough in-addr.arpa ip6.arpa } hosts /etc/coredns/NodeHosts { ttl 60 reload 15s fallthrough } prometheus :9153 forward . /etc/resolv.conf cache 30 loop reload loadbalance import /etc/coredns/custom/*.override } import /etc/coredns/custom/*.server ``` </details> This makes the service available under both: - `nginx-service.default.svc.cluster.local` - `nginx-service.default.dev.compute.local` I updated the DNS configuration in the NetBird admin UI to include `dev.compute.local` as a second match domain. **Issue:** The domain `nginx-service.default.dev.compute.local` is not reachable from my local computer. **Environment:** - OS: Mac OS - Wireguard client installed but disconnected and quit. **Commands Output:** <details> <summary>scutil --dns</summary> ``` DNS configuration resolver #1 search domain[0] : netbird.cloud nameserver[0] : 172.18.0.1 if_index : 14 (en0) flags : Request A records reach : 0x00020002 (Reachable,Directly Reachable Address) resolver #2 domain : netbird.cloud nameserver[0] : 100.121.255.254 port : 53 flags : Supplemental, Request A records reach : 0x00000002 (Reachable) order : 101600 resolver #3 domain : svc.cluster.local nameserver[0] : 100.121.255.254 port : 53 flags : Supplemental, Request A records reach : 0x00000002 (Reachable) order : 102401 resolver #4 domain : dev.compute.local nameserver[0] : 100.121.255.254 port : 53 flags : Supplemental, Request A records reach : 0x00000002 (Reachable) order : 102400 resolver #5 domain : local options : mdns timeout : 5 flags : Request A records reach : 0x00000000 (Not Reachable) order : 300000 resolver #6 domain : 254.169.in-addr.arpa options : mdns timeout : 5 flags : Request A records reach : 0x00000000 (Not Reachable) order : 300200 resolver #7 domain : 8.e.f.ip6.arpa options : mdns timeout : 5 flags : Request A records reach : 0x00000000 (Not Reachable) order : 300400 resolver #8 domain : 9.e.f.ip6.arpa options : mdns timeout : 5 flags : Request A records reach : 0x00000000 (Not Reachable) order : 300600 resolver #9 domain : a.e.f.ip6.arpa options : mdns timeout : 5 flags : Request A records reach : 0x00000000 (Not Reachable) order : 300800 resolver #10 domain : b.e.f.ip6.arpa options : mdns timeout : 5 flags : Request A records reach : 0x00000000 (Not Reachable) order : 301000 DNS configuration (for scoped queries) resolver #1 nameserver[0] : 172.18.0.1 if_index : 14 (en0) flags : Scoped, Request A records reach : 0x00020002 (Reachable,Directly Reachable Address) ``` </details> <details> <summary>dscacheutil -q host -a name nginx-service.default.svc.cluster.local</summary> ``` name: nginx-service.default.svc.cluster.local ip_address: 10.43.140.140 ``` </details> <details> <summary>dscacheutil -q host -a name nginx-service.default.dev.compute.local</summary> ``` ``` </details> <details> <summary>dig @10.43.0.10 nginx-service.default.svc.cluster.local</summary> ``` ; <<>> DiG 9.10.6 <<>> @10.43.0.10 nginx-service.default.svc.cluster.local ; (1 server found) ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46857 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;nginx-service.default.svc.cluster.local. IN A ;; ANSWER SECTION: nginx-service.default.svc.cluster.local. 5 IN A 10.43.140.140 ;; Query time: 331 msec ;; SERVER: 10.43.0.10#53(10.43.0.10) ;; WHEN: Mon Jun 10 18:29:51 CEST 2024 ;; MSG SIZE rcvd: 123 ``` </details> <details> <summary>dig @10.43.0.10 nginx-service.default.dev.compute.local</summary> ``` ; <<>> DiG 9.10.6 <<>> @10.43.0.10 nginx-service.default.dev.compute.local ; (1 server found) ;; global options: +cmd ;; Got answer: ;; WARNING: .local is reserved for Multicast DNS ;; You are currently testing what happens when an mDNS query is leaked to DNS ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 56850 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; WARNING: recursion requested but not available ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;nginx-service.default.dev.compute.local. IN A ;; ANSWER SECTION: nginx-service.default.svc.cluster.local. 5 IN A 10.43.140.140 ;; Query time: 1514 msec ;; SERVER: 10.43.0.10#53(10.43.0.10) ;; WHEN: Mon Jun 10 18:31:20 CEST 2024 ;; MSG SIZE rcvd: 123 ``` </details> So you can see that a direct request to the nameserver delivers the correct answer!!!!! **Expected Behavior:** The address `nginx-service.default.dev.compute.local` should be accessible from my local computer.
saavagebueno added the clientwaiting-feedbackdns labels 2025-11-20 05:20:44 -05:00
Author
Owner

@vaceslav commented on GitHub (Jun 11, 2024):

UPDATE
It looks like a NetBird + Mac OS issue. Under Windows everything works.
The domain nginx-service.default.dev.compute.local is reachable and ping works.

Any suggestions?

@vaceslav commented on GitHub (Jun 11, 2024): **UPDATE** It looks like a NetBird + Mac OS issue. Under Windows everything works. The domain `nginx-service.default.dev.compute.local` is reachable and ping works. Any suggestions?
Author
Owner

@nazarewk commented on GitHub (Apr 23, 2025):

@vaceslav is it still an issue with the latest NetBird version? We have fixed many DNS issues since then.

@nazarewk commented on GitHub (Apr 23, 2025): @vaceslav is it still an issue with the latest NetBird version? We have fixed many DNS issues since then.
Author
Owner

@vaceslav commented on GitHub (Apr 23, 2025):

Hi @nazarewk unfortunately I can not test it anymore. Because it doesn't work, I switched to another tool.

@vaceslav commented on GitHub (Apr 23, 2025): Hi @nazarewk unfortunately I can not test it anymore. Because it doesn't work, I switched to another tool.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SVI/netbird#964