Routing is not working properly when the Access Control groups have a group of Resources as destination #2145

Open
opened 2025-11-20 07:04:44 -05:00 by saavagebueno · 4 comments
Owner

Originally created by @janosmiko on GitHub (Aug 1, 2025).

Describe the problem, steps to reproduce

My network setup looks like this:

I have one network with two custom routes:

  • 10.1.0.0/16 through bastion-prod (prod vpc) - metric: 10
  • 10.2.0.0/16 through bastion-dev (dev vpc) - metric: 10

I also have two custom resources in the network (domain-based resources):

  • postgresql-prod.blabla.us-east-1.rds.amazonaws.com (Group: Endpoints)
  • postgresql-dev.blabla.us-east-1.rds.amazonaws.com (Group: Endpoints)

Without Access Control groups in the Network Routes I see the following routing:

$ sudo tcptraceroute postgresql-prod.blabla.us-east-1.rds.amazonaws.com 5432
Selected device utun100, address 100.108.197.2, port 51025 for outgoing packets
Tracing the path to postgresql-prod.blabla.us-east-1.rds.amazonaws.com (10.1.162.226) on TCP port 5432 (postgresql), 30 hops max
 1  bastion-prod.netbird.internal (100.108.114.218)  147.007 ms  146.808 ms  147.338 ms
 2  10.1.162.226 [open]  146.176 ms  149.223 ms  150.058 ms

$ sudo tcptraceroute postgresql-dev.blabla.us-east-1.rds.amazonaws.com 5432
Password:
Selected device utun100, address 100.108.197.2, port 50684 for outgoing packets
Tracing the path to postgresql-dev.blabla.us-east-1.rds.amazonaws.com (10.2.160.41) on TCP port 5432 (postgresql), 30 hops max
 1  bastion-dev.netbird.internal (100.108.251.207)  147.152 ms  144.849 ms  145.755 ms
 2  10.2.160.41 [open]  147.141 ms  149.986 ms  146.859 ms

But after I add the Endpoints group to the Access Control groups on the Network Routes, it wants to route the prod postgres through the dev network and it fails as the following:

$ sudo tcptraceroute postgresql-prod.blabla.us-east-1.rds.amazonaws.com 5432
Selected device utun100, address 100.108.197.2, port 51517 for outgoing packets
Tracing the path to postgresql-prod.blabla.us-east-1.rds.amazonaws.com (10.1.162.226) on TCP port 5432 (postgresql), 30 hops max
 1  bastion-dev.netbird.internal (100.108.251.207)  145.540 ms  146.024 ms  145.709 ms

Relevant policies:

  • Users to Bastions: My Users Group <-> Bastions Group (allowing all traffic) <- This one works (if no access control is set)
  • Users to Endpoints: My Users Group <-> Endpoints Group (allowing all traffic) <- FAILS
  • Users to Both: My Users Group <-> Bastions Group, Endpoints Group (allowing all traffic) <- FAILS

Steps to reproduce:

  • When I add the Endpoints Group to the Access Control groups of the Network Routes and I try to use the second or third policy, the routing is not distributed.

Expected behavior

Have the same routes as if we don't have the Access Control set for the Network Routes

Are you using NetBird Cloud?

Self-host NetBird

NetBird version

netbird management/signal/relay: v0.52.2
bastion clients: v0.52.2
macos desktop client: v0.52.2

Is any other VPN software installed?

Nope

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

Have you tried these troubleshooting steps?

  • Reviewed client troubleshooting (if applicable)
  • Checked for newer NetBird versions
  • Searched for similar issues on GitHub (including closed ones)
  • Restarted the NetBird client
  • Disabled other VPN software
  • Checked firewall settings
Originally created by @janosmiko on GitHub (Aug 1, 2025). **Describe the problem, steps to reproduce** My network setup looks like this: I have one network with two custom routes: - `10.1.0.0/16` through bastion-prod (prod vpc) - metric: 10 - `10.2.0.0/16` through bastion-dev (dev vpc) - metric: 10 I also have two custom resources in the network (domain-based resources): - postgresql-prod.blabla.us-east-1.rds.amazonaws.com (Group: **Endpoints**) - postgresql-dev.blabla.us-east-1.rds.amazonaws.com (Group: **Endpoints**) **Without** Access Control groups in the Network Routes I see the following routing: ```console $ sudo tcptraceroute postgresql-prod.blabla.us-east-1.rds.amazonaws.com 5432 Selected device utun100, address 100.108.197.2, port 51025 for outgoing packets Tracing the path to postgresql-prod.blabla.us-east-1.rds.amazonaws.com (10.1.162.226) on TCP port 5432 (postgresql), 30 hops max 1 bastion-prod.netbird.internal (100.108.114.218) 147.007 ms 146.808 ms 147.338 ms 2 10.1.162.226 [open] 146.176 ms 149.223 ms 150.058 ms $ sudo tcptraceroute postgresql-dev.blabla.us-east-1.rds.amazonaws.com 5432 Password: Selected device utun100, address 100.108.197.2, port 50684 for outgoing packets Tracing the path to postgresql-dev.blabla.us-east-1.rds.amazonaws.com (10.2.160.41) on TCP port 5432 (postgresql), 30 hops max 1 bastion-dev.netbird.internal (100.108.251.207) 147.152 ms 144.849 ms 145.755 ms 2 10.2.160.41 [open] 147.141 ms 149.986 ms 146.859 ms ``` But after I add the **Endpoints** group to the Access Control groups on the Network Routes, it wants to route the **prod postgres through the dev network** and it fails as the following: ```console $ sudo tcptraceroute postgresql-prod.blabla.us-east-1.rds.amazonaws.com 5432 Selected device utun100, address 100.108.197.2, port 51517 for outgoing packets Tracing the path to postgresql-prod.blabla.us-east-1.rds.amazonaws.com (10.1.162.226) on TCP port 5432 (postgresql), 30 hops max 1 bastion-dev.netbird.internal (100.108.251.207) 145.540 ms 146.024 ms 145.709 ms ``` Relevant policies: - Users to Bastions: **My Users Group** <-> **Bastions Group** (allowing all traffic) <- This one **works** (if no access control is set) - Users to Endpoints: **My Users Group** <-> **Endpoints Group** (allowing all traffic) <- FAILS - Users to Both: **My Users Group** <-> **Bastions Group**, **Endpoints Group** (allowing all traffic) <- FAILS Steps to reproduce: - When I add the **Endpoints Group** to the Access Control groups of the Network Routes and I try to use the second or third policy, the routing is not distributed. **Expected behavior** Have the same routes as if we don't have the Access Control set for the Network Routes **Are you using NetBird Cloud?** Self-host NetBird **NetBird version** netbird management/signal/relay: v0.52.2 bastion clients: v0.52.2 macos desktop client: v0.52.2 **Is any other VPN software installed?** Nope **Screenshots** If applicable, add screenshots to help explain your problem. **Additional context** Add any other context about the problem here. **Have you tried these troubleshooting steps?** - [x] Reviewed [client troubleshooting](https://docs.netbird.io/how-to/troubleshooting-client) (if applicable) - [x] Checked for newer NetBird versions - [x] Searched for similar issues on GitHub (including closed ones) - [x] Restarted the NetBird client - [x] Disabled other VPN software - [x] Checked firewall settings
saavagebueno added the triage-neededconfig-issueself-hosting labels 2025-11-20 07:04:44 -05:00
Author
Owner

@janosmiko commented on GitHub (Aug 1, 2025):

Two additional interesting facts:

  • If I modify the metric for the bastions, it will use the one with the lowest priority (definitely, in the case the other resource will not work).
  • It doesn't work even if I create dedicated Endpoint groups or even if I provide access for the resources directly (in the policies).
@janosmiko commented on GitHub (Aug 1, 2025): Two additional interesting facts: - If I modify the metric for the bastions, it will use the one with the lowest priority (definitely, in the case the other resource will not work). - It doesn't work even if I create dedicated Endpoint groups or even if I provide access for the resources directly (in the policies).
Author
Owner

@nazarewk commented on GitHub (Aug 1, 2025):

Sounds like you're putting both domain-based Resources into a single Network with shared set of Routing Peers.

I would advise you to create 2 Networks:

  1. Network 1: for 10.1.0.0/16 & postgresql-prod.blabla.us-east-1.rds.amazonaws.com - give it group1 (unless you don't want to)
  2. Network 2: for 10.2.0.0/16 & postgresql-dev.blabla.us-east-1.rds.amazonaws.com - give it group2
  3. create access policy from client to group1 and another to group2

I'm not sure what kind of Metric are we talking about, but the one in NetBird does not reflect onto anything outside NetBird (like operating system route metric), it's only used for the Routing Peer selection priority.

@nazarewk commented on GitHub (Aug 1, 2025): Sounds like you're putting both domain-based Resources into a single Network with shared set of Routing Peers. I would advise you to create 2 Networks: 1. Network 1: for `10.1.0.0/16` & `postgresql-prod.blabla.us-east-1.rds.amazonaws.com` - give it group1 (unless you don't want to) 2. Network 2: for `10.2.0.0/16` & `postgresql-dev.blabla.us-east-1.rds.amazonaws.com` - give it group2 3. create access policy from client to `group1` and another to `group2` I'm not sure what kind of Metric are we talking about, but the one in NetBird does not reflect onto anything outside NetBird (like operating system route `metric`), it's only used for the Routing Peer selection priority.
Author
Owner

@janosmiko commented on GitHub (Aug 1, 2025):

Thanks @nazarewk ,

Let me try this out. The metric I mentioned is the routing priority, you are right.

@janosmiko commented on GitHub (Aug 1, 2025): Thanks @nazarewk , Let me try this out. The metric I mentioned is the routing priority, you are right.
Author
Owner

@janosmiko commented on GitHub (Aug 1, 2025):

It looks like it's working when I configure the whole 10.1.0.0/16, 10.2.0.0/16 subnets as resources instead of domain name based resources. Thank you.

@janosmiko commented on GitHub (Aug 1, 2025): It looks like it's working when I configure the whole 10.1.0.0/16, 10.2.0.0/16 subnets as resources instead of domain name based resources. Thank you.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SVI/netbird#2145