API slows down after an average of 200 peers #1237

Closed
opened 2025-11-20 05:26:43 -05:00 by saavagebueno · 13 comments
Owner

Originally created by @ismail0234 on GitHub (Sep 12, 2024).

Describe the problem

The api slows down after 200 peers connected to the system. After 500 peers, it slows down a lot. Each request takes more than 1-2 seconds.

In the test measurements I made, these are the response times returned from the api according to the number of peers connected to the system.

20 Peers: 200-300 ms
100 Peers 300-600 ms
200 Peers: 500-1000 ms
500 Peers: 1500-3000 ms

JWT Group sync and User group propagation feature are also disabled. I do not use these features. Right now, I'm kicking all unconnected peers off the network every hour to keep the api running fast, so my problems are completely fixed. As the number of peers in the network increases, api calls become very slow and some calls return 0 http response.

The api calls I use are as follows;

  1. “List all Peers”
  2. “List all Groups”

Since Netbird does not share any information on the client side to find the id of the user in api calls, I need to match the netbird ip. For this, it is necessary to go through all peers and match the correct ip.

To Reproduce

  1. Connect 500 peers to System.
  2. Return all peers with the api.

Expected behavior

  1. Faster retrieval of data from the API or the option to filter the data instead of all of it.
  2. Instead of fetching all peers, it may be possible to search for only one peer with a method like “getPeerByIp” or “getPeerById”. In this way, we can use this method to find the desired peer instead of finding the correct peer among all peers.

Are you using NetBird Cloud?

Self-Hosted

NetBird version

0.28.9

Originally created by @ismail0234 on GitHub (Sep 12, 2024). **Describe the problem** The api slows down after 200 peers connected to the system. After 500 peers, it slows down a lot. Each request takes more than 1-2 seconds. In the test measurements I made, these are the response times returned from the api according to the number of peers connected to the system. 20 Peers: 200-300 ms 100 Peers 300-600 ms 200 Peers: 500-1000 ms 500 Peers: 1500-3000 ms JWT Group sync and User group propagation feature are also disabled. I do not use these features. Right now, I'm kicking all unconnected peers off the network every hour to keep the api running fast, so my problems are completely fixed. As the number of peers in the network increases, api calls become very slow and some calls return 0 http response. The api calls I use are as follows; 1. “List all Peers” 2. “List all Groups” Since Netbird does not share any information on the client side to find the id of the user in api calls, I need to match the netbird ip. For this, it is necessary to go through all peers and match the correct ip. **To Reproduce** 1. Connect 500 peers to System. 2. Return all peers with the api. **Expected behavior** 1. Faster retrieval of data from the API or the option to filter the data instead of all of it. 2. Instead of fetching all peers, it may be possible to search for only one peer with a method like “getPeerByIp” or “getPeerById”. In this way, we can use this method to find the desired peer instead of finding the correct peer among all peers. **Are you using NetBird Cloud?** Self-Hosted **NetBird version** 0.28.9
saavagebueno added the management-serviceperformance labels 2025-11-20 05:26:44 -05:00
Author
Owner

@ismail0234 commented on GitHub (Sep 12, 2024):

@mlsmaycon https://github.com/netbirdio/netbird/issues/2566#issuecomment-2345688007

Isn't the solution you're talking about here about the dashboard? But the problem I am talking about is related to the Rest API. So will this solution do any good?

@ismail0234 commented on GitHub (Sep 12, 2024): @mlsmaycon https://github.com/netbirdio/netbird/issues/2566#issuecomment-2345688007 Isn't the solution you're talking about here about the dashboard? But the problem I am talking about is related to the Rest API. So will this solution do any good?
Author
Owner

@mlsmaycon commented on GitHub (Sep 12, 2024):

It would help eliminate the possibility of an issue with the IDP manager. The time you are seeing is unexpected for such small setups.

@mlsmaycon commented on GitHub (Sep 12, 2024): It would help eliminate the possibility of an issue with the IDP manager. The time you are seeing is unexpected for such small setups.
Author
Owner

@ismail0234 commented on GitHub (Sep 12, 2024):

It would help eliminate the possibility of an issue with the IDP manager. The time you are seeing is unexpected for such small setups.

I think I did what you said correctly... Now the user only appears as id.

image

@ismail0234 commented on GitHub (Sep 12, 2024): > It would help eliminate the possibility of an issue with the IDP manager. The time you are seeing is unexpected for such small setups. I think I did what you said correctly... Now the user only appears as id. ![image](https://github.com/user-attachments/assets/1f27e1a8-644b-466f-b39b-0c64870e47cd)
Author
Owner

@mlsmaycon commented on GitHub (Sep 12, 2024):

That seems right. How is the API performance?

BTW, just to confirm, you are running the management with the 0.29.1 version, right?

@mlsmaycon commented on GitHub (Sep 12, 2024): That seems right. How is the API performance? BTW, just to confirm, you are running the management with the 0.29.1 version, right?
Author
Owner

@ismail0234 commented on GitHub (Sep 12, 2024):

That seems right. How is the API performance?

BTW, just to confirm, you are running the management with the 0.29.1 version, right?

There are 20 peers in the network now. It will take a couple of days to reach 500. In the meantime, I will not remove the connected peers. We will know the performance after a few days. And yes, it is the latest version.

20 peers: 150-200ms

@ismail0234 commented on GitHub (Sep 12, 2024): > That seems right. How is the API performance? > > BTW, just to confirm, you are running the management with the 0.29.1 version, right? There are 20 peers in the network now. It will take a couple of days to reach 500. In the meantime, I will not remove the connected peers. We will know the performance after a few days. And yes, it is the latest version. 20 peers: 150-200ms
Author
Owner

@ismail0234 commented on GitHub (Sep 12, 2024):

image

@ismail0234 commented on GitHub (Sep 12, 2024): ![image](https://github.com/user-attachments/assets/77eef671-ac5c-407f-8677-9df391d4efc5)
Author
Owner

@mlsmaycon commented on GitHub (Sep 13, 2024):

Down to 2/3 of what you had before. If you want to validate with some temporary peers, you can do the following:

  1. Create a reusable setup key with ephemeral peers enabled. Put a group that is not part of any ACL
  2. run the following command from a Unix-like system with NetBird installed:
for I in $(seq 1 100);
do
    echo logging in peer test-peer-$i 
    netbird login --log-file console --config /tmp/peer-$i.json --hostname test-peer-$i --management-url https://<your management URL>
done
@mlsmaycon commented on GitHub (Sep 13, 2024): Down to 2/3 of what you had before. If you want to validate with some temporary peers, you can do the following: 1. Create a reusable setup key with ephemeral peers enabled. Put a group that is not part of any ACL 2. run the following command from a Unix-like system with NetBird installed: ```shell for I in $(seq 1 100); do echo logging in peer test-peer-$i netbird login --log-file console --config /tmp/peer-$i.json --hostname test-peer-$i --management-url https://<your management URL> done ```
Author
Owner

@ismail0234 commented on GitHub (Sep 13, 2024):

@mlsmaycon

With the 500 peers I now get a response time between 900ms/1200ms, which is better than before, but I still think it is too high.

image

@ismail0234 commented on GitHub (Sep 13, 2024): @mlsmaycon With the 500 peers I now get a response time between 900ms/1200ms, which is better than before, but I still think it is too high. ![image](https://github.com/user-attachments/assets/1874630a-1255-4f39-9206-31b1b4002413)
Author
Owner

@ismail0234 commented on GitHub (Sep 13, 2024):

When I send 2-3 requests to the API at the same time, the total response time can take more than 3-4 seconds. If the amount of peers increases much more, it will slow down even more.

@ismail0234 commented on GitHub (Sep 13, 2024): When I send 2-3 requests to the API at the same time, the total response time can take more than 3-4 seconds. If the amount of peers increases much more, it will slow down even more.
Author
Owner

@mlsmaycon commented on GitHub (Sep 13, 2024):

can you run your server with trace logs?

You should logs with the texts:

released write lock for
released read lock for

can you share them with us?

@mlsmaycon commented on GitHub (Sep 13, 2024): can you run your server with trace logs? You should logs with the texts: ``` released write lock for released read lock for ``` can you share them with us?
Author
Owner

@ismail0234 commented on GitHub (Sep 13, 2024):

can you run your server with trace logs?

You should logs with the texts:

released write lock for
released read lock for

can you share them with us?

Do I need to do this on the docker side or client side? If I need to set it on the docker side, is it enough to add it to the dashboard.env file?

@ismail0234 commented on GitHub (Sep 13, 2024): > can you run your server with trace logs? > > You should logs with the texts: > > ``` > released write lock for > released read lock for > ``` > > can you share them with us? Do I need to do this on the docker side or client side? If I need to set it on the docker side, is it enough to add it to the dashboard.env file?
Author
Owner

@mlsmaycon commented on GitHub (Sep 13, 2024):

This is on the docker side running the NetBird management service.

You can update the docker-compose.yml, in the management service block, change "--log-level", "info", to "--log-level", "trace",, then run:

docker compose up -d management
@mlsmaycon commented on GitHub (Sep 13, 2024): This is on the docker side running the NetBird management service. You can update the docker-compose.yml, in the management service block, change `"--log-level", "info",` to `"--log-level", "trace",`, then run: ``` docker compose up -d management ```
Author
Owner

@ismail0234 commented on GitHub (Sep 13, 2024):

This is on the docker side running the NetBird management service.

You can update the docker-compose.yml, in the management service block, change "--log-level", "info", to "--log-level", "trace",, then run:

docker compose up -d management

@mlsmaycon I downloaded the logs but do they contain any private data? Is it ok to share it here?

@ismail0234 commented on GitHub (Sep 13, 2024): > This is on the docker side running the NetBird management service. > > You can update the docker-compose.yml, in the management service block, change `"--log-level", "info",` to `"--log-level", "trace",`, then run: > > ``` > docker compose up -d management > ``` @mlsmaycon I downloaded the logs but do they contain any private data? Is it ok to share it here?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SVI/netbird#1237