kworker/0:0-wg-c pegs CPU to 100% utilisation on exit node, each time a resource makes a web request though a client. #1514

Open
opened 2025-11-20 05:32:01 -05:00 by saavagebueno · 0 comments
Owner

Originally created by @rihards-simanovics on GitHub (Dec 26, 2024).

Hi Netbird team, I'm sorry, but I can't provide any logs this time as it takes too long to censor them and prep for debugging. Instead, I'm happy to email the full system dump from the exit node so you can review the information.

The long and short of the issue is that since version 0.30.0, all exit nodes appear to be having the same problem where a kworker/0:0-wg-c (where 0:0 could be any core or thread for the 2-core processor) generates spikes (or sometimes prolonged) 99% CPU utilisation.

This doesn't seem to be an issue on version 0.29.4 and below. Below are a bunch of btop++ screenshots of the issue. The problem appears to manifest regardless of what client version the resource uses. On the exit node, you can also see occasional spikes in memory utilisation, which looks like memory leaks, that get detected and the process gets killed, hence why the memory gets released immediately; one such leak killed my btop++ session, which generated a crash dump.

Screenshots below fold

Exit Node btop++ screenshots v0.30.0 and up

Screenshot 2024-12-18 100003
Screenshot 2024-12-18 100011
Screenshot 2024-12-18 100155
Screenshot 2024-12-18 101422
Screenshot 2024-12-18 101440
Screenshot 2024-12-18 101503
Screenshot 2024-12-18 101534
Screenshot 2024-12-18 101600
Screenshot 2024-12-18 101611
Screenshot 2024-12-26 042904
Screenshot 2024-12-26 042921
Screenshot 2024-12-26 042932
Screenshot 2024-12-26 042957
Screenshot 2024-12-26 043210
Screenshot 2024-12-26 043219

Exit Node btop++ screenshots v0.29.4 and below

Screenshot 2024-12-18 103647
Screenshot 2024-12-18 101943
Screenshot 2024-12-18 102917
Screenshot 2024-12-18 103022
Screenshot 2024-12-18 103207
Screenshot 2024-12-26 043641
Screenshot 2024-12-26 043552
Screenshot 2024-12-26 043627

Originally created by @rihards-simanovics on GitHub (Dec 26, 2024). Hi Netbird team, I'm sorry, but I can't provide any logs this time as it takes too long to censor them and prep for debugging. Instead, I'm happy to email the full system dump from the exit node so you can review the information. The long and short of the issue is that since version 0.30.0, all exit nodes appear to be having the same problem where a `kworker/0:0-wg-c` (where 0:0 could be any core or thread for the 2-core processor) generates spikes (or sometimes prolonged) 99% CPU utilisation. This doesn't seem to be an issue on version 0.29.4 and below. Below are a bunch of btop++ screenshots of the issue. The problem appears to manifest regardless of what client version the resource uses. On the exit node, you can also see occasional spikes in memory utilisation, which looks like memory leaks, that get detected and the process gets killed, hence why the memory gets released immediately; one such leak killed my btop++ session, which generated a crash dump. <details> <summary>Screenshots below fold</summary> ### Exit Node btop++ screenshots v0.30.0 and up ![Screenshot 2024-12-18 100003](https://github.com/user-attachments/assets/531bf17d-a05c-4754-bb6b-310a3e542879) ![Screenshot 2024-12-18 100011](https://github.com/user-attachments/assets/1e03924f-10ca-41c8-9bf1-7bf8cab63d7f) ![Screenshot 2024-12-18 100155](https://github.com/user-attachments/assets/a4c16cf1-7264-4471-8c1b-5b9a43877cfd) ![Screenshot 2024-12-18 101422](https://github.com/user-attachments/assets/2fcd9c1e-4d59-4fba-b54a-2fd1d57c078f) ![Screenshot 2024-12-18 101440](https://github.com/user-attachments/assets/31818f5d-dd00-4c2d-8876-9965a2335ef9) ![Screenshot 2024-12-18 101503](https://github.com/user-attachments/assets/2787a82f-d4eb-4177-ab50-110328f8bca2) ![Screenshot 2024-12-18 101534](https://github.com/user-attachments/assets/e56e2e9c-03f7-4d59-804b-434fe149a076) ![Screenshot 2024-12-18 101600](https://github.com/user-attachments/assets/f5114751-4ae6-4266-9bfc-145fb886cfd0) ![Screenshot 2024-12-18 101611](https://github.com/user-attachments/assets/ef360138-3cb9-46a9-96dc-2e80a1f56b31) ![Screenshot 2024-12-26 042904](https://github.com/user-attachments/assets/757a6092-8324-4031-a914-80bc4752b049) ![Screenshot 2024-12-26 042921](https://github.com/user-attachments/assets/fd9d8e6c-df94-4c61-a4af-683638a2a229) ![Screenshot 2024-12-26 042932](https://github.com/user-attachments/assets/133ba974-110e-4967-b28b-9adc54fedc5c) ![Screenshot 2024-12-26 042957](https://github.com/user-attachments/assets/e9b3185a-6db6-4517-8107-c8bb7cf18658) ![Screenshot 2024-12-26 043210](https://github.com/user-attachments/assets/114c94b4-7a03-403d-8a39-f4f05a2ba1a4) ![Screenshot 2024-12-26 043219](https://github.com/user-attachments/assets/6ddae33a-873b-4025-8fe4-35b0a3992ec5) ### Exit Node btop++ screenshots v0.29.4 and below ![Screenshot 2024-12-18 103647](https://github.com/user-attachments/assets/1fa021be-3460-4482-b410-96de537e3182) ![Screenshot 2024-12-18 101943](https://github.com/user-attachments/assets/911ddb54-d244-44c5-b496-de8b629be314) ![Screenshot 2024-12-18 102917](https://github.com/user-attachments/assets/aefba153-a9f9-418f-960d-e178f95bca7c) ![Screenshot 2024-12-18 103022](https://github.com/user-attachments/assets/fbb9b7d0-7606-40a7-a1b6-5ef8b35bf5e3) ![Screenshot 2024-12-18 103207](https://github.com/user-attachments/assets/6688dace-ee6a-4c72-92c2-172b5a7eb018) ![Screenshot 2024-12-26 043641](https://github.com/user-attachments/assets/c5ccaf04-a94e-4bfa-a217-75d4d65e57e0) ![Screenshot 2024-12-26 043552](https://github.com/user-attachments/assets/8bc5e8f9-f8c4-43b4-b5c9-ed3d69de12aa) ![Screenshot 2024-12-26 043627](https://github.com/user-attachments/assets/0ee86430-e147-4839-b8ef-0806f02ba99b) </details>
saavagebueno added the triage-needed label 2025-11-20 05:32:01 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SVI/netbird#1514