1530 Commits

Author SHA1 Message Date
CanbiZ (MickLesk)
d4e20816c7 core: APT/APK Mirror Fallback for CDN Failures (#13316) 2026-03-26 16:53:04 +01:00
CanbiZ (MickLesk)
fbe5b57c76 core/tools: replace generic return 1 exit_codes with more specific exit_codes (#13311) 2026-03-26 16:07:38 +01:00
CanbiZ (MickLesk)
42fbf1afc5 core: use /usr/bin/install to prevent function shadowing (#13299) 2026-03-26 10:11:47 +01:00
CanbiZ (MickLesk)
b9a39db667 fix(tools.func): pin npm to 11.11.0 to work around Node.js 22.22.2 regression (#13296)
Node.js 22.22.2 ships with a broken npm self-upgrade path where 'npm install -g npm@latest' fails with MODULE_NOT_FOUND for promise-retry. Pin to npm@11.11.0 as a known-good version until the upstream issue is resolved. Ref: nodejs/node#62425, npm/cli#9151
2026-03-26 10:04:00 +01:00
CanbiZ (MickLesk)
4eecca8aea fix(tools.func): use absolute path for install in setup_uv
Using bare 'install' command gets shadowed when scripts define their own install() function, causing setup_uv to hang. Use /usr/bin/install instead.
2026-03-26 09:54:17 +01:00
CanbiZ (MickLesk)
97bf744e96 fix typo (org instead of com) 2026-03-25 17:48:03 +01:00
CanbiZ (MickLesk)
53bc492fdb Make shell command substitutions safe with || true (#13279)
Add defensive fallbacks (|| true) to multiple command substitutions to prevent non-zero exits when commands produce no output or are unavailable. Changes touch misc/api.func, misc/build.func and misc/tools.func and cover places like lspci, /proc/cpuinfo parsing, /etc/os-release reads, hostname -I usage, grep reads from vars files and maps, pct config parsing, storage/template lookups, tool version detection, NVIDIA driver version extraction, and MeiliSearch config parsing. These edits do not change functional behavior aside from ensuring the scripts continue running (variables will be empty) instead of failing in stricter shells or when commands return non-zero status.
2026-03-25 13:06:21 +01:00
CanbiZ (MickLesk)
caf03fe274 chore: replace helper-scripts.com with community-scripts.com (#13244) 2026-03-24 20:50:40 +01:00
CanbiZ (MickLesk)
1c3c223e51 Turnkey: modernize turnkey.sh with shared libraries (#13242)
* refactor(turnkey): modernize turnkey.sh with shared libraries and telemetry

- Source core.func, error_handler.func, api.func instead of custom error/msg functions
- Replace custom error_exit/warn/info/msg with msg_info/msg_ok/msg_error/msg_warn
- Upgrade validate_container_id to cluster-aware (pvesh + all-node config check)
- Add diagnostics_check() and telemetry (post_to_api / post_update_to_api)
- Add pve_check, shell_check, root_check for environment validation
- Use proper EXIT trap for cleanup (destroy container on error, restart monitor)
- Improve quoting throughout (PCT_OPTIONS as array, quoted variables)
- Secure credentials file with chmod 600
- Use exit_script for user cancellations (consistent with other scripts)

* fix(turnkey): replace diagnostics_check with inline config read

diagnostics_check() is defined in build.func which is not sourced.
Read the diagnostics config file directly instead — respects existing
user preference without prompting (turnkey has no settings menu).

* bump hardcoded names to dynamic list

* Preserve telemetry type and report failures

Respect a pre-set TELEMETRY_TYPE in misc/api.func and use it in the API payload instead of the hardcoded "lxc". In turnkey/turnkey.sh, set TELEMETRY_TYPE="turnkey" for turnkey installs and enhance turnkey_cleanup() to report failed installs to telemetry (calls post_update_to_api "failed" with the exit code when POST_TO_API_DONE is true and POST_UPDATE_DONE is not), then destroy the failed container. These changes ensure correct telemetry type propagation and that failed turnkey deployments are reported.

---------

Co-authored-by: Slaviša Arežina <58952836+tremor021@users.noreply.github.com>
2026-03-24 15:54:01 +01:00
CanbiZ (MickLesk)
86c658909a Classify exit-1 errors & guard telemetry
Analyze logs for generic exit code 1 and export an ERROR_CATEGORY_OVERRIDE so telemetry receives a more accurate error category (apt, oom, network, storage, dependency). Preserve any existing TELEMETRY_TYPE when posting updates. Add defense-in-depth by disabling strict error traps before running grep/sed log analysis to avoid spurious error_handler invocations. Mark successful installs with INSTALL_COMPLETE and update the error handler to only report a successful "done" telemetry state when INSTALL_COMPLETE is explicitly set, preventing false-positive success reports from early zero-exit exits.
2026-03-24 09:57:43 +01:00
CanbiZ (MickLesk)
c8606e9fcc core: harden shell scripts against injection and insecure permissions (#13239) 2026-03-23 22:22:23 +01:00
CanbiZ (MickLesk)
a2616ee258 Improve network connectivity and DNS checks (#13222) 2026-03-23 20:43:48 +01:00
CanbiZ (MickLesk)
f29606ae87 fix(build): allow /31 and /32 CIDR with out-of-subnet gateway (#13231) 2026-03-23 20:41:55 +01:00
CanbiZ (MickLesk)
791981ba68 qf msg_warn lxc stack update 2026-03-23 14:44:24 +01:00
CanbiZ (MickLesk)
2922ecdcbb core: guard against empty IPv6 address in static mode (#13195) 2026-03-22 22:48:19 +01:00
CanbiZ (MickLesk)
55324dbb98 core: add missing -searchdomain/-nameserver prefix in base_settings (#13166) 2026-03-21 21:13:15 +01:00
CanbiZ (MickLesk)
8c35c68c9c tools.func: display pin reason in release-check messages (#13095)
* Display pin reason in release-check messages

Add an optional pin_reason parameter to check_for_gh_release and check_for_codeberg_release and update the no-update messaging to show the provided reason. If no reason is supplied, show a default message indicating the update is temporarily held back due to issues with newer releases. This improves user feedback when versions are intentionally pinned.

* Add informational args to release checks

Pass extra informational strings to check_for_gh_release calls to surface release-specific notes. Updated ct/immich.sh (notes for Immich and VectorChord releases), ct/opencloud.sh (note for OpenCloud), and ct/plant-it.sh (note about web frontend presence). These messages clarify testing/compatibility expectations when checking/releases.
2026-03-19 22:18:27 +01:00
CanbiZ (MickLesk)
245433f535 tools.func: use dpkg-query for reliable JDK version detection (#13101) 2026-03-19 21:55:58 +01:00
CanbiZ (MickLesk)
ba01175bc6 core: reorder hwaccel setup and adjust GPU group usermod (#13072)
* fix(tdarr): use curl_with_retry and verify binaries before enabling service

Tdarr_Updater downloads the actual server/node binaries from tdarr.io at
runtime. If tdarr.io is blocked by local DNS (e.g. OPNsense OISD blocklists),
the updater exits silently with code 0, leaving no binaries on disk. The
subsequent systemctl enable then fails with 'Operation not permitted' (exit 1)
because the ExecStart paths don't exist.

Changes:
- Replace bare curl with curl_with_retry for versions.json and Tdarr_Updater.zip
  downloads to gain retry logic, DNS pre-check and exponential backoff
- Add msg_info before Tdarr_Updater run so users see this step in the log
- Check that Tdarr_Server and Tdarr_Node binaries exist after the updater
  runs; fail immediately with a clear message pointing to tdarr.io connectivity
  instead of letting systemctl fail with a confusing 'Operation not permitted'

Fixes: #13030

* Improve Tdarr installer error handling

Refine post-update validation and failure behavior in tdarr-install.sh: remove a redundant status message, simplify the updater check to only require the Tdarr_Server binary, and replace the previous fatal path with msg_error plus an explicit exit 250. This makes failures (for example when tdarr.io is blocked by local DNS) clearer and avoids false negatives from the Tdarr_Node existence check.

* Use curl_with_retry and handle updater failure

Replace direct curl calls with curl_with_retry for fetching versions.json and downloading Tdarr_Updater.zip to improve network reliability. Add a post-update check that verifies /opt/tdarr/Tdarr_Server/Tdarr_Server exists; if missing, log an error suggesting possible DNS blocking and exit with code 250. Minor cleanup of updater artifacts remains unchanged.

* Reorder hwaccel setup and adjust GPU group usermod

Move setup_hwaccel invocations in emby, jellyfin, ollama, and plex installers to occur after package installation/configuration so GPU drivers/repos are present before enabling hardware acceleration. Update _setup_gpu_permissions to call usermod directly (remove $STD wrapper) when adding service users to render/video groups. Includes minor whitespace/ordering cleanups in the installer scripts.
2026-03-19 06:55:56 +01:00
CanbiZ (MickLesk)
e20fed1a2d tools.func Implement pg_cron setup for setup_postgresql (#13053)
* tools.func Implement PostgreSQL setup and upgrade function

Added setup_postgresql function to install or upgrade PostgreSQL, including optional modules and backup restoration.

* correct diff

* Update tools.func

* Update tools.func

* Update tools.func

* Update tools.func
2026-03-18 16:45:11 +01:00
CanbiZ (MickLesk)
bd91c4d07f tools: Centralize GPU group setup via setup_hwaccel (#13044)
improve hardware-acceleration setup to centralize service user group management. Install scripts (emby, plex, ollama, channels) now pass a service user to setup_hwaccel (or no user for channels) and have had inline /etc/group sed/usermod tweaks removed. misc/tools.func updated: setup_hwaccel accepts an optional service_user and forwards it to _setup_gpu_permissions, which now adds the service user to render and video groups if provided. This consolidates GPU permission changes in one place and removes duplicated per-service group edits.
2026-03-18 11:56:32 +01:00
CanbiZ (MickLesk)
0f4bfc0b5a add missing func? 2026-03-18 11:15:02 +01:00
CanbiZ (MickLesk)
48afb6c017 tools.func: Implement check_for_gh_tag function (#12998)
* tools.func Implement check_for_gh_tag function

Adds a function to check for new GitHub tags for repositories without releases. (needed for termix / guacd-server)

* Update documentation for check_for_gh_tag function
2026-03-18 08:00:33 +01:00
CanbiZ (MickLesk)
7c62147a00 tools.func: Implement fetch_and_deploy_gh_tag function (#13000)
* tools.func: Implement fetch_and_deploy_gh_tag function

Adds function to fetch and deploy GitHub tag-based source tarballs.

* Refactor fetch_and_deploy_gh_tag function and comments

Updated the function to fetch and deploy GitHub tags, enhancing its description and usage instructions.

* cleanuo
2026-03-18 07:44:13 +01:00
Slaviša Arežina
9df9a2831e Update (#13008) 2026-03-17 21:19:36 +01:00
CanbiZ (MickLesk)
6747f0c340 fix broken rocm setup 2026-03-17 08:31:42 +01:00
CanbiZ (MickLesk)
5ee3ad2702 fix(hwaccel): remove ROCm install from AMD APU setup (#12958)
AMD APUs (Radeon 780M/760M/740M and similar integrated graphics) do not
benefit from the full ROCm compute stack in LXC containers. ROCm is a
multi-GB GPGPU framework primarily designed for discrete AMD GPUs and
ML/AI workloads, not for video transcoding with integrated graphics.

For APUs the Mesa VA-API drivers (mesa-va-drivers, mesa-opencl-icd) and
firmware (firmware-amd-graphics) provide all the hardware acceleration
needed for media tasks. Installing ROCm on top adds ~4GB of packages
that frequently fail or time out for this class of hardware.

Discrete AMD GPUs (GPU_TYPE=AMD) are unaffected and still receive ROCm.
2026-03-16 11:36:38 +01:00
CanbiZ (MickLesk)
fd9039e849 fix: unify RELEASE variable for check_for_gh_release and fetch_and_deploy_gh_release (#12917) 2026-03-15 20:08:15 +01:00
CanbiZ (MickLesk)
7ba3e9fe5e core: retry downloads with exponential backoff (#12896) 2026-03-15 10:00:37 +01:00
CanbiZ (MickLesk)
005260df87 fix(hwaccel): don't abort on AMD repo apt update failure (#12890)
When repo.radeon.com has broken metadata, apt update fails with
exit code 100 and kills the entire install. Make it non-fatal so
the script can continue with cached packages or skip ROCm gracefully.

Fixes #12879
2026-03-15 00:05:31 +01:00
CanbiZ (MickLesk)
3601388abe core: add mode=generated for unattended frontend installs (#12807) 2026-03-12 12:21:28 +01:00
CanbiZ (MickLesk)
dd3b381813 core: validate storage availability when loading defaults (#12794) 2026-03-12 09:17:18 +01:00
CanbiZ (MickLesk)
cc95ef2987 tools.func: support older NVIDIA driver versions with 2 segments (xxx.xxx) (#12796) 2026-03-12 09:07:49 +01:00
CanbiZ (MickLesk)
38c9421493 tools.func: correct PATH escaping in ROCm profile script (#12793) 2026-03-12 09:07:33 +01:00
CanbiZ (MickLesk)
e7f551dab6 fix(hwaccel): install ROCm runtime only, reduce disk resize to +4GB
The full 'rocm' meta-package includes 15GB+ of dev tools (compilers,
debuggers, dev headers) which are unnecessary in LXC containers.
Install only runtime packages: rocm-opencl-runtime, rocm-hip-runtime,
rocm-smi-lib. Reduce disk resize from +8GB to +4GB accordingly.
2026-03-09 11:14:26 +01:00
CanbiZ (MickLesk)
d8b2a37228 fix(build): auto-resize disk +8GB when AMD GPU detected for ROCm 2026-03-09 10:18:23 +01:00
CanbiZ (MickLesk)
b20bf9c658 tools: add Alpine (apk) support to ensure_dependencies and is_package_installed (#12703) 2026-03-09 10:03:04 +01:00
CanbiZ (MickLesk)
8c5e340ad0 fix(hwaccel): use amdgpu/latest/ubuntu instead of versioned URL 2026-03-09 09:57:28 +01:00
CanbiZ (MickLesk)
2afc25d51f tools.func: extend hwaccel with ROCm (#12707) 2026-03-09 09:37:26 +01:00
Slaviša Arežina
8be52ab1ad Fixes (#12675) 2026-03-08 13:25:24 +01:00
CanbiZ (MickLesk)
148f0121df fix: add interactive GitHub PAT prompt on rate limit / auth failure (#12652)
When a GitHub API call fails with HTTP 401 (invalid token) or HTTP 403
(rate limit exceeded), the user is now prompted interactively to enter a
GitHub Personal Access Token (PAT). The token is validated (no empty
input, no whitespace) before being set and the API call is retried.

This applies to both github_api_call() and fetch_and_deploy_gh_release().

Closes #12615
2026-03-07 22:14:18 +01:00
CanbiZ (MickLesk)
eb848fd70f databasus: make fetch_and_deploy_from_url accept empty dir
Pass an explicit empty second argument when downloading MongoDB tools and update fetch_and_deploy_from_url to default the directory parameter to an empty string. This avoids unset/ambiguous directory handling at the call site and ensures the function has a defined directory variable even when none is provided.
2026-03-06 14:11:57 +01:00
CanbiZ (MickLesk)
0f1a06ca32 core: add var_os / var_version to whitelist for app.vars (#12576)
Include var_os and var_version in VAR_WHITELIST across load_vars_file, default_var_settings, and the global whitelist declaration. Also emit sanitized var_os and var_version in _build_current_app_vars_tmp so they are included in generated app variable output and can be recognized/persisted by the build tooling.
2026-03-05 15:28:07 +01:00
CanbiZ (MickLesk)
87e14ba12f fix(core): remove duplicate -nameserver/-searchdomain prefix in pct create
The NS and SD variables already contain the -nameserver= and
-searchdomain= prefixes (set in advanced_settings). PR #12521
incorrectly added a second prefix when building PCT_OPTIONS_STRING,
resulting in '-nameserver -nameserver=8.8.8.8' which pct rejects.

Also fixes the misleading comment ('Add storage' -> 'Add searchdomain').

Fixes #12572
2026-03-05 08:53:48 +01:00
CanbiZ (MickLesk)
783ba03e92 fix(postgresql): fall back to distro packages instead of bookworm-pgdg on Trixie (#12524) (#12542) 2026-03-04 08:39:33 +01:00
Tom
199483be82 fix: whitelist var_searchdomain and fix the handling of var_ns and var_searchdomain in build.func (#12521) 2026-03-04 07:31:11 +01:00
CanbiZ (MickLesk)
380aa4bc0f feat(recovery): add ENOSPC disk-full detection with auto-retry using doubled disk size (#12511) 2026-03-03 15:33:19 +01:00
CanbiZ (MickLesk)
8b62b8f3c5 fix(api): rewrite json_escape to use awk for reliable JSON escaping 2026-03-02 16:25:22 +01:00
CanbiZ (MickLesk)
cd38bc3a65 fix: strip G suffix from DISK_SIZE in post_update_to_api for VMs
VMs set DISK_SIZE=32G (with G suffix), but post_update_to_api used
\ directly in JSON, producing 'disk_size: 32G' which is
invalid JSON. The server rejected these with 'invalid character G'.

Now strips the G suffix and validates numeric-only before embedding.
2026-03-02 15:58:34 +01:00
CanbiZ (MickLesk)
46d25645c2 fix: add retry to initial installing POST (post_to_api / post_to_api_vm)
The initial 'installing' record MUST exist for all subsequent status
updates to succeed. Previously this was fire-and-forget with no retry,
so timeouts/503s silently dropped ~50% of installations.

Both post_to_api (LXC) and post_to_api_vm now retry up to 3 times
with 1s delay between attempts. Also captures HTTP response code to
detect failures instead of using curl -f (silent fail).
2026-03-02 15:43:29 +01:00