Node.js 22.22.2 ships with a broken npm self-upgrade path where 'npm install -g npm@latest' fails with MODULE_NOT_FOUND for promise-retry. Pin to npm@11.11.0 as a known-good version until the upstream issue is resolved. Ref: nodejs/node#62425, npm/cli#9151
Add defensive fallbacks (|| true) to multiple command substitutions to prevent non-zero exits when commands produce no output or are unavailable. Changes touch misc/api.func, misc/build.func and misc/tools.func and cover places like lspci, /proc/cpuinfo parsing, /etc/os-release reads, hostname -I usage, grep reads from vars files and maps, pct config parsing, storage/template lookups, tool version detection, NVIDIA driver version extraction, and MeiliSearch config parsing. These edits do not change functional behavior aside from ensuring the scripts continue running (variables will be empty) instead of failing in stricter shells or when commands return non-zero status.
* refactor(turnkey): modernize turnkey.sh with shared libraries and telemetry
- Source core.func, error_handler.func, api.func instead of custom error/msg functions
- Replace custom error_exit/warn/info/msg with msg_info/msg_ok/msg_error/msg_warn
- Upgrade validate_container_id to cluster-aware (pvesh + all-node config check)
- Add diagnostics_check() and telemetry (post_to_api / post_update_to_api)
- Add pve_check, shell_check, root_check for environment validation
- Use proper EXIT trap for cleanup (destroy container on error, restart monitor)
- Improve quoting throughout (PCT_OPTIONS as array, quoted variables)
- Secure credentials file with chmod 600
- Use exit_script for user cancellations (consistent with other scripts)
* fix(turnkey): replace diagnostics_check with inline config read
diagnostics_check() is defined in build.func which is not sourced.
Read the diagnostics config file directly instead — respects existing
user preference without prompting (turnkey has no settings menu).
* bump hardcoded names to dynamic list
* Preserve telemetry type and report failures
Respect a pre-set TELEMETRY_TYPE in misc/api.func and use it in the API payload instead of the hardcoded "lxc". In turnkey/turnkey.sh, set TELEMETRY_TYPE="turnkey" for turnkey installs and enhance turnkey_cleanup() to report failed installs to telemetry (calls post_update_to_api "failed" with the exit code when POST_TO_API_DONE is true and POST_UPDATE_DONE is not), then destroy the failed container. These changes ensure correct telemetry type propagation and that failed turnkey deployments are reported.
---------
Co-authored-by: Slaviša Arežina <58952836+tremor021@users.noreply.github.com>
Analyze logs for generic exit code 1 and export an ERROR_CATEGORY_OVERRIDE so telemetry receives a more accurate error category (apt, oom, network, storage, dependency). Preserve any existing TELEMETRY_TYPE when posting updates. Add defense-in-depth by disabling strict error traps before running grep/sed log analysis to avoid spurious error_handler invocations. Mark successful installs with INSTALL_COMPLETE and update the error handler to only report a successful "done" telemetry state when INSTALL_COMPLETE is explicitly set, preventing false-positive success reports from early zero-exit exits.
* Display pin reason in release-check messages
Add an optional pin_reason parameter to check_for_gh_release and check_for_codeberg_release and update the no-update messaging to show the provided reason. If no reason is supplied, show a default message indicating the update is temporarily held back due to issues with newer releases. This improves user feedback when versions are intentionally pinned.
* Add informational args to release checks
Pass extra informational strings to check_for_gh_release calls to surface release-specific notes. Updated ct/immich.sh (notes for Immich and VectorChord releases), ct/opencloud.sh (note for OpenCloud), and ct/plant-it.sh (note about web frontend presence). These messages clarify testing/compatibility expectations when checking/releases.
* fix(tdarr): use curl_with_retry and verify binaries before enabling service
Tdarr_Updater downloads the actual server/node binaries from tdarr.io at
runtime. If tdarr.io is blocked by local DNS (e.g. OPNsense OISD blocklists),
the updater exits silently with code 0, leaving no binaries on disk. The
subsequent systemctl enable then fails with 'Operation not permitted' (exit 1)
because the ExecStart paths don't exist.
Changes:
- Replace bare curl with curl_with_retry for versions.json and Tdarr_Updater.zip
downloads to gain retry logic, DNS pre-check and exponential backoff
- Add msg_info before Tdarr_Updater run so users see this step in the log
- Check that Tdarr_Server and Tdarr_Node binaries exist after the updater
runs; fail immediately with a clear message pointing to tdarr.io connectivity
instead of letting systemctl fail with a confusing 'Operation not permitted'
Fixes: #13030
* Improve Tdarr installer error handling
Refine post-update validation and failure behavior in tdarr-install.sh: remove a redundant status message, simplify the updater check to only require the Tdarr_Server binary, and replace the previous fatal path with msg_error plus an explicit exit 250. This makes failures (for example when tdarr.io is blocked by local DNS) clearer and avoids false negatives from the Tdarr_Node existence check.
* Use curl_with_retry and handle updater failure
Replace direct curl calls with curl_with_retry for fetching versions.json and downloading Tdarr_Updater.zip to improve network reliability. Add a post-update check that verifies /opt/tdarr/Tdarr_Server/Tdarr_Server exists; if missing, log an error suggesting possible DNS blocking and exit with code 250. Minor cleanup of updater artifacts remains unchanged.
* Reorder hwaccel setup and adjust GPU group usermod
Move setup_hwaccel invocations in emby, jellyfin, ollama, and plex installers to occur after package installation/configuration so GPU drivers/repos are present before enabling hardware acceleration. Update _setup_gpu_permissions to call usermod directly (remove $STD wrapper) when adding service users to render/video groups. Includes minor whitespace/ordering cleanups in the installer scripts.
improve hardware-acceleration setup to centralize service user group management. Install scripts (emby, plex, ollama, channels) now pass a service user to setup_hwaccel (or no user for channels) and have had inline /etc/group sed/usermod tweaks removed. misc/tools.func updated: setup_hwaccel accepts an optional service_user and forwards it to _setup_gpu_permissions, which now adds the service user to render and video groups if provided. This consolidates GPU permission changes in one place and removes duplicated per-service group edits.
* tools.func Implement check_for_gh_tag function
Adds a function to check for new GitHub tags for repositories without releases. (needed for termix / guacd-server)
* Update documentation for check_for_gh_tag function
* tools.func: Implement fetch_and_deploy_gh_tag function
Adds function to fetch and deploy GitHub tag-based source tarballs.
* Refactor fetch_and_deploy_gh_tag function and comments
Updated the function to fetch and deploy GitHub tags, enhancing its description and usage instructions.
* cleanuo
AMD APUs (Radeon 780M/760M/740M and similar integrated graphics) do not
benefit from the full ROCm compute stack in LXC containers. ROCm is a
multi-GB GPGPU framework primarily designed for discrete AMD GPUs and
ML/AI workloads, not for video transcoding with integrated graphics.
For APUs the Mesa VA-API drivers (mesa-va-drivers, mesa-opencl-icd) and
firmware (firmware-amd-graphics) provide all the hardware acceleration
needed for media tasks. Installing ROCm on top adds ~4GB of packages
that frequently fail or time out for this class of hardware.
Discrete AMD GPUs (GPU_TYPE=AMD) are unaffected and still receive ROCm.
When repo.radeon.com has broken metadata, apt update fails with
exit code 100 and kills the entire install. Make it non-fatal so
the script can continue with cached packages or skip ROCm gracefully.
Fixes#12879
The full 'rocm' meta-package includes 15GB+ of dev tools (compilers,
debuggers, dev headers) which are unnecessary in LXC containers.
Install only runtime packages: rocm-opencl-runtime, rocm-hip-runtime,
rocm-smi-lib. Reduce disk resize from +8GB to +4GB accordingly.
When a GitHub API call fails with HTTP 401 (invalid token) or HTTP 403
(rate limit exceeded), the user is now prompted interactively to enter a
GitHub Personal Access Token (PAT). The token is validated (no empty
input, no whitespace) before being set and the API call is retried.
This applies to both github_api_call() and fetch_and_deploy_gh_release().
Closes#12615
Pass an explicit empty second argument when downloading MongoDB tools and update fetch_and_deploy_from_url to default the directory parameter to an empty string. This avoids unset/ambiguous directory handling at the call site and ensures the function has a defined directory variable even when none is provided.
Include var_os and var_version in VAR_WHITELIST across load_vars_file, default_var_settings, and the global whitelist declaration. Also emit sanitized var_os and var_version in _build_current_app_vars_tmp so they are included in generated app variable output and can be recognized/persisted by the build tooling.
The NS and SD variables already contain the -nameserver= and
-searchdomain= prefixes (set in advanced_settings). PR #12521
incorrectly added a second prefix when building PCT_OPTIONS_STRING,
resulting in '-nameserver -nameserver=8.8.8.8' which pct rejects.
Also fixes the misleading comment ('Add storage' -> 'Add searchdomain').
Fixes#12572
VMs set DISK_SIZE=32G (with G suffix), but post_update_to_api used
\ directly in JSON, producing 'disk_size: 32G' which is
invalid JSON. The server rejected these with 'invalid character G'.
Now strips the G suffix and validates numeric-only before embedding.
The initial 'installing' record MUST exist for all subsequent status
updates to succeed. Previously this was fire-and-forget with no retry,
so timeouts/503s silently dropped ~50% of installations.
Both post_to_api (LXC) and post_to_api_vm now retry up to 3 times
with 1s delay between attempts. Also captures HTTP response code to
detect failures instead of using curl -f (silent fail).