Fixes#425
When polling multiple controllers, if one controller was down or
unreachable, unpoller would stop collecting data from ALL controllers.
This caused complete data loss across all sites when just one was down.
Root Cause:
Both Metrics() and Events() methods would immediately return an error
when any controller failed, skipping all remaining controllers in the
loop.
Changes:
- Log errors from failed controllers but continue to next controller
- Track collection errors separately from successful data collection
- Only return error if ALL controllers failed and no data was collected
- Return success if at least one controller provided data
This allows unpoller to continue monitoring healthy controllers even
when some are temporarily unreachable due to network issues, timeouts,
or maintenance.
Example behavior:
- Controller 1: Down (timeout) - logs error, continues
- Controller 2: Up - collects data successfully
- Controller 3: Up - collects data successfully
- Result: Returns data from controllers 2 and 3
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixes#904
When a poll fails (typically with 401 Unauthorized after ~2 hour token
expiration), the code would re-authenticate but then return the original
poll error without retrying. This caused a one-minute data gap every
2 hours.
Changes:
- After successful re-authentication, retry the poll operation
- Add 500ms delay before retry to allow controller to process new auth
- Rename error variable to avoid shadowing during re-auth attempt
This ensures that transient authentication failures during the re-auth
window don't cause data gaps.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Ports providing PoE power are no longer considered "dead" even when
disabled or down. This allows users to collect PoE metrics from ports
that are disabled for security reasons but still providing power.
Fixes#910
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Adds log_unknown_types config option (default: false) to control logging
of unknown UniFi device types. When disabled (default), unknown devices
are silently ignored to reduce log volume. When enabled, they are logged
as DEBUG messages instead of ERROR. Addresses issue #912.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Update go.mod to use unifi library v5.6.0 (includes remote API support)
- Remove temporary replace directive now that v5.6.0 is published
- Fix empty-block linter errors in input.go by removing empty if blocks
Remove empty if blocks by inverting conditions:
- Line 289: Invert Remote check for URL default
- Line 303: Invert APIKey check in Remote mode
- Line 401: Invert Remote check for URL default in setControllerDefaults
- Apply site name override to DPI clients (ClientsDPI) in augmentMetrics
- Apply site name override to client anomalies when collecting events
- Apply site name override to sites (both Name and SiteName fields) when adding to metrics
- Apply site name override to DPI sites, speed tests, and country traffic
- Move applySiteNameOverride call to end of augmentMetrics to ensure all metrics are processed
- This ensures all Prometheus metrics use console names instead of 'Default (default)' for Cloud Gateways
- Add isDefaultSiteName helper to match any site name containing 'default' (case-insensitive)
- Handles variations like 'Default', 'default', 'Default (default)', etc.
- Ensures site_name in metrics shows console names instead of generic 'Default' values
- Makes metrics more compatible with existing dashboards that expect meaningful site names
- Also checks SiteName field on sites in addition to Name field
- Keep actual site name 'default' for API calls to prevent 404 errors
- Apply site name override only in metrics for display purposes
- Fixes issue where console names were used in API paths causing 404s
- Site name override now correctly applied to devices, clients, sites, and rogue APs in metrics only
- Add remote API mode with automatic controller discovery
- Discover consoles via /v1/hosts endpoint
- Auto-discover sites for each console via integration API
- Use console name from hosts response as site name override for Cloud Gateways
- Support both config-level and per-controller remote mode
- Add example configs for YAML, JSON, and TOML formats
- Remote API uses api.ui.com with X-API-Key authentication
- Automatically discovers all consoles when remote=true and remote_api_key is set
This enables monitoring multiple UniFi Cloud Gateways through a single
API key without requiring direct network access to each controller.
Bumps the all group with 1 update: [golang.org/x/crypto](https://github.com/golang/crypto).
Updates `golang.org/x/crypto` from 0.46.0 to 0.47.0
- [Commits](https://github.com/golang/crypto/compare/v0.46.0...v0.47.0)
---
updated-dependencies:
- dependency-name: golang.org/x/crypto
dependency-version: 0.47.0
dependency-type: direct:production
update-type: version-update:semver-minor
dependency-group: all
...
Signed-off-by: dependabot[bot] <support@github.com>
The UniFi controller HTTP client was created without a timeout, causing
unpoller to hang indefinitely when the controller becomes unresponsive.
This resulted in random stops where polling would cease until the
container was restarted.
Changes:
- Add Timeout field to Controller struct (cnfg.Duration)
- Set default timeout of 60 seconds
- Pass timeout to unifi.Config when creating the client
- Log timeout value on startup for visibility
The timeout can be configured via:
- Config file: timeout = "60s"
- Environment: UP_UNIFI_DEFAULT_TIMEOUT=60s
Fixes issue where container would hang overnight:
2025/12/22 22:29:27 - Requesting https://unifi/.../stat/sta
[~2 hour gap - request hung indefinitely]
2025/12/23 00:17:57 - Unmarshalling Device Type: udm...
- Add SaveProtectLogs config option to enable Protect log collection
- Add ProtectThumbnails config option to fetch event thumbnails
- Add collectProtectLogs function with 24h default fetch window
- Add ProtectLogEvent for Loki reporting with separate thumbnail log lines
- Add PII redaction for Protect log entries
- Filter thumbnail fetching to camera events only (motion, smartDetect*, etc.)
- Update log output to show Protect logs status
Add new save_syslog config option to collect events from the v2 UniFi
system-log API (/v2/api/site/{site}/system-log/all).
Changes:
- Add SaveSyslog field to Controller struct
- Add collectSyslog() function using v2 API
- Keep collectEvents() using v1 API for backwards compatibility
- Add RedactIPPII() helper for PII redaction
- Update lokiunifi to log raw JSON (parseable with Loki | json)
- Reduce indexed labels to low-cardinality fields only
- Add SystemLogEntry handler in lokiunifi report
Config: save_syslog (v2 API) vs save_events (v1 API)
Env: UP_UNIFI_DEFAULT_SAVE_SYSLOG=true
Track the number of bytes written per request for both InfluxDB and Prometheus outputs.
InfluxDB:
- Added bytesT counter constant
- Implemented calculateMetricBytes() to estimate line protocol size
- Updated batchV1() and batchV2() to count bytes per point
- Updated log output to display bytes written
Prometheus:
- Added Bytes field to Report struct
- Updated export() to calculate approximate metric byte size
- Updated log output to display bytes written
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Speed tests were not being reported correctly for multi-WAN setups
because the device-level speedtest-status field was returning zeros.
The data has moved to a new aggregated dashboard API endpoint.
Changes:
- Add GetSpeedTests() and GetSiteSpeedTests() methods to fetch from
/v2/api/site/{site}/aggregated-dashboard endpoint
- Create SpeedTestResult data structures to capture per-WAN metrics
- Update Prometheus exporter with new speedtest_* metrics per interface
- Update InfluxDB exporter to write speedtest measurements per WAN
- Update Datadog exporter with unifi.speedtest.* metrics per WAN
- Update metrics collection to include speed test data for all sites
Metrics now include labels/tags for:
- wan_interface: Physical interface (eth8, eth9, etc.)
- wan_group: Logical WAN name (WAN, WAN2, etc.)
- site_name: Site identifier
- source: Controller URL
Gracefully handles older controllers without the new API endpoint.
Fixes#841🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This change significantly expands the metrics exported for UBB devices
to InfluxDB and Datadog, matching the comprehensive coverage added to
the Prometheus output.
Changes to InfluxDB (pkg/influxunifi/ubb.go):
- Added batchUBBstats() to export comprehensive statistics separated
by radio (total, wifi0, terra2, user-wifi0, user-terra2)
- Added VAP table export via processVAPTable()
- Added Radio table export via processRadTable()
- Added P2P stats (rx_rate, tx_rate, throughput)
- Added link quality metrics (link_quality, link_quality_current,
link_capacity)
- Comprehensive stats exported to new "ubb_stats" table with full
breakdown of traffic per radio
Changes to Datadog (pkg/datadogunifi/ubb.go):
- Added batchUBBstats() to export comprehensive statistics separated
by radio (total, wifi0, terra2, user-wifi0, user-terra2)
- Added VAP table export via processVAPTable()
- Added Radio table export via processRadTable()
- Added P2P stats (rx_rate, tx_rate, throughput)
- Added link quality metrics (link_quality, link_quality_current,
link_capacity)
- Comprehensive stats exported with namespace "ubb.stats"
All implementations now fully support:
- 5GHz radio (wifi0) metrics
- 60GHz radio (terra2/ad) metrics - Full 802.11ad support!
- Per-radio RX/TX packets, bytes, errors, dropped, retries
- User-specific metrics for each radio
- Interface-specific metrics (ath0 for 5GHz, wlan0 for 60GHz)
- Point-to-point link statistics and quality metrics
Fixes: #409🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>