Skip to content

Alerts

Skipper tracks important cluster events (resource changes, OOM kills, stuck pods) and surfaces them through an in-console alerting system with optional Slack integration.

The bell icon

The bell icon in the sidebar shows a badge with the number of unread alerts. Click it to open the alerts panel, which lists the most recent 50 alerts sorted newest first.

Each alert shows:

  • Severity: critical (red), warning (yellow), or info (green)
  • App and namespace: which workload triggered the alert
  • Action: what happened (memory doubled, CPU increased, stuck pod deleted)
  • Reason: why the action was taken (OOM kill detected, usage at 92%)
  • Timestamp: when the event occurred

What triggers alerts

Alerts are generated by the resource controller when running in auto mode. The following events create alerts:

Resource adjustments

When CPU or memory usage stays above 80% or below 20% for 3 consecutive checks (each check runs every 60 seconds), the controller adjusts resources and creates an alert. The alert records the old and new values so you can see exactly what changed.

OOM kills

When a pod is terminated due to an out-of-memory condition, the controller immediately doubles the memory limit and creates a critical alert. OOM recovery does not require multiple consecutive checks. It acts on the first detection.

Stuck pods

If a pod remains in ContainerCreating state for more than 5 minutes, the controller deletes it (allowing Kubernetes to recreate it) and creates a warning alert.

Node resource pressure

When total memory usage across all pods exceeds 80% of the node's allocatable memory, the controller generates a warning alert listing the top consumers and any anomalies. At 90%+, the alert is marked critical. The alert includes which workloads are using the most memory and which ones have grown significantly in the last 10 minutes.

Default profile application

When a new app has no resource requests configured, the controller applies profile defaults and creates an informational alert recording which profile was used.

Slack integration

Forward alerts to a Slack channel. See Configuration for setup.

Dismissing alerts

Click Dismiss in the alerts panel to mark all current alerts as read. The unread count on the bell icon resets to zero. Dismiss is per-user, so each team member has their own read/unread state.

Dismissed alerts are not deleted. They remain visible in the alerts panel but no longer contribute to the unread count. New alerts that arrive after dismissal will increment the badge again.

Storage

Alerts are stored in a Kubernetes ConfigMap (skipper-alerts in the skipper-system namespace). The system retains the most recent 50 alerts, automatically discarding older ones. No external database is required.

Released under the Apache 2.0 License.