Skip to content

Observability

Skipper includes a built-in observability stack for production monitoring: Loki for logs, Prometheus for metrics, and Grafana for dashboards.

All three are installed automatically during kip install and configured to work together out of the box.

Accessing Grafana

Grafana is available at https://grafana-<your-domain>:

https://grafana-46-225-91-12.kipper.run

Default credentials:

  • Username: admin
  • Password: skipper

Change the password after first login.

What's included

Loki: Log aggregation

Loki collects logs from all pods across all namespaces. Unlike streaming logs from a single pod (kip app logs), Loki gives you:

  • Persistent logs: survive pod restarts and crashes
  • Searchable: filter by app, namespace, time range, or text content
  • Multi-pod: see logs from all replicas of an app in one view

In Grafana, go to Explore → select Loki as the data source → query with LogQL:

{namespace="yourr-name-test", app="domain-service"}

Filter for errors:

{namespace="yourr-name-test", app="domain-service"} |= "ERROR"

Prometheus: Metrics

Prometheus collects CPU, memory, network, and request metrics from all pods and nodes. Pre-configured with:

  • Node exporter: CPU, memory, disk, network per node
  • kube-state-metrics: pod status, deployment health, replica counts
  • Pod metrics: CPU and memory usage per container

In Grafana, go to Explore → select Prometheus as the data source → query with PromQL:

container_memory_usage_bytes{namespace="yourr-name-test"}

Grafana: Dashboards

Grafana comes with pre-built dashboards for cluster monitoring. Access them from the sidebar → Dashboards.

Useful built-in dashboards:

  • Kubernetes / Compute Resources / Namespace: CPU and memory per namespace
  • Kubernetes / Compute Resources / Pod: CPU and memory per pod
  • Node Exporter Full: detailed node health

AI log analysis

The log viewers in the web console (for apps, functions, and jobs) include an Analyse button. Click it to send the currently visible logs to the configured AI provider for analysis.

The AI scans the log output for errors, warnings, stack traces, and unusual patterns. It returns a summary of what happened, highlights the most likely root cause, and suggests next steps. This is especially useful when debugging unfamiliar stack traces or sifting through high-volume log output where the signal is buried in noise.

AI log analysis works with both live streaming logs and Loki history queries. The analysis uses whatever logs are currently displayed. Use the time range and search filters to narrow the context before clicking Analyse.

Requires an AI provider to be configured in the Settings page. See Configuration: AI provider settings for setup.

Disabling monitoring

On smaller servers (8-12 GB RAM), the monitoring stack can be disabled to free approximately 1-2 GB of memory for your applications. Logs from the web console (live streaming via kip app logs and the Console log viewer) continue to work. Only persistent log storage and metrics collection are affected.

Disable

bash
kip monitoring disable
  Disabling monitoring stack...
    ✔  Loki scaled down
    ✔  Promtail scaled down
    ✔  Prometheus scaled down
    ✔  Grafana scaled down

  Monitoring disabled. This frees ~1-2 GB of memory.
  Run 'kip monitoring enable' to turn it back on.

Re-enable

bash
kip monitoring enable
  Enabling monitoring stack...
    ✔  Loki scaled up
    ✔  Promtail scaled up
    ✔  Prometheus scaled up
    ✔  Grafana scaled up

  Monitoring enabled. Components may take a minute to become ready.
  Run 'kip monitoring status' to check progress.

Check status

bash
kip monitoring status
  Monitoring stack:

    ✔  Loki           running (1/1)
    ✔  Promtail       running (1/1)
    ✔  Prometheus     running (1/1)
    ✔  Grafana        running (1/1)

When monitoring is disabled, kip status shows the components as "disabled" rather than unhealthy.

Resource usage

The observability stack is configured for single-node clusters:

ComponentMemory requestMemory limit
Prometheus256 MB512 MB
Loki128 MB512 MB
Grafana64 MB128 MB
Promtail32 MB128 MB
kube-state-metrics32 MB64 MB
node-exporter32 MB64 MB

Total: approximately 550 MB requested at idle. Prometheus memory grows with the number of active time series.

Data retention

  • Metrics (Prometheus): 3 days
  • Logs (Loki): 3 days

For longer retention, update the Helm values via the k3s HelmChart resource in kube-system.

Architecture

All components run in the monitoring namespace and are managed by Helm charts via k3s.

Released under the Apache 2.0 License.