Observability

Skipper includes a built-in observability stack for production monitoring: Loki for logs, Prometheus for metrics, and Grafana for dashboards.

All three are installed automatically during kip install and configured to work together out of the box.

Accessing Grafana

Grafana is available at https://grafana-<your-domain>:

https://grafana-46-225-91-12.kipper.run

Default credentials:

Username: admin
Password: skipper

Change the password after first login.

What's included

Loki: Log aggregation

Loki collects logs from all pods across all namespaces. Unlike streaming logs from a single pod (kip app logs), Loki gives you:

Persistent logs: survive pod restarts and crashes
Searchable: filter by app, namespace, time range, or text content
Multi-pod: see logs from all replicas of an app in one view

In Grafana, go to Explore → select Loki as the data source → query with LogQL:

{namespace="yourr-name-test", app="domain-service"}

Filter for errors:

{namespace="yourr-name-test", app="domain-service"} |= "ERROR"

Prometheus: Metrics

Prometheus collects CPU, memory, network, and request metrics from all pods and nodes. Pre-configured with:

Node exporter: CPU, memory, disk, network per node
kube-state-metrics: pod status, deployment health, replica counts
Pod metrics: CPU and memory usage per container

In Grafana, go to Explore → select Prometheus as the data source → query with PromQL:

container_memory_usage_bytes{namespace="yourr-name-test"}

Grafana: Dashboards

Grafana comes with pre-built dashboards for cluster monitoring. Access them from the sidebar → Dashboards.

Useful built-in dashboards:

Kubernetes / Compute Resources / Namespace: CPU and memory per namespace
Kubernetes / Compute Resources / Pod: CPU and memory per pod
Node Exporter Full: detailed node health

AI log analysis

The log viewers in the web console (for apps, functions, and jobs) include an Analyse button. Click it to send the currently visible logs to the configured AI provider for analysis.

The AI scans the log output for errors, warnings, stack traces, and unusual patterns. It returns a summary of what happened, highlights the most likely root cause, and suggests next steps. This is especially useful when debugging unfamiliar stack traces or sifting through high-volume log output where the signal is buried in noise.

AI log analysis works with both live streaming logs and Loki history queries. The analysis uses whatever logs are currently displayed. Use the time range and search filters to narrow the context before clicking Analyse.

Requires an AI provider to be configured in the Settings page. See Configuration: AI provider settings for setup.

Disabling monitoring

On smaller servers (8-12 GB RAM), the monitoring stack can be disabled to free approximately 1-2 GB of memory for your applications. Logs from the web console (live streaming via kip app logs and the Console log viewer) continue to work. Only persistent log storage and metrics collection are affected.

Disable

bash

kip monitoring disable

  Disabling monitoring stack...
    ✔  Loki scaled down
    ✔  Promtail scaled down
    ✔  Prometheus scaled down
    ✔  Grafana scaled down

  Monitoring disabled. This frees ~1-2 GB of memory.
  Run 'kip monitoring enable' to turn it back on.

Re-enable

bash

kip monitoring enable

  Enabling monitoring stack...
    ✔  Loki scaled up
    ✔  Promtail scaled up
    ✔  Prometheus scaled up
    ✔  Grafana scaled up

  Monitoring enabled. Components may take a minute to become ready.
  Run 'kip monitoring status' to check progress.

Check status

bash

kip monitoring status

  Monitoring stack:

    ✔  Loki           running (1/1)
    ✔  Promtail       running (1/1)
    ✔  Prometheus     running (1/1)
    ✔  Grafana        running (1/1)

When monitoring is disabled, kip status shows the components as "disabled" rather than unhealthy.

Resource usage

The observability stack is configured for single-node clusters:

Component	Memory request	Memory limit
Prometheus	256 MB	512 MB
Loki	128 MB	512 MB
Grafana	64 MB	128 MB
Promtail	32 MB	128 MB
kube-state-metrics	32 MB	64 MB
node-exporter	32 MB	64 MB

Total: approximately 550 MB requested at idle. Prometheus memory grows with the number of active time series.

Data retention

Metrics (Prometheus): 3 days
Logs (Loki): 3 days

For longer retention, update the Helm values via the k3s HelmChart resource in kube-system.

Architecture

All components run in the monitoring namespace and are managed by Helm charts via k3s.

Observability ​

Accessing Grafana ​

What's included ​

Loki: Log aggregation ​

Prometheus: Metrics ​

Grafana: Dashboards ​

AI log analysis ​

Disabling monitoring ​

Disable ​

Re-enable ​

Check status ​

Resource usage ​

Data retention ​

Architecture ​

Observability

Accessing Grafana

What's included

Loki: Log aggregation

Prometheus: Metrics

Grafana: Dashboards

AI log analysis

Disabling monitoring

Disable

Re-enable

Check status

Resource usage

Data retention

Architecture