Observability
Skipper includes a built-in observability stack for production monitoring: Loki for logs, Prometheus for metrics, and Grafana for dashboards.
All three are installed automatically during kip install and configured to work together out of the box.
Accessing Grafana
Grafana is available at https://grafana-<your-domain>:
https://grafana-46-225-91-12.kipper.runDefault credentials:
- Username: admin
- Password: skipper
Change the password after first login.
What's included
Loki: Log aggregation
Loki collects logs from all pods across all namespaces. Unlike streaming logs from a single pod (kip app logs), Loki gives you:
- Persistent logs: survive pod restarts and crashes
- Searchable: filter by app, namespace, time range, or text content
- Multi-pod: see logs from all replicas of an app in one view
In Grafana, go to Explore → select Loki as the data source → query with LogQL:
{namespace="yourr-name-test", app="domain-service"}Filter for errors:
{namespace="yourr-name-test", app="domain-service"} |= "ERROR"Prometheus: Metrics
Prometheus collects CPU, memory, network, and request metrics from all pods and nodes. Pre-configured with:
- Node exporter: CPU, memory, disk, network per node
- kube-state-metrics: pod status, deployment health, replica counts
- Pod metrics: CPU and memory usage per container
In Grafana, go to Explore → select Prometheus as the data source → query with PromQL:
container_memory_usage_bytes{namespace="yourr-name-test"}Grafana: Dashboards
Grafana comes with pre-built dashboards for cluster monitoring. Access them from the sidebar → Dashboards.
Useful built-in dashboards:
- Kubernetes / Compute Resources / Namespace: CPU and memory per namespace
- Kubernetes / Compute Resources / Pod: CPU and memory per pod
- Node Exporter Full: detailed node health
AI log analysis
The log viewers in the web console (for apps, functions, and jobs) include an Analyse button. Click it to send the currently visible logs to the configured AI provider for analysis.
The AI scans the log output for errors, warnings, stack traces, and unusual patterns. It returns a summary of what happened, highlights the most likely root cause, and suggests next steps. This is especially useful when debugging unfamiliar stack traces or sifting through high-volume log output where the signal is buried in noise.
AI log analysis works with both live streaming logs and Loki history queries. The analysis uses whatever logs are currently displayed. Use the time range and search filters to narrow the context before clicking Analyse.
Requires an AI provider to be configured in the Settings page. See Configuration: AI provider settings for setup.
Disabling monitoring
On smaller servers (8-12 GB RAM), the monitoring stack can be disabled to free approximately 1-2 GB of memory for your applications. Logs from the web console (live streaming via kip app logs and the Console log viewer) continue to work. Only persistent log storage and metrics collection are affected.
Disable
kip monitoring disable Disabling monitoring stack...
✔ Loki scaled down
✔ Promtail scaled down
✔ Prometheus scaled down
✔ Grafana scaled down
Monitoring disabled. This frees ~1-2 GB of memory.
Run 'kip monitoring enable' to turn it back on.Re-enable
kip monitoring enable Enabling monitoring stack...
✔ Loki scaled up
✔ Promtail scaled up
✔ Prometheus scaled up
✔ Grafana scaled up
Monitoring enabled. Components may take a minute to become ready.
Run 'kip monitoring status' to check progress.Check status
kip monitoring status Monitoring stack:
✔ Loki running (1/1)
✔ Promtail running (1/1)
✔ Prometheus running (1/1)
✔ Grafana running (1/1)When monitoring is disabled, kip status shows the components as "disabled" rather than unhealthy.
Resource usage
The observability stack is configured for single-node clusters:
| Component | Memory request | Memory limit |
|---|---|---|
| Prometheus | 256 MB | 512 MB |
| Loki | 128 MB | 512 MB |
| Grafana | 64 MB | 128 MB |
| Promtail | 32 MB | 128 MB |
| kube-state-metrics | 32 MB | 64 MB |
| node-exporter | 32 MB | 64 MB |
Total: approximately 550 MB requested at idle. Prometheus memory grows with the number of active time series.
Data retention
- Metrics (Prometheus): 3 days
- Logs (Loki): 3 days
For longer retention, update the Helm values via the k3s HelmChart resource in kube-system.
Architecture
All components run in the monitoring namespace and are managed by Helm charts via k3s.
