Resource Management
Skipper automatically manages CPU and memory for your apps so you do not have to think about Kubernetes resource requests and limits. It monitors actual usage and adjusts allocations to match. It scales up when apps need more and scales down when they are over-provisioned.
Auto mode (default)
A background controller monitors resource usage via metrics-server every 60 seconds. When it detects sustained high or low usage, it adjusts CPU and memory requests and limits automatically.
How it works
| Condition | Threshold | Action |
|---|---|---|
| High usage | Above 80% for 3 consecutive checks | Increase by 50% |
| Low usage | Below 20% for 3 consecutive checks | Halve (with minimums) |
| OOM kill | Immediate | Double memory (capped at 2 Gi) |
| Stuck pod | In ContainerCreating for 5+ minutes | Delete pod to trigger recreation |
The controller only acts when usage is consistently high or low. A single spike does not trigger a scale-up, and a brief idle period does not trigger a scale-down. That way, temporary load changes don't cause thrashing.
Profile-based minimums
The controller never scales below the resource profile defaults. This prevents databases and heavy applications from being starved:
| Profile | Min CPU | Min memory |
|---|---|---|
lightweight | 50m | 64 Mi |
standard | 100m | 128 Mi |
database | 250m | 256 Mi |
jvm | 500m | 2 Gi |
Database services (PostgreSQL, MySQL, MongoDB, OpenSearch) automatically get the database profile.
OOM memory cap
OOM doubling is capped at 50% of total node allocatable memory (minimum 8 Gi). On a 16 GB node, the cap is 8 Gi. If an OOM-killed pod is already at the cap, the controller creates a critical alert instead of doubling further.
All values are rounded to clean boundaries: CPU to the nearest 50m, memory to the nearest 64 Mi. If rounding would produce the same value as the current setting, the controller skips the update.
Startup grace period
Pods younger than 5 minutes are excluded from CPU and memory calculations. Without this grace period, the controller would react to transient startup spikes. JVM applications, for example, often use 100% CPU during class loading and JIT compilation for several minutes before settling to idle. OOM detection is unaffected and works immediately regardless of pod age.
Single-replica apps
For apps with a single replica, the controller only scales up and never scales down. Every resource change triggers a pod restart, and with one replica that means a brief outage. Scaling down is only safe with 2+ replicas, where Kubernetes performs a rolling update and at least one pod stays up.
The Scale tab in the web console shows a message explaining this when an app has one replica and auto mode is active.
Autoscaling (HPA)
Autoscaling adjusts the number of pods based on CPU and memory utilisation. It works independently from the resource controller, which adjusts CPU and memory per pod. Together, they give you both vertical scaling (right-sized pods) and horizontal scaling (right number of pods).
How the two controllers interact
| Concern | Who owns it | What it does |
|---|---|---|
| CPU and memory per pod | Resource controller (auto mode) | Monitors usage, adjusts requests and limits |
| Number of pods (replicas) | HPA (Kubernetes built-in) | Monitors utilisation %, scales between min and max |
| Deployment shape (image, env, volumes) | App reconciler | Syncs the Deployment to match the App CR |
When autoscaling is enabled, the App reconciler stops writing spec.replicas to the Deployment and lets the HPA own that field. When autoscaling is disabled, the App reconciler owns replicas again.
When to use what
The resource controller and autoscaling solve different problems. They complement each other, but you don't always need both.
Resource management only (auto mode, no autoscaling)
Best for apps with predictable traffic where you don't know the right CPU and memory values yet. Skipper figures out the right size over time. A small internal tool, a staging environment, a service that handles a steady number of background jobs. You don't need multiple replicas, you just need the pod to be the right size.
Autoscaling only (expert mode with HPA)
Best when you know exactly how much CPU and memory each pod needs, but traffic varies. A public API that gets 10 requests per second at night and 500 during business hours. You've profiled the app and set the resources yourself. You just need Kubernetes to add and remove pods as load changes.
Both together
Best for production apps where traffic varies AND you want Skipper to handle the right-sizing automatically. The resource controller finds the right CPU and memory per pod over time. The HPA handles traffic spikes by adding pods quickly, without any restarts. When a traffic spike hits, the HPA responds in seconds by adding pods. The resource controller only adjusts resources after sustained changes over minutes.
Here's a typical sequence with both enabled:
- App starts with standard profile defaults (250m CPU, 256Mi memory)
- Resource controller watches usage over a few minutes and adjusts. Maybe the app actually needs 500m CPU. That triggers one rolling restart, but the HPA ensures 2+ pods, so there's no downtime.
- A traffic spike hits. CPU goes above 70% across all pods.
- The HPA adds pods within seconds. No restarts, just more pods handling requests.
- The resource controller sees the HPA has scaled out, so it leaves the per-pod resources alone. No interference.
- Traffic drops. The HPA removes the extra pods.
- If baseline usage is still higher than before, the resource controller will eventually adjust. But only after sustained readings, not from a temporary spike.
Common scenarios
| Your situation | Recommended setup |
|---|---|
| Small internal tool, one user | Auto mode only, 1 replica |
| Staging environment, testing | Auto mode only, 1 replica |
| Production API, steady traffic | Auto mode, 2 replicas (no autoscaling) |
| Production API, variable traffic | Auto mode + autoscaling, min 2 / max 5 |
| JVM app you've already tuned | Expert mode + autoscaling |
| Database or cache | Auto mode only (databases should not be horizontally scaled) |
| Batch worker, periodic spikes | Auto mode + autoscaling based on CPU |
Enabling autoscaling
From the Scale tab in the web console, toggle Autoscaling on. Set the minimum and maximum replicas and a CPU target percentage. Click Save autoscaling.
From the CLI or GitOps:
apiVersion: getkipper.com/v1alpha1
kind: App
metadata:
name: api
namespace: yourr-name-prod
spec:
image: registry.example.com/api:latest
port: 8080
autoscale:
enabled: true
minReplicas: 2
maxReplicas: 5
cpuTarget: 70The HPA checks metrics every 15 seconds. When average CPU across all pods exceeds the target, it adds pods. When utilisation drops, it removes pods (down to minReplicas).
Recommended settings
Set minReplicas to at least 2 when using auto mode. This gives two benefits:
- The resource controller can safely scale resources down. With 2+ replicas, Kubernetes performs a rolling update so at least one pod stays available during the restart
- Your app has basic high availability. If one pod crashes, the other continues serving traffic
A good starting point for most apps:
| Setting | Value |
|---|---|
| Min replicas | 2 |
| Max replicas | 5 |
| CPU target | 70% |
| Memory target | 0 (disabled) |
Memory-based autoscaling is usually less useful because most applications do not release memory when load drops. CPU-based scaling responds faster to actual load changes.
What happens under the hood
- You enable autoscaling on the App CR
- The App reconciler creates an HPA targeting the app's Deployment
- The HPA reads CPU metrics from metrics-server and adjusts
deployment.spec.replicas - The resource controller independently adjusts CPU and memory requests based on per-pod usage
- When the App reconciler runs (e.g. after an image update), it updates the Deployment template but preserves the replica count set by the HPA
Disabling autoscaling
Toggle autoscaling off in the Scale tab and click Save autoscaling. The HPA is deleted and the App reconciler takes over replica management again, setting replicas to app.Spec.Replicas (defaults to 1).
OOM recovery
When a pod is terminated by the kernel for exceeding its memory limit (OOMKilled), the controller doubles the memory immediately, without waiting for 3 consecutive checks. This handles cases where an app needs significantly more memory than its initial allocation, such as a Java application starting with 64 Mi but requiring 512 Mi+ for the JVM.
The controller detects OOM kills even when the pod is in a crash loop and has no metrics. It checks the pod's termination state directly from the Kubernetes API, not just from metrics data.
Resource profiles
When an app has no resource requests configured, the controller applies defaults based on the app's resource profile label (getkipper.com/resource-profile):
| Profile | CPU | Memory | Use case |
|---|---|---|---|
lightweight | 50m | 64 Mi | Static sites, proxies, lightweight APIs |
standard | 100m | 128 Mi | Typical web applications (default) |
compute-heavy | 500m | 256 Mi | Image processing, data transformation |
memory-heavy | 100m | 512 Mi | Caching layers, in-memory databases, ML inference |
database | 250m | 256 Mi | PostgreSQL, MySQL, MongoDB, OpenSearch |
jvm | 500m | 2 Gi | Java/JVM applications, Spring Boot, heavy runtimes |
If no profile label is set, standard is used. Database services automatically get the database profile.
Custom resources
For workloads that don't fit any profile (like a Java application with -Xms 4G or a data pipeline needing 8 Gi), you can set explicit CPU and memory values at deploy time.
From the CLI:
kip app deploy --name exchange-service --image registry.example.com/exchange:latest \
--port 8080 --memory 4Gi --cpu 1From the web console:
Select Custom... from the resource profile dropdown when deploying an app. Two fields appear for memory and CPU. Use Kubernetes resource notation: 256Mi, 1Gi, 4Gi for memory; 250m, 500m, 1, 2 for CPU.
Custom values override the profile defaults. The auto controller still adjusts from there based on actual usage. Your values are the starting point, not a ceiling.
Resource log
Every change the controller makes is logged and visible under Settings in the web console. The log shows:
- Time: when the change happened
- App and namespace: which workload was adjusted
- Action: what changed (increased memory, decreased CPU, applied defaults)
- From / To: old and new values
- Reason: why the change was made (usage at 92%, OOM kill detected)
The system retains the most recent 50 log entries.
Expert mode
Switch to expert mode when you want full control over resource allocation. The auto controller stops making changes, and all CPU and memory values are set manually through the Resources tab in the app detail panel.
Toggle between modes in Settings in the web console. Only admins can change the mode.
PUT /api/v1/settings/mode
{"mode": "auto"} // or "expert"In expert mode, you can still view the resource log to see what the controller changed before you switched.
Alerts
Every action the controller takes generates an alert visible in the console bell icon:
- Critical (red): OOM kills, emergency memory doubling
- Warning (yellow): resource increases, stuck pod recovery
- Info (green): scale-downs, default profile application
See Alerts for details on the alerting system and Slack integration.
Slack notifications
Resource changes can be forwarded to Slack. See Configuration for setup.
What Skipper manages
The auto controller manages resources for Skipper workloads defined as Custom Resources (getkipper.com/v1alpha1):
- Apps: web apps, APIs, frontends
- Services: databases, caches, message queues
- Functions: serverless workloads (resources set at creation, not auto-tuned while idle)
- Jobs: scheduled and one-off batch tasks
It does not manage system components (Traefik, cert-manager, Longhorn) or the KEDA autoscaler itself.
GitOps
Skipper resources are defined as Custom Resource Definitions (CRDs) under getkipper.com/v1alpha1. This means you can manage your entire cluster declaratively with tools like ArgoCD or Flux:
apiVersion: getkipper.com/v1alpha1
kind: App
metadata:
name: api
namespace: yourr-name-test
spec:
image: registry.example.com/api:v2.1.0
port: 8080
replicas: 2
resources:
profile: jvm
memory: "4Gi"
env:
LOG_LEVEL: "info"
route:
host: api.example.comApply with kubectl apply -f app.yaml or commit to a Git repo and let your GitOps tool sync it. Skipper's reconcilers ensure the underlying Kubernetes resources (Deployment, Service, Ingress, Secrets) match the CR spec.
Available CRDs: App, Service, Function, Project, Job, Volume.
For a more user-friendly approach, use the skipper.yaml manifest format with kip apply. See the full GitOps guide for details, including ArgoCD and Flux integration examples.
