A Pod is a one-shot unit. A Deployment is the mechanism that guarantees the right number of pods is always running, knows how to update them, and can roll back changes.
What Is a Deployment and Why You Need It
A ReplicaSet ensures that a specified number of identical pods is running at any given time. If a Pod crashes — ReplicaSet creates a new one. If a node fails — the pods migrate.
A Deployment is a higher-level abstraction on top of ReplicaSet. It adds:
- Version management (revision history)
- Rolling update: updates with zero downtime
- Rollback: revert to a previous version with a single command
You almost never create a ReplicaSet directly — only through a Deployment.
Describing Desired State
Kubernetes operates on the desired state principle. You describe how the system should look; K8s converges the actual state to match and keeps it there.
You: "I want 3 replicas of nginx:1.25"
K8s: [starts 3 pods]
[one crashes] → [creates a new one]
[node dies] → [reschedules pods]
This is the declarative approach, as opposed to the imperative one (“create a container”, “restart it”, “add another one”).
Deployment YAML Example
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
labels:
app: my-app
spec:
replicas: 3
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: app
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "100m"
limits:
memory: "128Mi"
cpu: "250m"
Key parts:
replicas: 3— how many pods should be runningselector.matchLabels— the labels the Deployment uses to find “its” podstemplate— the pod template. Template labels must match theselector
# Apply
kubectl apply -f deployment.yaml
# Check
kubectl get deployments
# NAME READY UP-TO-DATE AVAILABLE AGE
# my-app 3/3 3 3 1m
kubectl get pods
# NAME READY STATUS RESTARTS AGE
# my-app-7d4b8f9c6-abcd1 1/1 Running 0 1m
# my-app-7d4b8f9c6-efgh2 1/1 Running 0 1m
# my-app-7d4b8f9c6-ijkl3 1/1 Running 0 1m
Rolling Update Strategy
By default, Deployment uses the RollingUpdate strategy:
spec:
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1 # at most 1 Pod may be unavailable
maxSurge: 1 # at most 1 extra Pod above the replica count
During an image update:
- A new Pod with the new image is created
- Kubernetes waits until it becomes Ready
- One old Pod is terminated
- Steps repeat for all replicas
At every point in time the application remains available.
# Update the image
kubectl set image deployment/my-app app=nginx:1.26
# Watch the rollout
kubectl rollout status deployment/my-app
# Waiting for deployment "my-app" rollout to finish: 1 out of 3 new replicas have been updated...
# Waiting for deployment "my-app" rollout to finish: 2 out of 3 new replicas have been updated...
# deployment "my-app" successfully rolled out
The alternative strategy is Recreate: all old pods are killed first, then new ones are started. There is downtime, but no version-compatibility requirement.
kubectl rollout: History and Rollback
# Revision history for the Deployment
kubectl rollout history deployment/my-app
# REVISION CHANGE-CAUSE
# 1 <none>
# 2 kubectl set image deployment/my-app app=nginx:1.26
# Details of a specific revision
kubectl rollout history deployment/my-app --revision=2
# Roll back to the previous version
kubectl rollout undo deployment/my-app
# Roll back to a specific revision
kubectl rollout undo deployment/my-app --to-revision=1
# Pause a rolling update
kubectl rollout pause deployment/my-app
# Resume it
kubectl rollout resume deployment/my-app
To keep the history with descriptions, add an annotation:
kubectl annotate deployment/my-app kubernetes.io/change-cause="upgrade to nginx 1.26"
HorizontalPodAutoscaler
HPA automatically scales the replica count based on load.
# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
kubectl apply -f hpa.yaml
# Watch HPA state
kubectl get hpa
# NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS
# my-app-hpa Deployment/my-app 45%/70% 2 10 3
When CPU exceeds 70%, HPA adds replicas. When load drops, it removes them — but never below minReplicas.
HPA requires the Metrics Server to be installed in the cluster:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
💬 Comments (0)
No comments yet
Be the first to share your opinion about this article!