In This Guide
- What Kubernetes Is (Plain English)
- Why Kubernetes Exists: The Container Scale Problem
- Docker vs. Kubernetes: The Confusion Cleared Up
- Core Kubernetes Concepts: Pods, Deployments, Services, and More
- The Kubernetes Architecture: Control Plane vs. Nodes
- Your First Kubernetes Deployment
- Kubernetes Networking: How Pods Talk to Each Other
- Persistent Storage: PersistentVolumes and StatefulSets
- Helm Charts: The Package Manager for Kubernetes
- Managed Kubernetes: EKS, GKE, and AKS
- Kubernetes for AI/ML: Kubeflow and GPU Workloads
- Do You Actually Need Kubernetes?
Key Takeaways
- What is Kubernetes in simple terms? Kubernetes is a system that automatically manages containers running across many machines.
- What is the difference between Docker and Kubernetes? Docker builds and runs individual containers on a single machine.
- Do I actually need Kubernetes for my project? Probably not at first. If you have a small application with a few services, Docker Compose or a managed platform like Heroku, Railway, or Fly.io is...
- What is a Helm chart in Kubernetes? A Helm chart is a package of pre-configured Kubernetes resource definitions.
I have run production Kubernetes clusters and taught container orchestration to beginners — this guide bridges those two worlds. Kubernetes has a reputation. It is the thing that senior engineers talk about in hushed reverence. It is the technology that spawned a thousand conference talks, a certification program, and an entire ecosystem of tooling. It is also — if you have tried reading the official documentation — seemingly designed to make your brain hurt.
But here is the truth: the concepts behind Kubernetes are not complicated. The implementation details are dense, but what Kubernetes actually does is something any developer can understand in an afternoon. This guide will give you that understanding — what Kubernetes is, why it exists, how its pieces fit together, and whether you actually need it for what you are building.
What Kubernetes Is (Plain English)
Kubernetes is a system for managing containers across many machines — it automatically schedules where containers run, restarts crashed containers, scales replicas up and down based on load, routes network traffic, and rolls out new versions with zero downtime, all without any human in the loop for routine operations. Everything else — Pods, Deployments, Services, Ingress controllers, etcd, the scheduler — is implementation detail on top of that one sentence.
Here is the simplest analogy. Imagine you have a fleet of 50 servers and 200 containers that need to run across them. Without Kubernetes, you would have to manually decide which container goes on which server, monitor each one, restart crashed containers by hand, and figure out load balancing yourself. Kubernetes automates all of that. It watches your containers, moves them around machines as needed, restarts failures, and routes traffic — all automatically.
Kubernetes was originally built at Google, where it was based on an internal system called Borg that managed Google's own infrastructure. Google open-sourced it in 2014. The Cloud Native Computing Foundation (CNCF) now governs it. The name comes from the Greek word for "helmsman" or "pilot" — the person who steers a ship. The logo is a ship's wheel. The abbreviation K8s comes from replacing the eight middle letters of "Kubernetes" with the number 8.
The One-Sentence Summary
Kubernetes is an operating system for a cluster of servers — it manages containers the same way your laptop's OS manages processes, but across hundreds or thousands of machines simultaneously.
Why Kubernetes Exists: The Container Scale Problem
Docker runs one container on one machine and answers none of the operational questions that arise at scale: which server should this container run on, who restarts it at 3am when it crashes, how do you scale from 2 to 10 copies under load, how do containers on different machines find each other? Kubernetes answers all of those questions, across many machines, automatically. Containers are great for packaging applications. Docker made it easy for any developer to wrap an application and all its dependencies into a portable, reproducible unit. That solved a real problem: "it works on my machine" became "it works in this container."
But containers at scale introduce a new set of problems:
- Placement: You have 10 servers and 50 containers. Which container runs on which server? What if one server has more CPU available? What if a container needs a GPU?
- Failure recovery: A container crashes at 3am. Who restarts it? How quickly? On which machine?
- Scaling: Traffic spikes. You need 10 copies of your API container instead of 2. Who spins them up? Who tears them down when traffic drops?
- Networking: Container A needs to talk to Container B, but they might be on different machines. How do they find each other? How does external traffic reach the right container?
- Configuration and secrets: Your app needs a database password. You can't hardcode it. Where does each container get it?
- Rolling deployments: You want to update your app without any downtime. How do you drain traffic from old containers and route it to new ones?
Docker answers none of these questions. Docker runs a container on one machine. Kubernetes answers all of these questions across many machines. That is the entire reason it exists.
Docker vs. Kubernetes: The Confusion Cleared Up
Docker builds container images and runs them on a single machine. Kubernetes runs those images across a cluster of machines — it uses Docker (or containerd) as the runtime on each node but handles all the orchestration: scheduling, scaling, self-healing, service discovery, rolling deployments, and secrets management that Docker alone cannot do. Docker and Kubernetes are not competitors. They work together.
Docker's job is to build and run containers. You write a Dockerfile, run docker build, and you get a container image. Run docker run and you have a live container on your machine. Docker is a single-machine tool. It is excellent at what it does.
Kubernetes' job is to manage containers across many machines. Kubernetes does not replace Docker — it uses Docker (or another container runtime like containerd) to actually run the containers on each individual node. Kubernetes decides where and when to run containers. Docker handles the how at the machine level.
| Capability | Docker | Kubernetes |
|---|---|---|
| Build container images | Yes | No |
| Run containers on one machine | Yes | Not directly |
| Manage containers across many machines | No | Yes |
| Auto-restart crashed containers | Limited (with restart policies) | Yes — automatically |
| Auto-scale based on CPU/memory/traffic | No | Yes |
| Rolling zero-downtime deployments | No | Yes |
| Multi-service local development | Docker Compose | Overkill for local dev |
The typical workflow: you use Docker to build your image, push it to a container registry (Docker Hub, Amazon ECR, Google Artifact Registry), and then tell Kubernetes to pull and run that image across your cluster. Docker and Kubernetes are teammates, not rivals.
"Docker packages your app. Kubernetes runs your app everywhere."
Core Kubernetes Concepts: Pods, Deployments, Services, and More
The eight Kubernetes concepts you encounter on day one: Pod (one or more containers sharing network/storage, the smallest schedulable unit), Deployment (manages N identical Pod replicas with rolling updates), Service (stable network address routing to ephemeral Pods), Ingress (HTTP/HTTPS routing by hostname or path), ConfigMap (non-sensitive config), Secret (sensitive credentials), Namespace (virtual cluster for isolation), and Node (the machine running your Pods).
Pods
A Pod is the smallest deployable unit in Kubernetes. A Pod contains one or more containers that share a network namespace and storage. In practice, most Pods contain a single container. Think of a Pod as a "wrapper" around your container that Kubernetes can schedule, start, stop, and restart. Pods are ephemeral — they are created and destroyed, not moved or updated in place.
Deployments
A Deployment describes a desired state for a set of Pods. You tell Kubernetes: "I want 3 copies of this container image running at all times." The Deployment controller watches the cluster and makes it so. If a Pod crashes, the Deployment spins up a replacement. If you push a new image version, the Deployment handles rolling it out Pod by Pod with zero downtime. Deployments are what you use for stateless applications — web servers, APIs, microservices.
Services
A Service gives a stable network address to a set of Pods. Since Pods are ephemeral and their IP addresses change when they restart, you need a stable endpoint. A Service provides that. It acts as a load balancer in front of your Pods. There are several types: ClusterIP (internal only), NodePort (accessible from outside the cluster on a specific port), and LoadBalancer (provisions a cloud load balancer automatically on EKS/GKE/AKS).
Ingress
An Ingress manages external HTTP and HTTPS traffic into your cluster. It acts as a reverse proxy — routing api.yourapp.com to one Service and app.yourapp.com to another. You need an Ingress controller (like nginx-ingress or Traefik) installed in your cluster for Ingress resources to work. Think of it as the front door to your cluster.
ConfigMaps
A ConfigMap stores non-sensitive configuration data as key-value pairs — environment variables, config files, command-line arguments. Your containers read from ConfigMaps at runtime. This decouples configuration from the container image, so you can change settings without rebuilding the image.
Secrets
Secrets are like ConfigMaps but for sensitive data — passwords, API keys, TLS certificates, database connection strings. Kubernetes stores Secrets base64-encoded (note: this is encoding, not encryption — use a secrets manager like HashiCorp Vault or AWS Secrets Manager for true secrets management). Pods can consume Secrets as environment variables or as mounted files.
Namespaces
A Namespace is a virtual cluster within a physical cluster. You use Namespaces to separate environments (dev, staging, production) or teams (frontend, backend, data) within the same cluster. Resources in different Namespaces are isolated by default. Most clusters have three default Namespaces: default, kube-system, and kube-public.
How These Fit Together
A Deployment creates Pods. A Service gives those Pods a stable address. An Ingress routes external traffic to the Service. A ConfigMap provides environment configuration to the Pods. A Secret provides sensitive credentials. A Namespace keeps it all organized. That is the complete picture for a typical application.
The Kubernetes Architecture: Control Plane vs. Nodes
A Kubernetes cluster has two roles: the control plane (API server, scheduler, controller manager, and etcd — the brain that stores all cluster state and makes scheduling decisions) and worker nodes (running kubelet, kube-proxy, and the container runtime — the machines that actually execute your containers). With managed Kubernetes (EKS, GKE, AKS), you never touch the control plane; the cloud provider runs it for you.
The Control Plane
The control plane manages the entire cluster. It runs on dedicated machines (typically 1 for small clusters, 3 for high availability) and consists of four main components:
- API Server (kube-apiserver) — The front door to the cluster. Everything communicates through it:
kubectl, other control plane components, and your CI/CD pipelines. It validates and persists all cluster state. - etcd — A distributed key-value store that holds all cluster state. Every Deployment, Pod, Service, ConfigMap — all of it lives in etcd. If you lose etcd without a backup, you lose the cluster. It is the single source of truth.
- Scheduler (kube-scheduler) — Watches for new Pods that don't have a node assigned and picks the best node for them based on resource availability, affinity rules, taints, and tolerations.
- Controller Manager (kube-controller-manager) — Runs a collection of controllers that watch the cluster state and take corrective action. The Deployment controller, ReplicaSet controller, and Node controller all live here. If a Pod dies and violates the desired state of a Deployment, the Deployment controller notices and creates a replacement.
Worker Nodes
Worker nodes are the machines that actually run your application containers. Each node runs three components:
- kubelet — The node agent. Talks to the API server, receives Pod specs, and ensures the containers described in those specs are running and healthy.
- kube-proxy — Manages network rules on each node. Handles the networking magic that lets Services route traffic to the right Pod IP addresses.
- Container Runtime — The software that actually runs containers. Kubernetes supports containerd (most common), CRI-O, and (historically) Docker.
The Reconciliation Loop: Kubernetes' Core Pattern
The fundamental pattern in Kubernetes is the reconciliation loop. You declare the desired state (3 copies of this Pod). Kubernetes continuously compares actual state to desired state. When they diverge, Kubernetes takes action to reconcile them. This is why Kubernetes is called "declarative" — you describe what you want, not how to do it.
Your First Kubernetes Deployment
kubectl is the command-line interface for Kubernetes — use kubectl apply -f deployment.yaml to deploy, kubectl get pods -n production to inspect, kubectl describe pod <name> to debug, kubectl logs <pod> to view output, and kubectl rollout undo deployment/my-api to roll back a bad deployment instantly. Everything in Kubernetes is managed through kubectl applying YAML manifests to the API server.
Running Kubernetes Locally
For local development, two options dominate:
- Minikube — Spins up a single-node Kubernetes cluster inside a virtual machine or Docker container on your laptop. Simple to set up, good for learning. Run
minikube startand you have a cluster in under two minutes. - kind (Kubernetes in Docker) — Runs a Kubernetes cluster using Docker containers as nodes. Faster than Minikube, preferred for CI pipelines and testing.
A Simple Deployment YAML
Here is what a real Deployment looks like. This deploys an nginx web server with 3 replicas:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
namespace: default
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
Apply it with:
kubectl apply -f deployment.yaml kubectl get pods kubectl get deployments
Kubernetes will create 3 Pods, each running the nginx container. If any Pod dies, Kubernetes replaces it automatically. To expose it with a Service:
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
type: LoadBalancer
The selector: app: nginx field is how the Service finds the right Pods — it matches the labels on the Pods defined in the Deployment. This label-based selection is fundamental to how Kubernetes connects resources.
Kubernetes Networking: How Pods Talk to Each Other
Kubernetes networking follows a few core rules:
- Every Pod gets its own IP address. No NAT between Pods on the same cluster.
- Pods on any node can communicate with Pods on any other node directly.
- Services provide stable virtual IPs that don't change when Pods restart.
In practice, you rarely deal with Pod IPs directly. Instead, you use Service DNS names. Kubernetes automatically creates DNS entries for every Service. If you have a Service named my-api in the default namespace, other Pods can reach it at my-api.default.svc.cluster.local — or simply my-api if they're in the same namespace.
For traffic from the outside world, the flow is: Internet → Load Balancer → Ingress Controller → Service → Pod. The Ingress controller (typically nginx or Traefik) handles TLS termination, path-based routing, and virtual host routing. It is the only component that talks directly to the internet; everything behind it is internal cluster networking.
Network Policies
By default, all Pods in a cluster can communicate with all other Pods. NetworkPolicy resources let you restrict this — only allowing specific Pods to communicate with specific other Pods on specific ports. This is essential for production security.
Persistent Storage: PersistentVolumes and StatefulSets
Containers are stateless by default — when a container restarts, any data written to its filesystem is gone. For stateful applications like databases, you need persistent storage that survives container restarts.
Kubernetes handles this with two abstractions:
- PersistentVolume (PV) — A piece of storage in the cluster, provisioned by an admin or dynamically by a StorageClass. It exists independently of any Pod. Think of it as a hard drive that the cluster knows about.
- PersistentVolumeClaim (PVC) — A request for storage by a Pod. You say "I need 10GB of storage with ReadWriteOnce access," and Kubernetes finds or creates a matching PV. The Pod mounts the PVC like a directory.
For stateful applications (databases, message queues, caches), you use StatefulSets instead of Deployments. StatefulSets give each Pod a stable identity (pod-0, pod-1, pod-2), stable storage, and ordered startup/shutdown. This is essential for clustered databases like PostgreSQL, Cassandra, or Kafka where each node needs a stable identity and its own persistent volume.
The Practical Advice on Databases in Kubernetes
Running databases in Kubernetes is possible, but adds operational complexity. Many teams use managed databases (RDS, Cloud SQL, PlanetScale) outside the cluster and only run stateless applications in Kubernetes. Both approaches are valid — the right choice depends on your team's Kubernetes expertise and operational requirements.
Helm Charts: The Package Manager for Kubernetes
Deploying a real application in Kubernetes means writing a lot of YAML — Deployments, Services, ConfigMaps, Secrets, Ingress rules, HorizontalPodAutoscalers. For a production application with 10 microservices, that might be 50+ YAML files with hundreds of lines each.
Helm solves this problem. Helm is the package manager for Kubernetes. A Helm chart is a collection of pre-written Kubernetes templates, parameterized with variables. Instead of writing all that YAML from scratch, you install a chart.
# Add the Bitnami chart repository helm repo add bitnami https://charts.bitnami.com/bitnami # Install PostgreSQL with custom values helm install my-postgres bitnami/postgresql \ --set auth.postgresPassword=mysecretpassword \ --set primary.persistence.size=20Gi
That single command creates a StatefulSet, a Service, Secrets, PersistentVolumeClaims, and every other resource needed to run a production-grade PostgreSQL instance on Kubernetes. Without Helm, writing all of that correctly from scratch would take hours.
Helm charts also handle upgrades and rollbacks. helm upgrade applies changes to an existing release. helm rollback reverts to a previous version. For your own applications, you can write Helm charts that allow teammates to deploy your service with a single command and a small values.yaml file specifying environment-specific settings.
Managed Kubernetes: EKS, GKE, and AKS
Running Kubernetes yourself means managing the control plane — which is genuinely hard. In production, most teams use a managed Kubernetes service where the cloud provider handles the control plane for you. You just manage the worker nodes and your workloads.
| Service | Provider | Best For | Notes |
|---|---|---|---|
| Amazon EKS | AWS | Teams already on AWS | Deepest AWS integrations (IAM, ALB, EBS). Most widely used. Control plane costs ~$0.10/hr. |
| Google GKE | Google Cloud | Teams wanting best-in-class Kubernetes experience | Google invented Kubernetes. Autopilot mode removes node management entirely. Generally considered most polished. |
| Azure AKS | Microsoft Azure | Teams on Azure or .NET stacks | Free control plane. Deep Active Directory integration. Natural choice for enterprise Microsoft shops. |
All three are solid production choices. The decision usually comes down to where your other infrastructure lives. If you are on AWS already, EKS is the path of least resistance. If you are greenfield and care about developer experience, GKE Autopilot is genuinely impressive — it removes the concept of nodes entirely and just charges you for the resources your Pods consume.
Fargate and Cloud Run: The Serverless Middle Ground
AWS Fargate (serverless compute for EKS/ECS) and Google Cloud Run let you run containers without managing nodes at all. If Kubernetes feels like overkill but Docker Compose is not enough, these services are worth evaluating. Cloud Run in particular is remarkably simple — deploy a container image and get a URL. No cluster management required.
Kubernetes for AI/ML: Kubeflow and GPU Workloads
Kubernetes has become the default infrastructure layer for machine learning workloads at scale, and for good reason. Training a large model often requires orchestrating dozens of GPU-equipped machines in parallel — exactly the kind of coordination Kubernetes was built for.
GPU Scheduling
Kubernetes can schedule Pods to nodes with GPUs using resource requests. You simply declare your GPU requirement in the Pod spec:
resources:
limits:
nvidia.com/gpu: 2 # Request 2 GPUs for this Pod
Kubernetes — with the NVIDIA device plugin installed — will only schedule that Pod onto nodes that have 2 available GPUs. This makes it straightforward to manage a heterogeneous cluster with some CPU nodes and some GPU nodes, letting Kubernetes handle placement automatically.
Kubeflow
Kubeflow is an open-source ML platform built on top of Kubernetes. It adds components specifically for ML workflows: Kubeflow Pipelines for defining ML training pipelines as DAGs, KServe (formerly KFServing) for model serving with autoscaling, and Katib for hyperparameter optimization. If you are running ML workloads at scale in a self-managed cluster, Kubeflow is the ecosystem to evaluate first.
Model Serving at Scale
Deploying a trained model as an API endpoint is where Kubernetes shines for ML teams. A model serving container (using FastAPI, Triton Inference Server, or TorchServe) is just a container — Kubernetes can run 1 copy or 100 copies depending on traffic, automatically scale down to 0 when idle (with Knative), and roll out new model versions without downtime using standard Deployments. The same infrastructure that runs your web services can run your model APIs.
Do You Actually Need Kubernetes?
This is the question most tutorials refuse to answer honestly. The answer is: probably not yet, and maybe never.
Kubernetes has real operational overhead. You need to understand YAML, kubectl, RBAC, networking, storage classes, Helm, and cluster upgrades. Even with a managed service like EKS or GKE, you are managing worker nodes, configuring Ingress controllers, and debugging scheduling issues. This is not a weekend project — it is an ongoing operational commitment.
For most early-stage applications, simpler options are the right choice:
- Docker Compose — For local development and small single-server deployments. If your entire application fits on one server, Docker Compose handles multiple containers beautifully without any Kubernetes complexity.
- Fly.io or Railway — Deploy Docker containers with a simple CLI. Auto-scales. Global edge network. No YAML configuration files to manage. For applications with modest scale requirements, these services remove weeks of infrastructure work.
- AWS App Runner / Google Cloud Run — Point at a container image, get a URL, auto-scales to zero. No cluster to manage. Costs nothing when idle. Perfect for APIs, background workers, and ML model serving at moderate scale.
- Heroku — Still the fastest path from code to production for web applications. Uses Dynos (containers) under the hood without exposing any Kubernetes concepts.
When Kubernetes is genuinely the right answer:
- You have multiple services that need to run across multiple machines with sophisticated orchestration
- You need GPU scheduling across a cluster for ML training or inference workloads
- You require custom scaling logic, advanced networking, or multi-tenancy in a single cluster
- You are running on-premises infrastructure where managed cloud services are not an option
- Your team already has Kubernetes expertise and the operational overhead is understood
The Honest Rule of Thumb
Start with Docker Compose or a simple cloud platform. Add Kubernetes when the pain of not having it is greater than the pain of operating it. Most startups reach Kubernetes at Series B or when they have a dedicated platform engineering team. There is no shame in running on Fly.io until you have 10 engineers.
The goal of this guide was not to make you a Kubernetes expert — it was to make sure that the next time you encounter Kubernetes in a job description, a team meeting, or a system design interview, you understand what it actually is and why it exists. The vocabulary is dense but the underlying ideas are simple: run containers at scale, automatically, across many machines. Everything else follows from that.
Learn DevOps, AI, and Cloud in three days.
Kubernetes, Docker, cloud platforms, Python, machine learning, and AI APIs — hands-on, in a small cohort of 40 professionals. $1,490. Five cities. October 2026.
Reserve Your SeatWhere We Train in 2026
The Precision AI Academy bootcamp runs in five cities starting October 2026. Every session is identical — the same instructor, the same curriculum, the same small-cohort format. Maximum 40 seats per city.
| City | Month | Price | Status |
|---|---|---|---|
| Denver, CO | October 2026 | $1,490 | Open |
| Los Angeles, CA | October 2026 | $1,490 | Open |
| New York City, NY | October 2026 | $1,490 | Open |
| Chicago, IL | October 2026 | $1,490 | Open |
| Dallas, TX | October 2026 | $1,490 | Open |
Stop reading about AI and start building it.
Three days. Forty professionals. One instructor who has shipped real systems. Python, machine learning, LLMs, agents, APIs, and the DevOps to deploy all of it. Your employer can pay for it tax-free under IRS Section 127.
Join the Waitlist — $1,490The bottom line: Kubernetes is the production standard for containerized workloads — if your team runs more than a handful of containers or needs zero-downtime deployments, horizontal autoscaling, or cross-machine resilience, K8s is the right tool. Start with Docker and Docker Compose, graduate to Kubernetes on a managed service (EKS, GKE, or AKS) when operational requirements demand it, and learn kubectl fluently — it is the one command-line tool every cloud engineer needs regardless of which managed K8s platform they use.
Sources: AWS Documentation, Gartner Cloud Strategy, CNCF Annual Survey
Explore More Guides
- AWS App Runner in 2026: Deploy Web Apps Without Managing Servers
- AWS Bedrock Explained: Build AI Apps with Amazon's Foundation Models
- AWS Lambda and Serverless in 2026: Complete Guide to Event-Driven Architecture
- AI Agents Explained: What They Are & Why They're the Biggest Shift in Tech (2026)
- AI Career Change: Transition Into AI Without a CS Degree