Automates (orchestrates) Docker to scale, check for errored out containers and etc.
Both combined with Docker allows us to deploy immutable infrastructure where both system and the application can be represented as a single artifact. Combining with git such artifacts can be represented in “single source of truth” - a repository which contains underlying configuration
CNCF Landscape
/Attachments/Pasted-image-20250527095536.png)
- CRI used to execute and tun container processes in system
- Mostly containerd and cri-o
- CNI user to define networking
- Cloud specific or Open Source Calico, Flannel, Cilium
- CSI managing storage and volumes
- Cloud specific plus cert manager and sercret store CSI Driver
Theory
/Attachments/Pasted-image-20230122040407.png)
Architecture (control plane)
Cluster control plane manages all scheduling, application, scaling, and deploying Kubernetes. Nodes serve as workers in the K8S cluster. Required components which must be installed on every node are:
- Container runtime - runs the container
- Kubelet - intermediary between container and node. Executes operations on pods in the node
- Kubeproxy - applies smart routing of requests (LB), a-la send requests to DB components which are closer (inside the same node)
Services are used to connect between nodes.
Master nodes serve as a controller for nodes. You can have multiple of them deployed for redundancy. Required components which must be installed on every master node are:
-
API Server (kube-apiserver) - accepts all requests coming in or queries from the cluster (UI, API, CLI). Gatekeeps authentication of requests to the cluster.
-
Scheduler (kube-scheduler) - balances new deployments and scale requirements via load balancing on computing resources.
-
Controller Manager (kube-controller-manager) - detects state changes of pods and redeploys them via Scheduler communication.
-
etcd - key-value store, cluster brain, changes written there. Has data on load, usage, etc. Does not store app data.
It all works in scales motion where when you create a deployment it also create replicaset which creates pods. Also if there is already a stray pod with matching label it will be removed by deployment since pods will be created regardless and stray pod will hit with limit condition -
Pod - an abstraction over containers, can contain multiple containers
- e.g. Init container (run before), plus sidecars (run along) and primary
- Usually 1 app/DB per pod
- Pod get an internal IP address, (ephemeral)
-
Job is an ad-hoc (single use) work until completion container configuration, creates one or mode pods, completes. Pods also assigned random name Set backoffLimit
-
CronJob is like job, but can be scheduled based general CRON jobs rules
-
DaemonSet runs a copy of a container pod an all or (selected subset of) nodes in the cluster (brings propagation). Good for monitoring, log aggregation or storage daemon. Does not target control plane
-
Namespaces separate resources like groups avoid name conflicts, but don’t act as security/network boundary by default.
- The list of default namespaces
-
Node - hosts pods
-
Service - Networking - Service or Ingress
-
Ingress - helps setting up URL, secure protocol and domain name.
- Accepts requests and can route to multiple Services
- Acts like an API Gateway (K8S slowly replaces Ingress with GatewayAPI element)
-
GatewayAPI - Evolves form GatewayAPI and support Network gateways
-
Configmap: - maps variables between pods and external services (like DBs credentials and URLs) property-key and file-like-keys
- Passwords are not stored in config map
-
Secrets: where you store credentials, encodes in base64, can be controlled via authorization policies.
- Not enabled by default.
- Both Secrets and configmap can be used by app though env vars.
-
ReplicaSet adds replication (Use deployments instead)
- Labels link ReplicaSets and Pods
-
(Persistent) Volumes - stores data persistently, functions as emulated physical storage.
- How it’s done
- Provisioned via PVs, generally via user, via Storage Classes SC
- The basis can be local or remote like (s3)
- How it’s done
-
Deployment: creates blueprints for pods, automates replication adds concepts of rollout and rollbacks
- Allows replication of pods. For example, via deploying a new node with same pods setup.
- Can’t replicate pod dbs, because they have state - data
-
Statefulsets - gives pods sticky identity (0,1,2.. naming), each pod mounts separate volumes and rollout is by order. helps preserve state, for example, helps DBs like primary vs read replica. Deployment for stateful application.
- Pods created in order, one after another, when scaled down last created are deleted
- Pods get predictable names -0/1/2/3/4
- Either use this or deploy DB outside K8S in a highly available infra, like cloud.
- You cannot modify many of the created fields from YAML, like storage size/request!
- Has field serviceName, i.e. creating dns service for each replica independently (as if each pods got dns name), so in Service declaration you should make it headless with None clusterIP
- Why headless Each connection to the service is forwarded to one randomly selected backing pod. But what if the client needs to connect to all of those pods? What if the backing pods themselves need to each connect to all the other backing pods. Connecting through the service clearly isn’t the way to do this. Service | Kubernetes
- StatefulSets currently require a Headless Service to be responsible for the network identity of the Pods. You are responsible for creating this Service.
-
PersistentVolume and PersistentVolumeClaim PVC is a declaration of need for storage that can at some point become available / satisfied - as in bound to some actual PV. PVC consumes PV. StatefulSet volumeClaimTemplate enables dynamic provision of PV.
-
RBAC (ServiceAccount, Role, RoleBinding) - Service account is an auth object, role if what pods address to and RoleBinding merges these two entities. Role is namespace level while ClusterRole is cluster level
-
Labels and Annotations:
You can get explanation of template formats via
kubectl explain <component>
orkubectl explain <component>.<sub> [GitHub - BretFisher/podspec: Kubernetes Pod Specification Good Defaults](https://github.com/BretFisher/podspec) ![[Pasted image 20250527170641.png]]
Helm
Distribution and versioning of Kubernetes applications.
/Attachments/Pasted-image-20250612225045.png)
It is good at templating.
/Attachments/Pasted-image-20250612225407.png)
values.yaml
can have a supplementary values.schema.json
to define a schema for values. Useful for error prevention
Helm history of the releases in secrets of the clusters
Networking - Service or Ingress
/Attachments/Pasted-image-20250528140816.png)
- Kind: Service (based on selector match with labels defined in Deployment)
- ClusterIP: Internal to Cluster. Services are reachable by pods/services in the Cluster.
- NodePort: Listens on each node in cluster. Services are reachable by clients on the same LAN/clients who can ping the K8s Host Nodes (and pods/services in the cluster). Note for security your k8s host nodes should be on a private subnet, thus clients on the internet won’t be able to reach this service)
- LoadBalancer: Provisions external LB. Services are reachable by everyone connected to the internet (Common architecture is L4 LB is publicly accessible on the internet by putting it in a DMZ. Or giving it both a private and public IP and k8s host nodes are on a private subnet) (sudo $(which cloud-provider-kind) for local k8s so that LB works)
- A Service uses selectors to determine which Pods it should route traffic to, based on the labels those Pods have.
- Kind: Ingress
- Expose to internet, route to multiple services, setup advances HTTP stuff, works similarly to an API.
Kubernetes has internal DNS service like Docker Compose so you can resolve a Service within a pod.
/Attachments/Pasted-image-20250528142410.png)
If you are within the same namespace you can resolve short name
/Attachments/Pasted-image-20250528143121.png)
If you are reaching over namespaces, use ^FQDN
/Attachments/Pasted-image-20250528144320.png)
In practice
kubectl
List all pods in the namespace
kubectl get pods -n new-prod
List pods and include additional info
kubectl get pods -o wide
Create curl pod (only use this way as demonstration)
kubectl run curl-pod -it --rm --image=curlimages/curl --command -- sh
/Attachments/Pasted-image-20250528143121.png)
How replication works in practice:
/Attachments/Pasted-image-20250616085032.png)
Dry run to preview changes:
kubectl create -f replicaset-definition-1.yaml --dry-run=client
/Attachments/Pasted-image-20250616225056.png)
Execute command in the container, like docker exec:
kubectl exec configmap-example -c nginx -- cat /etc/config/conf.yml
Pod name, container, command.
/Attachments/Pasted-image-20250528215644.png)
You want to create a config file type of secret like
/Attachments/Pasted-image-20250528222046.png)
To simplify you can do it via cli and the add
--dry-run=client -o yaml
so you can get it ready to copy and paste into a file. Here is comparison where first command creates and other provides it for the file/Attachments/Pasted-image-20250528222011.png)
/Attachments/Pasted-image-20250528222234.png)
Direct monitoring with , but better to use Prometheus and Grafana
To apply multiple:
kubectl apply -f 01_pods.yml -f 02_pvc.yml
Deployment
Two ways of scaling down the deployments:
- Reduce
replicas
parameter in manifest andkubectl replace -f xx.yaml
kubectl scale <type> --replicas=6
or `kubectl scale -replicas=6 -f xx.yaml
In deployment labels work as additional identifiers, for example you can distinguish environment (prod, dev, stage, test) with it.
Selectors lots deployment know to which deployment a pod is related to. Matches selector to a pod template-metadata-label
Labels, Selectors & Namespaces
Labels are classic key-value pairs, modifiable.
Selectors are used in manifests, to filter and act on resources under the label.
Namespaces on are strictly separate resources in the cluster, so they have separate RBAC, quota and network policies
Resource | Namespaced? | Visible across Namespaces? | Notes |
---|---|---|---|
PersistentVolume (PV) | ❌ No | ✅ Yes | Like shared storage pool |
PersistentVolumeClaim (PVC) | ✅ Yes | ❌ No | Tied to 1 namespace |
Pod | ✅ Yes | ❌ No | Must mount PVCs in its own namespace |
Standard practice name key as app if you deploy application, so app: nginx
Service
Also matches labels to know with pod to be linked with.
port = service port / targetPort = (Container port usually)
port and targetPort in common practice can be same, to keep things simpler
Status to Errors
Tip
BackOff
is incremental increase of wait time before retry
ImagePullBackOff
initial image pull failed, willBackOff
retriesErrImagePull
failed to pull image, does not exist or no access
CrashLoopBackOff
Q&A
Metadata: name:
vsspec: containers: - name:
- Name in metadata refers to the pod name, while name in container spec refers to a container’s name.
no matches for kind "ReplicaSet" in version "v1"
- You probably made a mistake in
apiVersion
- You probably made a mistake in
- . Limits vs Requests
- Requests defines how much pods is gonna use and uses this information to find proper node.
- Limit restricts how much pods can actually use
- If equal, then QoS status becomes “guaranteed”
- How to go inside pods and get logs form containers?
1. kubectl apply
vskubectl replace
- Some resources have immutable fields that apply won’t let you change. For example, a deployment cannot have its selectors changed. Replace can be used in these situations.
x node(s) had untolerated taint
- Taints and Tolerations | Kubernetes
- If you are in the lab, make sure you aren’t deploying to control plane node. Control plane nodes have taints by default.