What is etcd and what role does it play in a Kubernetes cluster?
etcd is a distributed, reliable key-value store that serves as Kubernetes’ primary datastore for all cluster state and configuration data.
Role in Kubernetes
etcd is the single source of truth for a Kubernetes cluster. Every object you create (pods, services, configmaps, secrets, etc.) is stored in etcd. The API server reads and writes to etcd for all cluster state.
Key Characteristics
Distributed Consensus
etcd uses the Raft consensus algorithm to ensure data consistency across multiple etcd instances. A cluster typically runs 3 or 5 etcd nodes to achieve fault tolerance.
Watch Mechanism
Kubernetes controllers use etcd’s watch API to get notified of changes. For example, the scheduler watches for unscheduled pods and the controller manager watches for deployment changes.
Strong Consistency
etcd provides linearizable reads and writes, ensuring all clients see the same data at the same time.
What’s Stored in etcd
- All Kubernetes objects (Pods, Deployments, Services, etc.)
- Cluster configuration
- RBAC policies
- Secrets (encrypted at rest if configured)
- Node information
etcd in Production
Backup Strategy
# Create etcd snapshot
ETCDCTL_API=3 etcdctl snapshot save snapshot.db \
--endpoints=https://127.0.0.1:2379 \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.keyHigh Availability
- Run odd number of nodes (3, 5, 7)
- A cluster of 3 tolerates 1 failure
- A cluster of 5 tolerates 2 failures
- Use dedicated SSDs for low latency
Why etcd Performance Matters
etcd latency directly impacts API server response time. Slow etcd = slow kubectl, slow deployments, and cluster instability. Always monitor etcd disk I/O and latency metrics.