What is Google Cloud Run and when should you use it instead of GKE?
Google Cloud Run is a fully managed serverless container platform that automatically scales containerized workloads, including to zero when not in use.
What is Cloud Run?
Cloud Run runs any stateless container that listens on HTTP. You bring a Docker image, and Cloud Run handles all infrastructure – load balancing, scaling, SSL, and billing.
Key Characteristics
- Serverless: No infrastructure management, scales to zero
- Pay-per-use: Billed per request + CPU/memory during request processing
- Knative-based: Built on open Knative standards
- Any language/framework: Works with any Docker container
Cloud Run vs GKE
| Aspect | Cloud Run | GKE |
|---|---|---|
| Infrastructure | Fully managed | Partially managed |
| Scaling | Automatic (0 to N) | Manual/HPA |
| Cost model | Per request | Per node hour |
| Startup time | Cold starts ~1-2s | N/A (pods warm) |
| Stateful workloads | No | Yes |
| Custom networking | Limited | Full control |
| Persistent storage | No (use GCS/CloudSQL) | Yes (PV/PVC) |
| Long-running jobs | Limited (timeout) | Yes |
When to Use Cloud Run
- API backends and microservices: HTTP APIs that can be stateless
- Event-driven workloads: Triggered by Pub/Sub, Cloud Scheduler, Eventarc
- Batch processing: Short-lived tasks from message queues
- Variable or spiky traffic: Scales to zero saves costs for low-traffic services
- Prototyping and MVPs: Fast deployment without cluster setup
When to Use GKE
- Stateful applications: Databases, message brokers with persistent storage
- Long-running background jobs: No timeout constraints
- Complex networking: Service mesh, custom ingress controllers
- GPU/specialized hardware: Machine learning training workloads
- Multiple containers per pod: Sidecar patterns (Envoy, log agents)
- Fine-grained scaling control: Custom HPA metrics
Cloud Run Example
# Deploy a container to Cloud Run
gcloud run deploy my-service \
--image gcr.io/PROJECT/my-app:latest \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--min-instances 1 \
--max-instances 100 \
--memory 512Mi