Real Production Scenarios
Real-world architecture, system migration, and design challenges.
What is the purpose of a staging environment and what tests should run there?
Staging is a production-mirror environment used to catch bugs that only appear with real data, full infrastructure, and realistic load — things unit tests can’t surface. Tests to run in staging:
- Integration tests: Real database connections, real API calls to third parties.
- E2E tests: Cypress, Playwright, or Selenium to simulate real user journeys.
- Smoke tests: Quick sanity checks that critical paths work after deployment.
- Performance tests: Load tests with k6 or Locust to catch regressions.
How do you handle database migrations in a CI/CD pipeline without downtime?
Database migrations are one of the riskiest parts of deployment. The golden rule: migrations must be backward-compatible because during a rolling deploy, old code and new code run simultaneously.
Safe migration checklist:
- Never: Rename or drop a column in the same deploy that uses the new name.
- Step 1: Add new column (nullable, backward-compatible).
- Step 2: Deploy code that writes to both old and new columns.
- Step 3: Migrate existing data.
- Step 4: Deploy code using only the new column.
- Step 5: Drop the old column.
How do you implement a multi-environment deployment pipeline (dev → staging → prod)?
A professional multi-environment pipeline uses gates between stages:
- Build once: A single immutable artifact (Docker image with SHA tag) is promoted — never rebuilt.
- Deploy to Dev: Automatic on every merge to main.
- Deploy to Staging: Automatic after dev health checks pass. Run integration and smoke tests.
- Deploy to Prod: Manual approval gate + scheduled deployment window.
The key is that the same image moves through all environments. This ensures what you tested in staging is exactly what runs in production.
How do you speed up slow CI pipelines?
Slow pipelines kill developer productivity. Key optimizations:
- Caching: Cache dependencies (node_modules, pip packages, Go modules) between runs.
- Parallelism: Split test suites and run jobs in parallel.
- Test selection: Only run tests affected by the changed code.
- Optimized Docker builds: Use layer caching and BuildKit.
- Self-hosted runners: Eliminate queue time and use faster hardware.
- Fail fast: Run linting and unit tests first; integration tests only if those pass.
What is a pipeline artifact and what are common examples?
A pipeline artifact is any file produced by a CI/CD job that needs to be passed to downstream jobs or stored for later use.
Common examples:
- Compiled binary or JAR file (Java/Go)
- Built Docker image pushed to a registry
- Frontend build output (
dist/orbuild/folder) - Test reports and coverage reports
- SBOM (Software Bill of Materials) files
- Terraform plan output
What is GitOps and how does it differ from traditional CI/CD?
Traditional CI/CD: The pipeline has credentials and directly pushes deployments to environments (push-based).
GitOps: Git is the single source of truth for the desired state of your infrastructure and applications. An agent running in the cluster (like ArgoCD or Flux) continuously reconciles the actual state with the desired state in Git (pull-based).
Benefits of GitOps: Drift detection, audit trail in Git history, easy rollback (git revert), no outbound credentials needed in CI.
How do you implement automated rollback in a deployment pipeline?
Automated rollback is triggered when post-deployment health checks fail. A robust implementation:
- Health check gate: After deployment, poll the health endpoint for 2-3 minutes.
- Metric thresholds: Monitor error rate and p99 latency for 5 minutes post-deploy.
- Rollback trigger: If error rate exceeds a threshold, automatically re-deploy the previous image tag.
# Generic shell rollback logic
NEW_VERSION="v2.0"
PREV_VERSION="v1.9"
deploy $NEW_VERSION
if ! health_check_passes; then
echo "Rollback triggered"
deploy $PREV_VERSION
alert_pagerduty "Automatic rollback executed"
fi
How do you structure a mono-repo CI/CD pipeline to avoid unnecessary builds?
In a monorepo with 20+ services, you must only trigger builds for services that actually changed. Strategies:
- Path filters: GitHub Actions
paths:filter to trigger workflows only when specific directories change. - Nx / Turborepo: Task runners with build graph awareness that skip unchanged services.
- git diff: Compare changed files against the base branch and only build affected services.
# GitHub Actions path filter
on:
push:
paths:
- "services/api/**"
- "shared/lib/**"
What is the difference between a Blue/Green deployment and a Canary deployment?
Blue/Green: You maintain two identical environments. “Blue” is live, “Green” has the new version. You switch all traffic from Blue to Green at once. Rollback is instant — just switch back. Downside: doubles infrastructure cost.
Canary: You gradually shift traffic from the old version to the new one — e.g., 5% → 25% → 50% → 100%. You analyze metrics and errors at each stage. Slower but safer for catching issues that only appear under real production load.
Why do you use branch protection rules in a CI/CD workflow?
Branch protection rules on the main or production branch enforce quality gates before any code is merged:
- Require pull request reviews (at least 1-2 approvals)
- Require status checks to pass (CI build, tests, linting)
- Require branches to be up to date before merging
- Prevent force pushes and branch deletion
This ensures no untested or unreviewed code ever reaches production, which is the foundation of a trustworthy deployment pipeline.
How do you implement secret management in a GitHub Actions pipeline?
Never hardcode secrets in your pipeline files. GitHub Actions provides an encrypted Secrets store:
- Go to Repository Settings → Secrets and Variables → Actions → New Repository Secret.
- Reference in your workflow:
${{ secrets.MY_SECRET }}
- name: Deploy to AWS
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
run: aws s3 sync ./dist s3://my-bucket
For more advanced use cases, use OIDC to get short-lived tokens from AWS/GCP instead of storing static credentials.
How do you secure a CI/CD pipeline from supply chain attacks?
Supply chain attacks (like SolarWinds, XZ Utils) target the build pipeline itself. Defense layers:
- Pin action versions: Use commit SHA, not floating tags like
@v2.uses: actions/checkout@abc123 - SBOM generation: Generate a Software Bill of Materials at build time using Syft.
- Image signing: Sign images with Cosign (Sigstore). Verify signatures before deployment.
- Least privilege: GitHub Actions tokens should have minimal permissions. Set
permissions: read-allby default. - Dependency review: Use Dependabot or Renovate for automated dependency updates.
What is the difference between Continuous Integration, Continuous Delivery, and Continuous Deployment?
Continuous Integration (CI): Developers merge code frequently (multiple times a day). Every merge triggers an automated build and test run to catch integration issues early.
Continuous Delivery (CD): Every passing build is automatically prepared for release to production. A human approves the final deployment step.
Continuous Deployment: Extends Delivery — every passing build is automatically deployed to production with no human intervention.