Beginner Questions
Core concepts, syntax, and foundational command-line knowledge.
What is Infrastructure as Code (IaC) and what are its main benefits?
Infrastructure as Code means managing and provisioning infrastructure through machine-readable configuration files instead of manual processes.
Key benefits:
- Reproducibility: Spin up identical environments on demand.
- Version control: Track all infrastructure changes in Git. Know who changed what and when.
- Auditability: Compliance teams can review what infrastructure is being provisioned.
- Self-documentation: The code is the documentation.
- Disaster recovery: Re-create an entire environment from scratch in minutes.
Intermediate Questions
Infrastructure management, deployment strategies, and delivery flows.
How do you handle sensitive values like passwords in Terraform without exposing them in state?
Terraform state files contain sensitive values in plaintext — this is a known limitation. Mitigations:
- Mark as sensitive:
sensitive = trueon variables and outputs prevents them from appearing in CLI output. - Avoid storing in state: Use AWS Secrets Manager or Vault to generate and store secrets externally. Reference via data source or environment variable.
- Encrypt state: S3 backend with server-side encryption (SSE-KMS).
- Restrict access: The S3 bucket containing state should have strict IAM policies — only CI/CD roles should have access.
How do Terraform modules work and what makes a good module?
A Terraform module is a reusable group of resource configurations. Every directory with .tf files is a module. You call modules from a root module to avoid repeating code.
What makes a good module:
- Single responsibility: One module for VPC, another for EKS, another for RDS.
- Parameterized: Accept variables to customize behavior per environment.
- Versioned: Pin module versions in the
sourceattribute. - Outputs: Expose useful outputs (VPC ID, subnet IDs) for other modules to consume.
What is Terraform state and why must it be stored remotely in a team environment?
Terraform state is a JSON file (terraform.tfstate) that maps your configuration to real-world resources. Terraform uses it to know what already exists before planning changes.
Storing it locally breaks team collaboration:
- Team members would each have different state files causing conflicts
- State file gets lost if the local machine breaks
- No locking mechanism — two engineers could run
applysimultaneously and corrupt state
Remote backends (S3 + DynamoDB for locking, GCS, Terraform Cloud) solve all three problems.
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-lock"
}
}
Advanced Questions
Enterprise orchestration, deep architectural concepts, and scaling issues.
What are Terraform providers and how do you handle provider version pinning?
Providers are plugins that translate Terraform configuration into API calls to AWS, GCP, Azure, etc. Always pin provider versions to prevent unexpected changes from provider upgrades:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0" # Allows 5.x but not 6.x
}
}
required_version = ">= 1.7.0"
}
provider "aws" {
region = "us-east-1"
}
Run terraform providers lock to generate a .terraform.lock.hcl file that locks exact versions and checksums. Commit this file to Git.
How do you manage multiple environments (dev/staging/prod) in Terraform? Workspaces vs. directory structure.
Two main approaches:
Terraform Workspaces: Use the same code but switch workspace to change state. Simple, but the same code runs for all environments — hard to have different variable values per environment. Suitable for simple differences.
Separate Directories (recommended): Each environment has its own directory with its own terraform.tfvars and remote state. This is explicit, auditable, and allows environments to diverge safely.
environments/
dev/
main.tf → calls shared module
terraform.tfvars
staging/
main.tf
terraform.tfvars
prod/
main.tf
terraform.tfvars
modules/
vpc/
eks/
Real Production Scenarios
Real-world architecture, system migration, and design challenges.
Explain the Terraform resource lifecycle and meta-arguments like create_before_destroy.
The lifecycle block gives you fine-grained control over how Terraform manages resource replacement:
resource "aws_instance" "web" {
ami = "ami-12345"
instance_type = "t3.medium"
lifecycle {
create_before_destroy = true # New instance created before old one is destroyed
ignore_changes = [ami] # Ignore external AMI changes
prevent_destroy = true # Block accidental deletion
}
}
create_before_destroy is critical for zero-downtime replacements. Without it, Terraform destroys the old resource first, creating a gap in availability.
How do you implement Terraform in a CI/CD pipeline safely?
Running Terraform in CI/CD requires careful guardrails:
- PR triggers plan: On every pull request, run
terraform planand post the output as a PR comment (using tools like Atlantis orterraform-pr-commenter). - Merge triggers apply: Only apply after PR is merged to main. Require manual approval for production.
- State locking: Ensure DynamoDB locking is configured to prevent concurrent applies.
- OIDC credentials: Use OIDC to get short-lived tokens from AWS instead of storing long-lived access keys.
- Plan artifacts: Save the plan file and apply that exact file — never re-plan at apply time.
What is the purpose of terraform.tfvars files?
terraform.tfvars files provide values for your declared variables, keeping configuration separate from the variable definitions. This allows you to have different values per environment without modifying the core modules.
# variables.tf — defines the variable
variable "instance_type" {
description = "EC2 instance type"
type = string
}
# production.tfvars — provides the value
instance_type = "c5.2xlarge"
# development.tfvars
instance_type = "t3.micro"
Never commit .tfvars files containing sensitive values to Git. Use .gitignore and pass sensitive values via environment variables (TF_VAR_*) in CI/CD.
What is Terraform state drift and how do you handle it?
State drift occurs when the real infrastructure differs from what Terraform state believes it to be — typically due to manual changes made in the AWS console or another tool.
Detection: terraform plan will show changes that seem unexpected.
Resolution options:
- Import:
terraform importto import manually created resources into state. - Refresh:
terraform refreshto update state to match reality (deprecated in favor ofplan -refresh-only). - Accept drift: Use
lifecycle { ignore_changes = [...] }for intentionally externally-managed attributes.
Prevention: Forbid all manual console access to production environments using IAM SCPs.
What are Terraform data sources and how do they differ from resources?
A resource creates, updates, or destroys infrastructure. A data source reads existing infrastructure that is managed outside of your current Terraform code — it is read-only.
# Data source — reads an existing VPC by tag, does not create it
data "aws_vpc" "main" {
tags = {
Environment = "production"
}
}
# Use the data source output
resource "aws_subnet" "app" {
vpc_id = data.aws_vpc.main.id
...
}
Data sources are essential for referencing shared infrastructure managed by a different team or Terraform root module.
What does terraform plan do and why should you always review it before applying?
terraform plan creates an execution plan — a preview of what Terraform will do before it actually makes changes. It shows additions, modifications, and destructions.
Always review the plan because:
- It may show unexpected destructions (e.g., a stateful database being replaced instead of modified)
- It catches misconfiguration before real infrastructure is affected
- In a CI/CD pipeline, save the plan output and apply that exact plan in the next step to ensure consistency
terraform plan -out=tfplan
terraform apply tfplan