Terraform in 2026: Complete Infrastructure as Code Guide for Cloud Engineers

In This Article

  1. What Infrastructure as Code Is and Why It Matters
  2. Terraform vs Pulumi vs CloudFormation vs CDK
  3. HCL Syntax Basics
  4. Providers: AWS, Azure, and GCP
  5. State Management: Local vs Remote Backend
  6. Modules and Reusable Infrastructure
  7. Terraform Cloud for Teams
  8. Terraform for AI Infrastructure
  9. CI/CD with Terraform
  10. The HashiCorp Licensing Change and OpenTofu Fork
  11. Frequently Asked Questions

Key Takeaways

Infrastructure as Code is no longer an advanced specialty reserved for platform engineering teams at large companies. In 2026, it is a baseline expectation for any cloud engineer, DevOps practitioner, or backend developer who deploys to the cloud with any regularity. And in that world, Terraform is the most important tool to understand.

This guide covers everything you need to go from zero to productive with Terraform: what it does and why it exists, how it compares to every major competitor, the HCL syntax you will write every day, state management done right, modules for reusable infrastructure, Terraform Cloud for team workflows, provisioning AI infrastructure including GPU instances, CI/CD integration, and the licensing fork that split the community in 2023 and what it means for you today.

What Infrastructure as Code Is and Why It Matters

Infrastructure as Code (IaC) means your cloud resources — servers, databases, networks, load balancers — are defined in version-controlled text files, so every environment can be reproduced exactly, every change is reviewed and auditable, and standing up a new copy of your entire infrastructure takes minutes instead of days of console clicking. Instead of clicking through a cloud console to provision resources, you write a declaration of what you want, and the IaC tool figures out what API calls to make to create it.

The benefits are not theoretical. They are operational:

3,000+
Terraform providers available in the registry, covering AWS, Azure, GCP, Kubernetes, Datadog, PagerDuty, Cloudflare, and hundreds more
More providers than any competing IaC tool — making Terraform the only tool that covers multi-cloud and SaaS resources in a single workflow

Terraform specifically uses a declarative approach: you describe the desired end state of your infrastructure, not the steps to get there. If a resource already exists and matches your config, Terraform does nothing. If it needs to change, Terraform plans the minimal set of changes. This is fundamentally different from writing imperative shell scripts that create resources, which fail unpredictably when run a second time.

Terraform vs Pulumi vs CloudFormation vs CDK

Choose Terraform (HCL, multi-cloud, 3,000+ providers) for most teams; Pulumi (TypeScript/Python/Go) when your team wants general-purpose programming languages instead of a DSL; CloudFormation or CDK when you are AWS-only and want deep native integration with no state file management; OpenTofu as the MIT-licensed Terraform fork if the BSL license is a concern. Here is a clear breakdown of the four tools most engineering teams evaluate:

Tool Language Multi-Cloud State Mgmt Learning Curve Best For
Terraform HCL (declarative DSL) Yes Self-managed or Terraform Cloud Low–Medium Multi-cloud teams, most orgs
Pulumi TypeScript, Python, Go, C# Yes Pulumi Cloud or self-hosted Medium (depends on language) Teams wanting full programming languages
CloudFormation JSON or YAML AWS only Fully managed (AWS) Medium–High AWS-only shops, compliance-heavy orgs
AWS CDK TypeScript, Python, Java, Go AWS only Via CloudFormation Medium AWS-native teams with dev backgrounds

The practical answer for most teams is Terraform. It has the largest community, the most comprehensive provider ecosystem, and the most demand in the job market. The declarative HCL syntax is easier to read and review than YAML-heavy CloudFormation, and it works across AWS, Azure, GCP, and dozens of SaaS providers simultaneously.

Pulumi is worth evaluating if your team has deep TypeScript or Python expertise and needs complex conditional logic, loops, or dynamic resource generation. Pulumi's programming language integration is genuinely more powerful for certain advanced use cases. But the majority of infrastructure does not need that complexity, and for most teams, Terraform's explicit declarative syntax is clearer in code review.

CloudFormation in 2026

CloudFormation is not going away — AWS uses it under the hood for CDK and many managed services. But writing raw CloudFormation YAML in 2026 when Terraform or CDK are available is a choice most teams have moved away from. The verbosity is substantial and the debugging experience is worse than either alternative. If you are AWS-only and prefer a programming language, CDK is the better choice over raw CloudFormation.

HCL Syntax Basics

HCL is Terraform's declarative DSL — you write resource blocks that declare what you want (aws_instance, aws_s3_bucket, aws_rds_cluster), variable blocks that parameterize values, output blocks that expose results, and data blocks that reference existing resources. Run terraform plan to see what will change, terraform apply to execute, and terraform destroy to tear down. HCL is readable, concise, and designed specifically for infrastructure descriptions.

Resources — The Core Building Block

A resource block declares a single infrastructure object. The block type is resource, followed by the provider resource type and a local name you give it.

main.tf — Basic AWS EC2 instance
# Declare an EC2 instance resource "aws_instance" "web_server" { ami = "ami-0c55b159cbfafe1f0" instance_type = var.instance_type tags = { Name = "web-server-${var.environment}" Environment = var.environment ManagedBy = "terraform" } } # Reference the instance in an output output "web_server_ip" { value = aws_instance.web_server.public_ip description = "Public IP of the web server" }

Variables — Parameterizing Your Config

variables.tf — Input variable declarations
variable "environment" { description = "Deployment environment (dev, staging, prod)" type = string default = "dev" validation { condition = contains(["dev", "staging", "prod"], var.environment) error_message = "Environment must be dev, staging, or prod." } } variable "instance_type" { description = "EC2 instance type" type = string default = "t3.micro" }

Data Sources — Reading Existing Resources

data.tf — Look up an existing VPC and AMI
# Read an existing VPC by tag data "aws_vpc" "main" { tags = { Name = "main-vpc" } } # Look up the latest Ubuntu 22.04 AMI data "aws_ami" "ubuntu" { most_recent = true owners = ["099720109477"] # Canonical filter { name = "name" values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-*"] } } # Reference in a resource resource "aws_instance" "app" { ami = data.aws_ami.ubuntu.id vpc_id = data.aws_vpc.main.id }

The three core workflow commands you run constantly: terraform init (download providers and set up backends), terraform plan (show what will change without making any changes), and terraform apply (execute the plan after reviewing it). Always review the plan before applying. Always.

Providers: AWS, Azure, and GCP

Providers are plugins that allow Terraform to interact with specific cloud platforms and services. Each provider exposes resource types and data sources specific to that platform. Configuring a provider is the first thing in any Terraform config.

providers.tf — Multi-cloud provider config
terraform { required_version = ">= 1.7.0" required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } azurerm = { source = "hashicorp/azurerm" version = "~> 3.0" } google = { source = "hashicorp/google" version = "~> 5.0" } } } provider "aws" { region = var.aws_region } provider "azurerm" { features {} } provider "google" { project = var.gcp_project_id region = var.gcp_region }

Version constraints on providers matter. Pinning to a major version with ~> 5.0 allows patch and minor updates but prevents breaking major version upgrades from running unexpectedly. Always pin provider versions in production configs.

Learn Terraform, AI tools, and cloud engineering hands-on.

Three days. Real infrastructure. From IaC fundamentals to deploying AI model endpoints — with engineers who do this work, not just teach it.

Reserve Your Seat

$1,490 · Denver · Los Angeles · New York City · Chicago · Dallas · October 2026

State Management: Local vs Remote Backend

Never use local Terraform state in team environments — use an S3 remote backend with DynamoDB state locking. The S3 backend stores terraform.tfstate encrypted in an S3 bucket; the DynamoDB table prevents two engineers from running terraform apply simultaneously, which would corrupt state. Local state is only acceptable for solo experimentation. Terraform state is the mechanism by which Terraform tracks what resources it manages, comparing config against current state to determine what needs to change on every apply.

By default, Terraform writes state to a local file. This works fine for learning and solo projects. It is a serious problem in team environments for two reasons: multiple engineers running terraform apply simultaneously will corrupt state, and anyone who does not have the state file cannot manage the infrastructure.

Remote Backend with S3 and DynamoDB (AWS)

backend.tf — S3 remote backend with DynamoDB locking
terraform { backend "s3" { bucket = "my-company-terraform-state" key = "prod/main/terraform.tfstate" region = "us-east-1" encrypt = true # DynamoDB table for state locking dynamodb_table = "terraform-state-locks" } } # The DynamoDB table (created separately, before backend config) # aws_dynamodb_table with hash_key = "LockID" and billing_mode = "PAY_PER_REQUEST"

Never Commit terraform.tfstate to Version Control

State files contain sensitive information: passwords, connection strings, private keys embedded in resource attributes. They also contain resource IDs that, if corrupted, can cause Terraform to lose track of existing resources and attempt to recreate them. Add *.tfstate and *.tfstate.backup to your .gitignore immediately. Use a remote backend for any shared environment.

State Commands You Need to Know

Modules and Reusable Infrastructure

Modules are the primary mechanism for reuse in Terraform. A module is simply a directory of .tf files with defined inputs and outputs. You call a module from another configuration, pass it variables, and it provisions a consistent set of resources.

The practical value is significant: instead of writing an S3 bucket configuration with all the right encryption, access control, and lifecycle policies from scratch in every project, you write it once as a module, publish it, and call it everywhere.

main.tf — Calling a module from the Terraform Registry
# Call the official AWS VPC module module "vpc" { source = "terraform-aws-modules/vpc/aws" version = "~> 5.0" name = "my-vpc" cidr = "10.0.0.0/16" azs = ["us-east-1a", "us-east-1b", "us-east-1c"] private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"] public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"] enable_nat_gateway = true enable_vpn_gateway = false tags = local.common_tags } # Reference module outputs in other resources resource "aws_eks_cluster" "main" { name = "my-cluster" role_arn = aws_iam_role.eks.arn vpc_config { subnet_ids = module.vpc.private_subnets } }

The Terraform Registry contains thousands of community and official modules for common patterns: EKS clusters, RDS instances, VPCs, Lambda functions, and more. Using battle-tested registry modules for standard infrastructure patterns is almost always faster and more reliable than writing everything from scratch.

Terraform Cloud for Teams

Terraform Cloud (now rebranded as HCP Terraform under HashiCorp's product consolidation) is a managed platform that solves the operational problems of running Terraform in a team: remote state storage with locking, a collaborative run interface, policy as code via Sentinel, private module registry, and audit logging.

Free
For up to 5 users — remote state, runs, and basic collaboration at no cost
$20/mo
Per user on Plus tier — adds SSO, audit logs, and Sentinel policies
VCS
Connects to GitHub, GitLab, Bitbucket — auto plans on PR, applies on merge

For small teams, Terraform Cloud's free tier is the easiest path to a production-safe workflow. You get remote state with locking, a shared run history, and VCS integration without managing any backend infrastructure. Larger teams with compliance requirements will want the Sentinel policy framework, which lets you write rules like "all S3 buckets must have encryption enabled" and enforce them before any plan can apply.

Self-Hosted Alternative: Atlantis

Atlantis is an open-source Terraform pull request automation tool that you host yourself. It runs terraform plan on every PR and posts the output as a comment, then runs terraform apply when you merge. Many teams use Atlantis instead of Terraform Cloud when they need to keep everything on-premise or want to avoid vendor lock-in. It integrates with GitHub, GitLab, and Bitbucket and is widely used at organizations that cannot use SaaS tooling for compliance reasons.

Terraform for AI Infrastructure

One of the most practically important use cases for Terraform in 2026 is provisioning AI and ML infrastructure: GPU-backed instances for training, managed model endpoints, vector databases, and the networking required to connect them. This infrastructure is expensive, configuration-sensitive, and exactly the kind of thing you want version-controlled and repeatable.

Provisioning GPU Instances on AWS

ai-infra.tf — GPU instance for model training
# p3.2xlarge: 1x NVIDIA V100, 16GB GPU memory # p4d.24xlarge: 8x NVIDIA A100, 320GB GPU memory (training at scale) resource "aws_instance" "gpu_trainer" { ami = data.aws_ami.deep_learning.id instance_type = "g5.xlarge" # NVIDIA A10G, cost-effective for fine-tuning subnet_id = module.vpc.private_subnets[0] vpc_security_group_ids = [aws_security_group.gpu.id] iam_instance_profile = aws_iam_instance_profile.gpu.name root_block_device { volume_size = 200 volume_type = "gp3" encrypted = true } tags = { Name = "gpu-trainer-${var.run_id}" Purpose = "model-training" AutoStop = "true" # Tag for automated shutdown lambda } } # SageMaker endpoint for model inference resource "aws_sagemaker_endpoint" "inference" { name = "llm-inference-${var.environment}" endpoint_config_name = aws_sagemaker_endpoint_configuration.llm.name }

Terraform for AI Platform Infrastructure Patterns

The GPU Cost Problem

A single p4d.24xlarge instance costs roughly $32/hour. An 8xA100 training run left running over a weekend by accident costs over $6,000. Terraform helps because infrastructure-as-code makes it easy to include auto-shutdown mechanisms, enforce maximum instance lifetimes via IAM conditions, and audit exactly what is running and for how long. Teams that manage GPU infrastructure manually (clicking in the console) consistently pay more than those using IaC with enforced cost controls.

CI/CD with Terraform

The standard Terraform CI/CD workflow in 2026 is: run terraform plan on every pull request, post the plan output as a PR comment for review, then run terraform apply automatically on merge to the main branch. This ensures every infrastructure change is reviewed before it reaches production and creates a complete audit trail of what changed and when.

.github/workflows/terraform.yml — GitHub Actions workflow
name: Terraform on: pull_request: branches: [main] push: branches: [main] jobs: terraform: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup Terraform uses: hashicorp/setup-terraform@v3 with: terraform_version: "1.7.0" - name: Terraform Init run: terraform init env: AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }} AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }} - name: Terraform Plan (on PR) if: github.event_name == 'pull_request' run: terraform plan -no-color -out=tfplan - name: Terraform Apply (on merge) if: github.ref == 'refs/heads/main' && github.event_name == 'push' run: terraform apply -auto-approve

Key practices for production Terraform CI/CD:

The HashiCorp Licensing Change and the OpenTofu Fork

In August 2023, HashiCorp announced that Terraform — which had been licensed under the Mozilla Public License (MPL 2.0), a true open-source license — would be relicensed under the Business Source License (BSL). The BSL is not an open-source license. It prohibits using Terraform to build products that compete with HashiCorp's commercial offerings.

The response from the community was immediate. Within weeks, a group of companies and individual contributors announced OpenTofu, a fork of Terraform under the MPL 2.0 license. OpenTofu is now governed by the Linux Foundation and actively maintained by a consortium of companies including Spacelift, env0, Scalr, and others.

Where OpenTofu Stands in 2026

OpenTofu 1.8 and beyond have diverged meaningfully from HashiCorp's Terraform in some areas, adding features like early variable evaluation and provider-defined functions that Terraform has not shipped. The two tools remain highly compatible for the vast majority of configurations — standard HCL, providers, modules, and state files work with both. The choice between them is primarily driven by organizational policy (BSL compliance requirements) and tooling ecosystem preference, not technical capability differences in most cases.

For most engineers learning IaC today, the practical answer is straightforward: the HCL you write works on both Terraform and OpenTofu. Learn the concepts and syntax — they transfer completely. If your employer or a government contract requires open-source licensing, OpenTofu is the choice. If you are in a commercial environment with no BSL restrictions and your organization is already on Terraform, there is no urgent reason to migrate.

"The fork did not fracture the community — it validated it. The fact that thousands of engineers could fork the project, find a foundation to host it, and ship production-quality releases within months is a testament to how mature the IaC ecosystem has become."

The longer-term competitive pressure from the OpenTofu fork has also influenced HashiCorp's behavior — the company has been more careful about licensing terms and more transparent about its roadmap since the fork. Competition, even within open-source governance, is healthy.

From IaC fundamentals to production deployments.

The Precision AI Academy bootcamp covers Terraform, cloud infrastructure, AI tooling, and deployment workflows in three intensive days. Small cohort. Hands-on from day one. Real engineers as instructors.

Reserve Your Seat — $1,490

Denver · Los Angeles · New York City · Chicago · Dallas · October 2026

The bottom line: Terraform is the highest-ROI IaC skill in 2026 — learn HCL, use S3+DynamoDB remote state with locking from day one, organize infrastructure into reusable modules, and run terraform plan in CI before every merge. The OpenTofu fork is fully compatible with existing configs if BSL licensing is a concern. At minimum, every cloud engineer should be able to write a Terraform module that provisions a VPC, an RDS instance, and an ECS service — that combination covers 80% of what most teams deploy.

Frequently Asked Questions

Is Terraform still worth learning in 2026?

Yes — Terraform remains the most widely adopted Infrastructure as Code tool in 2026, used by the majority of cloud engineering teams across AWS, Azure, and GCP. Despite competition from Pulumi and CDK, Terraform's declarative HCL syntax, massive provider ecosystem (over 3,000 providers), and deep community knowledge base make it the safest and most valuable IaC skill to invest in. The OpenTofu fork has also resolved most concerns about the licensing change, giving teams a fully open-source alternative that is compatible with existing Terraform configurations.

What is the difference between Terraform and OpenTofu?

OpenTofu is a community-maintained, open-source fork of Terraform created in response to HashiCorp's 2023 decision to change Terraform's license from the Mozilla Public License (MPL) to the Business Source License (BSL). OpenTofu is governed by the Linux Foundation and is fully compatible with existing Terraform HCL configurations, state files, and providers for the vast majority of use cases. The choice between them is primarily a matter of organizational policy: teams that cannot use BSL-licensed software — including many government agencies and open-source projects — use OpenTofu. Teams without that constraint often continue with HashiCorp's Terraform.

Should I use Terraform or Pulumi in 2026?

For most cloud engineers, Terraform is the better starting point in 2026 due to its dominant market share, more comprehensive job demand, and the fact that HCL is easier to read and review than infrastructure written in a general-purpose language. Pulumi is the stronger choice when your team has deep TypeScript or Python expertise and needs complex conditional logic, loops, or dynamic resource generation that is cumbersome in HCL. Pulumi's programming language integration is genuinely more powerful for advanced use cases — but the majority of infrastructure does not need that complexity, and Terraform handles it with less overhead and more readability in code review.

How do I manage Terraform state safely in a team environment?

Managing Terraform state safely in a team requires a remote backend with state locking. The most common approach on AWS is an S3 bucket for state storage with DynamoDB for distributed locking — this prevents two engineers from running terraform apply simultaneously and corrupting the state file. On Azure, Azure Blob Storage with native locking works equivalently. Terraform Cloud and HCP Terraform provide managed state storage with locking, access controls, and run history out of the box. The critical rule is: never store terraform.tfstate locally in a team environment, and never commit it to version control. State files contain sensitive credentials and resource mappings that, if corrupted, can cause Terraform to recreate existing resources.

Sources: AWS Documentation, Gartner Cloud Strategy, CNCF Annual Survey

BP

Bo Peng

AI Instructor & Founder, Precision AI Academy

Bo has trained 400+ professionals in applied AI across federal agencies and Fortune 500 companies. Former university instructor specializing in practical AI tools for non-programmers. Kaggle competitor and builder of production AI systems. He founded Precision AI Academy to bridge the gap between AI theory and real-world professional application.

Explore More Guides