Scaling Terraform - A GitOps Prelude

Introduction

My journey with Infrastructure as Code (IaC) started with server configuration using Puppet in 2012. I used Vagrant to test modules, Packer to build AMIs, and a mix of CloudFormation and the AWS Python SDK to deploy infrastructure. This was a major step up from the artisanal operations of managing pets—but it still meant tedious config management and effectively kept software engineers away from the dungeons of “the ops people.”

I began developing an approach to eliminate manual configuration and reduce complexity. With the early release of Terraform in 2014, it became clear this tool could lower the barrier to entry, minimize the blast radius of changes, and enable a more dynamic way to build and manage infrastructure. I introduced the first iteration in a SysAdvent article in 2016.

Since then, I’ve implemented this approach across multiple organizations—large and small—with lasting success. In every case, teams were able to maintain and evolve it long after I was gone. That kind of staying power is rare in infrastructure work.

In this post, I’ll cover:

I. Foundational Hierarchy
II. Root Modules
III. Child Modules

The HCL formatting throughout follows my Terraform style guide, shaped by lessons learned and feedback from real-world rollouts.

What Are We Solving?

Here is a list of benefits and problems we’re solving with this approach:

Terraform version mismatches
Misconfigured environments
Overhead of Terraform setup (init, upgrading modules, backend configuration)
Free: just be willing to run a clean Bash script
Limit the amount of things you need to manage in order to ship

What is GitOps?

GitOps is a modern DevOps practice that uses Git as the single source of truth for managing infrastructure and application configurations. It brings the principles of version control, collaboration, compliance, and automation to infrastructure operations, making it easier to manage complex systems reliably and at scale.

I. Foundational Hierarchy

At the highest level, we maintain a dedicated GitHub repository to manage infrastructure. This is where Terraform runs and where we define our root modules—the entry points for applying changes.

infrastructure/aws/curiqa-dev/us-east-1/dev-vpc-use1/dev/curiqa-api

Example folder structure of a service in a dev environment

The directory structure mirrors the AWS resource hierarchy. Top-level folders manage account-wide resources, and each nested directory scopes resources further—by region, VPC, environment, and finally, service. Each directory contains a root module that calls the relevant, versioned child module, providing consistency, isolation, and traceability at every layer.

Overview of hierarchy in multiple regions

Examples of resources provisioned at each level

Back to TOC

II. Root Modules

A core tenet of this approach is consistency. Every account follows the same layout, and each root module shares a standardized pattern—making it easier to onboard new team members, enforce standards, and scale infrastructure reliably.

Each root module consists of Terraform configuration files (backend.tf, provider.tf, main.tf, and versions.tf), a variable definition file (variables.tf), and an output values file (outputs.tf). The main.tf file is responsible for calling and configuring the relevant child module(s).

The final piece is terraform.sh—a Bash wrapper that handles several configuration steps and serves as the backbone of this approach. It’s shared across all root modules via a symlink to infrastructure/utilities/terraform.sh.

Terraform Wrapper

The consistent folder structure and module composition provide a few key guarantees. The terraform.sh wrapper builds on this by using the directory path to populate Terraform input variables—applying a convention-over-configuration approach.

The script accepts standard Terraform subcommands (apply, plan, destroy, import, etc.) and performs the following during root module setup:

Installs the specified version of Terraform using tfenv
Populates input variables based on folder names in the path
Creates a provider.tf file and sets default tags
Creates a backend.tf file and configures the remote state location
Creates a versions.tf file with Terraform and AWS provider versions
Sets and creates the TF_PLUGIN_CACHE_DIR
Initializes the root module and upgrades child modules

Example Root Account Module

Applying the Root Account Module and Writing Remote State

Account resources are created after running ./terraform.sh apply and remote state is recorded in S3

Terraform creates resources defined in child modules, and any output values declared within those modules are available only within the root module’s context. These outputs are not directly accessible via remote state unless they are explicitly exposed by the root module.

In this setup, the root module’s outputs.tf file exports the entire child module’s outputs as a single object. This approach lets us define output values once in the child module and makes them available to other modules through terraform_remote_state.

# Child module outputs.tf file
output "admin_iam_role_arn" {
  value = aws_iam_role.admin_iam_role.arn
}

output "domain_name" {
  value = var.domain_name
}
...

# Root module outputs.tf file
output "account" {
  value = module.account
}

This sets the stage for discussing child modules—and how dynamic variable naming based on folder structure becomes a powerful pattern for scaling and reuse.

Back to TOC

III. Child Modules

The power behind this approach lies in Terraform’s remote state as a data source. In the past, configuration values lived in files, or you could use something like Consul or ZooKeeper as a discovery tool. While those tools are great for managing dynamic configurations, we want to lean into AWS’s native services to handle the dynamic environment for us.

Let’s bring it all together.

We’ll start by looking at the main.tf and variables.tf files in a service’s root module—similar to how you might structure the deployment of an application. From there, we’ll explore how these variables are passed into the service’s child module, and how that child module can use terraform_remote_state to pull in outputs from higher-level root modules such as account, region, vpc, and environment.

Root Service Module

infrastructure/aws/curiqa-dev/us-east-1/dev-vpc-use1/dev/curiqa-api

Location of our root service module

The directory above is the location of our service’s root module. The main.tf and variables.tf files live in this directory, along with the previously mentioned files required for a root module. Since the Terraform script dynamically populates the variables based on the directory path, we never need to manually define the baseline variables.

# Root service module variables.tf
variable "aws_account" {}

variable "aws_region" {}

variable "vpc_name" {}

variable "environment_name" {}

variable "service_name" {}
...

# Root service module main.tf
module "curiqa_api" {
  source = "git@github.com:Curiqa/curiqa-api.git//terraform?ref=b2b45e"

  aws_account      = var.aws_account
  environment_name = var.environment_name
  service_name     = var.service_name
  vpc_name         = var.vpc_name
...
}

Child Service Module

In this final section, we’ll focus on how the folder structure enables easy access to remote state from other modules. This is just the beginning—diving deeper into module composition is beyond the scope of this post.

For this example, the variables.tf file in our child service module is essentially a copy of the root module’s. Because the folder structure in the infrastructure repo mirrors the S3 key structure used for remote state, we can easily construct remote state data sources with predictable, consistent paths.

The child modules contain configuration blocks used to create infrastructure objects. The HCL block below creates a CloudWatch event rule and demonstrates how to reference an output value from the terraform_remote_state of a region module: data.terraform_remote_state.region.outputs.region.aws_region_shortname

resource "aws_cloudwatch_event_rule" "cloudwatch_event_rule" {
  name                = "${var.environment_name}-step-function-event-rule-${data.terraform_remote_state.region.outputs.region.aws_region_shortname}"
  description         = "Trigger step function"
  schedule_expression = var.schedule_expression
}

By aligning folder structure with remote state key paths, we gain a predictable and scalable way to wire modules together—without hardcoding values or introducing brittle logic. The result is a system where even complex infrastructures remain composable, traceable, and easy to evolve.

This post focused on the structural and operational foundations—hierarchy, root modules, child modules, and the tooling that ties it all together. There’s much more to explore in areas like module composition, authentication, and other config management, which I’ll cover in future posts.

Until then, you can check out the Terraform style guide for more detailed conventions and examples.

Back to TOC

Scaling Terraform - A GitOps Prelude

Introduction

What Are We Solving?

What is GitOps?

I. Foundational Hierarchy

II. Root Modules

Terraform Wrapper

Example Root Account Module

Applying the Root Account Module and Writing Remote State

III. Child Modules

Root Service Module

Child Service Module

Related Posts

The Four Metrics Every Ops Team Should Track