Terraform Modules As An Internal Platform: How To Build A Self-Service Infrastructure Layer

The team has 15 services. Each one’s Terraform is a copy-modify of a previous service’s Terraform. When AWS changes a default or a security policy needs updating, somebody has to touch 15 directories. New services take a week to bootstrap because they’re cargo-cult-copying everything from the most recent service that “looked similar.”

The fix is internal Terraform modules, a reusable building block per common pattern. A new service becomes 10 lines of HCL calling the right modules. Updates propagate by bumping a version. The module is the platform.

This post is the module design that scales: composition, versioning, testing, and the four traps that catch teams the first time.

What a “good” module looks like

A module solves one concern. Examples of right-sized modules:

ecs-service: runs a container on ECS Fargate with the right log group, IAM role, autoscaling.
rds-postgres: Postgres instance with backups, alarms, secret in Secrets Manager.
sqs-queue: queue plus dead-letter queue plus consumer IAM policy.
cloudfront-spa: static site bucket, CloudFront, ACM cert, origin access control.

Each is small (200-500 lines of HCL), has a focused interface, and is composable.

A bad module:

everything-for-our-app: provisions the whole stack of one specific service. Not reusable.
aws-resource-wrapper: adds nothing on top of aws_* resources. Indirection without value.
multi-environment-config: switches behavior based on var.env. Confusing; replace with separate calls.

The module interface

A module exposes inputs (variables) and outputs:

# modules/ecs-service/variables.tf
variable "name"   { type = string }
variable "image"  { type = string }
variable "cpu"    { type = number; default = 256 }
variable "memory" { type = number; default = 512 }
variable "env"    {
  type = map(string)
  default = {}
}
variable "secrets" {
  type = map(string)
  default = {}
}
variable "vpc_id" { type = string }
variable "subnet_ids" { type = list(string) }

# modules/ecs-service/outputs.tf
output "service_name" { value = aws_ecs_service.this.name }
output "task_role_arn" { value = aws_iam_role.task.arn }
output "log_group_name" { value = aws_cloudwatch_log_group.this.name }

The interface is the contract. Input names are stable across versions. Adding inputs is non-breaking; renaming or removing them is.

Calling a module

# infra/prod/api/main.tf
module "api" {
  source = "git::https://github.com/example/tf-modules.git//ecs-service?ref=v1.4.0"

  name   = "api"
  image  = "ghcr.io/example/api:abc123"
  cpu    = 1024
  memory = 2048

  vpc_id     = data.aws_vpc.main.id
  subnet_ids = data.aws_subnets.private.ids

  env = {
    NODE_ENV = "production"
  }
  secrets = {
    DATABASE_URL = data.aws_secretsmanager_secret.db.arn
  }
}

Ten lines plus inputs. The module does the rest: task definition, service, log group, IAM, alarms.

?ref=v1.4.0 is the version pin. Treat modules as versioned dependencies; bump deliberately.

Versioning modules

Three approaches:

1. Tagged Git releases. git::https://github.com/.../tf-modules.git//ecs-service?ref=v1.4.0. Standard, simple, free.

2. Terraform Registry. Public for open-source, private for paid (Terraform Cloud / Enterprise). Better discovery, version listings.

3. Module-per-repo with semver. Each module has its own repo with semver tags. More overhead, cleaner ownership.

For most teams, (1) is enough. A monorepo with tagged versions, calls reference ?ref=v.... Promote modules to (2) or (3) only if scale demands.

Composition over parameterization

A common temptation: keep adding inputs to a module to handle every case. The module grows from 50 inputs to 200. Configuration becomes overwhelming.

The better pattern is composition: small modules that combine. Instead of one giant service module with options for “should we have a queue,” “should we have a database,” “should we have a cron”:

# infra/prod/api/main.tf
module "api"   { source = "...//ecs-service" ; ... }
module "api_queue" { source = "...//sqs-queue"  ; ... }
module "api_db"    { source = "...//rds-postgres" ; ... }
module "api_cron"  { source = "...//cloudwatch-cron" ; ... }

Each module is small. The service file describes what this service has by what modules it calls. New patterns add new modules; existing ones stay focused.

Testing modules

Terraform tests catch breaking changes before they propagate. Three approaches:

1. terraform plan against a fixture. Each module has a test/ directory with a sample call. CI runs terraform plan against a known state file and asserts the plan matches expected.

2. Terratest. Go-based test framework. Provisions real infrastructure, runs assertions, tears down. Slow (minutes) but high-confidence.

func TestEcsService(t *testing.T) {
  options := &terraform.Options{
    TerraformDir: "../examples/basic",
  }
  defer terraform.Destroy(t, options)
  terraform.InitAndApply(t, options)

  serviceName := terraform.Output(t, options, "service_name")
  assert.Contains(t, serviceName, "test-")
}

3. terraform-compliance. Asserts against the plan: “every RDS instance must have backups enabled,” “every S3 bucket must have encryption.”

For most modules, (1) is enough. (2) for modules that provision complex infrastructure where misconfigurations are subtle. (3) as a policy gate across all modules.

The four traps

1. Modules that wrap a single resource. A module that just wraps aws_lambda_function adds nothing. Use the resource directly.

2. Modules that change behavior based on environment. if var.env == "prod" { backup = true }. Now reading the module requires understanding three branches. Better: separate “prod-grade” and “dev-grade” presets, or call the module with explicit inputs from each environment.

3. Modules that grow without versioning. “Just push to main; everyone is on the latest.” No way to update one consumer without updating all. Tag every change.

4. Modules with no examples. A module with 30 inputs and no example call is unusable. Every module’s repo should have an examples/ directory with realistic invocations.

The platform team’s role

If you have a dedicated platform team:

They maintain modules and runtime config (provider version, default tags, IAM patterns).
They version modules and notify consumers of breaking changes.
They handle migrations when AWS changes resource shapes.

If you don’t (a team of 10 sharing infra responsibility):

Designate a “module reviewer” rotation. Module changes require their approval.
Document breaking changes loudly in release notes.
Have a #infra Slack channel where consumers can ask questions.

The pattern works either way. The point is that modules are infrastructure code, owned by someone, with the same review discipline as application code.

A starter set of modules

For a typical AWS-based team, start with these:

ecs-service: runs a container.
rds-postgres: managed Postgres with sensible defaults.
s3-bucket: versioned, encrypted, blocked from public.
sqs-queue: queue + DLQ.
lambda-function: Lambda with log group, IAM role.
cloudfront-spa: static site delivery.
route53-record: DNS record with validation.

Most apps need a subset of these. Composing 4-6 of them describes most services.

What modules cannot do

A module abstracts the resources, not the workflow. Provisioning a database is one Terraform call; running migrations is something else. Modules don’t run migrations.

Some teams build complementary tooling around modules:

A CLI (internal-cli new-service) that scaffolds the Terraform call and the application repo.
A workflow that runs terraform plan, opens a PR, runs DB migrations after apply.

These layer on top of modules; they don’t replace them.

The takeaway

Internal Terraform modules turn “spin up a new service” from a week of work into ten lines of HCL. The investment is real: designing, testing, versioning modules takes engineering time. The payoff is real too. Every service after the first is faster, more consistent, and easier to update.

Build small focused modules, compose them, version them, test them, document examples. Avoid the four traps (single-resource wrappers, env-conditional logic, no versioning, no examples). The team that does this has infrastructure as a product, not as a copy-paste exercise.

A note from Yojji

The kind of platform-engineering discipline that turns infrastructure code from a per-service liability into a reusable asset (versioned modules, composition patterns, testing) is the kind of long-haul DevOps work Yojji’s teams build into the platforms they ship for clients.

Yojji is an international custom software development company founded in 2016, with teams across Europe, the US, and the UK. They specialize in the JavaScript ecosystem, cloud platforms (AWS, Azure, GCP), and Terraform-based infrastructure, including the module design and platform-engineering work that decides whether new services take a week or an hour to bootstrap.