← Blog

Phi Tracing for Late-Bound Fields

Michael Ten-Pow · CEO, · April 2026

Eryn wrote the post I wish I'd had the clarity to write a year ago: Form Factors: How Software Vendors Define Where Their Software Can Run. Read it if you want the full treatment. The short version: a form factor is a structured declaration of where a vendor's software is allowed to run — environment, connectivity, required services, security posture. Tensor9 keeps a registry of service equivalents so it knows what to swap an AWS service for in some other target.

This post is about one, very specific, fun implementation detail sitting underneath that. When a vendor writes aws_db_instance.engine_version = var.postgres_version, and Tensor9 has to pick a specific CloudNative-PG chart version for the on-prem form factor, how does it pick? The value it needs isn't a primitive — it's an expression. Often a conditional; and often threaded through three modules before it even reaches the resource.

This is the third post in a series. The first walked through how we represent infrastructure as a typed graph (STIR). The second walked through how service dialects let us raise AWS into a canonical form and lower to Kubernetes. This one walks through one specific mechanism inside the Tensor9 compiler: phi tracing.

TL;DR

The problem. To swap an AWS service for its target-environment equivalent, the compiler needs concrete values — which Postgres version, which instance class, which chart. Real vendor Terraform hands it expressions, not primitives.

What phi tracing does. It walks backward through the expression to collect every value it could take and the Terraform condition under which it takes each one. Output: one pre-compiled specialization per value, each wrapped in a count-gated module. Terraform itself picks the active branch at plan time.

Why it matters. The vendor never rewrites their Terraform. The compiled output reads like the original; the plan is the source of truth; every decision is visible before apply. No runtime magic, no new DSL.

Part 0: Context

What Tensor9 is

Tensor9 is a platform that lets software vendors take products they built for their own AWS account and deploy them into customer-owned environments: the customer's AWS, the customer's on-prem Kubernetes cluster, the customer's GCP project. The vendor hands Tensor9 the Terraform they already wrote for AWS, and the platform emits the stack the customer needs to run the same application in their environment. The application's behavior is preserved; the underlying infrastructure is translated. A compiler at the heart of the platform does that translation, and this post is about one specific mechanism inside it. You can learn more here: docs.tensor9.com.

Doing that well means the compiler has to reason about real vendor Terraform, not toy Terraform. Real vendor Terraform is full of variables, conditionals, and values threaded through layers of modules. That's what this post is about.

A few terms

These appear throughout. Most are defined more carefully in Eryn's post and in post one; we'll restate the ones that carry weight here.

Vendor — the software company whose application we're packaging. They wrote the origin stack and they're the Tensor9 customer; their own customers are the ones who end up running the compiled stack in their target environment.
Origin stack — the vendor's Terraform, written for the cloud they actually develop against (usually AWS).
Form factor — the target environment: AWS, on-prem Kubernetes, Google Cloud. See this post for more.
Service replacement — picking the equivalent service for a given form factor. aws_db_instance becomes a CloudNative-PG cluster on K8s. aws_elasticache_replication_group becomes a Valkey operator deployment.
Service compiler — the component inside the Tensor9 compiler that handles one service replacement. There's one per source service per target — for example, the Postgres service compiler converts aws_db_instance resources into their target-specific equivalents.
Late-bound field — a resource field whose value is an expression, not a primitive. The thing that breaks naive compilers.
Universe — a service-compiler-supplied finite abstraction of an otherwise unbounded value domain. {"14.9", "15.4", "16.2", "17.2"} for Postgres engine versions. Formally a widening operator parameterized by what the service compiler can actually specialize for; not every field has a finite universe.

Roadmap

Part 1 frames the problem: what naive service compilers do, why that fails on real stacks. Part 2 is a compact detour through SSA phi nodes, because the terminology is borrowed from there and it's worth being honest about what we borrowed and what we didn't. Parts 3 through 5 walk through the mechanism: backward data flow, symbolic conditions, template specialization. Part 6 lists the four scenarios you see in practice. Part 7 is a short list of adjacent topics we skipped for space.

Part 1: The Late-Bound Field Problem

A vendor's origin stack might have something like this:

variable "postgres_version" {
  type    = string
  default = "15.4"
}

variable "instance_class" {
  type = string
  // no default; set by the environment
}

resource "aws_db_instance" "app" {
  engine         = "postgres"
  engine_version = var.postgres_version
  instance_class = var.instance_class
  allocated_storage = 100
}

If the vendor runs terraform apply directly against their own AWS account, Terraform resolves var.postgres_version at plan time and RDS takes the value. Terraform doesn't care that the field was an expression; by the time the AWS API call goes out, it's a string.

When we're compiling this aws_db_instance into a CloudNative-PG Cluster resource for a Kubernetes form factor, we need to know the engine version at compile time, not at apply time.^† Why? Because different major versions map to different Helm chart versions, different CRD shapes, different default parameter groups. The service compiler has to pick which of those to emit, and it has to pick before the vendor ever runs anything.

So our naive first pass was: if the field isn't a primitive, fail. Emit a blocking stack issue, ask the vendor to hardcode the value. We even thought this was reasonable — Terraform itself does this in lots of places (count, for_each, provider aliasing). Vendors are used to it.^‡

Specifically, before Terraform plan time. The compiler runs on the vendor's laptop or in their CI pipeline, producing the generated Terraform that the vendor (or their customer) will later terraform plan and apply. So "compile time" here means earlier than either plan or apply — when there's no runtime context to draw on. Reasonable. Also wrong in practice. Vendors use variables precisely so they can change these values without editing code. Telling them to hardcode is telling them to give up the knob.

Here's the problem. Real vendor stacks almost never hardcode. They parameterize. They pass variables through modules. They conditionalize on environment. A more realistic shape of the same stack:

// root module
module "database" {
  source         = "./modules/postgres"
  engine_version = var.customer_env == "prod" ? "15.4" : "14.9"
  instance_class = local.instance_class
}

// modules/postgres/main.tf
variable "engine_version" { type = string }
variable "instance_class" { type = string }

resource "aws_db_instance" "app" {
  engine         = "postgres"
  engine_version = var.engine_version
  instance_class = var.instance_class
}

Three hops from the literal to the field. A conditional in the middle. A local pointing at who-knows-what. None of it is a primitive where the service compiler needs one.

As a STIR graph, the same stack looks like this. The field the service compiler is trying to read is the green box at the top right; everything else is the data flow the compiler would have to chase to find an actual value.

Rsx (resource)

ModCall

Expr

Param

Prim

value flow

RefTo

The STIR graph the compiler actually sees. The field on aws_db_instance is the entry point; getting to a primitive means crossing two module boundaries, walking one RefTo chain, and unwinding a ConditionalExpr.

So the compiler has two options.

Push the work back to the vendor. Ask them to hardcode the value or restructure the expression so the compiler sees a primitive. This works in a pinch, but it's a bad first impression: onboarding a new vendor shouldn't start with a list of places they need to rewrite code the compiler couldn't handle.
Figure it out. Walk backward through the data flow. Determine the set of possible values. Compile for each one, gate them behind the original conditional, let Terraform pick at apply time.

Option 2 is phi tracing.

Part 2: Phi Nodes, Briefly

The name is borrowed. It's worth being honest about what we borrowed.

In compiler theory, SSA (Static Single Assignment) form is a graph representation where every variable is assigned exactly once. When control flow joins — after an if/else, at the top of a loop — you need a way to say "this variable is either x₁ (if we came from block A) or x₂ (if we came from block B)." That's a phi node: φ(x₁, x₂). Classical SSA phi nodes are positional; they look at which predecessor block you came from. That's enough when you have a control flow graph.

Our context is different. We analyze data flow through Terraform, not control flow through basic blocks. The "predecessor" of a value isn't a block — it's a conditional expression, or a module boundary, or a for_each. And we care about why a branch was taken, not just which one. A positional phi is the wrong shape.

What we actually need is closer to GSA (Gated Static Assignment) γ (gamma) nodes. GSA extends SSA by making control dependence explicit in the data structure: each branch of a phi carries its own predicate — the condition that must hold for that branch to be active. γ(cond → x₁, ¬cond → x₂).^†

GSA was introduced by Ottenstein, Ballance, MacCabe, and others in "The Program Dependence Web" (PLDI '90). It's used in compilers that want data-flow reasoning without first reconstructing control flow, which maps cleanly onto what we're doing with the STIR graph.

We kept the "phi" name because nobody says "gamma nodes" colloquially. But in our implementation, every branch carries its own PhiCondition tree. That's the GSA design. More on conditions in Part 4.

Not a graph node type

The first design had phi as a first-class STIR node type, following the pattern used for count-gated generators. We pulled it out during implementation. Phi analysis results live for microseconds between the tracer that produces them and the specializer that consumes them. They never get serialized, emitted, or inspected by any other pass. Adding a node type would have meant exhaustive handlers across every pass, serialization code, graph image format changes — a lot of weight for something that isn't a durable part of the graph. Phi tracing produces transient analysis results, not graph structure.

Part 3: Backward Data Flow

Forward data flow starts at declarations and pushes values forward through the program. Backward data flow starts at a use and works out what could have produced it. Phi tracing is backward.

The service compiler is the driver. When it hits a field like engine_version = var.postgres_version and wants a primitive, it calls into the tracer with the STIR node representing that expression. That node is the seed.

Value-carrying edges

The STIR graph has a lot of edge types. For phi, only a handful matter:

Val — the "is assigned" edge from a local to its value expression.
Field — the edge from a resource to one of its fields, or from a module call to the value it passes to a downstream parameter.
ExprArg — the edge from an expression node to one of its sub-expressions (condition, true branch, false branch, function arguments).
GenIter — the edge from the iterator local of a for_each/count block back to the collection it iterates over.

Reference edges are used as a bridge, not as a primary traversal. When we hit a scope reference like var.x, we jump to x's definition node and continue the trace from there. The bridge keeps the tracer focused on what produces values, not on how names resolve.^†

This distinction matters when the reference chain is long. Pure reference traversal drags you through every name lookup. Data-flow-first traversal skips straight to value sources.

Three possible trace results

Every trace returns one of three things. The relationship is containment: Resolved (a single known value) sits inside Bounded (a finite set we can enumerate), which sits inside Unbounded — the outer region of possibilities we couldn't narrow down. There's a fourth state the animation makes visible: a branch set that is finite but has blown past the size limit — Bounded, but too large — which the tracer widens to Unbounded with that specific reason, rather than emitting a specialization per branch. Keeping analysis and output both finite is non-negotiable for a compiler that has to fit in a reasonable wall-clock budget. The animation starts tight and widens outward: from the one value we proved, to the few we know the selector chooses from, to the ones we counted but gave up on, to everything else we couldn't pin down.

From Resolved to Bounded to Unbounded

the three possible trace results, widening outward as analysis loses precision

Step 1 / 4

Resolved — one known value

Resolved — exactly one known value. E.g., a local that collapses to "15.4" after following through modules.
Bounded — a finite set of branches, each with a value and a symbolic condition that gates it.
Unbounded — we couldn't prove a finite set. The tracer gives up and returns a human-readable reason.

Results flow through function calls without losing precision. Applying lower() to a Resolved value stays Resolved with the value lowercased. Applying it to a Bounded result stays Bounded with each branch value lowercased. Applying it to Unbounded stays Unbounded. Every transfer function the tracer applies — string ops, equality, arithmetic — is monotone with respect to this containment order, which is what makes composition sound.

Stepping through the running example

The running example, from Part 1, is the engine_version field on the aws_db_instance inside the ./modules/postgres module, where the root module passes a conditional.

The tracer starts at the field's value expression. Step by step:

The ConditionalExpr is where the fork happens. The tracer recurses down ExprArg("true") with an accumulated path condition of Existing(env=="prod"), and down ExprArg("false") with Not(Existing(env=="prod")). Each branch terminates at a primitive, each with its own condition.

The result is a Bounded value with two branches. Each branch carries the value it would produce and the symbolic condition under which it would apply. The selector — the expression the branches disagree on — is stored once, separately, so the specializer can reuse it when gating.

Bounded is a transient analysis result: two symbolic branches hanging off a shared selector, not a new graph structure.

What else the tracer handles

The running example is the happy path. In real stacks, the tracer has to deal with:

Function calls. lower(var.x), tostring(var.n), try(var.a, var.b), coalesce(...), lookup(map, key, default). Each function has a per-function trace strategy that tells the tracer how to push results through. For a pure transform like lower, the tracer traces the argument and applies lower to each resolved value. For try, the tracer takes an arm-union: it specializes both arms when their static resolvability is the same, and widens to the union when they diverge.^† The underlying function semantics are shared with the graph evaluator so the tracer and the evaluator can't drift. Why arm-union, not "first resolvable arm": try(a, b) in real Terraform catches eval-time errors (null traversals, type coercion failures), not static unresolvability. Picking the first statically-resolvable arm would silently specialize a when a would have errored at apply and b would have fired. Arm-union is wider than the vendor's original intent, but it's always sound.
for_each/count iterators. each.value inside a for_each block traces back to the collection being iterated over, and the tracer extracts one branch per collection entry, gated by Eq(selector, entry_key). Dynamic blocks take the same path with one wrinkle: the iteration keys must be plan-time-known (Terraform already enforces this), but the per-iteration content expressions evaluate in each iteration's scope. The tracer keeps the iteration axis and the value axis separate so the two don't conflate.
Cross-module data flow. Variables in child modules don't carry a standalone "where my value comes from" edge. We use inline lookup: when the tracer hits a Param in a child module, it finds the matching ModCall in the caller, follows its Field edge for that parameter, and recurses on the caller's value expression. No persistent provenance edges to maintain; one-time name match at trace time.
Variables without conditionals. If a Param has no default and no conditional driving it, the tracer falls back to the universe the service compiler supplied. One branch per universe value, each gated Eq(selector, v). If no universe, Unbounded.
Safety. A visited set catches cycles. A depth limit (default 20) catches runaway recursion through deeply nested modules. A size limit on the branch set (default 16) bounds the output: once the bounded set grows past the threshold, the tracer widens to Unbounded with the reason "bounded, but too large to specialize". This is the classical abstract-interpretation move for keeping analysis finite, and it's also the mechanism that prevents specialization blow-up in downstream output (see the "Bounded, but too large" phase in the animation above). Any limit violation returns Unbounded with an actionable reason.

Transfer-function strategies

For the tracer to be sound, every function it pushes values through must be monotone with respect to the containment order Resolved ⊂ Bounded ⊂ Unbounded. The easy cases are pure element-wise transforms like lower and tostring: apply to each branch value; result size unchanged. The interesting cases are the ones that can blow up or lose information:

Set-valued operations (contains(list, x), concat, merge). When both operands are Bounded, the tracer computes the cross-product of branch pairs subject to the size limit: if |A|×|B| exceeds the threshold, we widen to Unbounded with the "too large" reason rather than emitting a specialization per cell. Sound, finite, and bounded-output.
Opaque decoders (jsondecode, yamldecode). The tracer doesn't descend into the decoded structure. If the argument is a literal string, the decoder runs at trace time and the result is Resolved. If the argument is anything else, the decoder's output is Unbounded — we can't enumerate the shape of the structure without executing it, and executing it against arbitrary Bounded inputs is potentially unsafe (decoders can throw). A vendor who wants specialization through a decoded blob can refactor to pass the decoded fields as separate variables.
Non-deterministic functions (timestamp, uuid, bcrypt). These are not plan-time-stable — they re-evaluate on every plan. The tracer refuses to use them as a gate selector and returns Unbounded with the reason "plan-stability violation". Using them as a pass-through value (the right-hand side of an expression, not the selector) is fine and stays Resolved.
Data sources. data.aws_rds_engine_version.latest.version behaves exactly like var.x with no default: the tracer asks the service compiler for a universe for the surrounding field and, if one is supplied, emits one branch per universe value gated Eq(data.…, v). Data-source values evaluate at Terraform refresh (before plan), so using them as gate selectors is plan-time-stable. If no universe is supplied, the result is Unbounded — the same fallback as a variable without conditionals. There's nothing special-cased about data sources.

New Terraform functions get added to the strategy registry, not to the tracer core. Each registration carries the monotonicity proof obligation: show that the function maps the containment order forward. In practice that's a two-line argument per function family.

Part 4: Symbolic Conditions

One design principle we enforced hard: the tracer is read-only with respect to the graph.

The reason is prosaic. Tracing gets called a lot, from a lot of places. A service compiler might trace one field, look at the result, decide it needs to trace a sibling field, look at that result, decide to back off, and never produce any output at all. If every trace mutated the graph by synthesizing condition nodes, we'd pile garbage into STIR that had to be cleaned up. We'd also have subtle double-counting if the same trace ran twice.

So the tracer builds condition trees symbolically. A PhiCondition is one of four shapes, each describing a way a gate can be expressed without committing any graph nodes:

The four PhiCondition shapes. Only Existing points at real graph nodes; the other three are pure description until materialization.

Existing reuses a node that's already in the graph (typically the condition expression from a ConditionalExpr). The other three are synthesized during the trace — but only as descriptions, not as graph nodes.

One invariant that falls out of this: once a PhiCondition pins an Existing node, that node is frozen for the rest of the pass.^† Later specialization can still wrap it, index into it, or reference it, but it can't rewrite it in place. If a pass genuinely needs to rewrite, it clones first. Without that rule, the read-only-analysis story would leak — a later mutation would silently change the semantics of every materialized gate pointing at the same node. The enforcement isn't discipline: STIR nodes are immutable data classes, and the Kotlin compiler refuses any attempt to write through a reference. The "clone first" path goes through a dedicated constructor that takes ownership of the new copy.

Copy-on-write is the conservative default. We've never had to reach for it in practice because the specializer only writes new Gen nodes around existing structure, but the invariant is cheap to honor and catches a whole class of aliasing bugs.

Path conditions during tracing always compose via conjunction. If we're in the true branch of A == prod, then in the false branch of B == us, the path condition is And(Existing(A=="prod"), Not(Existing(B=="us"))). There is no disjunction during the trace; each branch corresponds to exactly one path, and paths accumulate via AND.

A three-way example. Nested Terraform conditional:

local.x = var.env == "prod" ? "m5.xlarge"
                         : (var.region == "us" ? "t3.medium" : "t3.small")

The tracer walks both conditionals and produces three branches, each with its own symbolic path condition:

Nested conditionals become AND-chained path conditions. Each leaf is a PhiBranch with a primitive value and the conjunction of every decision taken to reach it.

The bridge: condition materialization

The specializer is where graph mutation happens — it's the component that actually rewrites the STIR graph into a specialized, count-gated form. When the specializer needs a concrete graph node for a condition (because count = cond ? 1 : 0 needs cond to be a real expression node in the graph), it calls a function that walks the symbolic PhiCondition tree and synthesizes the equivalent concrete STIR expression nodes:

Symbolic tree on the left (pure description, never touched the graph) becomes a tree of real expression nodes on the right. Existing leaves reuse their wrapped nodes directly, so materialization is only synthesizing And/Not/Eq combinators.

Existing is cheap — reuse the node that's already there. Eq becomes a BinaryOp("==") between the selector and a primitive literal. Not wraps an inner materialization in a UnaryOp("!"). And wraps two sub-materializations in a BinaryOp("&&"). The whole thing is a straight recursive descent.

This function is the only place where phi-derived conditions turn into graph structure. It runs exactly once per specialized branch, so there's no garbage and no double-counting. And it lets us do cheap things with conditions before committing: coalescing branches that share a value, negation normalization, shared-subtree detection.

Why the boundary matters

Analysis passes and graph mutation are different kinds of work: analysis is exploratory, while mutation is committing. Mixing them — synthesizing nodes during exploratory analysis — makes passes non-idempotent and piles up garbage. Symbolic conditions let the tracer reason (building, combining, simplifying) without touching the graph. Materialization is where the analysis hands its final artifact to the graph, at the moment the graph needs it. Everything between is pure computation over immutable inputs.

Part 5: Template Specialization

The tracer hands the service compiler a Bounded result: N branches, each with a value and a symbolic condition. Now what?

The service compiler's existing logic already knows how to compile for one known value. That's the function it had before phi tracing existed. Phi doesn't change that function; it just calls it N times.^†

This is classical offline partial evaluation — specifically the first Futamura projection. The compile function is the interpreter; each branch value is a statically-knowable binding; specializing the compile call against each binding produces residual code, and the count gate re-dispatches the residuals on the dynamic selector at plan time. Jones, Gomard, and Sestoft's Partial Evaluation and Automatic Program Generation is the canonical reference.

The entry point the service compiler calls takes four things: the expression it wants resolved, the universe of values it knows how to handle, a compile function that turns one value into a list of graph nodes, and a fallback for the unbounded case. Everything else is the phi system's job.

One expression in, N count-gated modules out. The service compiler's own compile function runs once per branch value — it never sees phi.

The specializer runs four steps per branch:

Call the service compiler's compile function with the branch's value, collecting one list of graph nodes back.
Materialize the branch's symbolic condition into a real graph expression node.
Wrap the compile output in a Gen node^† with count = cond ? 1 : 0.
Rewrite any downstream references to the gated nodes so they carry the required [0] index.

A Gen node is our compiler's representation of Terraform's count and for_each. It takes a child node (a resource, a module call) and a count/iteration expression, and at lowering time emits the underlying Terraform with the appropriate count = or for_each = attached. Gen nodes are how we represent "zero-or-one" and "one-per-key" in a single place instead of threading those concerns through every pass.

Output:

module "db_v15_4" {
  count  = var.customer_env == "prod" ? 1 : 0
  source = "./modules/postgres-v15-4"
  instance_class = local.instance_class
}

module "db_v14_9" {
  count  = var.customer_env == "prod" ? 0 : 1
  source = "./modules/postgres-v14-9"
  instance_class = local.instance_class
}

As a STIR graph, the specialized output looks like this. The original ModCall is gone; in its place are two count-gated Gen nodes, each wrapping its own specialization. The count field on each Gen is the materialized branch condition — the same selector as the original conditional, just with its truth direction flipped per branch.

Two Gen nodes, each gating its own specialization. Both count expressions reference the same shared selector; the branches disagree only on which side of the boolean they activate.

One of these resolves to count = 1 at plan time; the other to count = 0. Terraform produces exactly one deployment. The vendor's original conditional logic is preserved in the gate expression, but now it selects between two properly specialized compilations instead of trying to thread a single specialization through both versions.

The plan-time-knowability invariant

All phi gate conditions are guaranteed to be known at Terraform plan time, not apply time. That is the invariant that makes this whole scheme work.^†

Terraform's rule: count and for_each values must be known at plan time. If a gate depended on, say, aws_db_instance.db.arn — a value that only exists after apply — you'd get the classic error: "The `count` depends on a value that will not be known until apply."

The tracer enforces this by construction. It follows data flow only through variables, locals, conditional expressions, and data sources (which evaluate at refresh, before plan) — never through resource attributes. Variables resolve at plan time. Locals are just expressions over variables. Conditionals evaluate over those. None of the sources of a phi condition is something Terraform has to call a cloud API to learn and apply to provision.

The enforcement isn't a convention in the tracer code. STIR distinguishes plan-time-stable expressions from apply-time values at the type level: a resource attribute like aws_db_instance.db.arn is a different node kind than a variable or local, and the tracer's pattern matching simply doesn't have an arm that recurses through it. If a vendor expression tries to use a resource attribute as a selector, the tracer returns Unbounded with the reason "selector depends on an apply-time value" and the vendor gets a stack issue before any output is emitted.

This design decision has a sharper consequence than it might sound: it's the reason we don't need a second "apply-time phi" mechanism. Every specialization that phi produces can be decided at plan. The vendor reads the plan, sees which module is coming up with + count = 1, and reviews the actual specialization that will run.

Downstream reference rewriting

Terraform treats count-gated resources as lists. If the module module.db used to be referenced elsewhere as module.db.output, once count is attached, the reference must become module.db[0].output.

The specializer handles this automatically. It walks every reference edge pointing at a count-gated node, finds the corresponding scope-traversal expression, and splices an index-zero step into the reference path right after the root reference. This composes through reference chains: only direct references to the gated node need [0]; references that go through other nodes carry the index along for the ride.

Hand-authored depends_on edges get the same treatment, with one important rule: the rewrite depends on every branch module, not just the active one. When a vendor writes depends_on = [aws_db_instance.app] and the specializer has split the resource into module.db_v14_9, module.db_v15_4, and module.db_v16_2, the rewritten reference is depends_on = [module.db_v14_9, module.db_v15_4, module.db_v16_2]. Inactive branches have count = 0, which Terraform treats as an empty list — a valid dependency that contributes no edges — so the downstream resource waits exactly for the active branch. Picking just one branch to depend on would fail on every plan where a different branch was active. User-declared ordering survives specialization; silently dropping those edges would be the kind of correctness bug that turns into an overnight page.

A subtlety worth flagging: specializing aws_db_instance.app into module.db_v15_4.aws_db_instance.app is a resource address change. Without care, existing Terraform state would see the old address disappear and the new one appear — a destroy-and-recreate on first apply after adoption, which is a very bad surprise for a database.^†

The same address-stability question applies across compiler versions: a Tensor9 upgrade that changed generated module names would show up as a state diff on the customer's next plan, which is unacceptable for production infra.^‡

The compiler emits Terraform moved {} blocks alongside each specialization so state migrates in place. The first post-compile plan shows a "moved" diff with no destroy/create, and apply is a no-op against unchanged infrastructure. Provider aliases on the original resource are carried onto the specialized modules through the same mechanism. Tensor9 guarantees generated resource address stability across compiler versions. The specialization naming scheme is a public contract. If a future version needs to change it, the compiler ships a migration artifact (moved {} blocks again) so the first plan after upgrade is a no-op. Customers never have to choose between upgrading Tensor9 and getting a clean plan.

When the specializer collapses

If the tracer returns Resolved, there's only one value; no branching, no gating. The specializer just calls compile(value) and returns the result directly. Similarly, if Bounded came back with branches that all carry the same value (a pathological but possible case), the specializer dedups to one module and drops the gating. No wasted modules in the output.

Not every service compiler output is directly countable. If the template returns a field of an already-gated resource, or a node that already has its own count/for_each, wrapping it in Gen directly doesn't compose. The specializer falls back to synthetic module wrapping: each branch's output is placed inside a generated sub-module, which is countable, and the parent count-gates the sub-module. The service compiler never sees the distinction; it just gets a list of nodes back.

The nested-count semantics are what you'd expect. When the parent gate evaluates to count = 0, the whole sub-module is inert — the inner gate never evaluates. When the parent gate evaluates to count = 1, the inner gate fires on its own selector exactly as if no wrapping had happened. There is no case where the two gates "disagree" at runtime: the outer gate is an on/off switch for the inner, not a competing condition.

Provenance for apply-time debugging

The debugging question operators actually care about: "Postgres 14.9 just got deployed; it was supposed to be 15.4 — why?" At 3am, you don't want to reverse-engineer the answer from the generated Terraform.

The specializer emits provenance in two places, by default, for every specialization:

A one-line comment in the emitted Terraform directly above the count-gated module. This is the primary affordance — vendors and operators read terraform plan output, not sidecar files, and the comment is sitting right where they're already looking.
A companion JSON record, written alongside the generated Terraform, carrying the full trace for tooling, post-apply triage, and the kind of forensic analysis humans only want to do once.

The inline comment looks like this:

# phi: engine_version (aws_db_instance.app) — gate: Not(Existing(var.customer_env == "prod")) — trace: modules/postgres/main.tf:12-18 — compiler: tensor9 1.8.4
module "db_v14_9" { count = (var.customer_env != "prod") ? 1 : 0 ... }

The companion JSON record carries everything the inline comment trims for brevity:

{
  "module":        "db_v14_9",
  "source_field":  "aws_db_instance.app.engine_version",
  "branch_value":  "14.9",
  "gate":          "Not(Existing(var.customer_env == \"prod\"))",
  "trace_path": [
    "Field aws_db_instance.app.engine_version",
    "NameRef var.engine_version",
    "cross-module: caller's ModCall field \"engine_version\"",
    "ConditionalExpr false-arm (path: Not(Existing(...)))",
    "Prim \"14.9\""
  ],
  "source_span":   "modules/postgres/main.tf:12-18",
  "compiler_version": "tensor9 1.8.4"
}

compiler_version does real work, not decoration. When an operator is forensically reading a provenance record a year after the compile that produced it, the first question they ask is which Tensor9 version generated it — because the specializer's decisions and the naming scheme are versioned.

Two things fall out of having provenance in both forms:

Plan-time review. The vendor reading terraform plan output sees which trace produced each count-gated module and why its gate evaluates the way it does. No reverse-engineering required.
Post-apply triage. If the wrong specialization fired, the JSON record names the gate, the trace path, and the compiler version. The operator diffs the gate against the current variable values and the mistake is almost always obvious within a minute.

The records are small (typically a few hundred bytes per specialization), written alongside the generated Terraform, and not consulted at apply time — they exist purely for human and tooling consumption after the fact.

Part 6: The Four Scenarios

In practice, what the tracer hands the specializer falls into four buckets. Roughly in this order of prevalence, in our experience.

Scenario 1: One known value

The most common. A vendor passes engine_version = "15.4" from the root module, maybe through two or three layers of modules, maybe transformed once or twice by a function like tostring. Each hop is traversable, and at the end, a primitive. The tracer returns Resolved("15.4"). The specializer emits one module, no gating.

Most parameterization in real stacks is of this form: vendors use variables so they can change values later, but in the current deployment they're threading a single concrete value through. Phi tracing for this case is "follow the yarn to the spool," and the cost is small.

Scenario 2: Conditional with both branches knowable

The running example. var.env == "prod" ? "15.4" : "14.9". Two branches, each with a concrete value and a path condition. Specialize twice, gate on the condition expression. The vendor's original intent — "use 15.4 in prod, 14.9 elsewhere" — survives compilation as two modules gated on the same boolean.

The interesting subcase: nested conditionals. A three-way conditional produces three branches with nested AND/NOT conditions. The specializer emits three modules, each with its own gate. Branches that happen to share a value across different paths get coalesced via disjunction so the output doesn't carry duplicate specializations.

Scenario 3: Unknown variable with a bounded universe

A vendor has variable "instance_size" { type = string } — no default, no conditional setting it in this stack. The vendor wants customers to provide the value at deployment time.

This is the case that motivated the universe parameter the service compiler supplies when it invokes the phi-aware entry point. The service compiler usually knows the domain of the field it cares about, even if the vendor's HCL doesn't say. Instance sizes are a small closed set. Engine versions are a small closed set. TLS versions are a small closed set.

The service compiler passes the universe in. The tracer, on hitting a Param with no default, falls back to the universe: one branch per universe value, each gated with Eq(selector, v). The specializer generates N specializations. At apply time, whichever value the customer picks for instance_size — provided it's in the universe — keeps exactly one specialization alive.

// universe = {"small", "medium", "large"}
// var.instance_size has no default
module "db_small" {
  count  = var.instance_size == "small"  ? 1 : 0
  ...
}
module "db_medium" {
  count  = var.instance_size == "medium" ? 1 : 0
  ...
}
module "db_large" {
  count  = var.instance_size == "large"  ? 1 : 0
  ...
}

If the customer supplies a value outside the universe, the compiler emits a Terraform-native validation {} block asserting contains(universe, var.instance_size), so the customer gets a clear error at plan time rather than a silent no-op.

Scenario 4: Unbounded

The expression is a variable with no default, no conditional driving it, and no universe the service compiler knows about. Or it's a function call whose semantics we don't have a strategy for. Or the trace hit the cycle/depth/branch-count safety limits.

The tracer returns an unbounded result. Rather than guess, the compiler stops and emits a stack issue that points at the exact expression and tells the vendor how to unblock it:

⚠  Blocking: can't determine the value of engine_version

  The engine_version field on aws_db_instance.app is set to
  var.postgres_version, but we couldn't narrow that variable down to a
  specific value or a small set of values. We need to know the possible
  values up front so we can pick the right PostgreSQL version for each
  of your customer deployments.

  Why:
    var.postgres_version has no default value, and no validation
    block lists the allowed values.

  Fix (any one of these works):

    1. Give the variable a default:

         variable "postgres_version" {
           type    = string
           default = "15.4"
         }

    2. Constrain the variable with a validation block:

         validation {
           condition     = contains(["14.9", "15.4", "16.2"], var.postgres_version)
           error_message = "postgres_version must be 14.9, 15.4, or 16.2"
         }

    3. Hardcode the value if it doesn't need to be configurable:

         engine_version = "15.4"

  Option 2 is usually the right choice — your customers can still pick
  between versions, and we'll compile a specialization for each one.

The issue points at the specific expression and offers concrete, copy-pasteable fixes. That matters: a stack issue that tells a vendor which field, which variable, which module — and how to fix it — turns into a five-minute edit. A generic "this field isn't a primitive" turns into a support ticket.

The scenarios, as a table

Scenario	Trace result	Specialization	Gate
One known value	`Resolved(prim)`	1 compilation, no gate	—
Conditional	`Bounded` — 2-3 branches, each with symbolic condition	N compilations	the original conditional
Bounded universe	`Bounded` — one branch per universe value, `Eq`-gated	N compilations	`var.x == "v"`, one per value
Unbounded	Unbounded (with a human-readable reason)	none — blocking stack issue with a concrete fix	—

Part 7: What This Post Didn't Cover

Phi tracing has enough surface area to fill a small book. A few adjacent topics we skipped in this post, which we may come back to later:

Branch coalescing via disjunction (the most likely "next in series"). When nested conditionals produce multiple branches that land on the same value, the specializer coalesces them — grouping branches by value and merging conditions with OR — so the output doesn't ship duplicate specializations:
```
var.env == "prod" ? "m5.xlarge"
                  : (var.region == "us-east-1" ? "m5.xlarge" : "t3.small")
```
Three branches become two modules, not three. The algorithm for turning a conjunctive-branch forest into a coalesced DNF is worth its own post.
Universe maintenance. The universe is the set of values a service compiler knows how to specialize for, and it's what phi tracing turns on. Postgres engine versions, instance classes, K8s API versions, all of it. Where universes come from, who curates them, what happens when AWS deprecates a value mid-deployment, and how a customer's existing tfvars file interacts with a universe shrunk by a compiler upgrade are all real questions that didn't fit here. Today some universes are hand-curated in the service compiler, some are inferred from validation blocks, and we're working on a sourcing layer that pulls from provider schemas. A future post on universe lifecycle and drift management is owed.
Cross-version state migration. Specialization is sensitive to address shape: aws_db_instance.app becomes module.db_v15_4.aws_db_instance.app. When the compiler itself changes (different specialization strategy, different sub-module structure), addresses can shift across compiler versions. The compiler emits moved {} blocks with each output so the first plan after upgrade is a no-op, but that handles the easy case. A customer who skipped three Tensor9 versions, a vendor whose stack hits a renamed sub-module path, a universe element removed between compiler versions: each of these is its own design decision. Worth a post on its own.
Validation block lifting. Terraform lets vendors constrain a variable's domain with validation {} blocks (contains(["s","m","l"], var.size) and friends). The lifter recognizes these and feeds the constraint into the tracer as an implicit universe, which is a nice way to let the HCL itself drive specialization without the service compiler having to enumerate valid values. The implementation details get into AST traversal and constraint inference that didn't fit here.
Binning for large bounded sets. "Amount of memory" has hundreds of possible integer values; generating a specialization for each is a non-starter. Binning groups values into equivalence classes (small, medium, large) and specializes per bin, with a stack warning to the vendor that fine-grained tuning requires making the attribute known at compile time. The interesting design question is where the bin boundaries come from — service-compiler-provided, inferred from resource pricing tiers, or vendor-configurable.
Specializer dedup across calls. If two different fields on the same resource both trace back to the same variable, naive specialization would gate the resource twice. The specializer caches resolutions keyed by expression identity, so shared dependencies share a single count gate. It's a small thing that has an outsized effect on generated-Terraform readability.
The rest of the compiler. Phi tracing is one mechanism. A cross-cloud infrastructure compiler has many others — the lifter, the linker, the evaluator, the resolvers from the previous post, and every one of the service compilers. The compiler is a project, not a post.

Summary

Eryn's post explained what form factors are — the contract that pins down where a vendor's software is allowed to run. For that contract to hold, the compiler has to perform service replacements correctly for each form factor. Service replacements depend on specific values of resource fields — engine versions, instance classes, memory sizes. Real vendor stacks don't hand the compiler primitives; they hand the compiler expressions.

Phi tracing is how we bridge that gap. Backward data flow analysis over the STIR graph, producing one of three trace results (Resolved / Bounded / Unbounded) with symbolic condition trees on each branch. When bounded, the specializer calls the service compiler's existing compile logic once per branch value and wraps each output in a count-gated module. The plan-time-knowability invariant falls out of only tracing through variables, locals, and conditionals, never through resource attributes. The tracer is read-only; the specializer owns graph mutation; condition materialization is the narrow bridge between them.

The effect, from the outside: a service compiler that used to require primitives now handles parameterized stacks without changes to its core compile logic. It wraps its existing compile function in a phi-aware entry point. The phi-tracing system does the rest.

Next in the series: probably a deeper look at how the compiler decides which target service to reach for in the first place, and how that choice plays with the version-picking mechanism described here.

-mtp 2026-04-29