[WIP] Ephemeral Values prototype #35078

apparentlymart · 2024-04-24T19:35:26Z

This is another attempt at introducing to Terraform the idea of objects and values being "ephemeral", which means something like "lives only for the duration of one Terraform phase".

Terraform already has at least two concepts that meet this definition, despite us not previously naming it:

Provider configurations (provider blocks): Terraform re-evaluates the arguments in a provider block separately during the plan and apply phases, and doesn't mind if the configuration is different between the two as long as the apply-time configuration allows performing the actions that were proposed during the plan phase.
Provisioners (provisioner and connection blocks): Terraform fully evaluates these only during the apply phase, so they aren't really considered during the plan phase at all, aside from basic static validation.

However, because the idea of "ephemeral" is not available in the rest of the language, it's tough to actually benefit from this ephemerality. This prototype aims to introduce "ephemeral" as a cross-cutting concern supported broadly across the language.

Ephemeral Values

The most fundamental idea is that values used in expressions can either be ephemeral or non-ephemeral. This is an idea similar to "sensitive" in that Terraform will perform dynamic analysis such that any value derived from an ephemeral value is itself ephemeral. Ephemeral values can then be used only in parts of the language which would not require persisting the value either between the plan phase and the apply phase, or from one plan/apply round to the next.

Considering only pre-existing language features, ephemeral values can be freely used in provider blocks, provisioner blocks, connection blocks, and in local values. The following sections describe some new additions that either accept or produce ephemeral values.

resource blocks (aside from special nested parts like the aforementioned provisioner blocks) do not accept ephemeral values, because preserving resource configuration unchanged between the plan and apply phases is a fundamental part of how Terraform works to keep its promise of either doing what the plan described or returning an error explaining why that's not possible.

Because ephemeral values are not expected to persist from plan to apply or between plan/apply rounds, there is no need to save them in saved plan files or state snapshots, thus finally giving a plausible answer for what to do about #516, which has been on my mind since long before I worked at HashiCorp.

Ephemeral Input Variables

An ephemeral input variable is, in the most general terms, just an input variable that is declared as accepting ephemeral values. A non-ephemeral input variable cannot accept ephemeral values, while an ephemeral value will accept both ephemeral and non-ephemeral values but the value will always be treated as ephemeral when used inside the declaring module.

The main interesting case is when a root module declares an ephemeral input variable. In that case, Terraform will no longer remember the value for the variable provided during planning and will instead expect any ephemeral variable set during the plan step to be provided again -- possibly with a different value -- during the apply step.

The primary goal of this is to be able to use input variables to set arguments in ephemeral contexts. For example, an input variable that's both ephemeral and sensitive could provide a JSON Web Token to be used when configuring a specific provider, and then automation around Terraform could provide separate JSON Web Tokens across the plan and apply phases so that the apply phase isn't subject to the expiration time for the plan-time JWT, and so that the plan-time JWT doesn't get persisted to disk as part of a saved plan.

Ephemeral Output Values

An ephemeral output value is essentially the opposite of an ephemeral input variable, allowing a module to expose an ephemeral value to its caller. As with input variables, a non-ephemeral output value will reject having an ephemeral value assigned to it. An ephemeral output value can have both ephemeral and non-ephemeral values assigned to it, but the calling module will always see it as ephemeral.

To start the utility of this is limited just to echoing back values derived from ephemeral input variables, since nothing else I've described so far actually produces ephemeral values. However, allowing this is important to ensure that ephemeral values are supported symmetrically and will cooperate well with all other language features.

Ephemeral Resources

The final idea in this prototype -- one which this prototype probably won't explore fully just yet, and introduce only just enough to validate that it fits in well with everything else -- is a new resource mode for representing remote objects that are ephemeral themselves.

Terraform currently has two "resource modes": managed resources (resource blocks) describe objects that Terraform is directly managing, while data resources (data blocks) describe objects that are managed elsewhere that the current configuration depends on. But in both cases the assumption is that those objects persist in some sense from plan to apply and from one plan/apply round to the next, and that Terraform is supposed to detect and react to any changes to those objects and therefore needs to persist information about them itself.

Ephemeral resources, (ephemeral blocks) on the other hand, represent objects that -- at least, as far as Terraform is concerned -- exist only briefly during a single Terraform phase, and then get cleaned up once the phase is complete. This idea is an evolution of some much earlier design work I did before I even worked at HashiCorp 😀 in relation to #8367, which was about establishing temporary SSH tunnels, and the HashiCorp Vault provider I wrote in #9158 (which evolved into today's official hashicorp/vault).

The general idea of ephemeral resources, then, is that their lifecycle includes three events:

OpenEphemeral: Prepares the object for use. For some kinds of objects this would represent a "create" action, but for others it might just open a temporary session to something that already exists, such as in the SSH tunnel use-case.

This operation is the one that establishes the result attributes that can be accessed from other parts of the module where the resource is declared. All of these results would be ephemeral values, so that they can vary from plan to apply. For example, opening an SSH tunnel is likely to cause a different local TCP port number to be allocated each time, and so consistency between plan and apply phases is not expected.
RenewEphemeral: Some ephemeral remote objects need to be periodically refreshed in order to stay "live", such as leases for Vault secrets.

This optional operation is therefore opted into by the provider's OpenEphemeral response, by providing a private set of data that should be sent back to the provider's RenewEphemeral implementation and a deadline before which Terraform must renew it. The provider can then do whatever is needed to keep the object from expiring, and optionally return another renew request with a new deadline in order to repeat this renewal process.
CloseEphemeral: Once Terraform has completed work for all objects that refer to the ephemeral resource, this operation gives the provider an explicit signal that the object is not longer required so that it can be promptly destroyed or invalidated.

This detail is particularly helpful for the Vault provider and fixes a limitation I ran into immediately back in 2016: a dynamic secret fetched using a data block can never have its lease explicitly terminated, because data resources were intended only to read information about an object someone else is managing, not to directly manage an object (a Vault lease).

Because the results from ephemeral resources are ephemeral values, they're primarily useful in configuration for other ephemeral objects: provider blocks, provisioner/connection blocks, and of course other ephemeral blocks.

Actually changing the provider protocol and implementing real providers is not in scope for my initial prototyping work here, and so I intend to prototype this in a more limited way that just emulates how this mechanism might behave, so we can see how well it interacts with the rest of the language and the other ephemeral values discussed here.

I've also been considering a mechanism to allow managed resource types to declare individual arguments as being "write-only", such as for an RDS database password that only needs to be provided during creation and should not be provided again unless the operator actually intends to reset it. I don't intend to prototype that in here, but I intend to lay the foundations for it by having a convention that ephemeral input values and write-only arguments both treat null as meaning "don't set or change" and non-null as "set or change", thereby creating a small imperative-shaped niche in the otherwise-declarative Terraform Language to allow for using Terraform to manage objects that have write-only (typically, sensitive) arguments without needing to persist them in plan and state.

I'm still working on this, so not everything described above is in here yet, but the foundations for ephemeral values themselves are already in. I've opened this draft largely just because I need to put this work down for a while for a team offsite and don't want to lose the context.

For now these graph nodes don't actually do anything, but the graph shape is at least plausible for what we'll need.

We now need to clean up any straggling ephemeral resource instances before we complete each graph walk, and ephemeral resource instances are ultimately owned by the graph walker, so the graph walker now has a Close method that's responsible for cleaning up anything that the walker owns which needs to be explicitly closed at the end of a walk.

Because ephemeralResourceCloseTransformer runs very late in the transform sequence, it's too late to get provider open and close nodes associated with it automatically. We don't actually need to worry about the provider _open_ dependency because our close node always depends on all of our open nodes and they will in turn depend on the provider open they need. But for close we need to delay closing the provider until all of the associated ephemeral resources have been closed, so we need to do a little fixup: If any of particular ephemeral resource's open nodes have provider close nodes depending on them, those provider close nodes should also depend on the ephemeral resource close node. That then describes that the provider should remain open for as long as at least one ephemeral resource instance owned by that provider remains live, which makes it okay for us to do our periodic background renew requests and our final close requests.

…alysis Previously we had a special interface graphNodeEphemeralResourceConsumer and a helper for implementing it in terms of GraphNodeReferencer, but for the moment we'll just use GraphNodeReferencer directly with that helper because that gives us broad coverage across many node types without having to make such sprawling changes just to support a prototype. The separated interface design might return later if we discover a need for a node to report that it uses an ephemeral resource without actually including any expression references for it, but we'll wait to see if that additional complexity is actually needed.

Ephemeral resources work quite differently than managed or data resources in that their instances live only in memory and are never persisted, and in that we need to handle the possibility of the object having become invalid by the time we're evaluating a reference expression. Since we're just prototyping ephemeral resources for now, this works as a totally separate codepath in the evaluator. The resource reference handling in the evaluator is long overdue for being reworked so that it doesn't depend so directly on the implementation details of how we keep track of resources, and the new ephemeral codepath is perhaps a simplified example of what that might look like in future, but for now it's used only for ephemeral resources to limit the invasiveness of this prototype.

I'm honestly not really sure yet how to explain _why_ ephemeral resource nodes are getting pruned when they shouldn't; for the sake of prototyping this is just a hard-coded special exception to just not consider them at all in the pruneUnusedNodesTransformer. The later ephemeralResourceCloseTransformer has its own logic for deciding that an ephemeral resource isn't actually needed in the current graph and pruning both their open and close nodes, so these will still get pruned but it will happen in different circumstances and based on a later form of the graph with more nodes and edges already present, thus preventing some cases of ephemeral resources being pruned when they shouldn't be.

The modules runtime should always use a different strategy to keep track of live ephemeral resource instances, and should never persist them in the plan or state. These checks are here just to reduce the risk that a bug in the modules runtime could inadvertently result in an ephemeral resource instance being persisted. This is a bit of a "defense-in-depth" strategy, because the state and plan types all have most of their fields exported and so we can't be sure that all modifications will go through the mutation methods.

This is just enough to skip writing and reading ephemeral resources and their instances in the plan and state, so that we can reach the code that manages them in their own separate data structure. This relies on the new idea of some resource modes not being persisted between rounds and not being persisted from plan to apply, although for now EphemeralResourceMode is the only mode that doesn't do both of those things.

To support ephemeral values we need a more complicated set of rules about what input variables can and must be set when applying a saved plan. The command itself does not have enough information to implement those rules itself, so we'll let them pass through and check this in the local backend's apply phase instead. The local backend's apply phase already has basic support for dealing with apply-time variable values, so this just removes the blocker that was preventing values from reaching that logic.

We need to remember which ephemeral values were set during planning so that we can require them to be set again (possibly to a different value) during the apply step.

apparentlymart · 2024-05-10T16:07:27Z

As of this comment there's enough in this branch to demonstrate using ephemeral input variables to meet the use-case of certain values needing to change between plan and apply, as is often necessary when using time-limited transient credentials.

In today's Terraform we avoid dealing with that by encouraging providers to look for such information "ambiently", such as by using environment variables or configuration files discovered in the user's home directory. That remains a good, pragmatic answer for simple cases -- and has the potentially benefit of integrating well with other software for the same platform running on the same computer -- but it doesn't work so well when things get more complicated, such as when there are two configurations for the same provider that each need different credentials.

Here's a configuration I used to exercise this:

terraform {
  experiments = [ephemeral_values]

  required_providers {
    aws = {
      source = "hashicorp/aws"
    }
  }
}

variable "aws_assume_role" {
  type      = object({
    role_arn       = string
    session_name   = optional(string)
    external_id    = optional(string)
    identity_token = optional(string)
  })
  ephemeral = true
  sensitive = true

  validation {
    condition = var.aws_assume_role.identity_token != null || var.aws_assume_role.external_id != null
    error_message = "Must set either identity_token or external_id."
  }

  validation {
    condition = var.aws_assume_role.identity_token == null || var.aws_assume_role.external_id == null
    error_message = "Must set only one of identity_token or external_id."
  }
}

variable "aws_region" {
  type = string
}

provider "aws" {
  region = var.aws_region

  dynamic "assume_role" {
    for_each = [
      for ar in var.aws_assume_role[*] : ar
      if ar.external_id != null
    ]
    content {
      role_arn     = assume_role.value.role_arn
      session_name = assume_role.value.session_name
      external_id  = assume_role.value.external_id
    }
  }

  dynamic "assume_role_with_web_identity" {
    for_each = [
      for ar in var.aws_assume_role[*] : ar
      if ar.identity_token != null
    ]
    content {
      role_arn           = assume_role.value.role_arn
      session_name       = assume_role.value.session_name
      web_identity_token = assume_role.value.identity_token
    }
  }
}

data "aws_caller_identity" "current" {
}

output "caller_identity_arn" {
  value = data.aws_caller_identity.current.arn
}

The aws_assume_role variable is declared as ephemeral = true, which means:

Its value is not persisted as part of the plan.
If it's set during the planning phase then it must be provided again during the apply phase, possibly with a different value.
The expression var.aws_assume_role in this module produces a value that's marked as being ephemeral, which blocks it from being used in non-ephemeral situations such as resource configurations. A provider configuration (like the provider "aws" block in this example) is ephemeral -- it's re-configured separately from scratch for each plan or apply -- and so it's valid to use var.aws_assume_role in there.

I exercised this using the following sequence of commands, where plan.tfvars and apply.tfvars both contain different values for aws_assume_role where the former has only read access while the latter has full access:

terraform plan -var-file=plan.tfvars -out=tfplan
terraform apply -var-file=apply.tfvars tfplan

In current Terraform the second command would be invalid because we disallow using -var-file in conjunction with a saved plan file. That rule is weakened in this branch so that it can be the local backend which deals with that option, and it does so using a more nuanced set of rules that makes sure that non-ephemeral variables stay set to what they were planned to be, while also making sure that ephemeral variables are provided again during the apply phase.

This example also illustrates an interesting design wrinkle, which isn't a totally new problem but perhaps "feels" more sinister in a world with explicit support for ephemeral values:

The aws_caller_identity data source in hashicorp/aws returns a result that varies based on how the provider is configured. Since I'm intentionally switching to a role with different permissions in the apply step, that data source would return a different result depending on whether it were read during the plan phase or during the apply phase.

That is already true today if someone achieves this same result by resetting the environment variables the AWS provider responds to, but I think it's still worth noting that this interesting seam exists, and might prove confusing for some uses.

This situation doesn't break Terraform, because if Terraform reads the data source during the planning phase then it will use that same result during the apply phase without checking it again. Terraform effectively "locks in" a particular data source result for each plan/apply round.

But it does seem like if we move forward with a change like this, which would promote intentionally varying credentials between plan and apply, that we explain the implications carefully in all relevant documentation. Although I was just using two different roles in the same AWS account here, it's technically possible for the credentials provided during apply to be for an entirely different AWS account, which would cause a very confusing result.

(I wonder if there's something in letting providers return some data from the ConfigureProvider call made during planning that Terraform Core sends back to the provider for ConfigureProvider during the apply phase, so that the provider can "remember" how it was configured and return an error if the configuration has changed to such an extent that the plan would've been invalidated. But that's a separate idea for another time.)

Instead of a test for whether the type name is different than the one we expect, we'll use a switch statement. This does nothing for now, but a future commit will add a new ephemeral resource type that's intended only for prototyping, exploiting the fact that this particular provider can offer ephemeral resource types without us first extending the provider plugin protocol with that concept.

…esult

… type This is here only for the purposes of prototyping ephemeral resources. If we move forward with a "real" implementation then something like this would be better placed in a separate SSH provider, rather than built into Terraform CLI itself. This is just a basic implementation to get started with. It's probably not very robust and will probably need fixes and additions in future commits.

When a provider configuration is using an ephemeral resource, we need the closure of the resource instances to depend on the closure of the provider instance because otherwise we'll leave the ephemeral resource instance live only long enough to configure the provider, and that's useless for taking any other actions with the provider after it's been configured.

apparentlymart · 2024-05-11T00:18:20Z

As of this comment there's enough to use an ephemeral resource instance to open an SSH tunnel, though for now I did it just using a temporary ephemeral resource type I bolted into the built-in Terraform provider because there's no plugin protocol support for ephemeral resources yet. (In practice I would expect this to belong to a separate provider plugin focused on SSH.)

For example, I just applied the following configuration that configures the hashicorp/consul provider to access a Consul server using an SSH tunnel:

terraform {
  experiments = [ ephemeral_values ]

  required_providers {
    terraform = {
      source = "terraform.io/builtin/terraform"
    }
    consul = {
      source = "hashicorp/consul"
    }
  }
}

ephemeral "terraform_ssh_tunnels" "foo" {
  server = "127.0.0.1:2222"
  username = "not really my username"

  auth_methods = [
    { password = "not really my password" },
  ]

  tcp_local_to_remote "consul" {
    remote = "127.0.0.1:8500"
  }
}

provider "consul" {
  address = ephemeral.terraform_ssh_tunnels.foo.tcp_to_remote["consul"].local
}

data "consul_keys" "app" {
  key {
    name    = "thingy"
    path    = "thingy"
  }
}

output "result" {
  value     = data.consul_keys.app.var.thingy
  sensitive = true
}

With this configuration:

Terraform notices that provider "consul" depends on ephemeral.terraform_ssh_tunnels.foo, and so it opens the SSH tunnel before configuring the provider and then waits until the provider has done all of its work before closing the SSH tunnel.
The Consul provider's address argument is set to an ephemeral string containing something like 127.0.0.1:1234 where 1234 is a randomly-selected port number representing the local end of the tunnel.
When Consul connects to 127.0.0.1:1234, the terraform_ssh_tunnels implementation accepts the connection, then connects to 127.0.0.1:8500, and then forwards bytes back and forth between the two as a proxy.

Here's the plan graph for this configuration, showing the above diagrammatically:

(You should read this graph bottom-to-top, because it's using Terraform's old-style graph rendering with the "root" -- the final node visited -- at the top.)

This achieves the use-case described in #8367.

apparentlymart added enhancement config labels Apr 24, 2024

apparentlymart self-assigned this Apr 24, 2024

apparentlymart mentioned this pull request Apr 24, 2024

[WIP] Ephemeral Values prototype #35077

Closed

vercel bot deployed to Preview April 24, 2024 19:36 View deployment

apparentlymart force-pushed the f-ephemeral-values branch from f5841fc to 2ce8606 Compare April 29, 2024 18:52

apparentlymart mentioned this pull request Apr 29, 2024

A way to refresh provider credentials #29182

Open

vercel bot deployed to Preview April 29, 2024 18:57 View deployment

apparentlymart force-pushed the f-ephemeral-values branch from 2ce8606 to 5c7deeb Compare April 29, 2024 18:58

vercel bot deployed to Preview April 29, 2024 19:03 View deployment

vercel bot deployed to Preview April 29, 2024 21:18 View deployment

vercel bot deployed to Preview April 29, 2024 22:54 View deployment

vercel bot deployed to Preview April 29, 2024 23:20 View deployment

vercel bot deployed to Preview April 30, 2024 16:47 View deployment

apparentlymart force-pushed the f-ephemeral-values branch from be64383 to aaa345f Compare April 30, 2024 18:09

vercel bot deployed to Preview April 30, 2024 18:13 View deployment

vercel bot deployed to Preview April 30, 2024 18:30 View deployment

vercel bot deployed to Preview May 1, 2024 00:05 View deployment

apparentlymart force-pushed the f-ephemeral-values branch from 25072c8 to 2ab8071 Compare May 1, 2024 22:35

vercel bot deployed to Preview May 1, 2024 22:39 View deployment

vercel bot deployed to Preview May 2, 2024 00:05 View deployment

apparentlymart force-pushed the f-ephemeral-values branch from 95e1621 to 59ea75b Compare May 3, 2024 19:00

vercel bot deployed to Preview May 3, 2024 19:04 View deployment

vercel bot deployed to Preview May 3, 2024 21:49 View deployment

vercel bot deployed to Preview May 3, 2024 23:25 View deployment

apparentlymart force-pushed the f-ephemeral-values branch from 059c2f9 to e6c8430 Compare May 6, 2024 18:22

vercel bot deployed to Preview May 6, 2024 18:26 View deployment

vercel bot deployed to Preview May 6, 2024 18:56 View deployment

vercel bot deployed to Preview May 7, 2024 18:13 View deployment

vercel bot deployed to Preview May 7, 2024 19:07 View deployment

apparentlymart added 10 commits May 9, 2024 09:42

terraform: Graph nodes for closing ephemeral resource instances

ce438c4

For now these graph nodes don't actually do anything, but the graph shape is at least plausible for what we'll need.

resources/ephemeral: A place to track ephemeral resource instances

d6f04d1

terraform: Plumb in an ephemeral.Resources object for each graph walk

8b98ccd

terraform: Open and close ephemeral resource instances during graph walk

4dedb78

apparentlymart force-pushed the f-ephemeral-values branch from bbd6dce to f0e5b1b Compare May 9, 2024 16:42

vercel bot deployed to Preview May 9, 2024 16:46 View deployment

apparentlymart force-pushed the f-ephemeral-values branch from f0e5b1b to 1f4d2ff Compare May 9, 2024 16:50

vercel bot deployed to Preview May 9, 2024 16:54 View deployment

apparentlymart added 2 commits May 10, 2024 08:25

terraform: Propagate apply-time variables from plan to apply

23ebf2d

We need to remember which ephemeral values were set during planning so that we can require them to be set again (possibly to a different value) during the apply step.

apparentlymart force-pushed the f-ephemeral-values branch from 1f4d2ff to 23ebf2d Compare May 10, 2024 15:27

vercel bot deployed to Preview May 10, 2024 15:31 View deployment

vercel bot deployed to Preview May 10, 2024 20:14 View deployment

terraform: Better error message for inconsistent ephemeral resource r…

0a1b9d3

…esult

vercel bot deployed to Preview May 10, 2024 23:10 View deployment

apparentlymart force-pushed the f-ephemeral-values branch from fd3dc2a to de22ac5 Compare May 10, 2024 23:53

vercel bot deployed to Preview May 10, 2024 23:56 View deployment

vercel bot deployed to Preview May 11, 2024 00:10 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Ephemeral Values prototype #35078

[WIP] Ephemeral Values prototype #35078

apparentlymart commented Apr 24, 2024 •

edited

apparentlymart commented May 10, 2024

apparentlymart commented May 11, 2024 •

edited

[WIP] Ephemeral Values prototype #35078

Are you sure you want to change the base?

[WIP] Ephemeral Values prototype #35078

Conversation

apparentlymart commented Apr 24, 2024 • edited

Ephemeral Values

Ephemeral Input Variables

Ephemeral Output Values

Ephemeral Resources

apparentlymart commented May 10, 2024

apparentlymart commented May 11, 2024 • edited

apparentlymart commented Apr 24, 2024 •

edited

apparentlymart commented May 11, 2024 •

edited