Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drift in cloudflare_ruleset for http_request_firewall_custom phase for computed args id, last_updated, ref, version #2690

Open
2 tasks done
nitrocode opened this issue Aug 16, 2023 · 37 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@nitrocode
Copy link

nitrocode commented Aug 16, 2023

Confirmation

  • My issue isn't already found on the issue tracker.
  • I have replicated my issue using the latest version of the provider and it is still present.

Terraform and Cloudflare provider version

Terraform 1.5.5
Cloudflare provider 4.12.0 latest

Affected resource(s)

  • cloudflare_ruleset

Terraform configuration files

resource "cloudflare_ruleset" "custom" {
  kind    = "zone"
  name    = "default"
  phase   = "http_request_firewall_custom"
  zone_id = local.zone_id

  # ... 30 rules ...

  # e.g. pre-existing rule
  rules {
    enabled     = true
    action      = "log"
    description = "<snip>"
    expression  = <<-EOT
              <snip>
            EOT
  }

  # added additional rule which caused a drift. could be any rule
  rules {
    enabled     = true
    action      = "log"
    description = "<snip>"
    expression  = <<-EOT
              any(http.request.headers["custom-header"][*] matches ".?")
            EOT
  }

Link to debug output

none

Panic output

N/A

Expected output

No perma drift

Actual output

Perma drift

Steps to reproduce

  1. Add a new rule or modify an existing one
  2. terraform plan

Additional factoids

This happened after we bumped the provider from 3.x to 4.x

We deleted the cloudflare_ruleset and reimported.

If we do not make a change, we see No changes as expected. Any change we make, and we see a bunch of drift between rules, all affecting arguments that we do not set such as id and version

We have 30 something rules and we see these computed arguments changing for every rule.

in 4.12.0 cloudflare provider

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # cloudflare_ruleset.custom will be updated in-place
  ~ resource "cloudflare_ruleset" "custom" {
        id      = "<snip>"
        name    = "default"
        # (3 unchanged attributes hidden)

      ~ rules {
          ~ id           = "<snip>" -> (known after apply)
          ~ last_updated = "2023-08-18 16:36:15.176813 +0000 UTC" -> (known after apply)
          ~ ref          = "<snip>" -> (known after apply)
          ~ version      = "3" -> (known after apply)
            # (4 unchanged attributes hidden)
        }
      ~ rules {
          ~ id           = "<snip>" -> (known after apply)
          ~ last_updated = "2023-08-18 16:36:15.176813 +0000 UTC" -> (known after apply)
          ~ ref          = "<snip>" -> (known after apply)
          ~ version      = "15" -> (known after apply)
            # (4 unchanged attributes hidden)

          ~ action_parameters {
              + version  = (known after apply)
                # (3 unchanged attributes hidden)
            }

            # (1 unchanged block hidden)
        }
      + rules {
          + action       = "log"
          + description  = "log traffic with headers"
          + enabled      = true
          + expression   = <<-EOT
                any(http.request.headers["custom-header"][*] matches ".?")
            EOT
          + id           = (known after apply)
          + last_updated = (known after apply)
          + ref          = (known after apply)
          + version      = (known after apply)
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy.

In 4.7.1 cloudflare provider

      ~ rules {
          ~ id           = "<snip>" -> (known after apply)
          + last_updated = (known after apply)
          ~ version      = "3" -> (known after apply)
            # (4 unchanged attributes hidden)
        }
      ~ rules {
          ~ id           = "<snip>" -> (known after apply)
          + last_updated = (known after apply)
          ~ version      = "3" -> (known after apply)
            # (4 unchanged attributes hidden)

          ~ action_parameters {
              + version  = (known after apply)
                # (1 unchanged attribute hidden)
            }

            # (1 unchanged block hidden)
        }
      + rules {
          + action       = "log"
          + description  = "log traffic with headers"
          + enabled      = true
          + expression   = <<-EOT
                any(http.request.headers["custom-header"][*] matches ".?")
            EOT
          + id           = (known after apply)
          + last_updated = (known after apply)
          + ref          = (known after apply)
          + version      = (known after apply)
        }

We also cannot set a lifecycle ignore rule because this does not work for us

  lifecycle {
    ignore_changes = [
      "rules.*.id",
      "rules.*.version",
      "rules.*.last_updated",
      "rules.*.ref",
      # "rules[*].id",
      # "rules.id",
    ]
  }

I'm told this might work but it's not scaleable. We have over 30 rules. I tried this and still see the drift!

  lifecycle {
    ignore_changes = [
      rules[0].id,
      rules[0].version,
      rules[0].last_updated,
      rules[0].ref,
      rules[1].id,
      rules[1].version,
      rules[1].last_updated,
      rules[1].ref,
      rules[2].id,
      rules[2].version,
      rules[2].last_updated,
      rules[2].ref,
      rules[3].id,
      rules[3].version,
      rules[3].last_updated,
      rules[3].ref,
      # ...
      rules[n].id,
      rules[n].version,
      rules[n].last_updated,
      rules[n].ref,
    ]
  }

We did not see this issue in 3.x cloudflare provider

I don't see any DiffSuppressFunc for any of the above drifted computed arguments. Unsure if the drift needs to be suppressed or if it needs to be investigated.

https://github.com/cloudflare/terraform-provider-cloudflare/blob/adbc0accaafe0df39e78fcfd61d092573e13a25d/internal/framework/service/rulesets/resource.go

References

Related issues

Related pr

From change log

BREAKING CHANGES:

@nitrocode nitrocode added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 16, 2023
@github-actions
Copy link
Contributor

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions
Copy link
Contributor

Thank you for reporting this issue! For maintainers to dig into issues it is required that all issues include the entirety of TF_LOG=DEBUG output to be provided. The only parts that should be redacted are your user credentials in the X-Auth-Key, X-Auth-Email and Authorization HTTP headers. Details such as zone or account identifiers are not considered sensitive but can be redacted if you are very cautious. This log file provides additional context from Terraform, the provider and the Cloudflare API that helps in debugging issues. Without it, maintainers are very limited in what they can do and may hamper diagnosis efforts.

This issue has been marked with triage/needs-information and is unlikely to receive maintainer attention until the log file is provided making this a complete bug report.

@github-actions github-actions bot added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 16, 2023
@jessegonzalez-life360
Copy link

jessegonzalez-life360 commented Aug 19, 2023

I can provide this, but I'm not comfortable posting that here in a public forum. I'm an enterprise customer, can I create a case?

Rules, IPs, etc. would take a long while to redact.

Log uploaded to case id: #2901904

@nitrocode nitrocode changed the title Drift in cloudflare_ruleset for http_request_firewall_custom phase Drift in cloudflare_ruleset for http_request_firewall_custom phase for computed args id, last_updated, ref, version Aug 19, 2023
@jessegonzalez-life360
Copy link

Thanks for the lifecycle hint. It can help with perma drift until a proper solution is identified.

@github-actions
Copy link
Contributor

Marking this issue as stale due to 30 days of inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 7 days it will automatically be closed. Maintainers can also remove the lifecycle/stale label.
If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

@nitrocode

This comment was marked as spam.

@nitrocode

This comment was marked as spam.

@DDuarte
Copy link

DDuarte commented Oct 10, 2023

Confirming this bug as well, any time there's a change to 1 rule, all the other rules in the ruleset get their mentioned fields updated in diff

image

@nitrocode

This comment was marked as spam.

@jessegonzalez-life360
Copy link

jessegonzalez-life360 commented Nov 1, 2023

@DDuarte and @nitrocode if you define a unique ref for the rules block, you should no longer see permanent drift.

Edit: Spoke too soon, the issue persists.

@gabrielqs
Copy link

+1 dealing with this issue.
given that we're migrating from multiple deprecated cloudflare_firewall_rules into a single new cloudflare_ruleset, this makes the plan outputs become really hard to read. currently using a list of 300 attributes being ignored by lifecycle rules 😅

@vojkny
Copy link

vojkny commented Nov 27, 2023

I tried to fix this something like:

    lifecycle {
      ignore_changes = ["rules[*].ref", "rules[*].last_updated"]
    }

However there is nothing like [*] in Terraform syntax, so this couldn't work.

@nitrocode
Copy link
Author

@gabrielqs could you share what you use? I tried setting the ignore_changes and I could not get it to work.

Copy link
Contributor

Marking this issue as stale due to 30 days of inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 7 days it will automatically be closed. Maintainers can also remove the lifecycle/stale label.
If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

@gabrielqs
Copy link

This is still a problem

@gabrielqs
Copy link

@nitrocode sorry, only seeing your message now. here's what we had to do to get around this:

  lifecycle {
    ignore_changes = [
      rules[0].id,
      rules[0].version,
      rules[0].last_updated,
      rules[0].ref,
      rules[0].action_parameters[0].version,
      rules[1].id,
      rules[1].version,
      rules[1].last_updated,
      rules[1].ref,
      rules[1].action_parameters[0].version,
      rules[2].id,
      rules[2].version,
      rules[2].last_updated,
      rules[2].ref,
      rules[2].action_parameters[0].version,
      ...

It is terrible, and we need to remind ourselves to add this every time we add a new rule, but at least the drift doesn't show up on every plan

@nitrocode
Copy link
Author

I'm happy it works for you but that doesn't work for us. We still see the drift even with explicitly specifying each individual attributes per rule in ignore_changes.

Since these rules are all in a single terraform resource and we cannot get the plan in a reviewable state, we may have to consider moving these rules out of terraform and having them managed with a script that supplants all the rules upon pr merge.

cc @jacobbednarz friendly ping to get cloudflare's feedback here. This is the 2nd highest voted issue and ideally we should be able to manage and review changes within terraform without so much noise.

@troymjones
Copy link
Contributor

I am providing a full TF_LOG=DEBUG for 2 scenarios, both of which are problematic when working with the Cloudflare ruleset resource. Please note that while there is a workaround (manually verifying them), this is a major issue for my team with a hundred rules in a zone and could result in a bad rule change making its way into the WAF.

Scenarios:

  1. A rule is added in the middle of existing rules. Note that it looks like rules are drastically changing and a 'new' rule is being added to the bottom of the list. However, that 'new' rule is just an existing rule at the end. The rules are correctly being shown, but it is extremely confusing. When there are more than a couple rules, it becomes almost impossible to parse through the changes to make sure no mistakes were made. Also, all rules show changes to id, ref, version, etc, further complicating the issue. Link to gist

  2. A rule is modified (ie name, expression, etc) but no other changes are made. This results in the plan showing all rules having changed. Again, if there are more than a couple rules, it makes it almost impossible to verify that the changes desired are what is shown in the plan. Link to gist

@nitrocode

This comment was marked as spam.

@saintsinnerr

This comment has been minimized.

jacobbednarz added a commit that referenced this issue Jan 31, 2024
Within the rulesets schema, we have a handful of fields that are `Computed`.
The expectation of this directive is that it is a value not known at run time
and _may_ be provided by the remote. However, this premise will break down and
not work correctly if the value is ever provided to the plan since the value is
no longer "unknown". This subtle bug is one part of what is contributing to the
additional output toil in #2690. To address this, I've removed the lines that
modified these values and were being supplied to the plan operation.

This takes the confusing and bloated output from looking like this:

```
Terraform will perform the following actions:

  # cloudflare_ruleset.rate_limiting_example will be updated in-place
  ~ resource "cloudflare_ruleset" "rate_limiting_example" {
        id          = "87e1099f077f4e49bbfbe159217ff605"
        name        = "restrict API requests count"
        # (4 unchanged attributes hidden)

      ~ rules {
          ~ id           = "e3b97fdf354c4c24a7ac091c3537fbc5" -> (known after apply)
          ~ last_updated = "2024-01-31 04:02:55.651275 +0000 UTC" -> (known after apply)
          ~ ref          = "e3b97fdf354c4c24a7ac091c3537fbc5" -> (known after apply)
          ~ version      = "1" -> (known after apply)
            # (4 unchanged attributes hidden)

            # (1 unchanged block hidden)
        }
      + rules {
          + action       = "block"
          + description  = "rate limit for API"
          + enabled      = true
          + expression   = "(http.request.uri.path matches \"^/api0/\")"
          + id           = (known after apply)
          + last_updated = (known after apply)
          + ref          = (known after apply)
          + version      = (known after apply)

          + ratelimit {
              + characteristics     = [
                  + "cf.colo.id",
                  + "ip.src",
                ]
              + mitigation_timeout  = 600
              + period              = 60
              + requests_per_period = 100
              + requests_to_origin  = false
            }
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy.
```

To a clearer and more concise output:

```
Terraform will perform the following actions:

  # cloudflare_ruleset.rate_limiting_example will be updated in-place
  ~ resource "cloudflare_ruleset" "rate_limiting_example" {
        id          = "87e1099f077f4e49bbfbe159217ff605"
        name        = "restrict API requests count"
        # (4 unchanged attributes hidden)

      ~ rules {
            id           = "39ed6015367444e3905d838cadc1b9c2"
          + last_updated = (known after apply)
          + version      = (known after apply)
            # (5 unchanged attributes hidden)

            # (1 unchanged block hidden)
        }
      + rules {
          + action       = "block"
          + description  = "rate limit for API"
          + enabled      = true
          + expression   = "(http.request.uri.path matches \"^/api0/\")"
          + id           = (known after apply)
          + last_updated = (known after apply)
          + ref          = (known after apply)
          + version      = (known after apply)

          + ratelimit {
              + characteristics     = [
                  + "cf.colo.id",
                  + "ip.src",
                ]
              + mitigation_timeout  = 600
              + period              = 60
              + requests_per_period = 100
              + requests_to_origin  = false
            }
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy.
```

The second part of this confusing output is the `last_updated` and `version`
output. The `last_updated` is pretty self explanatory however, there are some
minor but important details about the `version` field here. Within ERE, it
captures and is tied to a particular state of the ruleset rule at any point and
is incremented when changes are detected to any parts of the rule. The naunce
here is that ERE does not perform a diff of the current state and what is
proposed. Instead, it autoincrements this field **even if the payload is
identical**. So why is this important for the Terraform resource? Well, since
we use explicit ordering via [`ListNestedObject`][1] schema attribute, we are
sending all rules to the API even when only a single one changes to ensure we
maintain the correctness the end user expects. Doing this causes the
`last_updated` and `version` fields to always show as unknown values and result
in updates. There are some plans to potentially look at de-duping the payloads
to prevent spurious versions being created however, that won't help here until
such a time that we individually manage `rules` as their own entities.

Closes #2690 by improving the output and marking the `version` and
`last_updated` as known values that will always be changing due to the way we
update all the resources.

[1]: https://github.com/cloudflare/terraform-provider-cloudflare/blob/f7c05af8be05e0ef9da72c10baa5a5bf8440e5d7/internal/framework/service/rulesets/schema.go#L104
jacobbednarz added a commit that referenced this issue Jan 31, 2024
Within the rulesets schema, we have a handful of fields that are `Computed`.
The expectation of this directive is that it is a value not known at run time
and _may_ be provided by the remote. However, this premise will break down and
not work correctly if the value is ever provided to the plan since the value is
no longer "unknown". This subtle bug is one part of what is contributing to the
additional output toil in #2690. To address this, I've removed the lines that
modified these values and were being supplied to the plan operation.

This takes the confusing and bloated output from looking like this:

```
Terraform will perform the following actions:

  # cloudflare_ruleset.rate_limiting_example will be updated in-place
  ~ resource "cloudflare_ruleset" "rate_limiting_example" {
        id          = "87e1099f077f4e49bbfbe159217ff605"
        name        = "restrict API requests count"
        # (4 unchanged attributes hidden)

      ~ rules {
          ~ id           = "e3b97fdf354c4c24a7ac091c3537fbc5" -> (known after apply)
          ~ last_updated = "2024-01-31 04:02:55.651275 +0000 UTC" -> (known after apply)
          ~ ref          = "e3b97fdf354c4c24a7ac091c3537fbc5" -> (known after apply)
          ~ version      = "1" -> (known after apply)
            # (4 unchanged attributes hidden)

            # (1 unchanged block hidden)
        }
      + rules {
          + action       = "block"
          + description  = "rate limit for API"
          + enabled      = true
          + expression   = "(http.request.uri.path matches \"^/api0/\")"
          + id           = (known after apply)
          + last_updated = (known after apply)
          + ref          = (known after apply)
          + version      = (known after apply)

          + ratelimit {
              + characteristics     = [
                  + "cf.colo.id",
                  + "ip.src",
                ]
              + mitigation_timeout  = 600
              + period              = 60
              + requests_per_period = 100
              + requests_to_origin  = false
            }
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy.
```

To a clearer and more concise output:

```
Terraform will perform the following actions:

  # cloudflare_ruleset.rate_limiting_example will be updated in-place
  ~ resource "cloudflare_ruleset" "rate_limiting_example" {
        id          = "87e1099f077f4e49bbfbe159217ff605"
        name        = "restrict API requests count"
        # (4 unchanged attributes hidden)

      ~ rules {
            id           = "39ed6015367444e3905d838cadc1b9c2"
          + last_updated = (known after apply)
          + version      = (known after apply)
            # (5 unchanged attributes hidden)

            # (1 unchanged block hidden)
        }
      + rules {
          + action       = "block"
          + description  = "rate limit for API"
          + enabled      = true
          + expression   = "(http.request.uri.path matches \"^/api0/\")"
          + id           = (known after apply)
          + last_updated = (known after apply)
          + ref          = (known after apply)
          + version      = (known after apply)

          + ratelimit {
              + characteristics     = [
                  + "cf.colo.id",
                  + "ip.src",
                ]
              + mitigation_timeout  = 600
              + period              = 60
              + requests_per_period = 100
              + requests_to_origin  = false
            }
        }
    }

Plan: 0 to add, 1 to change, 0 to destroy.
```

The second part of this confusing output is the `last_updated` and `version`
output. The `last_updated` is pretty self explanatory however, there are some
minor but important details about the `version` field here. Within ERE, it
captures and is tied to a particular state of the ruleset rule at any point and
is incremented when changes are detected to any parts of the rule. The naunce
here is that ERE does not perform a diff of the current state and what is
proposed. Instead, it autoincrements this field **even if the payload is
identical**. So why is this important for the Terraform resource? Well, since
we use explicit ordering via [`ListNestedObject`][1] schema attribute, we are
sending all rules to the API even when only a single one changes to ensure we
maintain the correctness the end user expects. Doing this causes the
`last_updated` and `version` fields to always show as unknown values and result
in updates. There are some plans to potentially look at de-duping the payloads
to prevent spurious versions being created however, that won't help here until
such a time that we individually manage `rules` as their own entities.

Closes #2690 by improving the output and marking the `version` and
`last_updated` as known values that will always be changing due to the way we
update all the resources.

[1]: https://github.com/cloudflare/terraform-provider-cloudflare/blob/f7c05af8be05e0ef9da72c10baa5a5bf8440e5d7/internal/framework/service/rulesets/schema.go#L104
@pranit-p

This comment has been minimized.

@eephillip
Copy link

eephillip commented Feb 13, 2024

I tried to fix this something like:

    lifecycle {
      ignore_changes = ["rules[*].ref", "rules[*].last_updated"]
    }

However there is nothing like [*] in Terraform syntax, so this couldn't work.

Related hashicorp/terraform#24188

@pranit-p
Copy link

pranit-p commented Feb 13, 2024

Yes @eephillip

But, this code is not workable

I tried to fix this something like:

    lifecycle {
      ignore_changes = ["rules[*].ref", "rules[*].last_updated"]
    }

However there is nothing like [*] in Terraform syntax, so this couldn't work.

Related hashicorp/terraform#24188

Copy link
Contributor

Marking this issue as stale due to 30 days of inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 7 days it will automatically be closed. Maintainers can also remove the lifecycle/stale label.
If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

@fcoelho
Copy link

fcoelho commented Mar 15, 2024

this is still an issue

@gabrielqs
Copy link

Still an issue

@Anticmos
Copy link

Still an Issue

@Anticmos
Copy link

@nitrocode can you provide the TF_LOG=DEBUG output so that they can remove needs-information tag by chance?

@nitrocode
Copy link
Author

nitrocode commented Mar 26, 2024

@Anticmos please see this comment where the debug log was attached to an internal cloudflare case #2901904 (see #2690 (comment) in Aug 2023)

@nitrocode
Copy link
Author

@Anticmos feel free to add another one. 😄

@jogarao96
Copy link

jogarao96 commented Apr 8, 2024

work around is that deleting the rule from cloudflare cli and recreating it through code. altough you might deleted at cloudflare gui we still have that rule exsits in cli version of cloudflare.
Fix is that

  1. curl https://api.cloudflare.com/client/v4/zones/<ZONE_ID>/rulesets --header "X-Auth-Key:{you key here}" --header "X-Auth-Email:{your email here}" >rulesets.yml
  2. Identify the thr rule with no discription but the phase you are trying at in my case http_request_firewall_custom
  3. Delete that rule with cloudflare cli using curl --request DELETE https://api.cloudflare.com/client/v4/zones/<ZONE_ID>/rulesets/<RULESETID> --header "X-Auth-Key:{API KEY}" --header "X-Auth-Email:{EMAIL}" you can get rule id from above file
  4. Re run terraform script to create a new rule which doesnt colide with any rule set ID

@boudekerk
Copy link

We're also running into this, as others have mentioned this kind of defeats the purpose of using terraform as the plan becomes unreadable. I.e. at least in the UI it's clear which rule is being edited before saving.

@boudekerk
Copy link

boudekerk commented Apr 12, 2024

@gabrielqs We tried your workaround, but we got errors during apply when trying to lifecycle ignore versions and last_updated.
Is there anything special you did to get it to work?

I think one issue is that the action parameter version is always being added, but never ends up in the state, so is added again on each subsequent run.

@gabrielqs
Copy link

@boudekerk nothing spectial. the only caveat might be we're running a relatively old version of the cloudflare provider, so maybe there's a discrepancy between what that version is doing and yours?

@nitrocode
Copy link
Author

nitrocode commented Apr 12, 2024

@gabrielqs what version are you running? Maybe we can pin until this issue is resolved.

Is it between version 4 and 4.8.0?

Copy link
Contributor

Marking this issue as stale due to 30 days of inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 7 days it will automatically be closed. Maintainers can also remove the lifecycle/stale label.
If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

@nitrocode
Copy link
Author

Unstale

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

Successfully merging a pull request may close this issue.