Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: parallelism parameter for resources with count #14258

Open
dkiser opened this issue May 5, 2017 · 37 comments
Open

feature request: parallelism parameter for resources with count #14258

dkiser opened this issue May 5, 2017 · 37 comments

Comments

@dkiser
Copy link

dkiser commented May 5, 2017

As a Terraform User, I should be able to specify a parallelism count on resource instances containing a count parameter So That I Can handle creating API resources via providers at a more sane rate and/or deal with mediocre api backends.

Example Terraform Configuration File

resource "providera_resourcea" "many" {
   count = 1000
   parallelism = 10

   attributeA = "a"
   attributeB = "b"
   attributeC = "c"
}

Expected Behavior

GIVEN terraform apply -parallelism = X where X < ${providera_resourcea.many.*.parallelism)
WHEN terraform creates/deletes/refreshes resources
THEN I expect only X concurrent goroutines creating this resource type.

GIVEN terraform apply -parallelism = X where X >= ${providera_resourcea.many.*.parallelism)
WHEN terraform creates/deletes/refreshes resources
THEN I expect only ${providera_resourcea.many.*.parallelism) concurrent goroutines creating this resource type.

@dkiser
Copy link
Author

dkiser commented May 5, 2017

Possibly related to #7388

@kwach

This comment was marked as off-topic.

1 similar comment
@Stono

This comment was marked as off-topic.

@pbusquemdf
Copy link

I just try to create 3 vsphere_virtual_machine resource.
Because all 3 virtual machines are created at the same time, they take exponentially more time to create, causing the apply operation to time out.

Creating 1 machine take 5 minutes. All 3 machines therefor take 15 minutes with a single thread.
Creating 3 machines timeout after 10 minutes because each machine are now taking longer than 5 minutes each and are creating disk, balancing disk and reconfiguring bottleneck on the vsphere server.

But other resources are working fine. So, I should be able to limit the number of simultaneously job running into either the resource, or the provider (or both)

@invidian
Copy link
Contributor

invidian commented Aug 8, 2019

Another use case I would see for that is rolling update of immutable infrastructure, where you just roll the update one server/resource at a time.

@mkjmdski
Copy link

I found this issue when trying to run apply against "pass" provider which needs to communicate with git repository and this should be done one by one, but my infrastructure covers many other resources types (from different providers) so I'd like to run it with high parallelism but limited to 1 only for resources of certain type (or provider as mnetioned @invidian )

@davidquarles
Copy link

I'm hitting this today. There is still no known workaround, I take it?

Anecdotally, what I'm trying to do is create many managed instance groups in GCP that are all backends for the same load-balancer (using count) but can't be collapsed into one because of upstream constraints and how we're partitioning outbound traffic. Doing so forfeits rolling update semantics, of course.

I started to hack at having each instance group depend on the prior one after the head of the list until I realized that depends_on is static and the entire resource group is actually a single node in the DAG. Any ideas? As it stands, my only real strategy is to move this stuff out into a dedicated repo run with -parallelism=1 and use the remote state provider to loosely couple back to our primary repository :(

@brendan-sherrin
Copy link

I'm getting an error: Deleting CloudWatch Events permission failed: ConcurrentModificationException: EventBus default was modified concurrently

I believe this suggestion would let me work around this issue by applying a parellism limit on permissions affecting the default event bus on the account.

ie. adding a parellism attribute to this resource:
resource "aws_cloudwatch_event_permission" "PerAccountAccess" {
for_each = local.accountslist

@mrsimonemms
Copy link

I've found a similar issue with multiple Google SQL databases on a private IP where this would be incredibly useful (detailed on SO.

@delwaterman
Copy link

👍 Have this issue with a custom redshift provider. Need to limit the number of concurrent requests being made.

@hege-aliz
Copy link

Same here with AWS task definitions within the same family.

@antanof
Copy link

antanof commented Dec 6, 2020

Similar issue with Azure dns and Public IP :
I want to create severals A record for the same public IP

resource "azurerm_dns_a_record" "new" {
  count               = length(var.subdomains)
  name                = coalesce(var.subdomains[count.index])
  zone_name           = "var.zone"
  resource_group_name = "var.dns_rgname"
  ttl                 = 60
  target_resource_id  = azurerm_public_ip.public_ip.id

  depends_on          = [azurerm_public_ip.public_ip]
}

have an issue with terraform apply :

dns.RecordSetsClient#CreateOrUpdate: Failure responding to request: StatusCode=409 -- Original Error: autorest/azure: Service returned an error. Status=409 Code="Conflict" 

No issue with terraform apply --parallelism=1

@clarsonneur
Copy link

Like many others, I'm face the same kind of issue for some resources with google cloud. (network peering, firestore indexes, ...)

@tshawker
Copy link

tshawker commented Feb 17, 2021

I've also run into this when making changes to load balancers and target groups. Certain changes destroy everything before recreating anything. I'd like to be able to group the changes so that only some of them are done at a time. Alternatively, changes to the lifecycle sequencing would be as useful.

In this case, we aren't using count but for_each. I don't think that should make a difference for limiting parallelism.

@cbus-guy
Copy link

We are experiencing issues when attempting to bootstrap Chef using a null resource, or, when using Chef provisioner and building servers, either with VMWare or Azure. There are issues with vault permissions being assigned properly, to the node in Chef server. This succeeds when we set parallelism to 1, but fails intermittently, but fairly consistently, when not set to 1. It would be nice to only set the null resource for the bootstrap to a parallelism to 1, but everything else, to be allowed to be ran in parallel.

@zhujinhe
Copy link

+1. # 2021-07-16

@iyinoluwaayoola
Copy link

I'm surprised there is no news on this. I have only one resource that requires parallelism to be 1 but the only native solution is to disable parallelism for the entire infrastructure (of many many objects) using terraform apply --parallelism=1. I'll love the see this feature suffice for resources with for_each or count.

@mhaddon
Copy link

mhaddon commented Apr 19, 2022

+1 for #currentyear.

even being able to set parallelism on a module level would be great

@surajsbharadwaj
Copy link

surajsbharadwaj commented May 24, 2022

Need this very much. For me the count parameter is creating subnets in same vlan. Wish i could control count parallelism. I need it to create one after the other.

@guidooliveira
Copy link

+1 here, I can easily accomplish this using batchsize on any copy loop with arm templates, suprised this isn't ready after almost 5 years

@mukundjalan
Copy link

Very much needed feature. I need to run a module using for each, but the system runs out of space in some situations. If I had the possibility to limit the parallelism, I could easily manage this.

@AresiusXP
Copy link

I have the same need. My main issue is when managing subnets in Azure within the same VNET, Azure doesn't allow to modify multiple subnets at the same time. My only workaround is using a null_resourcewith az clicommands, which is a very sucky way of working. If I had a way setting the parallelism for this, I can reduce it to 1 and have everything managed by terraform code.

@viniciuscastro-hotmart

This comment was marked as off-topic.

@zhujinhe
Copy link

zhujinhe commented Jul 22, 2022

First of all , i like hashicorp products a lot, they really helped me a lot. ❤️ @armon @mitchellh

And I know as a non-paying user I don't have any right to ask you to do anything, but feature requests with core tag have been opened for over 5 years.

I love Terraform and Nomad, but they occasionally give me a one-last-mile-needed feeling to production ready. Other request like
hashicorp/nomad#1635 have been opened for over 6 years.

I really hope that you lovely developers of Hashicorp have time and willing to make a plan of the old core feature requests when developing new features.

@sherifkayad
Copy link

Hi folks,

I am too struggling with this here missing! I have some resources and modules that can't handle parallelism while other work perfectly fine with it.

Setting the whole apply with the parallelism flag to 1 is really excruciatingly slow!! Can we have this one somehow prioritized for the sake of all those having similar issues?!

@danjamesmay
Copy link

Same issue when having hundreds of monitored projects in google: hashicorp/terraform-provider-google#12883

TF also doesn't seem to handle 429's very well and completely screws up plan/apply steps when heavily rate limited.

@abij
Copy link

abij commented Apr 6, 2023

We are having the same issue in Azure with multiple PrivateEndpoints into a Subnet. PE creation also fails while performing VNet peering. @AresiusXP can you explain the null_resource in detail, are you validating the subnet is ready or controlling parallism? Comparable with issues in terraform-provider-azurerm: #21293 #16182

@AresiusXP
Copy link

AresiusXP commented Apr 6, 2023

We are having the same issue in Azure with multiple PrivateEndpoints into a Subnet. PE creation also fails while performing VNet peering. @AresiusXP can you explain the null_resource in detail, are you validating the subnet is ready or controlling parallism? Comparable with issues in terraform-provider-azurerm: #21293 #16182

We have 2 null_resource that have a depends_on the vnet with subnets resource. Once it's done, it's running an az cli command on each subnet to add private endpoints, and to associate to a route_table.

  triggers = {
    subnets = join(" ", azurerm_virtual_network.vnet.subnet.*.name)
  }

  provisioner "local-exec" {
    command     = "az login --service-principal -u $ARM_CLIENT_ID -p $ARM_CLIENT_SECRET --tenant $ARM_TENANT_ID; az account set --subscription ${var.global_settings.subscription_id}; for subnet in ${self.triggers.subnets}; do az network vnet subnet update -g ${var.rg_name} -n $subnet --vnet-name ${var.vnet_name} --service-endpoints ${join(" ", local.service_endpoints)}; done"
    interpreter = ["/bin/bash", "-c"]
  }

  depends_on = [
    azurerm_virtual_network.vnet
  ]
}

resource "null_resource" "rt" {
  count = var.rt_enabled ? 1 : 0

  triggers = {
    subnets = join(" ", azurerm_virtual_network.vnet.subnet.*.name)
  }

  provisioner "local-exec" {
    command     = "az login --service-principal -u $ARM_CLIENT_ID -p $ARM_CLIENT_SECRET --tenant $ARM_TENANT_ID; az account set --subscription ${var.global_settings.subscription_id}; for subnet in ${self.triggers.subnets}; do az network vnet subnet update -g ${var.rg_name} -n $subnet --vnet-name ${var.vnet_name} --route-table ${var.route_table_id}; done"
    interpreter = ["/bin/bash", "-c"]
  }

  depends_on = [
    azurerm_virtual_network.vnet,
    null_resource.endpoints
  ]
}

@pspot2
Copy link

pspot2 commented May 6, 2023

My use-case for this would be building multiple Docker images (using the registry.terraform.io/providers/kreuzwerker/docker provider) from a single (parametrized) Dockerfile in a for_each loop. I'd like to reduce the parallelism of the docker_image resource (and only of that resource) to 1 because my Dockerfile installs Python packages, and having N versions of this installation concurrently can result in sporadic issues because pip presently does not have any synchronization mechanisms around its cache directory (which is shared between Docker processes) click.

@M0NsTeRRR
Copy link

Having same issue on powerdns with SQLITE backend because of concurrent access to database.

@cbus-guy
Copy link

+1
We are using the buggy dns provider, which will only work consistently if we set parallelism to 1. It's ridiculous that the build process for many server has to slow down, just because a single provider is buggy. I should be able to set parallelism on a per resource basis.

@bschaatsbergen
Copy link
Member

bschaatsbergen commented Oct 12, 2023

I would be happy to look into this issue and see if I can come up with something 👍🏼

Note: in the issue description the second given,when, then seems redundant as it describes the same behaviour as the first given, when, then.

@bschaatsbergen
Copy link
Member

@jbardin, a couple years ago you replied on a similar issue: #24433 (comment)

Has your view changed on this and if so, would this be something I could take a stab it? It seems like there's quite a few people experiencing issues with it not being possible.

@jbardin
Copy link
Member

jbardin commented Oct 16, 2023

@bschaatsbergen, no, there have been no architectural changes around this that would alter the situation. If a resource type (or even the individual account controlling those resources) has limitations on concurrency, that is something which is in the provider's domain to control. In fact many providers already frequently use internal concurrency limits, along with API rate limiting and retries for similar reasons.

A cli flag is not appropriate for this type of per-resource configuration, which is one of the reasons the other issue was closed outright. So while this issue remains possible to implement, it's a bit more invasive than I would like to see for something the provider should handle directly. Since it is not a required feature for operation, it would also first have be approved through product management to begin implementation.

Thanks!

@cbus-guy
Copy link

cbus-guy commented Oct 18, 2023

I still think this would be an extremely useful update. Some resources don't always behave as expected. If you have code creating 20-30 resources, it slows things down to the extreme, to set parallelism to 1 as a whole, versus setting it to 1 for a specific resource, that is giving your trouble.

@mng1dev
Copy link

mng1dev commented May 18, 2024

I am not sure why 5 years later this request is still open and ignored.

If I need to create many resources of the same type using count/for_each it would be quite helpful if I could set how many resources I am creating concurrently, especially when these resources share a lock and throw an error when they cannot acquire it. This is as easy as implementing a for loop.

While I agree that this must be implemented at provider level, it would be nice to have some high-level option to orchestrate resource creation in case the provider's logic has no way to implement this mechanism and/or is not maintained.

Moreover, I don't agree with fully "passing the bucket" to the maintainers of providers because this could also be a strict requirement of my very own deployment, for any unquestionable reasons, so I would expect my IaC tool of choice to offer such degree of flexibility, and not the single provider.

There is the -parallelism option, but if I am creating 2000 resources and only 10 of them require to be created sequentially, I don't see why I should create all of them sequentially.

@kuteninja
Copy link

I'm having this exact issue with aws_appautoscaling_scheduled_action, since they can not be modified concurrently.

I have a list of actions, and I need them to be executed one at a time, but they try to run simultaneously resulting in "ConcurrentUpdateException: You already have a pending update to an Auto Scaling resource"

I also have issues related to this with Postgres server roles, since trying to remove and add roles at the same time, causes a tuple exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests