Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to apply a lot of changes in one pass #30

Open
Code0x58 opened this issue Mar 13, 2019 · 3 comments
Open

Failure to apply a lot of changes in one pass #30

Code0x58 opened this issue Mar 13, 2019 · 3 comments

Comments

@Code0x58
Copy link

Code0x58 commented Mar 13, 2019

Community Note

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Affected Resource(s)

All during Create/Update/Delete operations

Terraform Configuration Files

pastebin link - 390 lines, 3 variables

Debug Output

5 error(s) occurred:

* artifactory_local_repository.maven_release: 1 error(s) occurred:

* artifactory_local_repository.maven_release: PUT https://code0x58test.jfrog.io/code0x58test/api/repositories/x-libs-release-local: 400 [{Status:400 Message:Could not merge and save new descriptor [org.jfrog.common.ExecutionFailed: Last retry failed: exceeded number of retries 5. Not trying again (Should update revision 252)]
}]
* artifactory_local_repository.maven_snapshot: 1 error(s) occurred:

* artifactory_local_repository.maven_snapshot: PUT https://code0x58test.jfrog.io/code0x58test/api/repositories/x-libs-snapshot-local: 400 [{Status:400 Message:Could not merge and save new descriptor [org.jfrog.common.ExecutionFailed: Last retry failed: exceeded number of retries 5. Not trying again (Should update revision 252)]
}]
* artifactory_local_repository.rpm: 1 error(s) occurred:

* artifactory_local_repository.rpm: GET https://code0x58test.jfrog.io/code0x58test/api/repositories/x-rpm-local: 400 [{Status:400 Message:Bad Request}]
* artifactory_remote_repository.npm: 1 error(s) occurred:

* artifactory_remote_repository.npm: PUT https://code0x58test.jfrog.io/code0x58test/api/repositories/x-npm-remote: 400 [{Status:400 Message:Could not merge and save new descriptor [org.jfrog.common.ExecutionFailed: Last retry failed: exceeded number of retries 5. Not trying again (Should update revision 252)]
}]
* artifactory_virtual_repository.pypi: 1 error(s) occurred:

* artifactory_virtual_repository.pypi: GET https://code0x58test.jfrog.io/code0x58test/api/repositories/x-pypi: 400 [{Status:400 Message:Bad Request}]

Expected Behavior

I'd expect it to succeed like when it was smaller, or as it does after a couple of applies.

Actual Behavior

Artefactory can't keep up. It looks like there's a race to save the config which is worked around with server side retries, but that doesn't work when too many changes occur at once.

Steps to Reproduce

  1. terraform apply

Important Factoids

I suspect it's possible to do something like set MaxConnsPerHost to 1 on the transport of the HTTP client, that way an instance of the terraform provider shouldn't be introducing the races that it otherwise would.

Work arounds include:

  • running terraform apply --parallelism=1 which isn't super as other non-Artefactory providers will suffer
  • repeating terraform apply until the state converges (can lead to bad state)

There is a server side config option mentioned here that sets the number of retries, while not a solution, it should be a lead for reading up if needed.

@Code0x58
Copy link
Author

I tried a crude patch to limit MaxConnsPerHost, but it didn't fix it:

diff --git a/pkg/artifactory/provider.go b/pkg/artifactory/provider.go
index cb41084..6f10fcd 100644
--- a/pkg/artifactory/provider.go
+++ b/pkg/artifactory/provider.go
@@ -62,16 +62,21 @@ func providerConfigure(d *schema.ResourceData) (interface{}, error) {
        password := d.Get("password").(string)
        token := d.Get("token").(string)
 
+       t := http.DefaultTransport.(*http.Transport)
+       t.MaxConnsPerHost = 1
+
        var client *http.Client
        if username != "" && password != "" {
                tp := artifactory.BasicAuthTransport{
-                       Username: username,
-                       Password: password,
+                       Username:  username,
+                       Password:  password,
+                       Transport: http.DefaultTransport,
                }
                client = tp.Client()
        } else if token != "" {
                tp := &artifactory.TokenAuthTransport{
-                       Token: token,
+                       Token:     token,
+                       Transport: http.DefaultTransport,
                }
                client = tp.Client()
        } else {

@dillon-giacoppo
Copy link
Contributor

dillon-giacoppo commented Mar 13, 2019

Duplicate of #9. The suggested workaround is to set parallelism to 1.

JFrog also provided an alternative solution, they recommended increasing this property to artifactory.central.config.save.number.of.retries=20 in artifactory.system.properties. With this you can keep terraform multithreaded however we have noticed some infrequent errors with access related resources can occur when doing batch operations (such as users, groups, permissions). The ticket you linked is to fix these errors.

I have looked at client side throttling in the past, but I think the ideal solution would be retries with exponential backoff, this would have to be added to every resource. This is not a priority however since the issue is worked around easily.

@kad-meedel
Copy link

Look at https://www.jfrog.com/jira/browse/RTFACT-16638
I have stil an issue with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants