Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry get Bucket on failure if possible #9

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jnuxadrian
Copy link

GoogleUtils.getStorageBucket can throw StorageException
upon failure. This commit wrapps the call with a ExponentialBackOff
logic.

GoogleUtils.getStorageBucket can throw StorageException
upon failure. This commit wrapps the call with a ExponentialBackOff
logic.
@dtretyakov
Copy link
Contributor

@jnuxadrian, thanks for contribution. Could you please share some details and the stacktrace of the problem which fixes this PR?

@jnuxadrian
Copy link
Author

@dtretyakov Do you know of a way to enable only debug logs related to this plugin. Our TeamCity deployment is so massive, we in the past tried this and logs were rotated every 5s.

@jnuxadrian
Copy link
Author

[2018-11-07 02:26:50,175] WARN - blish.GoogleArtifactsPublisher - Failed to publish files: com.google.cloud.storage.StorageException: Error getting access token for service account: (enable debug to see stacktrace)
[2018-11-07 02:26:50,176] WARN - jetbrains.buildServer.AGENT - Failed to publish artifacts: jetbrains.buildServer.agent.ArtifactPublishingFailedException: Failed to publish files: Error getting access token for service account: (enable debug to see stacktrace)

@jnuxadrian
Copy link
Author

We are using the GoogleRegularFileUploader.kt

Given that there is two calls that could throw com.google.cloud.storage.StorageException

  1. bucket.create
  2. GoogleUtils.getStorageBucket

We rule out the first one cause we could not find in the build logs anything related to this retry message https://github.com/JetBrains/teamcity-google-storage/blob/master/google-storage-agent/src/main/kotlin/jetbrains/buildServer/artifacts/google/publish/GoogleRegularFileUploader.kt#L53

The second one instead is catch here https://github.com/JetBrains/teamcity-google-storage/blob/master/google-storage-agent/src/main/kotlin/jetbrains/buildServer/artifacts/google/publish/GoogleArtifactsPublisher.kt#L60 and this message match with what we found in the posted logs.

@dtretyakov
Copy link
Contributor

@jnuxadrian, to see debug logs you need to configure logging preset on the build agent. For that modify the %buildAgent%/conf/teamcity-agent-log4j.xml file where you need to uncomment DEBUG priority in the corresponding section like that:

<category name="jetbrains.buildServer">
    <priority value="DEBUG"/>
    <appender-ref ref="ROLL"/>
  </category>

Then in the teamcity-agent.log file will be stored the complete stacktrace of this error.

AFAIK GoogleUtils.getStorageBucket could not throw the mentioned IOException "Error getting access token for service account:", so I suspect that something unpredictable has happend during access token refresh during files uploading, but it's unclear what was the cause of that: https://github.com/googleapis/google-auth-library-java/pull/206/files

If you could enable debug logging on the build agents and could reproduce it, it will help us to investigate the problem.

Another possible way is to enable signed URL option and check whether it prevents such problems.

@jnuxadrian
Copy link
Author

@dtretyakov Thanks for the log suggestion. Still I think is worthy to consider retry that call to google, it will make the plugin more reliable, and you already have that in place for creating the object.

If you look at the gist of logs provided you will see that StorageException is not handled by GoogleRegularFileUploader.

Don't you think that adding a retry to an external service call would make sense, or do you think the problem is happening in some other place ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants