Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How best to handle ETIMEDOUT? #425

Closed
rfennell opened this issue Dec 4, 2020 · 12 comments
Closed

How best to handle ETIMEDOUT? #425

rfennell opened this issue Dec 4, 2020 · 12 comments
Labels

Comments

@rfennell
Copy link

rfennell commented Dec 4, 2020

This is a follow up to #379, which never got resolved, I have created a new issue as I have now done a lot more research now.

Environment

Node version: v14.15.1 for local dev and I think the same on the Azure DevOps Hosted Agents
Npm version: 6.14.8
OS and version: Azure DevOps Hosted Agents
azure-devops-node-api version: 8.1.1

Issue Description

I have a Node based Azure DevOps Pipeline Extension that generates release notes based on a handlebars template.

This task uses the Node SDK to make many calls, commonly well over 100, to the Azure DevOps API to get details of WI, CS, PR and tests associated with a build or release. These results are then injected into a Handlebars template.

In most cases this task works without error, but in some cases my users see a error in the form

Error: connect ETIMEDOUT 13.107.42.18:443

What can I do to handle these timeouts?

Is my only option to abandon the Azure DevOps Node SDK and make all my REST calls natively?

Expected behaviour

The task should reliably complete, irrespective of the number Azure DevOps REST API calls made. The SDK should handle timeouts, retry and throttling of the API.

Actual behaviour

There is intermittent failure, the error being in the form.

Error: connect ETIMEDOUT 13.107.42.18:443

The task could run find multiple times, then fail for a few runs. A retry of a job commonly fixes the immediate problem. It is assumed that the Azure DevOps REST API is being saturated and the SDK retry logic is in adequate.

Steps to reproduce

This is hard to reliable reproduce. It appears to occur more often

  • On the hosted agents
  • In the late afternoon and evening UK time - when USA based clients are busier?
  • When a set of release note contains a lot of associated WI/CS and PRs, hence more API calls to get the details

What I have tried

I have tried all of the following, none have helped

API Retry Settings
I have altered the creation of my WebApi instance to increate the timeouts

 const credentialHandler = getCredentialHandler(pat);
 const options = {
       allowRetries: true,
       maxRetries: 20,
  } as vstsInterfaces.IRequestOptions;
  const organisation = new webApi.WebApi(tpcUri, credentialHandler, options);

Retry on failure
I added a try catch block around all of my SDK API calls. If there was a failure I retired the call (using my own code, the the SDK retry system). Within this retry logic I was able to set the number of retries and the time period to pause before a retry.

This had no effect, it seemed as if once there was an SDK reported timeout no re-connection was possible, even if I paused before the reconnect retry for 60 seconds.

Recreate the WebAPI instance on timeout
I refactored my retry logic to recreate the WebApi object on each retry. This again had no effect, the error still occurred

Logs

The only message seen is

Error: connect ETIMEDOUT 13.107.42.18:443

Examples can be seen in this issue 648 on my Release Notes Task Repo

@jjguijt
Copy link

jjguijt commented Dec 17, 2020

I am encountering the same issue, and agree with the expected behaviour of the library.

@rfennell
Copy link
Author

Yes, I have seen a good few reports in various forums.

It would be good to hear if there is a solution to the problem via the API or if it is an underlying constraint of the underlying Azure DevOps REST instances we can do nothing about. Even if that is the case as work around is really needed

@github-actions
Copy link

This issue has had no activity in 90 days. Please comment if it is not actually stale

@github-actions github-actions bot added the stale label Mar 17, 2021
@rfennell
Copy link
Author

Still heard nothing to do with this issues, and I still see the problem.

@sommmen
Copy link

sommmen commented Jun 10, 2021

Still heard nothing to do with this issues, and I still see the problem.

Loving how the bot just closed the issue :sigh:

@sunilsurana
Copy link
Member

@rfennell got any clue how to handle this? We also facing the same

@rfennell
Copy link
Author

rfennell commented Feb 9, 2022

Sorry no, never got a solution

@greengumby
Copy link

This just started happening and now I have no release notes. Re-tried the build multiple times and no success. Looks like I have to remove XplatGenerateReleaseNotes?

@jon-freed
Copy link

jon-freed commented Feb 16, 2023

In my case, this error happened consistently on a call to getWorkItem with the expand parameter set to expand relations. However, it wasn't always for the same work item. I'm not sure if getWorkItem with expand relations was the most pertinent factor or if something else was, like like the run time or the number of API calls. Fortunately, setting the WebApi options for retries got me past the error.

@wmcnamara
Copy link

Getting this error aswell.

@cjblomqvist
Copy link

I recommend reading which might fix your (and others') issue: https://stackoverflow.com/questions/63064393/getting-axios-error-connect-etimedout-when-making-high-volume-of-calls

@wmcnamara
Copy link

wmcnamara commented Feb 16, 2024

Just to update; I solved my problem. It was an issue with my proxy config, and was my mistake

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants