Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HttpClient throws TaskCanceledException on timeout #21965

Closed
awithy opened this issue May 25, 2017 · 121 comments · Fixed by #2281
Closed

HttpClient throws TaskCanceledException on timeout #21965

awithy opened this issue May 25, 2017 · 121 comments · Fixed by #2281
Assignees
Labels
area-System.Net.Http enhancement Product code improvement that does NOT require public API changes/additions
Milestone

Comments

@awithy
Copy link

awithy commented May 25, 2017

HttpClient is throwing a TaskCanceledException on timeout in some circumstances. This is happening for us when the server is under heavy load. We were able to work around by increasing the default timeout. The following MSDN forum thread captures the essence of the experienced issue: https://social.msdn.microsoft.com/Forums/en-US/d8d87789-0ac9-4294-84a0-91c9fa27e353/bug-in-httpclientgetasync-should-throw-webexception-not-taskcanceledexception?forum=netfxnetcom&prof=required

Thanks

@danmoseley
Copy link
Member

@awithy what version of .NET Core are you using?

@awithy
Copy link
Author

awithy commented May 25, 2017

1.1

@stephentoub
Copy link
Member

Whether the right design or not, by design OperationCanceledExceptions are thrown for timeouts (and TaskCanceledException is an OperationCanceledException).
cc: @davidsh

@awithy
Copy link
Author

awithy commented May 25, 2017

Our team found this unintuitive, but it does seem to be working as designed. Unfortunately, when we hit this, we wasted a bit of time trying to find a source of cancelation. Having to handle this scenario differently from task cancelation is pretty ugly (we created custom handler to manage this). This also seems to be an overuse of cancelation as a concept.

Thanks again for thee quick response.

@karelz
Copy link
Member

karelz commented Dec 29, 2017

Seems like by design + there were only 2 customers complains in last 6+ months.
Closing.

@karelz karelz closed this as completed Dec 29, 2017
@jamescrowley
Copy link

@karelz adding a note as another customer impacted by this misleading exception :)

@karelz
Copy link
Member

karelz commented Feb 14, 2018

Thanks for info, please upvote also top post (issues can be sorted by it), thanks!

@thomaslevesque
Copy link
Member

Whether the right design or not, by design OperationCanceledExceptions are thrown for timeouts

This is a bad design IMO. There's no way to tell if the request was actually canceled (i.e. the cancellation token passed to SendAsync was canceled) or if the timeout was elapsed. In the latter case, it would make sense to retry, whereas in the former case it wouldn't. There's a TimeoutException that would be perfect for this case, why not use it?

@csrowell
Copy link

csrowell commented Mar 1, 2018

Yeah, this also threw our team for a loop. There's an entire Stack Overflow thread dedicated to trying to deal with it: https://stackoverflow.com/questions/10547895/how-can-i-tell-when-httpclient-has-timed-out/32230327#32230327

@thomaslevesque
Copy link
Member

I published a workaround a few days ago: https://www.thomaslevesque.com/2018/02/25/better-timeout-handling-with-httpclient/

@csrowell
Copy link

csrowell commented Mar 1, 2018

Awesome, thanks.

@karelz
Copy link
Member

karelz commented Mar 1, 2018

Reopening as the interest is now bigger - the StackOverflow thread has now 69 upvotes.

cc @dotnet/ncl

@karelz karelz reopened this Mar 1, 2018
@Davilink
Copy link

Any news ? Still happen in dotnet core 2.0

@karelz
Copy link
Member

karelz commented Mar 21, 2018

It is on our list of top problems for post-2.1. It won't be fixed in 2.0/2.1 though.

@karelz
Copy link
Member

karelz commented Jun 17, 2018

We have several options what to do when timeout happens:

  1. Throw TimeoutException instead of TaskCanceledException.
    • Pro: Easy to distinguish timeout from explicit cancellation action at runtime
    • Con: Technical breaking change - new type of exception is thrown.
  2. Throw TaskCanceledException with inner exception as TimeoutException
    • Pro: Possible to distinguish timeout from explicit cancellation action at runtime. Compatible exception type.
    • Open question: Is it ok to throw away original TaskCanceledException stack and throw a new one, while preserving InnerException (as TimeoutException.InnerException)? Or should we preserve and just wrap the original TaskCanceledException?
      • Either: new TaskCanceledException -> new TimeoutException -> original TaskCanceledException (with original InnerException which may be null)
      • Or: new TaskCanceledException -> new TimeoutException -> original TaskCanceledException.InnerException (may be null)
  3. Throw TaskCanceledException with message mentioning timeout as the reason
    • Pro: Possible to distinguish timeout from explicit cancellation action from the message / logs
    • Con: Cannot be distinguished at runtime, it is "debug only".
    • Open question: Same as in [2] - should we wrap or replace the original TaskCanceledException (and it stack)

I am leaning towards option [2], with discarding original stack of original TaskCanceledException.
@stephentoub @davidsh any thoughts?

BTW: The change should be fairly straightforward in HttpClient.SendAsync where we set up the timeout:
https://github.com/dotnet/corefx/blob/30fb78875148665b98748ede3013641734d9bf5c/src/System.Net.Http/src/System/Net/Http/HttpClient.cs#L433-L461

@davidsh
Copy link
Contributor

davidsh commented Jun 17, 2018

The top-level exception should always be an HttpRequestException. That is the standard API model for HttpClient.

Within that, as inner exception, "timeouts" should probably be a TimeoutException. In fact, if we were to combine this with the other issues (about modifying HttpRequestion to include a new enum property similar to WebExceptionStatus), then developers would only have to check that enum and not have to inspect inner exceptions at all.

This is basically option 1) but continuing to preserve current app-compat API model that top-level exceptions from HttpClient APIs are always HttpRequestException.

Note: that any "timeouts" from directly reading an HTTP response stream from Stream APIs like:

var client = new HttpClient();
Stream responseStream = await client.GetStreamAsync(url);

int bytesRead = await responseStream.ReadAsync(buffer, 0, buffer.Length);

should continue to throw System.IO.IOException etc. as top-level exceptions (we actually have a few bugs in SocketsHttpHandler about this).

@karelz
Copy link
Member

karelz commented Jun 17, 2018

@davidsh are you suggesting to throw HttpRequestException even in cases when there is explicit user cancellation? Those throw TaskCanceledException today and it seems fine. Not sure if it is something we want to change ...
I'd be ok to expose client-side timeouts as HttpRequestException instead of TaskCanceledException.

@davidsh
Copy link
Contributor

davidsh commented Jun 17, 2018

@davidsh are you suggesting to throw HttpRequestException even in cases when there is explicit user cancellation? Those throw TaskCanceledException today and it seems fine. Not sure if it is something we want to change …

I need to test out the different behaviors of the stacks including .NET Framework to verify the exact current behaviors before I can answer this question.

Also, in some cases, some of our stacks are throwing HttpRequestException with inner exception of OperationCanceledException. TaskCanceledException is a derived type of OperationCanceledException.

when there is explicit user cancellation

We also need clear definition of what "explicit" is. For example, is setting the HttpClient.Timeout property "explicit"? I think we're saying no. What about calling .Cancel() on the CancellationTokenSource object (which affects the passed in CancellationToken object)? Is that "explicit"? There are probably other examples we can think of.

@karelz
Copy link
Member

karelz commented Jun 17, 2018

.NET Framework also throws TaskCanceledException when you set HttpClient.Timeout.

@eocron
Copy link

eocron commented Oct 9, 2018

If you can't distinguish between cancellation from client and cancellation from implementation - just check both. Before using client token create derived one:

var clientToken = ....
var innerTokenSource = CancellationTokenSource.CreateLinkedTokenSource(clientToken);

If at some point you catch OperationCancelledException:

try
{
     //invocation with innerToken
}
catch(OperationCancelledException oce)
{
      if(clientToken.IsCancellationRequested)
          throw; //as is, it is client initiated and in higher priority
      if(innerToken.IsCancellationRequested)
           //wrap it in HttpException, TimeoutException, whatever, wrap it in another linked token until you process all cases.
}

Basically you filter all possible layers from client token to deepest one, until you find token which caused cancellation.

@ericsampson
Copy link

Any updates on this making 2.2 or 3.0? @karelz @davidsh, cc @NickCraver . I kind of wish it would just do option 1) of throwing a TimeoutException, or alternatively a new HttpTimeoutException that is derived from HttpRequestException. Thanks!

@thomaslevesque
Copy link
Member

Any updates on this making 2.2 or 3.0?

Too late for 2.2, it was released last week ;)

@ericsampson
Copy link

ericsampson commented Dec 14, 2018

Too late for 2.2, it was released last week ;)

Argh, that's what happens when I take a break ;)

@Hobray
Copy link

Hobray commented Dec 21, 2018

Albeit a little goofy, the most simplistic workaround I've found is to nest a catch handler for OperationCanceledException and then evaluate the value of IsCancellationRequested to determine whether or not the cancellation token was the result of a timeout or an actual session abortion. Clearly the logic performing the API call would likely be in a business logic or service layer but I've consolidated the example for simplicity:

[HttpGet]
public async Task<IActionResult> Index(CancellationToken cancellationToken)
{
	try
	{
		//This would normally be done in a business logic or service layer rather than in the controller itself but I'm illustrating the point here for simplicity sake
		try
		{
			//Using HttpClientFactory for HttpClient instances      
			var httpClient = _httpClientFactory.CreateClient();

			//Set an absurd timeout to illustrate the point
			httpClient.Timeout = new TimeSpan(0, 0, 0, 0, 1);

			//Perform call that requires special timeout logic                
			var httpResponse = await httpClient.GetAsync("https://someurl.com/api/long/running");

			//... (if GetAsync doesn't fail, handle the response as desired)
		}
		catch (OperationCanceledException innerOperationCanceled)
		{
			//If a canceled token exception occurs due to a timeout, "IsCancellationRequested" should be false
			if (cancellationToken.IsCancellationRequested)
			{
				//Bubble exception to global handler
				throw innerOperationCanceled;
			}
			else
			{
				//... (perform timeout logic here)
			}                    
		}                

	}
	catch (OperationCanceledException operationCanceledEx)
	{
		_logger.LogWarning(operationCanceledEx, "Request was aborted by the end user.");
		return new StatusCodeResult(499);
	}
	catch (Exception ex)
	{
		_logger.LogError(ex, "An unexepcted error occured.");
		return new StatusCodeResult(500);
	}
}

I've seen other articles about how to handle this scenario more elegantly but they all require digging into the pipeline etc. I realize that this approach isn't perfect by my hope is that there will be better support for actual timeout exceptions with a future release.

@dotnetchris
Copy link

Working with http client is utterly abysmal. HttpClient needs deleted and started over from scratch.

@dotnetchris
Copy link

Albeit a little goofy, the most simplistic workaround I've found is to nest a catch handler for OperationCanceledException and then evaluate the value of IsCancellationRequested to determine whether or not the cancellation token was the result of a timeout or an actual session abortion. Clearly the logic performing the API call would likely be in a business logic or service layer but I've consolidated the example for simplicity:

[HttpGet]
public async Task<IActionResult> Index(CancellationToken cancellationToken)
{
	try
	{
		//This would normally be done in a business logic or service layer rather than in the controller itself but I'm illustrating the point here for simplicity sake
		try
		{
			//Using HttpClientFactory for HttpClient instances      
			var httpClient = _httpClientFactory.CreateClient();

			//Set an absurd timeout to illustrate the point
			httpClient.Timeout = new TimeSpan(0, 0, 0, 0, 1);

			//Perform call that requires special timeout logic                
			var httpResponse = await httpClient.GetAsync("https://someurl.com/api/long/running");

			//... (if GetAsync doesn't fail, handle the response as desired)
		}
		catch (OperationCanceledException innerOperationCanceled)
		{
			//If a canceled token exception occurs due to a timeout, "IsCancellationRequested" should be false
			if (cancellationToken.IsCancellationRequested)
			{
				//Bubble exception to global handler
				throw innerOperationCanceled;
			}
			else
			{
				//... (perform timeout logic here)
			}                    
		}                

	}
	catch (OperationCanceledException operationCanceledEx)
	{
		_logger.LogWarning(operationCanceledEx, "Request was aborted by the end user.");
		return new StatusCodeResult(499);
	}
	catch (Exception ex)
	{
		_logger.LogError(ex, "An unexepcted error occured.");
		return new StatusCodeResult(500);
	}
}

I've seen other articles about how to handle this scenario more elegantly but they all require digging into the pipeline etc. I realize that this approach isn't perfect by my hope is that there will be better support for actual timeout exceptions with a future release.

It's absolutely bonkers we need to write this atrocious code. HttpClient is the leakiest abstraction ever created by Microsoft

@Hobray
Copy link

Hobray commented Jan 8, 2019

I agree that it seems rather short sighted to not include more specific exception information around timeouts. Seems like a pretty basic thing for people to want to handle actual timeouts differently than cancelled token exceptions.

@ericsampson
Copy link

Working with http client is utterly abysmal. HttpClient needs deleted and started over from scratch.\

That's not very helpful or constructive feedback @dotnetchris . Customers and MSFT employees alike are all here to try and make things better, and that's why folks like @karelz are still here listening to customer feedback and trying to improve. There is no requirement for them to do development in the open, and let's not burn their good will with negativity or they could decide to take their ball and go home, and that would be worse for the community. Thanks! :)

@andre-ss6
Copy link

We agreed on that point @ericsampson but that would be breaking change for many existing users. As @scalablecory mentioned we compromised to make improvement while limiting compatibility damage.

Although I don't necessarily disagree with the wrapping option, I think it's worth pointing out that - as far as I've understood - the issue is not that the derived option is a breaking change, but that the wrapping one is simpler. As @stephentoub mentioned earlier in the discussion, throwing a derived exception is not considered a breaking change by the corefx guidelines.

@sksk571
Copy link

sksk571 commented Nov 2, 2019

Deriving HttpTimeoutException from TaskCanceledException violates “is a” relationship between these two exceptions. It will still make calling code to think that cancellation happened, unless it specifically handles HttpTimeoutException. This is basically the same handling that we have to do at the moment when we check for cancellation token being canceled to determine whether it’s a timeout or not. Besides, having two timeout exceptions in core library is confusing.
Adding TimeoutException as an inner exception to TaskCanceledException will not provide any additional information in the most cases, as we can already check cancellation token and determine whether it’s timeout or not.

I think the only sensible solution is to change exception thrown on timeout to TimeoutException. Yes, it will be a breaking change, but the issue already spans several major releases and breaking changes is what major releases are for.

@qidydl
Copy link

qidydl commented Nov 4, 2019

Can the documentation for existing versions be updated to indicate that methods could throw TaskCanceledException and/or OperationCanceledException (both concrete types appear to be a possibility) when a timeout occurs? Currently the documentation indicates HttpRequestException deals with timeouts, which is incorrect and misleading.

@scalablecory
Copy link
Contributor

@qidydl that's the plan... will fully document any ways timeouts can manifest and how to detail with them.

knocte referenced this issue in nblockchain/geewallet Nov 18, 2019
Hopefully the last attempt to fix the weird cancellation bug [1][2]
(that currently doesn't crash the app anymore, but generates a warning
and marks the balance with '?' to mean 'unknown').

This change makes it so that there's independent cancellationTokenSources
for each job, instead of having one shared across the FaultTPClient.Query
work, because I suspected that one job running HttpClient and facing the
TCE (to denote a timeout) could make the other jobs be cancelled as well.

[1] https://gitlab.com/knocte/geewallet/issues/125
[2] https://github.com/dotnet/corefx/issues/20296 ?
knocte referenced this issue in nblockchain/geewallet Nov 24, 2019
I think I just found the culprit for the deadly TCE problem I've been
having for the past few weeks/months:
https://gitlab.com/knocte/geewallet/issues/125

The root cause might have been this .NETCore's BCL bug: [1].

However, after working on a workaround (which I committed here[2]),
I was still getting cancellation[3] in code that wasn't supposed to
cancel at all (at least from an external cancellationSource, since
this only is supposed to happen when the user sends the app to the
background while it is in the middle of refreshing balances; unlike
what happens with the first retreival of balances on the Loading page).

After committing the workaround, I started ironing out issues but which
were just derived from the new workaround approach, while still overlooking
the real root cause, which I just realised is the following: balance
queries in the Ether world (i.e. ETH, ETC, DAI) need sometimes 2 distinct
queries (not just one), to retreived both the confirmed and the unconfirmed
balance.

This was the reason why I was only reproducing this problem in this kind
of currencies (which led me to believe the real culprit was the HttpClient
bug[1], because the UTXO-based ones don't use HttpClient...). And another
reason why this was hard to reproduce is because wallets that already have
cached balances used them as a reference point (against unconfirmed balance
amount) instead of querying for both unconfirmed&confirmed balances, so
most people would reproduce this only the first time they used geewallet.

The fix (see the diff of this commit) might look too simple, but it would
have not been possible without the change in [2] (and especially, [4]). It
basically consists of avoiding to dispose(or cancel) the cancelSource from
within FaultTolerantParallelClient itself, and only let external code do
it. This way, finishing a confirmed-balance-query would not cancel an
unconfirmed-balance-query for the same account, and viceversa.

I was actually going to commit this change just to dismiss this issue
altogether and hide the shit under the rug, but the only thing that needed
to be done besides the change is *UNDERSTAND* why it's needed, hence this
long commit message.

[1] https://github.com/dotnet/corefx/issues/20296
[2] 54b2a36
[3] e.g. https://sentry.io/organizations/nblockchain/issues/1353874620
[4] ab586a6
@alnikola alnikola self-assigned this Jan 27, 2020
@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
alnikola added a commit that referenced this issue Feb 19, 2020
…on when request times out (#2281)

Currently, HttpClient throws the same TaskCancellationException regardless of the request cancellation reason that can be caller's cancellation, all pending request cancellation or timeout. This makes it impossible to handle a request timeout in a way different from all other cases (e.g. special retry logic).

This PR adds a timeout detection logic into HttpClient. It watches for all TaskCancelledExceptions and catches the one caused by timeout. Then, it creates two new exceptions and build a hierarchy. The first is a TimeoutException having its InnerException set to the original TaskCancelledException. The second is a new TaskCancelledException having its InnerException set to that new TimeoutException, but preserving the original stack trace, message and cancellation token. Finally, this top-level TaskCancelledException gets thrown.

Fixes #21965
@ericsampson
Copy link

Thanks @alnikola and @scalablecory and @davidsh and @karelz and everyone else involved in this discussion/implementation, I really appreciate it.

@adydeshmukh
Copy link

adydeshmukh commented Dec 11, 2020

Don't see a Timeout exception wrapped inside TaskCanceledException on PostAsync() call, it is returning under aggregate exception - .net version 4.5
System.AggregateException: One or more errors occurred. (A task was canceled.)
---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
at System.Threading.Tasks.Task1.GetResultCore(Boolean waitCompletionNotification) at System.Threading.Tasks.Task1.get_Result()

@wfurt
Copy link
Member

wfurt commented Dec 11, 2020

This discussion his not relevant to .NET Framework @adydeshmukh. That is completely different implementation.

@dotnet dotnet locked as resolved and limited conversation to collaborators Jan 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net.Http enhancement Product code improvement that does NOT require public API changes/additions
Projects
None yet