Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronous Lambda Timeout with Lambda SDK invocation #3227

Closed
ThomasGaboriau opened this issue Mar 18, 2024 · 8 comments
Closed

Synchronous Lambda Timeout with Lambda SDK invocation #3227

ThomasGaboriau opened this issue Mar 18, 2024 · 8 comments
Labels
bug This issue is a bug. module/sdk-custom

Comments

@ThomasGaboriau
Copy link

ThomasGaboriau commented Mar 18, 2024

Describe the bug

I invoke a possibly long running synchronous Lamba (it can either run for 30 seconds or ~10 minutes) with the AmazonLambdaClient, and I wait the result.

The invocation is done from a WebAPI HostedService, hosted in ECS.

Short runs are fine, but longer ones will always end after the configured timeout (15 minutes in our case), even though the lambda has finished running successfully.

Expected Behavior

The lambda runs fine.
The InvokeAsync methods returns the result after the lambda ends.

Current Behavior

The lambda runs fine.

timestamp message
2024-03-18T16:28:06.714 INIT_START Runtime Version: dotnet:8.v7 Runtime Version ARN: {arn}
2024-03-18T16:28:07.088 START RequestId: {id} Version: $LATEST
2024-03-18T16:34:08.144 END RequestId: {id}
2024-03-18T16:34:08.144 REPORT RequestId: {id} Duration: 361056.16 ms Billed Duration: 361057 ms Memory Size: 1792 MB Max Memory Used: 172 MB Init Duration: 372.30 ms

The InvokeAsync throws a TimeoutException after 15 minutes

timestamp message
2024-03-18T16:28:06.178 AWSSDKUtils 10|2024-03-18T16:28:06.177Z|DEBUG|Double encoded /2015-03-31/functions/{FunctionName}/invocations with endpoint https://lambda.eu-west-1.amazonaws.com/ for canonicalization: /2015-03-31/functions/arn%{arn}
2024-03-18T16:43:06.197 AmazonLambdaClient 11|2024-03-18T16:43:06.193Z|ERROR|An exception of type TimeoutException was handled in ErrorHandler. --> System.TimeoutException: The operation was canceled.
 ---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
 ---> System.Net.Sockets.SocketException (125): Operation canceled
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](CancellationToken cancellationToken, Int32 estimatedSize)
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](Memory`1 buffer, CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   --- End of inner exception stack trace ---
   at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken cancellationToken)
   at Amazon.Runtime.Internal.HttpHandler`1.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
AmazonLambdaClient 12|2024-03-18T16:43:06.198Z|ERROR|TimeoutException making request InvokeRequest to https://lambda.eu-west-1.amazonaws.com/. Attempt 1. --> System.TimeoutException: The operation was canceled.
 ---> System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
 ---> System.Net.Sockets.SocketException (125): Operation canceled
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Security.SslStream.EnsureFullTlsFrameAsync[TIOAdapter](CancellationToken cancellationToken, Int32 estimatedSize)
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
   at System.Net.Security.SslStream.ReadAsyncInternal[TIOAdapter](Memory`1 buffer, CancellationToken cancellationToken)
   at System.Runtime.CompilerServices.PoolingAsyncValueTaskMethodBuilder`1.StateMachineBox`1.System.Threading.Tasks.Sources.IValueTaskSource<TResult>.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   --- End of inner exception stack trace ---
   at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken cancellationToken)
   at Amazon.Runtime.Internal.HttpHandler`1.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Signer.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CredentialsRetriever.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)

Reproduction Steps

At first I used the AmazonLambdaConfig Timeout property, and the tried to use a factory to add a keep-alive.

Custom Http Factory :

public class AwsLambdaHttpClientFactory : HttpClientFactory
{
    private readonly IHttpClientFactory _httpClientFactory;

    public AwsLambdaHttpClientFactory(IHttpClientFactory clientFactory): base()
    {
        _httpClientFactory = clientFactory; 
    }

    public override HttpClient CreateHttpClient(IClientConfig clientConfig)
    {
        HttpClient client = _httpClientFactory.CreateClient();
        client.Timeout = TimeSpan.FromMinutes(15);
        client.DefaultRequestHeaders.Connection.Add("Keep-Alive");
        client.DefaultRequestHeaders.Add("Keep-Alive", "timeout=901");
        return client;
    }
}

Lambda client configuration, with AWS loggin config :

_lambda = new AmazonLambdaClient(new AmazonLambdaConfig()
{
    LogResponse = true,
    MaxErrorRetry = 0,
    HttpClientFactory = new AwsLambdaHttpClientFactory(httpClientFactory),
});
AWSConfigs.LoggingConfig.LogMetrics = true;
AWSConfigs.LoggingConfig.LogTo = LoggingOptions.Console;
AWSConfigs.LoggingConfig.LogResponses = ResponseLoggingOption.Always; 

Retries are at 0 because the lambda would retry after a timeout that succeeded.

Lambda invocation :

InvokeRequest lambdaRequest = new()
{
    FunctionName = Constants.AWS.RecoveryTaskArn,
    Payload = json,
};
InvokeResponse response = await _lambda.InvokeAsync(lambdaRequest, stoppingToken).ConfigureAwait(false); 

Possible Solution

Did I miss or missused some information or configuration ?

Additional Information/Context

No response

AWS .NET SDK and/or Package version used

AWSSDK.Lambda 3.7.304.2

Targeted .NET Platform

.NET 8

Operating System and version

Debian 12 (docker image : 8.0-bookworm-slim-arm64v8)

@ThomasGaboriau ThomasGaboriau added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Mar 18, 2024
@ashishdhingra
Copy link
Contributor

@ThomasGaboriau Good afternoon. Kindly advise if you

  • Have tried setting ReadWriteTimeout on AmazonLambdaConfig, in addition to Timeout property.
  • Instead of implementing custom HttpClientFactory, rely on SDK's mechanism to use the default Http client and retry mechanism.
  • If your Lambda function waiting for response to the sent back to response stream?
  • What happens if you change last line to InvokeResponse response = _lambda.InvokeAsync(lambdaRequest, stoppingToken).Result?

Thanks,
Ashish

@ashishdhingra ashishdhingra added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed needs-triage This issue or PR still needs to be triaged. labels Mar 18, 2024
@ThomasGaboriau
Copy link
Author

Hello @ashishdhingra , thanks for the quick answer !

  • At first I ignored it as it is marked obsolete. Just added it, with the same 15 minutes than the timeout, and it didn't change anything
  • Yes my first tries were with the default HttpClient. I implemented a custom factory to try to set the timeout directly on the client, and to add the keep alive header.
  • I'm not sure what you mean there. Does the lambda stream the response back ? No, It is sent as a single block at the end of the function
  • I change the last line, but it also didn't change anaything.

So here is what I now have, just to be sure :

_lambda = new AmazonLambdaClient(new AmazonLambdaConfig()
{
    LogResponse = true,
    MaxErrorRetry = 0,
    Timeout = TimeSpan.FromMinutes(15),
    ReadWriteTimeout = TimeSpan.FromMinutes(15),
});
.....
InvokeRequest lambdaRequest = new()
{
    FunctionName = Constants.AWS.RecoveryTaskArn,
    Payload = json,
};
InvokeResponse response = _lambda.InvokeAsync(lambdaRequest, stoppingToken).Result;

The lambda still succeeded after 6 minutes (forced with a Task.Delay), and the caller still timeout after 15 minutes.

Any other ideas ?

Thanks,
Thomas

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Mar 20, 2024
@ashishdhingra
Copy link
Contributor

@ThomasGaboriau Apologies for delayed reply. We would need to reproduce the issue to come to any conclusion. Could you please share your Lambda configuration, if there is any setting that might be causing the issue. Are you able to reproduce it when running locally, instead of from ECS hosted service?

Thanks,
Ashish

@ashishdhingra ashishdhingra added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Mar 29, 2024
@ThomasGaboriau
Copy link
Author

ThomasGaboriau commented Apr 3, 2024

Hey @ashishdhingra

When running locally and calling the Lambda, I do not reproduce the issue.
I guess the answer might be hidden somewhere in the layers of networking of AWS.

Here is the lambda configuration, please tell me if you need more infos :

  • Runtime : .NET 8
  • Architecture : arm64
  • Timeout : 15min
  • Memory : 1792 MB
  • Ephemeral storage : 512 MB
  • Concurrency : 1

And there is a VPC, but I don't know much about that one.

Thanks,
Thomas

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Apr 4, 2024
@ashishdhingra ashishdhingra added needs-reproduction This issue needs reproduction. and removed needs-review labels Apr 5, 2024
@sudsy
Copy link

sudsy commented Apr 11, 2024

I am seeing this in a similar environment but I thought it might be worth highlighting the differences and what I have tried in case it helps to diagnose the issue.

In my case I first saw the issue running the lambda client from an asp.net web application running .net 8.x in EKS on x86. AWSSDK.Lamda - 3.7.305.5
My eks cluster and the lambda are both connected to a vpc.
The lambda is running .net 8.x on x86

I've tried downgrading both the calling and lambda ends to .net 6 but I didn't see any improvement.

@sudsy
Copy link

sudsy commented Apr 12, 2024

I also built a quick client based on dotnet HttpClient and AwsSignatureVersion4 and I see the same results.

@dscpinheiro
Copy link
Contributor

I'm going to close this as both reports in this issue seem related to the networking setup (which the SDK doesn't have access to).

If you have access to AWS Support, I recommend reaching out as they'll be able to troubleshoot your specific scenario.

If you don't, one option is to use the Reachability Analyzer: VPC Reachability Analyzer determines network reachability for your specified network path when you run the analysis. If your network path is reachable, we display details of the components of the network path. If the path is not reachable, we identify the blocking components. For example, you can select the Lambda ENI as source and the ECS task ENI as destination, and the tool will display hop-by-hop details of the network path between them (note: there's a cost associated with each analysis, see Network Analysis in the Amazon VPC pricing page).

@dscpinheiro dscpinheiro removed needs-reproduction This issue needs reproduction. needs-review labels Apr 26, 2024
Copy link

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. module/sdk-custom
Projects
None yet
Development

No branches or pull requests

4 participants