Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calls Hanging when calling /metrics to export Large Number of Metrics when using .net core and IIS #440

Open
robert-thorne opened this issue Sep 20, 2023 · 5 comments

Comments

@robert-thorne
Copy link

robert-thorne commented Sep 20, 2023

We are hosting a ASP.NET core (.net 6 and tried .net 7) app through IIS which delivers a large number of Prometheus metrics (several 1000) through the Middleware provided by prometheus-net

We´ve been hit by issues where when hitting the /metrics endpoint the response hangs using curl or through a browser.

After a deep investigation it appears that this is an issue due to the way prometheus-net is writing out metrics using the HttpResponseStream and WriteAsync in TextSerializer.cs, for example WriteFamilyDeclarationAsync has 9 WriteAsync calls and WriteIdentifierPartAsync has around 13 calls, in some cases these are just writing new line or space characters.

The problem is that there appears to be a limit of 65535 WriteAsnyc calls to a HttpResponseStream before it hits a cancellation token which kills the response, and then the client hangs forever. This is not an issue when not using IIS. The middleware appears to be ignoring the cancellation token as a client cancelled the request which is not the case.

Our thought would be to improve this by changing the way we write to this stream so it pushes data into say a MemoryStream or StringBuilder before outputting to the HttpResponseStream in larger chunks

@DzmitryBratchuk
Copy link

Hi @robert-thorne, is there any news on this issue? Faced the same hang on IIS today.
@sandersaares can you take a look at it when you have time?

@DzmitryBratchuk
Copy link

I think I was able to solve this problem for myself.
I removed endpoints.MapMetrics() from the startup and added endpoint with an implementation that uses MemoryStream.

[HttpGet]
[Route("metrics")]
public async Task<IActionResult> GetMetrics(CancellationToken cancellationToken)
{
	using var memoryStream = new MemoryStream();

	await Metrics.DefaultRegistry.CollectAndExportAsTextAsync(memoryStream, cancellationToken);

	memoryStream.Position = 0;

	using var streamReader = new StreamReader(memoryStream);

	var metrics = await streamReader.ReadToEndAsync(cancellationToken);

	return Content(metrics);
}

@mwasson74
Copy link

Facing the same hanging issue here, too.

@Daniel15
Copy link
Contributor

Daniel15 commented Dec 6, 2023

Lots of small writes also reduces the efficiency of compression. In particular, a few years ago I noticed that compressing prometheus-net responses via Brotli actually produced responses that were larger than the uncompressed response: dotnet/aspnetcore#21685, dotnet/runtime#36245

ASP .NET MVC uses a 16KB buffer (https://github.com/dotnet/aspnetcore/blob/b98185b6376b966c5a051926986acaf204fe4e76/src/Http/WebUtilities/src/HttpResponseStreamWriter.cs#L18) to mitigate issues caused by a large number of small writes. That might be a good idea for prometheus-net, too, for example by using BufferedStream.

Edit: Looks like buffering was just added by @sandersaares two days ago: 1bff752. I suspect this may fix the issue.

@mwasson74
Copy link

Do we have any updates on this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants