Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent System.IO.IOException for Schema Registry #2195

Open
4 tasks
betmix-matt opened this issue Mar 1, 2024 · 0 comments
Open
4 tasks

Intermittent System.IO.IOException for Schema Registry #2195

betmix-matt opened this issue Mar 1, 2024 · 0 comments

Comments

@betmix-matt
Copy link

betmix-matt commented Mar 1, 2024

Description

We have a persistent problem with Schema Registry that we can't seem to make reliable 100% of the time.
This happens both with Schema Registry in Confluent Cloud and using the Schema Registry deployed through Confluent for Kubernetes.

This happens only very infrequently (possibly less than 1% of the time) so it's very hard to reproduce consistently but it cause our integration tests to fail nearly 100% of the time because of 1 test failing out of hundreds.

Randomly we will have a request to publish fail with the following stack trace and error:

System.Threading.Tasks.TaskCanceledException : The request was canceled due to the configured HttpClient.Timeout of 30 seconds elapsing.
---- System.TimeoutException : The operation was canceled.
-------- System.Threading.Tasks.TaskCanceledException : The operation was canceled.
------------ System.IO.IOException : Unable to read data from the transport connection: Operation canceled.
---------------- System.Net.Sockets.SocketException : Operation canceled
at System.Net.Http.HttpClient.HandleFailure(Exception e, Boolean telemetryStarted, HttpResponseMessage response, CancellationTokenSource cts, CancellationToken cancellationToken, CancellationTokenSource pendingRequestsCts)
   at System.Net.Http.HttpClient.<SendAsync>g__Core|83_0(HttpRequestMessage request, HttpCompletionOption completionOption, CancellationTokenSource cts, Boolean disposeCts, CancellationTokenSource pendingRequestsCts, CancellationToken originalCancellationToken)
   at Confluent.SchemaRegistry.RestService.ExecuteOnOneInstanceAsync(Func`1 createRequest)
   at Confluent.SchemaRegistry.RestService.RequestAsync[T](String endPoint, HttpMethod method, Object[] jsonBody)
   at Confluent.SchemaRegistry.RestService.LookupSchemaAsync(String subject, Schema schema, Boolean ignoreDeletedSchemas, Boolean normalize)
   at Confluent.SchemaRegistry.Serdes.ProtobufSerializer`1.<>c__DisplayClass16_0.<<RegisterOrGetReferences>b__1>d.MoveNext()
--- End of stack trace from previous location ---
   at Confluent.SchemaRegistry.Serdes.ProtobufSerializer`1.RegisterOrGetReferences(FileDescriptor fd, SerializationContext context, Boolean autoRegisterSchema, Boolean skipKnownTypes)
   at Confluent.SchemaRegistry.Serdes.ProtobufSerializer`1.<>c__DisplayClass16_0.<<RegisterOrGetReferences>b__1>d.MoveNext()
--- End of stack trace from previous location ---
   at Confluent.SchemaRegistry.Serdes.ProtobufSerializer`1.RegisterOrGetReferences(FileDescriptor fd, SerializationContext context, Boolean autoRegisterSchema, Boolean skipKnownTypes)
   at Confluent.SchemaRegistry.Serdes.ProtobufSerializer`1.SerializeAsync(T value, SerializationContext context)

How to reproduce

Publish any message that needs to make a request to Schema Registry. We have even attempted to setup our configuration so we duplicate the URLs provided to the schema registry config to allow a retry on failure, however this doesn't seem to helped.

I would love it if the Schema Registry Client had some kind of retry semantics built in so that it could handle intermittent network failures like this.

Checklist

Please provide the following information:

  • A complete (i.e. we can run it), minimal program demonstrating the problem. No need to supply a project file.
  • [2.0.2] Confluent.Kafka nuget version.
  • [7.3.0] Apache Kafka version.
  • [ X] Client configuration.
  • Operating system.
  • [ X] Provide logs (with "debug" : "..." as necessary in configuration).
  • Provide broker log excerpts.
  • Critical issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant