Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken gRPC calls with Istio Service Mesh and Envoy #1237

Open
paoloyx opened this issue Jun 7, 2021 · 1 comment
Open

Broken gRPC calls with Istio Service Mesh and Envoy #1237

paoloyx opened this issue Jun 7, 2021 · 1 comment

Comments

@paoloyx
Copy link

paoloyx commented Jun 7, 2021

Hi everyone,

I don't really know if here is the right place to ask, but I'll try anyway. For the sake of clarity, I've asked the same on Envoy GitHub as I don't really know where the problem lies: envoyproxy/envoy#16811

A bit of context

We've an Istio 1.8.3 installation on a 1.18.x Kubernetes clusters. There are two pods (let's say ORIGIN and DEST) where ORIGIN is making a gRPC call to DEST, through a Kubernetes service that is proxying the DEST pod.

The situation is this:

  • We are using higherkindness/mu-scala
  • without Istio sidecars (i.e. Envoy proxies) all is going well.
  • With the sidecars the gRPC communication is not working, we receive an error like Received unexpected EOS on empty DATA frame from server.

Here are the logs, one for successful communication (without sidecars) and one for bad communication (with sidecars)

Successful communications logs

ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.417 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND SETTINGS: ack=false settings={ENABLE_PUSH=0, MAX_CONCURRENT_STREAMS=0, INITIAL_WINDOW_SIZE=1048576, MAX_HEADER_LIST_SIZE=8192}																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.430 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=983041																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.515 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND SETTINGS: ack=false settings=[id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100]																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.516 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND SETTINGS: ack=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.535 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND SETTINGS: ack=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.542 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND HEADERS: streamId=3 headers=GrpcHttp2OutboundHeaders[:authority: dest-service:9100, :path: /x.y.z/Create, :method: POST, :scheme: http, content-type: application/grpc, te: trailers, user-agent: grpc-java-netty/1.36.1, grpc-accept-encoding: gzip] streamDependency=0 weight=16 exclusive=false padding=0 endStream=false																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.551 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND DATA: streamId=3 padding=0 endStream=true length=50 bytes=000000002d0a2439383937636436392d646636632d346362612d386631392d30613562316332613237663412057465737439																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:44.718 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=9934465																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.500 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND HEADERS: streamId=3 headers=GrpcHttp2ResponseHeaders[:status: 200, content-type: application/grpc+proto, grpc-encoding: gzip, date: Thu, 03 Jun 2021 16:18:45 GMT] padding=0 endStream=false																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.512 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=false length=89 bytes=01000000541f8b08000000000000000dc8bb1580201004c0dcd01a48f73de0be9473cad27f093ae15c6df98c71b6224c079429a8ca05299fbd18215177eb41ef...																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.512 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND PING: ack=false bytes=1234																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.521 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND HEADERS: streamId=3 headers=GrpcHttp2ResponseHeaders[grpc-status: 0, date: Thu, 03 Jun 2021 16:18:45 GMT] padding=0 endStream=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.530 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND PING: ack=true bytes=1234																									
ORIGIN-POD ORIGIN-POD INFO  2021-06-03 16:18:45.542 [ioapp-compute-0] i.r.c.Server$ - [APPLICATION_LOGS_BLAH_BLAH]																								
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.552 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] OUTBOUND GO_AWAY: lastStreamId=0 errorCode=0 length=0 bytes=																									

Bad Communication logs

ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.427 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND SETTINGS: ack=false settings={ENABLE_PUSH=0, MAX_CONCURRENT_STREAMS=0, INITIAL_WINDOW_SIZE=1048576, MAX_HEADER_LIST_SIZE=8192}																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.475 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=983041																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.487 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND SETTINGS: ack=false settings={HEADER_TABLE_SIZE=4096, MAX_CONCURRENT_STREAMS=2147483647, INITIAL_WINDOW_SIZE=268435456,=0}																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.490 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND SETTINGS: ack=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.511 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND SETTINGS: ack=true																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.512 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND WINDOW_UPDATE: streamId=0 windowSizeIncrement=268369921																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.521 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND HEADERS: streamId=3 headers=GrpcHttp2OutboundHeaders[:authority: dest-service:9100, :path: /x.y.z/Create, :method: POST, :scheme: http, content-type: application/grpc, te: trailers, user-agent: grpc-java-netty/1.36.1, grpc-accept-encoding: gzip] streamDependency=0 weight=16 exclusive=false padding=0 endStream=false																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:30.533 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND DATA: streamId=3 padding=0 endStream=true length=50 bytes=000000002d0a2439383937636436392d646636632d346362612d386631392d30613562316332613237663412057465737439																																															
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.071 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND HEADERS: streamId=3 headers=GrpcHttp2ResponseHeaders[:status: 200, grpc-encoding: gzip, date: Thu, 03 Jun 2021 15:13:31 GMT, content-type: application/grpc+proto, x-envoy-upstream-service-time: 529, server: envoy] padding=0 endStream=false																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.113 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=false length=89 bytes=01000000541f8b08000000000000000dc9b911c0300804c0dca16b50ca0c423ca21c71c6fd97606fbad710e6ee9a2099aca406a1b4d8a4484b632b71dc2375c9...																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.114 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND PING: ack=false bytes=1234																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.125 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=true length=0 bytes=																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.129 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND PING: ack=true bytes=1234																									
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.160 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] OUTBOUND GO_AWAY: lastStreamId=0 errorCode=0 length=0 bytes=
[APPLICATION-LOGS_BLAH_BLAH]																							
ORIGIN-POD ORIGIN-POD io.grpc.StatusRuntimeException: INTERNAL: Received unexpected EOS on empty DATA frame from server																									
ORIGIN-POD ORIGIN-POD	at io.grpc.Status.asRuntimeException(Status.java:535)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.DelayedClientCall$DelayedListener$3.run(DelayedClientCall.java:464)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.DelayedClientCall$DelayedListener.delayOrExecute(DelayedClientCall.java:428)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.DelayedClientCall$DelayedListener.onClose(DelayedClientCall.java:461)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)																								
ORIGIN-POD ORIGIN-POD	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)	

Differences in logs

Unfortunately I'm no expert in debugging gGPC services, and There's one thing that i clearly see:

  • in succesful calls (without envoys) we've got only one INBOUND DATA log line:
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 16:18:45.512 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x0a2ce423, L:/10.172.22.165:60318 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=false length=89 bytes=01000000541f8b08000000000000000dc8bb1580201004c0dcd01a48f73de0be9473cad27f093ae15c6df98c71b6224c079429a8ca05299fbd18215177eb41ef...	
  • in errored calls (with envoys) we've got TWO INBOUND DATA log lines:
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.113 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=false length=89 bytes=01000000541f8b08000000000000000dc9b911c0300804c0dca16b50ca0c423ca21c71c6fd97606fbad710e6ee9a2099aca406a1b4d8a4484b632b71dc2375c9...																									
																								
ORIGIN-POD ORIGIN-POD DEBUG 2021-06-03 15:13:31.125 [grpc-nio-worker-ELG-1-1] i.g.n.NettyClientHandler - [id: 0x600c3d09, L:/10.172.22.162:33746 - R:dest-service/10.175.246.61:9100] INBOUND DATA: streamId=3 padding=0 endStream=true length=0 bytes=	

and the last one has length=0 bytes, so I suppose that this is giving the Received unexpected EOS on empty DATA frame from server error

Envoy logs

Just to add an important info, here are the Envoy proxies logs for above communication

ORIGIN pod

[2021-05-28T15:32:52.440Z] "POST /x.y.z/Create HTTP/2" 200 - via_upstream - "-" 50 89 534 529 "-" "grpc-java-netty/1.36.1" 2b95ea42-d089-4ae5-9d45-b13dba9e983f"  "dest-service:9100" 10.172.24.7:9000 outbound|9100||dest-service.default.svc.cluster.local 10.172.22.162:35838 10.175.246.61:9100 10.172.22.162:33746 - default

DEST pod

[2021-05-28T15:32:52.442Z]  "POST /x.y.z/Create HTTP/2"  200  -  via_upstream  -  "-"  0  135  643  640  "-"  "grpc-java-netty/1.36.1"  "96fa90ba-ab5e-4d03-818e-c762f5 │  │ 92aeb1"  "dest-service:9100"  "127.0.0.1:9000"  inbound|9000||  127.0.0.1:45170  10.172.24.7:9000  10.172.22.162:35838  outbound_.9100_._.dest-service.default.svc.cluster.local  default

Hope that this is enough to let someone of you understand what my problem is.
Thanks

@mvazquezrius
Copy link

Did you figure it out?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants