GrpcServiceException: INTERNAL: RST_STREAM closed stream #881

patriknw · 2023-05-04T13:44:25Z

When testing Projection over gRPC samples in AWS with application layer load balancer (alb ingress controller)

akka.grpc.GrpcServiceException: INTERNAL: RST_STREAM closed stream. HTTP/2 error code: INTERNAL_ERROR

client logs:

[2023-05-04 11:37:42,136] [DEBUG] [io.grpc.netty.shaded.io.grpc.netty.NettyClientHandler] [] [] [grpc-default-worker-ELG-1-2] - [id: 0x5670e347, L:/192.168.54.227:34182 - R:k8s-shopping-shopping-604179632a-148180922.us-east-2.elb.amazonaws.com/3.14.190.250:443] INBOUND RST_STREAM: streamId=5 errorCode=1

[2023-05-04 11:39:29,467] [DEBUG] [io.grpc.netty.shaded.io.grpc.netty.NettyClientHandler] [] [] [grpc-default-worker-ELG-1-2] - [id: 0x5670e347, L:/192.168.54.227:34182 - R:k8s-shopping-shopping-604179632a-148180922.us-east-2.elb.amazonaws.com/3.14.190.250:443] OUTBOUND PING: ack=false bytes=1111
[2023-05-04 11:39:29,569] [DEBUG] [io.grpc.netty.shaded.io.grpc.netty.NettyClientHandler] [] [] [grpc-default-worker-ELG-1-2] - [id: 0x5670e347, L:/192.168.54.227:34182 - R:k8s-shopping-shopping-604179632a-148180922.us-east-2.elb.amazonaws.com/3.14.190.250:443] INBOUND PING: ack=true bytes=1111

This tear down the connections and the projections are restarted.

nothing that can explain it in the server logs

Happens once per minute and that is the idle timeout for the load balancer.

Tried on server without success:

akka.http.server.http2.ping-interval=10s

In client we have channelBuilderOverrides

    _.keepAliveWithoutCalls(true)
      .keepAliveTime(10, TimeUnit.SECONDS)
      .keepAliveTimeout(5, TimeUnit.SECONDS)

One way to work around the problem is to periodically update the consumer filter, i.e. request from client. If this is a problem that we need to solve we could automatically emit keep alive messages from the GrpcReadJournal.

Maybe related https://stackoverflow.com/questions/66818645/http2-ping-frames-over-aws-alb-grpc-keepalive-ping

The text was updated successfully, but these errors were encountered:

johanandren · 2023-05-04T14:00:47Z

I highly suspect the issue is specific to ALB which does not pass through HTTP/2 pings, and does not seem to care about them from the server to ALB, and does signal timeouts in a weird way by RST_STREAM with protocol_error to the client (which is supposed to be about client/server talking invalid HTTP/2 as far as I understand it).

Looks like a possible workaround could be possible to tune the ALB config to a much longer idle timeout.

May still be worth working around with gRPC message level keepalives, but we have not seen this from any other load balancers/proxies ALB as far as I know, so an upstream fix of some sort also sounds like it would make sense. I couldn't find any public issue tracker for ALB/ELB to look for existing reports.

johanandren mentioned this issue May 4, 2023

GrpcServiceException: INTERNAL: RST_STREAM closed stream akka/akka-grpc#1784

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GrpcServiceException: INTERNAL: RST_STREAM closed stream #881

GrpcServiceException: INTERNAL: RST_STREAM closed stream #881

patriknw commented May 4, 2023

johanandren commented May 4, 2023

GrpcServiceException: INTERNAL: RST_STREAM closed stream #881

GrpcServiceException: INTERNAL: RST_STREAM closed stream #881

Comments

patriknw commented May 4, 2023

johanandren commented May 4, 2023