Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Regression: spannerio RESOURCE_EXHAUSTED: No session available in the pool #32697

Closed
1 of 17 tasks
nielm opened this issue Oct 8, 2024 · 3 comments
Closed
1 of 17 tasks

Comments

@nielm
Copy link
Contributor

nielm commented Oct 8, 2024

What happened?

#31663 introduced a regression in Dataflow for streaming pipelines using spannerIO

Spanner has a max number of 400 sessions in a session pool, where a session is used for a read or a write operation.

When using streaming pipelines on Dataflow with a high degree of parallelisation, where each work item can be both reading from and writing to spanner, Dataflow will have up to 300 threads per worker, which potentially means 300 simultaneous read sessions, and 300 simultaneous downstream writes.

In some scenarios, this can lead to session pool exhaustion, followed by a worker failing due to this flag with the following stack trace:

Error message from worker: generic::unknown:
org.apache.beam.sdk.util.UserCodeException: com.google.cloud.spanner.SpannerException: RESOURCE_EXHAUSTED: No session available in the pool. Maximum number of sessions in the pool can be overridden by invoking 
SessionPoolOptions#Builder#setMaxSessions. Client can be made to block rather than fail by setting
SessionPoolOptions#Builder#setBlockIfPoolExhausted.
There are currently 400 sessions checked out:
  
com.google.cloud.spanner.SessionPool$LeakedSessionException: Session was checked out from the pool at 2024-10-03T08:30:13.020Z
 
com.google.cloud.spanner.SessionPool$PooledSessionFuture.markCheckedOut(SessionPool.java:1338)
com.google.cloud.spanner.SessionPool$PooledSessionFuture.access$6200(SessionPool.java:1317)
com.google.cloud.spanner.SessionPool.checkoutSession(SessionPool.java:3270)

reverted in #32694

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
Copy link
Contributor

github-actions bot commented Oct 8, 2024

Label cannot be managed because it does not exist in the repo. Please check your spelling.

@nielm
Copy link
Contributor Author

nielm commented Oct 8, 2024

.add-labels spanner,io,streaming,gcp,dataflow

@github-actions github-actions bot added dataflow gcp io spanner streaming Issues pertaining to streaming functionality labels Oct 8, 2024
@nielm
Copy link
Contributor Author

nielm commented Oct 8, 2024

@Abacn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants