Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Failing Test]: apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest is flaky #32069

Closed
2 of 17 tasks
tvalentyn opened this issue Aug 2, 2024 · 2 comments · Fixed by #32293
Closed
2 of 17 tasks

Comments

@tvalentyn
Copy link
Contributor

What happened?

The test_batch_size_with_auto_sharding scenario seems to become flaky recently; I encountered this error in coverage test suite on some seemingly unrelated PRs:

=================================== FAILURES ===================================
____ PipelineBasedStreamingInsertTest.test_batch_size_with_auto_sharding_0 _____
[gw5] linux -- Python 3.8.18 /runner/_work/beam/beam/sdks/python/test-suites/tox/py38/build/srcs/sdks/python/target/.tox-py38-cloudcoverage/py38-cloudcoverage/bin/python

a = (<apache_beam.io.gcp.bigquery_test.PipelineBasedStreamingInsertTest testMethod=test_batch_size_with_auto_sharding_0>,)
kw = {}

    @wraps(func)
    def standalone_func(*a, **kw):
>       return func(*(a + p.args), **p.kwargs, **kw)

target/.tox-py38-cloudcoverage/py38-cloudcoverage/lib/python3.8/site-packages/parameterized/parameterized.py:620: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
apache_beam/io/gcp/bigquery_test.py:2239: in test_batch_size_with_auto_sharding
    self.assertEqual(out1['colA_values'], ['value1', 'value3'])
E   AssertionError: Lists differ: ['value1', 'value5'] != ['value1', 'value3']
E   
E   First differing element 1:
E   'value5'
E   'value3'
E   
E   - ['value1', 'value5']
E   ?                  ^
E   
E   + ['value1', 'value3']
E   ?                  ^

=============================== warnings summary ===============================
https://github.com/apache/beam/actions/runs/10219876243/job/28279049265?pr=32066

Issue Failure

Failure: Test is flaky

Issue Priority

Priority: 1 (unhealthy code / failing or flaky postcommit so we cannot be sure the product is healthy)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@tvalentyn
Copy link
Contributor Author

cc: @damccorm

@damccorm damccorm self-assigned this Aug 20, 2024
@damccorm
Copy link
Contributor

Looks like this is likely just a bad test. The actual assertion here is based on an assumption that data will be processed in order, but that's not guaranteed. I'll clean the test up a bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants