-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core][VL] Add random parquet data generator and ShuffleWriterFuzzerTest #3584
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI |
86300fd
to
fcff6af
Compare
Run Gluten Clickhouse CI |
fcff6af
to
3aa0ef9
Compare
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
e89a860
to
2a7d2fd
Compare
Run Gluten Clickhouse CI |
This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
2a7d2fd
to
fa4c97e
Compare
Run Gluten Clickhouse CI |
fa4c97e
to
080ca94
Compare
Run Gluten Clickhouse CI |
@zhouyuan Could you help to review? Thanks! |
@marin-ma it seems there are several unit tests failed, I guess it's due to some gaps w/ main branch, can you please do a rebase to check again? -yuan |
080ca94
to
0d3869e
Compare
Run Gluten Clickhouse CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
this seems a standalone component, looks good to me
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
Add
ShuffleWriterFuzzerTest
which utilizeRandomParquetDataGenerator
to generate random schema, input batch size and data for shuffle. This test aims to evaluate the shuffle module's accuracy and spill behavior. Developers should first pass C++ unit tests and then manually execute this specific test after any modifications that could impact the shuffle module.By default, each test will run with 10 iterations. Failed iterations will be printed as
Failed to run test 'testname' with seed: xxxxx, ...
. Developers can pick up the SQL query corresponding to the test name as well as the seeds from log and modify thereproduce
test to reproduce. It's recommended to build cpp as Debug build type before running this test.If any iterations are failed because of OOM, they will be printed as error log
Out of memory while running test 'testname' with seed: xxxxx, ...
. Iterations with OOM won't fail the test case.ShuffleWriterFuzzerTest
is tagged asSkipTestTags
so it won't be run in CI.