-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-7267][CH]Support nested column pruning for HiveTableScan
json/parquet/orc format
#7268
base: main
Are you sure you want to change the base?
[GLUTEN-7267][CH]Support nested column pruning for HiveTableScan
json/parquet/orc format
#7268
Conversation
Run Gluten Clickhouse CI |
HiveTableScan
json formatHiveTableScan
json format
5ba3026
to
4c202a6
Compare
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
HiveTableScan
json formatHiveTableScan
json/parquet/orc format
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
"select id, d1.c, d1.d[0].x, d2.d['m124'].y from %s where day = '2024-09-26' and hour = '12'" | ||
.format(pq_table_name) | ||
withSQLConf( | ||
("spark.sql.hive.convertMetastoreParquet" -> "false"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这俩orc和parquet的开关在什么使用场景下是false呢
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
当需要使用hive parquet/orc serde 读取 table 时,而不是使用spark内置的parquet/orc reader读取时,这两个配置就需要被设置为false @taiyang-li
性能测试表schema:test_tbl (a STRING, b STRUCT<x1: STRING, x2: STRING, x3: STRING, x4: STRING, x5: STRING>) 优化前 平均耗时: 优化后 平均耗时: |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
(Fixes: #7267)
How was this patch tested?
BY UT