Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] add support for spilling to on-heap memory before spilling to file #356

Open
wants to merge 59 commits into
base: main
Choose a base branch
from

Conversation

zuochunwei
Copy link
Collaborator

WIP

zhejiangxiaomai and others added 30 commits July 3, 2023 16:23
relative pr:
Update build dependencies oap-project#185
relative pr:

add decimal column reader support oap-project#254
Add utility method MemoryUsageTracker::highUsage() oap-project#227
Support parquet read case sensitive option oap-project#126
Make varchar and varbinary compatible oap-project#115
Create folder if not exits on HDFS write oap-project#267
relative pr:

Add expand op in velox oap-project#199
Add ValueStreamNode operator oap-project#204
Allow decimal in casting string to int oap-project#215
relative pr:

add support for reading ORC oap-project#229
Parquet: Optimize parquet write perf oap-project#238
Expand timestamps in page reader oap-project#260
Add processedStrides and processedSplits metrics oap-project#264
relative pr:

Fix hashjoin runtime issue oap-project#106
INVALID_STATE on HashJoin when spill is turned on oap-project#154
SIGABRT on DecimalAvgAggregate<UnscaleLongDecimal, UnscaleShortDecimal> when spilling is engaged oap-project#236
Support kPreceeding & kFollowing for window range frame type oap-project#287
relative pr:

Allow decimal in casting string to int oap-project#215
Add mapping from named_struct to row_constructor oap-project#214
Fix semantic issues in cast function oap-project#280
relative pr:

Fix replace SparkSQL function oap-project#277
Support kPreceeding & kFollowing for window range frame type oap-project#287
support timestamp hash oap-project#269
Spark sum can overflow oap-project#101
Support float & double types in pmod function oap-project#157
Implement datetime functions in velox/sparksql. oap-project#81
Fix type check in MapFunction oap-project#273
Let function validation fail for lookaround pattern in RE2-based implementation oap-project#124
Register lpad/rpad functions for Spark SQL. oap-project#63
Support substring_index sql function oap-project#189
Fix First/Last aggregate functions intermediate type and support decimal oap-project#245
Support date_add spark sql function oap-project#144
relative pr:

Serialize and deserialize RowVector oap-project#250
relative pr:

Check a fallback case in validation: using literal partition key in window function oap-project#148
Fix might_contain validate fallback and support struct literal oap-project#137
Implement datetime functions in velox/sparksql. oap-project#81
Parse options in SingularOrList correctly oap-project#48
Add SingularOrList support oap-project#45
Support if then in filter oap-project#74
Fix semi join output type and support existence join oap-project#67
Support decimal as partition column oap-project#167
Add the window support oap-project#61
Add expand operator oap-project#65
Support more cases of filter and its pushdown oap-project#14
relative pr:

Support more data types for read filter oap-project#139
Fix cast double to decimal oap-project#179
Fix casting from string to decimal oap-project#281
Support cast decimal to int oap-project#177
Fix null on overflow and multiply as spark precision and support cast varchar to decimal oap-project#169
Disable tokenizing the path by dot oap-project#109
Serialize and deserialize RowVector oap-project#250
Support datetime pattern in spark oap-project#94
In Ubuntu, thrift will be installed manually in setup scripts, but Arrow still compile thrift, then Velox use system thrift and pre-build arrow/parquet.
In Centos, Velox could not found system thrift, so it will compile Arrow and thrift twice.
Since Arrow will compile thrift in all environments, lets use these pre-build shared libs to save time and keep toolchain consistency.

Support native dependency could be overrided by env variables.
Co-authored-by: zuochunwei <[email protected]>
JkSelf and others added 26 commits July 3, 2023 16:29
* fix decimal add
* remove unused code
Co-authored-by: ‘zhaozhenhui’ <‘[email protected]’>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.