forked from facebookincubator/velox
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose isUnquotedPathCharacter for validation #375
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
rui-mo
force-pushed
the
token
branch
14 times, most recently
from
August 9, 2023 08:45
e404d87
to
463e601
Compare
rui-mo
force-pushed
the
token
branch
3 times, most recently
from
August 23, 2023 07:08
fafcbe0
to
abdb894
Compare
rui-mo
force-pushed
the
token
branch
9 times, most recently
from
September 18, 2023 02:06
7baa130
to
ca26676
Compare
Summary: To improve the performance, instead of creating intermediate strings, the raw string buffer is pre-allocated to be written directly. Function `std::to_chars` is used to convert integers into a character string by successively filling the range. On buffer allocation, instead of calculating the precise size from intermediate strings, we pre-allocate sufficient buffer based on an estimation with decimal precision and scale, and set the precise size after all strings are written. An alternative implementation used `DecimalUtil::toString` which produced a lot of intermediate strings during conversion. Besides, `DecimalUtil::toString` was called for the calculation of string buffer size. The optimized implementation uses `std::to_chars` to convert integer to string and avoid all intermediate strings. The string buffer size is estimated with decimal precision and scale. As below benchmarks show, the final performance is improved 4-5x compared with the previous one. Cast from decimal to varchar benchmark | cast##cast_short_decimal | cast##cast_long_decimal -- | -- | -- previous (DecimalUtil::toString) | 45.43ms | 132.09ms optimized (std::to_chars) | 9.87ms | 35.00ms Pull Request resolved: facebookincubator#6210 Reviewed By: xiaoxmeng Differential Revision: D49315826 Pulled By: mbasmanova fbshipit-source-id: 1f419aa9edcb080752c3bed567d390cc7a461cce
Summary: When velox was used as a third-party library and `SIMDJsonExtractor` was used, it failed when running json function tests. We found that `-DSIMDJSON_THREADS_ENABLED=1` was not configured when generating libvelox_functions_json.a. We fix it by changing "simdjson" to "simdjson::simdjson" in target_link_libraries. Fixes facebookincubator#6564 Pull Request resolved: facebookincubator#6565 Reviewed By: Yuhta Differential Revision: D49285542 Pulled By: kgpai fbshipit-source-id: f9bc093b278288a2a73bbb289bb91b5dd7061097
facebookincubator#6599) Summary: Pull Request resolved: facebookincubator#6599 When type kind is not equal and one of them non-primitive we would crash accessing null type pointer after dynamic cast. Fix is to bail out from going down the type tree whenever type kind is different. The bug sneaked in, when we replaced throw() by log(1) in type checking code. Reviewed By: Yuhta Differential Revision: D49338549 fbshipit-source-id: 987f1df62016f68d7796f40c0aedfcd1becf5f1e
…okincubator#6404) Summary: pass down the scan table schema to parquet column reader Details: currently, the requestedType, which is available in [ParquetColumnReader.cpp](https://github.com/facebookincubator/velox/blob/517e3e3a0c8308c96ca068444dfeee37204f7773/velox/dwio/parquet/reader/ParquetColumnReader.cpp#L37C60-L37C68), are set based on the schema present in the parquet file (file data type) instead of scan table schema. The issue occurs when the expected output of the TableScan differs from the schema of the parquet file. Spark's data format for some types differs from Parquet's format. Similar to schema evolution, when the type differs, Spark performs an implicit conversion. The conversions that Spark performs can be seen in [ParquetVectorUpdaterFactory.java](https://github.com/apache/spark/blob/6ca45c52b7416e7b3520dc902cb24f060c7c72dd/sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/ParquetVectorUpdaterFactory.java#L67C3-L185C6). This PR fix the issue by setting requestedType` with scan table schema data type to parquet column reader. It's a follow PR of this PR facebookincubator#5786 to address the issue by following the comments of Yuhta Please check detail context from facebookincubator#5786 This update is one of the modifications necessary for issue facebookincubator#5770. Pull Request resolved: facebookincubator#6404 Reviewed By: pedroerp Differential Revision: D49330580 Pulled By: Yuhta fbshipit-source-id: bd56bda6efd708691ee35b5b66d5ba9536df525f
Summary: Fixes facebookincubator#6417 Pull Request resolved: facebookincubator#6463 Reviewed By: amitkdutta Differential Revision: D49371431 Pulled By: mbasmanova fbshipit-source-id: 8956b04abe608bfcb76b0a3b49cefd0689284bb2
…n crashes (facebookincubator#6402) Summary: Pull Request resolved: facebookincubator#6402 This adds an experimental flag 'experimental_velox_save_input_on_fatal_signal' that when set to true, serializes the input vector data and all the SQL expressions in the ExprSet that is currently executing whenever a fatal signal is encountered. Enabling this flag makes the signal handler async signal unsafe, so it should only be used for debugging purposes. Reviewed By: kgpai Differential Revision: D48891649 fbshipit-source-id: 47722d726c76a8602cf436c1840d2a0d720e2c35
…MONTH() and DATE() to avoid copying (facebookincubator#6615) Summary: Pull Request resolved: facebookincubator#6615 This is to remove unnecessary copying in INTERVAL_DAY_TIME(), INTERVAL_YEAR_MONTH() and DATE() calls, which return (a copy of) constant shared_ptr, and make it very expensive. Reviewed By: Yuhta, bikramSingh91 Differential Revision: D49347369 fbshipit-source-id: 6930970d9f2807347b16065fc224d7a7f5f57b69
Summary: Pull Request resolved: facebookincubator#6309 Reviewed By: xiaoxmeng Differential Revision: D49394977 Pulled By: pedroerp fbshipit-source-id: ba5fa3dda474505093d7d9d2f00aaa8c3d2d7e81
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.