From 43fdaeaab7f900ab6a14bb52496edf44243fca4a Mon Sep 17 00:00:00 2001 From: Yuan Date: Thu, 30 Nov 2023 09:31:34 +0800 Subject: [PATCH] [VL] Doc refresh (#3882) * update configurations Signed-off-by: Yuan Zhou * update operators/functions Signed-off-by: Yuan Zhou * fix maxBatchSize doc Signed-off-by: Yuan Zhou * fix operator support status Signed-off-by: Yuan Zhou --------- Signed-off-by: Yuan Zhou --- docs/Configuration.md | 10 +- docs/velox-backend-support-progress.md | 124 ++++++++++++++++++------- 2 files changed, 96 insertions(+), 38 deletions(-) diff --git a/docs/Configuration.md b/docs/Configuration.md index e66c5e6034e7..a80f0f716c5e 100644 --- a/docs/Configuration.md +++ b/docs/Configuration.md @@ -20,20 +20,23 @@ You can add these configurations into spark-defaults.conf to enable or disable t | spark.plugins | To load Gluten's components by Spark's plug-in loader | com.intel.oap.GlutenPlugin | | spark.shuffle.manager | To turn on Gluten Columnar Shuffle Plugin | org.apache.spark.shuffle.sort.ColumnarShuffleManager | | spark.gluten.enabled | Enable Gluten, default is true. Just an experimental property. Recommend to enable/disable Gluten through the setting for `spark.plugins`. | true | +| spark.gluten.sql.columnar.maxBatchSize | Number of rows to be processed in each batch. Default value is 4096. | 4096 | | spark.gluten.memory.isolation | (Experimental) Enable isolated memory mode. If true, Gluten controls the maximum off-heap memory can be used by each task to X, X = executor memory / max task slots. It's recommended to set true if Gluten serves concurrent queries within a single session, since not all memory Gluten allocated is guaranteed to be spillable. In the case, the feature should be enabled to avoid OOM. Note when true, setting spark.memory.storageFraction to a lower value is suggested since storage memory is considered non-usable by Gluten. | false | | spark.gluten.sql.columnar.scanOnly | When enabled, this config will overwrite all other operators' enabling, and only Scan and Filter pushdown will be offloaded to native. | false | | spark.gluten.sql.columnar.batchscan | Enable or Disable Columnar BatchScan, default is true | true | | spark.gluten.sql.columnar.hashagg | Enable or Disable Columnar Hash Aggregate, default is true | true | | spark.gluten.sql.columnar.project | Enable or Disable Columnar Project, default is true | true | | spark.gluten.sql.columnar.filter | Enable or Disable Columnar Filter, default is true | true | -| spark.gluten.sql.columnar.codegen.sort | Enable or Disable Columnar Sort, default is true | true | +| spark.gluten.sql.columnar.sort | Enable or Disable Columnar Sort, default is true | true | | spark.gluten.sql.columnar.window | Enable or Disable Columnar Window, default is true | true | | spark.gluten.sql.columnar.shuffledHashJoin | Enable or Disable ShuffledHashJoin, default is true | true | | spark.gluten.sql.columnar.forceShuffledHashJoin | Force to use ShuffledHashJoin over SortMergeJoin, default is true | true | -| spark.gluten.sql.columnar.sort | Enable or Disable Columnar Sort, default is true | true | | spark.gluten.sql.columnar.sortMergeJoin | Enable or Disable Columnar Sort Merge Join, default is true | true | | spark.gluten.sql.columnar.union | Enable or Disable Columnar Union, default is true | true | | spark.gluten.sql.columnar.expand | Enable or Disable Columnar Expand, default is true | true | +| spark.gluten.sql.columnar.generate | Enable or Disable Columnar Generate, default is true | true | +| spark.gluten.sql.columnar.limit | Enable or Disable Columnar Limit, default is true | true | +| spark.gluten.sql.columnar.tableCache | Enable or Disable Columnar Table Cache, default is false | true | | spark.gluten.sql.columnar.broadcastExchange | Enable or Disable Columnar Broadcast Exchange, default is true | true | | spark.gluten.sql.columnar.broadcastJoin | Enable or Disable Columnar BroadcastHashJoin, default is true | true | | spark.gluten.sql.columnar.shuffle.codec | Set up the codec to be used for Columnar Shuffle. If this configuration is not set, will check the value of spark.io.compression.codec. By default, Gluten use software compression. Valid options for software compression are lz4, zstd. Valid options for QAT and IAA is gzip. | lz4 | @@ -55,12 +58,11 @@ You can add these configurations into spark-defaults.conf to enable or disable t | spark.gluten.sql.columnar.backend.velox.bloomFilter.numBits | The default number of bits to use for the velox bloom filter. | 8388608L | | spark.gluten.sql.columnar.backend.velox.bloomFilter.maxNumBits | The max number of bits to use for the velox bloom filter. | 4194304L | -Below is an example for spark-default.conf, if you are using conda to install OAP project. +Below is an example for spark-default.conf: ``` ##### Columnar Process Configuration -spark.sql.sources.useV1SourceList avro spark.plugins io.glutenproject.GlutenPlugin spark.shuffle.manager org.apache.spark.shuffle.sort.ColumnarShuffleManager spark.driver.extraClassPath ${GLUTEN_HOME}/package/target/gluten-<>-jar-with-dependencies.jar diff --git a/docs/velox-backend-support-progress.md b/docs/velox-backend-support-progress.md index 7fabddba788a..dbbfd219deae 100644 --- a/docs/velox-backend-support-progress.md +++ b/docs/velox-backend-support-progress.md @@ -27,7 +27,7 @@ The total supported functions' number for [Spark3.3 is 387](https://spark.apache ### Operator Map -Gluten supports 20 operators (Draw to right to see all data types) +Gluten supports 28 operators (Draw to right to see all data types) | Executor | Description | Gluten Name | Velox Name | BOOLEAN | BYTE | SHORT | INT | LONG | FLOAT | DOUBLE | STRING | NULL | BINARY | ARRAY | MAP | STRUCT(ROW) | DATE | TIMESTAMP | DECIMAL | CALENDAR | UDT | | --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------- | --------------------- | ------ | --- | ---- | --- | --- | ---- | ----- | ----- | --- | ----- | ---- |-----| ---------- |-----| --------- | ------ | -------- | --- | @@ -47,7 +47,7 @@ Gluten supports 20 operators (Draw to right to see all data types) | UnionExec | The backend for the union operator | UnionExecTransformer | N | S | S | S | S | S | S | S | S | S | S | NS | NS | NS | S | NS | NS | NS | NS | | DataWritingCommandExec | Writing data | Y | TableWriteNode | S | S | S | S | S | S | S | S | S | S | S | NS | S | S | NS | S | NS | NS | | CartesianProductExec | Implementation of join using brute force | N | CrossJoinNode | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | -| ShuffleExchangeExec | The backend for most data being exchanged between processes | N | ExchangeNode | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | +| ShuffleExchangeExec | The backend for most data being exchanged between processes | ColumnarShuffleExchangeExec | ExchangeNode | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | | | The unnest operation expands arrays and maps into separate columns | N | UnnestNode | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | | | The top-n operation reorders a dataset based on one or more identified sort fields as well as a sorting order | N | TopNNode | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | | | The partitioned output operation redistributes data based on zero or more distribution fields | N | PartitionedOutputNode | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | @@ -57,31 +57,34 @@ Gluten supports 20 operators (Draw to right to see all data types) | | Partitions input data into multiple streams or combines data from multiple streams into a single stream | N | LocalPartitionNode | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | | | The enforce single row operation checks that input contains at most one row and returns that row unmodified | N | EnforceSingleRowNode | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | | | The assign unique id operation adds one column at the end of the input columns with unique value per row | N | AssignUniqueIdNode | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | NS | S | S | S | S | S | +| ReusedExchangeExec | A wrapper for reused exchange to have different output | ReusedExchangeExec | N | | | | | | | | | | | | | | | | | | | | CollectLimitExec | Reduce to single partition and apply limit | N | N | | | | | | | | | | | | | | | | | | | | BroadcastExchangeExec | The backend for broadcast exchange of data | Y | Y | S | S | S | S | S | S | S | S | S | S | NS | NS | NS | S | NS | S | NS | NS | | ObjectHashAggregateExec | The backend for hash based aggregations supporting TypedImperativeAggregate functions | N | N | | | | | | | | | | | | | | | | | | | | SortAggregateExec | The backend for sort based aggregations | N | N | | | | | | | | | | | | | | | | | | | -| CoalesceExec | Reduce the partition numbers | N | N | | | | | | | | | | | | | | | | | | | -| GenerateExec | The backend for operations that generate more output rows than input rows like explode | N | N | | | | | | | | | | | | | | | | | | | +| CoalesceExec | Reduce the partition numbers | CoalesceExecTransformer | N | | | | | | | | | | | | | | | | | | | +| GenerateExec | The backend for operations that generate more output rows than input rows like explode | GenerateExecTransformer | UnnestNode | | | | | | | | | | | | | | | | | | | | RangeExec | The backend for range operator | N | N | | | | | | | | | | | | | | | | | | | | SampleExec | The backend for the sample operator | N | N | | | | | | | | | | | | | | | | | | | | SubqueryBroadcastExec | Plan to collect and transform the broadcast key values | Y | Y | S | S | S | S | S | S | S | S | S | S | NS | NS | NS | S | NS | S | NS | NS | | TakeOrderedAndProjectExec | Take the first limit elements as defined by the sortOrder, and do projection if needed | Y | Y | S | S | S | S | S | S | S | S | S | S | NS | NS | NS | S | NS | S | NS | NS | | CustomShuffleReaderExec | A wrapper of shuffle query stage | N | N | | | | | | | | | | | | | | | | | | | -| InMemoryTableScanExec | Implementation of InMemory Table Scan | N | N | | | | | | | | | | | | | | | | | | | +| InMemoryTableScanExec | Implementation of InMemory Table Scan | Y | Y | | | | | | | | | | | | | | | | | | | | BroadcastNestedLoopJoinExec | Implementation of join using brute force. Full outer joins and joins where the broadcast side matches the join side (e.g.: LeftOuter with left broadcast) are not supported | N | N | | | | | | | | | | | | | | | | | | | | AggregateInPandasExec | The backend for an Aggregation Pandas UDF, this accelerates the data transfer between the Java process and the Python process | N | N | | | | | | | | | | | | | | | | | | | | ArrowEvalPythonExec | The backend of the Scalar Pandas UDFs. Accelerates the data transfer between the Java process and the Python process | N | N | | | | | | | | | | | | | | | | | | | | FlatMapGroupsInPandasExec | The backend for Flat Map Groups Pandas UDF, Accelerates the data transfer between the Java process and the Python process | N | N | | | | | | | | | | | | | | | | | | | | MapInPandasExec | The backend for Map Pandas Iterator UDF. Accelerates the data transfer between the Java process and the Python process | N | N | | | | | | | | | | | | | | | | | | | | WindowInPandasExec | The backend for Window Aggregation Pandas UDF, Accelerates the data transfer between the Java process and the Python process | N | N | | | | | | | | | | | | | | | | | | | +| HiveTableScanExec | The Hive table scan operator. Column and partition pruning are both handled | Y | Y | | | | | | | | | | | | | | | | | | | +| InsertIntoHiveTable | Command for writing data out to a Hive table | Y | Y | | | | | | | | | | | | | | | | | | | | Velox2Row | Convert Velox format to Row format | Y | Y | S | S | S | S | S | S | S | S | NS | S | NS | NS | NS | S | S | NS | NS | NS | | Velox2Arrow | Convert Velox format to Arrow format | Y | Y | S | S | S | S | S | S | S | S | NS | S | S | S | S | S | NS | S | NS | NS | ### Function support -Gluten supports 164 functions. (Draw to right to see all data types) +Gluten supports 199 functions. (Draw to right to see all data types) | Spark Functions | Velox/Presto Functions | Velox/Spark functions | Gluten | Restrictions | BOOLEAN | BYTE | SHORT | INT | LONG | FLOAT | DOUBLE | DATE | TIMESTAMP | STRING | DECIMAL | NULL | BINARS | CALENDAR | ARRAS | MAP | STRUCT | UDT | |-------------------------------|------------------------|-----------------------|--------|------------------------|---------|------|-------|-----|------| ----- | ------ |------| --------- | ------ | ------- | ---- | ------ | -------- | ----- | ---- | ------ | ---- | @@ -103,16 +106,16 @@ Gluten supports 164 functions. (Draw to right to see all data types) | >= | gte | greaterthanorequal | S | | S | S | S | S | S | S | S | | | S | | | | | | | | | | ^ | bitwise_xor | | S | | | | S | S | S | | | | | | | | | | | | | | | | | bitwise_or | bitwise_or | S | | | | S | S | S | | | | | | | | | | | | | | -| || | | | | | | | | | | | | | | | | | | | | | | | +| || | | | S | | | | | | | | | | | | | | | | | | | | | ~ | bitwise_not | | S | | | | S | S | S | | | | | | | | | | | | | | | and | | | S | | S | S | S | S | S | S | S | | | S | | | | | | | | | | between | between | between | S | | S | S | S | S | S | S | S | S | | S | | | | | | | | | -| bit_and | | | | | | | | | | | | | | | | | | | | | | | +| bit_and | bitwise_and_agg | | S | | | S | S | S | S | S | | | | | | | | | | | | | | bit_count | bit_count | bit_count | S | | S | S | S | S | S | | | | | | | | | | | | | | | bit_get | | bit_get | S | | | S | S | S | S | | | | | | | | | | | | | | -| bit_or | | | | | | | | | | | | | | | | | | | | | | | +| bit_or | | | S | | | | | | | | | | | | | | | | | | | | | bit_xor | | | | | | | | | | | | | | | | | | | | | | | -| case | | | | | | | | | | | | | | | | | | | | | | | +| case | | | S | | | | | | | | | | | | | | | | | | | | | div | | | | | | | | | | | | | | | | | | | | | | | | getbit | | | | | | | | | | | | | | | | | | | | | | | | if | | | S | | | | | | | | | | | | | | | | | | | | @@ -131,7 +134,7 @@ Gluten supports 164 functions. (Draw to right to see all data types) | ascii | | ascii | S | | | | | | | | | | | S | | | | | | | | | | base64 | | | | | | | | | | | | | | | | | | | | | | | | bin | | bin | | | | | | | | | | | | | | | | | | | | | -| bit_length | | | | | | | | | | | | | | | | | | | | | | | +| bit_length | | | S | | | | | | | | | | | | | | | | | | | | | btrim | | | S | | | | | | | | | | | | | | | | | | | | | char, chr | chr | chr | S | | | | | | | | | | | S | | | | | | | | | | char_length, character_length | length | length | S | | | | | | | | | | | S | | | | | | | | | @@ -158,7 +161,7 @@ Gluten supports 164 functions. (Draw to right to see all data types) | lpad | lpad | | S | | | | | | | | | | | S | | | | | | | | | | ltrim | ltrim | ltrim | S | | | | | | | | | | | S | | | | | | | | | | octet_length | | | | | | | | | | | | | | | | | | | | | | | -| overlay | | overlay | | | | | | | | | | | | | | | | | | | | | +| overlay | | overlay | S | | | | | | | | | | | | | | | | | | | | | parse_url | | | | | | | | | | | | | | | | | | | | | | | | position | strpos | | | | | | | | | | | | | | | | | | | | | | | printf | | | | | | | | | | | | | | | | | | | | | | | @@ -171,7 +174,7 @@ Gluten supports 164 functions. (Draw to right to see all data types) | sentences | | | | | | | | | | | | | | | | | | | | | | | | soundex | | | | | | | | | | | | | | | | | | | | | | | | space | | | | | | | | | | | | | | | | | | | | | | | -| split | split | split | | Mismatched | | | | | | | | | | | | | | | | | | | +| split | split | split | S | Mismatched | | | | | | | | | | | | | | | | | | | | split_part | split_part | | | Mismatched | | | | | | | | | | | | | | | | | | | | startswith | | startsWith | | | | | | | | | | | | | | | | | | | | | | substr, substring | substr | substring | S | | | | | | | | | | | S | | | | | | | | | @@ -183,6 +186,15 @@ Gluten supports 164 functions. (Draw to right to see all data types) | unbase64 | | | | | | | | | | | | | | | | | | | | | | | | unhex | | | | | | | | | | | | | | | | | | | | | | | | upper, ucase | upper | upper | S | | | | | | | | | | | S | | | | | | | | | +| xpath | | | | | | | | | | | | | | | | | | | | | | | +| xpath_boolean | | | | | | | | | | | | | | | | | | | | | | | +| xpath_double | | | | | | | | | | | | | | | | | | | | | | | +| xpath_float | | | | | | | | | | | | | | | | | | | | | | | +| xpath_int | | | | | | | | | | | | | | | | | | | | | | | +| xpath_long | | | | | | | | | | | | | | | | | | | | | | | +| xpath_number | | | | | | | | | | | | | | | | | | | | | | | +| xpath_short | | | | | | | | | | | | | | | | | | | | | | | +| xpath_string | | | | | | | | | | | | | | | | | | | | | | | | like | like | | S | | | | | | | | | | | S | | | | | | | | | | regexp | | rlike | S | Not support lookaround | | | | | | | | | | S | | | | | | | | | | regexp_extract | regexp_extract | regexp_extract | S | Not support lookaround | | | | | | | | | | S | | | | | | | | | @@ -224,8 +236,9 @@ Gluten supports 164 functions. (Draw to right to see all data types) | pow, power | pow,power | power | | | | | S | S | S | S | S | | | | | | | | | | | | | power, pow | power,pow | power | S | | | S | S | S | S | S | S | | | | | | | | | | | | | radians | radians | | S | | | S | S | S | S | S | S | | | | | | | | | | | | +| rand | rand | rand | S | | | | | | | | | | | | | | | | | | | | | rand | rand | rand | | | | | | | | | | | | | | | | | | | | | -| random | random | | | | | | | | | | | | | | | | | | | | | | +| random | random | | S | | | | | | | | | | | | | | | | | | | | | rint | | | | | | | | | | | | | | | | | | | | | | | | round | round | round | S | | | S | S | S | S | S | S | | | | | | | | | | | | | shiftleft | bitwise_left_shift | shiftleft | S | | | S | S | S | S | S | S | | | | | | | | | | | | @@ -241,13 +254,13 @@ Gluten supports 164 functions. (Draw to right to see all data types) | width_bucket | width_bucket | | | | | | | | | | | | | | | | | | | | | | | array | | array | S | | | | | | | | | | | | | | | | S | | | | | aggregate | aggregate | reduce | S | | | | | | | | | | | | | | | | S | | | | -| array_contains | | array_contains | | | | | | | | | | | | | | | | | | | | | +| array_contains | | array_contains | S | | | | | | | | | | | | | | | | | | | | | array_distinct | array_distinct | | | | | | | | | | | | | | | | | | | | | | | array_except | array_except | | | | | | | | | | | | | | | | | | | | | | -| array_intersect | array_intersect | array_intersect | | | | | | | | | | | | | | | | | | | | | -| array_join | array_join | | | | | | | | | | | | | | | | | | | | | | -| array_max | array_max | | | | | | | | | | | | | | | | | | | | | | -| array_min | array_min | | | | | | | | | | | | | | | | | | | | | | +| array_intersect | array_intersect | array_intersect | S | | | | | | | | | | | | | | | | | | | | +| array_join | array_join | | S | | | | | | | | | | | | | | | | | | | | +| array_max | array_max | | S | | | | | | | | | | | | | | | | | | | | +| array_min | array_min | | S | | | | | | | | | | | | | | | | | | | | | array_position | array_position | | | | | | | | | | | | | | | | | | | | | | | array_remove | | | | | | | | | | | | | | | | | | | | | | | | array_repeat | | | | | | | | | | | | | | | | | | | | | | | @@ -262,7 +275,7 @@ Gluten supports 164 functions. (Draw to right to see all data types) | explode_outer, explode | | | | | | | | | | | | | | | | | | | | | | | | filter | filter | filter | | | | | | | | | | | | | | | | | | | | | | flatten | flatten | | | | | | | | | | | | | | | | | | | | | | -| map | map | map | | | | | | | | | | | | | | | | | | | | | +| map | map | map | S | | | | | | | | | | | | | | | | | | | | | map_concat | map_concat | | | | | | | | | | | | | | | | | | | | | | | map_entries | map_entries | | | | | | | | | | | | | | | | | | | | | | | map_filter | map_filter | map_filter | | | | | | | | | | | | | | | | | | | | | @@ -272,12 +285,17 @@ Gluten supports 164 functions. (Draw to right to see all data types) | map_keys | map_keys | | | | | | | | | | | | | | | | | | | | | | | map_values | map_values | | S | | | | | | | | | | | | | | | | | S | | | | named_struct,struct | row_construct | named_struct | S | | | | | | | | | | | | | | | | | | S | | +| posexplode_outer,posexplode | | | | | | | | | | | | | | | | | | | | | | | | sequence | | | | | | | | | | | | | | | | | | | | | | | | shuffle | shuffle | | | | | | | | | | | | | | | | | | | | | | -| size | | size | | | | | | | | | | | | | | | | | | | | | +| size | | size | S | | | | | | | | | | | | | | | | | | | | | slice | slice | | | | | | | | | | | | | | | | | | | | | | | sort_array | | sort_array | | | | | | | | | | | | | | | | | | | | | -| struct, named_struct | | | S | | | | | | | | | | | | | | | | | | S | | +| str_to_map | | | | | | | | | | | | | | | | | | | | | | | +| transform | transform | transofrm | | | | | | | | | | | | | | | | | | | | | +| transform_keys | transform_keys | | | | | | | | | | | | | | | | | | | | | | +| transform_values | transform_values | | | | | | | | | | | | | | | | | | | | | | +| zip_with | zip_with | | | | | | | | | | | | | | | | | | | | | | | add_months | | | | | | | | | | | | | | | | | | | | | | | | current_date | | | S* | | | | | | | | | | | | | | | | | | | | | current_timestamp | | | S* | | | | | | | | | | | | | | | | | | | | @@ -285,18 +303,25 @@ Gluten supports 164 functions. (Draw to right to see all data types) | date | date | | S | | | | | | | | | | | | | | | | | | | | | date_add | date_add | date_add | S | | | S | S | S | | | | S | S | | | | | | | | | | | date_format | date_format | | | | | | | | | | | | | | | | | | | | | | -| date_sub | | | | | | | | | | | | | | | | | | | | | | | +| date_from_unix_date | | | | | | | | | | | | | | | | | | | | | | | +| date_part | | | | | | | | | | | | | | | | | | | | | | | +| date_sub | | | S | | | | | | | | | | | | | | | | | | | | | date_trunc | date_trunc | | | | | | | | | | | | | | | | | | | | | | | datediff | date_diff | | S | | | | | | | | | S | S | | | | | | | | | | | day | day | | S | | | | | | | | | S | S | | | | | | | | | | | dayofmonth | day_of_month | | S | | | | | | | | | S | S | | | | | | | | | | | dayofweek | day_of_week,dow | | S | | | | | | | | | S | S | | | | | | | | | | | dayofyear | day_of_year,doy | | S | | | | | | | | | S | S | | | | | | | | | | +| extract | | | | | | | | | | | | S | S | | | | | | | | | | | from_unixtime | from_unixtime | | | | | | | | | | | | | | | | | | | | | | | from_utc_timestamp | | | | | | | | | | | | | | | | | | | | | | | | hour | hour | | | | | | | | | | | | | | | | | | | | | | | last_day | | last_day | | | | | | | | | | | | | | | | | | | | | -| make_date | | make_date | | | | | | | | | | | | | | | | | | | | | +| make_date | | make_date | S | | | | | | | | | | | | | | | | | | | | +| make_dt_interval | | | | | | | | | | | | | | | | | | | | | | | +| make_interval | | | | | | | | | | | | | | | | | | | | | | | +| make_timestamp | | | | | | | | | | | | | | | | | | | | | | | +| make_ym_interval | | | | | | | | | | | | | | | | | | | | | | | | minute | minute | | | | | | | | | | | | | | | | | | | | | | | month | month | | S | | | | | | | | | S | S | | | | | | | | | | | months_between | | | | | | | | | | | | | | | | | | | | | | | @@ -304,38 +329,51 @@ Gluten supports 164 functions. (Draw to right to see all data types) | now | | | S | | | | | | | | | S | S | | | | | | | | | | | quarter | quarter | | S | | | | | | | | | S | S | | | | | | | | | | | second | second | | | | | | | | | | | | | | | | | | | | | | +| session_window | | | | | | | | | | | | | | | | | | | | | | | | timestamp | | | | | | | | | | | | | | | | | | | | | | | +| timestamp_micros | | | | | | | | | | | | | | | | | | | | | | | +| timestamp_millis | | | | | | | | | | | | | | | | | | | | | | | +| timestamp_seconds | | | | | | | | | | | | | | | | | | | | | | | | to_date | | | S | | | | | | | | | S | S | | | | | | | | | | | to_timestamp | | | | | | | | | | | | | | | | | | | | | | | -| to_unix_timestamp | to_unixtime | to_unix_timestamp | | | | | | | | | | | | | | | | | | | | | +| to_unix_timestamp | to_unixtime | to_unix_timestamp | S | | | | | | | | | | | | | | | | | | | | | to_utc_timestamp | | | | | | | | | | | | | | | | | | | | | | | | trunc | | | | | | | | | | | | | | | | | | | | | | | | unix_timestamp | | unix_timestamp | | | | | | | | | | | | | | | | | | | | | -| weekofyear | week,week_of_year | | | | | | | | | | | | | | | | | | | | | | +| weekday | | | | | | | | | | | | | | | | | | | | | | | +| weekofyear | week,week_of_year | | S | | | | | | | | | | | | | | | | | | | | | window | | | | | | | | | | | | | | | | | | | | | | | | year | year | year | S | | | | | | | | | S | S | | | | | | | | | | +| aggregate | | aggregate | S | | | | | | | | | | | | | | | | | | | | +| any | | | | | | | | | | | | | | | | | | | | | | | | approx_count_distinct | approx_distinct | | S | | S | S | S | S | S | S | S | S | | S | | | | | | | | | +| approx_percentile | | | | | | | | | | | | | | | | | | | | | | | | avg | avg | | S | Ansi Off | | S | S | S | S | S | | | | | | | | | | | | | -| bit_and | bitwise_and_agg | | S | | | S | S | S | S | S | | | | | | | | | | | | | -| bit_or | bitwise_or_agg | | S | | | S | S | S | S | S | | | | | | | | | | | | | -| bit_xor | | bit_xor | S | | | S | S | S | S | S | | | | | | | | | | | | | -| collect_list | | | | | | | | | | | | | | | | | | | | | | | -| collect_set | | | | | | | | | | | | | | | | | | | | | | | -| corr | corr | | | | | | | | | | | | | | | | | | | | | | +| bool_and | | | | | | | | | | | | | | | | | | | | | | | +| bool_or | | | | | | | | | | | | | | | | | | | | | | | +| collect_list | | | S | | | | | | | | | | | | | | | | | | | | +| collect_set | | | S | | | | | | | | | | | | | | | | | | | | +| corr | corr | | S | | | | | | | | | | | | | | | | | | | | | count | count | | S | | | | S | S | S | S | S | | | | | | | | | | | | | count_if | count_if | | | | | S | S | S | S | S | | | | | | | | | | | | | +| count_min_sketch | | | | | | | | | | | | | | | | | | | | | | | | covar_pop | covar_pop | | S | | | S | S | S | S | S | | | | | | | | | | | | | | covar_samp | covar_samp | | S | | | S | S | S | S | S | | | | | | | | | | | | | +| every | | | | | | | | | | | | | | | | | | | | | | | | first | | first | S | | | | | | | | | | | | | | | | | | | | | first_value | | first_value | S | | | | | | | | | | | | | | | | | | | | +| grouping | | | | | | | | | | | | | | | | | | | | | | | | grouping_id | | | | | | | | | | | | | | | | | | | | | | | | kurtosis | | | | | | | | | | | | | | | | | | | | | | | | last | | last | S | | | | | | | | | | | | | | | | | | | | | last_value | | last_value | S | | | | | | | | | | | | | | | | | | | | | max | max | | S | | | | S | S | S | S | S | | | | | | | | | | | | +| max_by | | | | | | | | | | | | | | | | | | | | | | | | mean | avg | | S | Ansi Off | | | | | | | | | | | | | | | | | | | | min | min | | S | | | | S | S | S | S | S | | | | | | | | | | | | +| min_by | | | | | | | | | | | | | | | | | | | | | | | | skewness | | | | | | | | | | | | | | | | | | | | | | | +| some | | | | | | | | | | | | | | | | | | | | | | | | std,stddev | stddev | | S | | | | S | S | S | S | S | | | | | | | | | | | | | stddev,std | stddev | | S | | | | S | S | S | S | S | | | | | | | | | | | | | stddev_pop | stddev_pop | | S | | | S | S | S | S | S | | | | | | | | | | | | | @@ -351,24 +389,42 @@ Gluten supports 164 functions. (Draw to right to see all data types) | nth_value | nth_value | nth_value | PS | | | | | | | | | | | | | | | | | | | | | ntile | | | | | | | | | | | | | | | | | | | | | | | | percent_rank | percent_rank | | S | | | | | | | | | | | | | | | | | | | | -| rank | rank | | | | | | | | | | | | | | | | | | | | | | +| rank | rank | | S | | | | | | | | | | | | | | | | | | | | | row_number | row_number | | S | | | | S | S | S | | | | | | | | | | | | | | +| from_csv | | | | | | | | | | | | | | | | | | | | | | | | from_json | | | | | | | | | | | | | | | | | | | | | | | | get_json_object | json_extract_scalar | get_json_object | S | | | | | | | | | | | | | | | | | | S | | | json_array_length | json_array_length | | S | | | | | | | | | | | | | | | | | | S | | | json_tuple | | | | | | | | | | | | | | | | | | | | | | | +| schema_of_csv | | | | | | | | | | | | | | | | | | | | | | | | schema_of_json | | | | | | | | | | | | | | | | | | | | | | | +| to_csv | | | | | | | | | | | | | | | | | | | | | | | | to_json | | | | | | | | | | | | | | | | | | | | | | | +| assert_true | | | | | | | | | | | | | | | | | | | | | | | +| coalesce | | | PS | | | | | | | | | | | | | | | | | | | | | crc32 | crc32 | | S | | | | | | | | | | | S | | | | | | | | | | current_user | | | S* | | | | | | | | | | | S | | | | | | | | | +| current_catalog | | | S | | | | | | | | | | | | | | | | | | | | +| current_database | | | S | | | | | | | | | | | | | | | | | | | | | greatest | greatest | greatest | S | | | | | | S | S | S | S | S | | | | | | | | | | | hash | hash | hash | S | | S | S | S | S | S | S | S | | | | | | | | | | | | +| inline | | | | | | | | | | | | | | | | | | | | | | | +| inline_outer | | | | | | | | | | | | | | | | | | | | | | | | input_file_name | | | | | | | | | | | | | | | | | | | | | | | +| input_file_block_length | | | | | | | | | | | | | | | | | | | | | | | +| input_file_block_start | | | | | | | | | | | | | | | | | | | | | | | +| java_method | | | | | | | | | | | | | | | | | | | | | | | | least | least | least | S | | | | | | S | S | S | S | S | | | | | | | | | | | md5 | md5 | | S | | | S | | | | | | | | | | | | | | | | | | monotonically_increasing_id | | | | | | | | | | | | | | | | | | | | | | | +| nanvl | | | | | | | | | | | | | | | | | | | | | | | +| nvl | | | | | | | | | | | | | | | | | | | | | | | +| nvl2 | | | | | | | | | | | | | | | | | | | | | | | +| raise_error | | | | | | | | | | | | | | | | | | | | | | | +| reflect | | | | | | | | | | | | | | | | | | | | | | | | sha | | | S | | | | | | | | | | | S | | | | | | | | | | sha1 | sha1 | sha1 | S | | | | | | | | | | | S | | | | | | | | | | sha2 | | sha2 | S | | | | | | | | | | | S | | | | | | | | | -| spark_partition_id | | | | | | | | | | | | | | | | | | | | | | | +| spark_partition_id | | | S | | | | | | | | | | | | | | | | | | | | +| stack | | | | | | | | | | | | | | | | | | | | | | | | xxhash64 | xxhash64 | xxhash64 | | | | | | | | | | | | | | | | | | | | |