Skip to content

Commit

Permalink
address comment
Browse files Browse the repository at this point in the history
  • Loading branch information
marin-ma committed Apr 15, 2024
1 parent 3daa82d commit b4b4656
Show file tree
Hide file tree
Showing 2 changed files with 43 additions and 28 deletions.
8 changes: 4 additions & 4 deletions velox/docs/functions/spark/binary.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,29 +7,29 @@ Binary Functions
Computes the hash of one or more input values using seed value of 42. For
multiple arguments, their types can be different.
Supported types are: BOOLEAN, TINYINT, SMALLINT, INTEGER, BIGINT, VARCHAR,
VARBINARY, REAL, DOUBLE, HUGEINT and TIMESTAMP.
VARBINARY, REAL, DOUBLE, HUGEINT, TIMESTAMP, ARRAY, MAP and ROW.


.. spark:function:: hash_with_seed(seed, x, ...) -> integer
Computes the hash of one or more input values using specified seed. For
multiple arguments, their types can be different.
Supported types are: BOOLEAN, TINYINT, SMALLINT, INTEGER, BIGINT, VARCHAR,
VARBINARY, REAL, DOUBLE, HUGEINT and TIMESTAMP.
VARBINARY, REAL, DOUBLE, HUGEINT, TIMESTAMP, ARRAY, MAP and ROW.

.. spark:function:: xxhash64(x, ...) -> bigint
Computes the xxhash64 of one or more input values using seed value of 42.
For multiple arguments, their types can be different.
Supported types are: BOOLEAN, TINYINT, SMALLINT, INTEGER, BIGINT, VARCHAR,
VARBINARY, REAL, DOUBLE, HUGEINT and TIMESTAMP.
VARBINARY, REAL, DOUBLE, HUGEINT, TIMESTAMP, ARRAY, MAP and ROW.

.. spark:function:: xxhash64_with_seed(seed, x, ...) -> bigint
Computes the xxhash64 of one or more input values using specified seed. For
multiple arguments, their types can be different.
Supported types are: BOOLEAN, TINYINT, SMALLINT, INTEGER, BIGINT, VARCHAR,
VARBINARY, REAL, DOUBLE, HUGEINT and TIMESTAMP.
VARBINARY, REAL, DOUBLE, HUGEINT, TIMESTAMP, ARRAY, MAP and ROW.

.. spark:function:: md5(x) -> varbinary
Expand Down
63 changes: 39 additions & 24 deletions velox/functions/sparksql/Hash.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -592,6 +592,45 @@ class XxHash64Function final : public exec::VectorFunction {
const std::optional<int64_t> seed_;
};

bool checkHashElementType(const TypePtr& type) {
switch (type->kind()) {
case TypeKind::BOOLEAN:
case TypeKind::TINYINT:
case TypeKind::SMALLINT:
case TypeKind::INTEGER:
case TypeKind::BIGINT:
case TypeKind::VARCHAR:
case TypeKind::VARBINARY:
case TypeKind::REAL:
case TypeKind::DOUBLE:
case TypeKind::HUGEINT:
case TypeKind::TIMESTAMP:
return true;
case TypeKind::ARRAY:
return checkHashElementType(type->asArray().elementType());
case TypeKind::MAP:
return checkHashElementType(type->asMap().keyType()) &&
checkHashElementType(type->asMap().valueType());
case TypeKind::ROW: {
const auto& children = type->asRow().children();
return std::all_of(
children.begin(), children.end(), [](const auto& child) {
return checkHashElementType(child);
});
}
default:
return false;
}
}

void checkArgTypes(const std::vector<exec::VectorFunctionArg>& args) {
for (const auto& arg : args) {
if (!checkHashElementType(arg.type)) {
VELOX_USER_FAIL("Unsupported type for hash: {}", arg.type->toString())
}
}
}

} // namespace

// Not all types are supported by now. Check types when making hash function.
Expand All @@ -604,30 +643,6 @@ std::vector<std::shared_ptr<exec::FunctionSignature>> hashSignatures() {
.build()};
}

void checkArgTypes(const std::vector<exec::VectorFunctionArg>& args) {
for (const auto& arg : args) {
switch (arg.type->kind()) {
case TypeKind::BOOLEAN:
case TypeKind::TINYINT:
case TypeKind::SMALLINT:
case TypeKind::INTEGER:
case TypeKind::BIGINT:
case TypeKind::VARCHAR:
case TypeKind::VARBINARY:
case TypeKind::REAL:
case TypeKind::DOUBLE:
case TypeKind::HUGEINT:
case TypeKind::TIMESTAMP:
case TypeKind::ARRAY:
case TypeKind::MAP:
case TypeKind::ROW:
break;
default:
VELOX_USER_FAIL("Unsupported type for hash: {}", arg.type->toString())
}
}
}

std::shared_ptr<exec::VectorFunction> makeHash(
const std::string& name,
const std::vector<exec::VectorFunctionArg>& inputArgs,
Expand Down

0 comments on commit b4b4656

Please sign in to comment.