-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#562] docs(hive): add user doc of Hive catalog #569
Conversation
Code Coverage Report
Files
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few minor formatting and English issues that need fixing
license: "Copyright 2023 Datastrato. | ||
This software is licensed under the Apache License version 2." | ||
--- | ||
## Using Hive as a Catalog in Gravitino |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
may be we can add more Hive catalog capacities? such as it works as a proxy mode now, supports basic namespace&table DDL operations, not support partition operations yet. could it manage the tables not created by Gravitino? what's the different with the tables created by Gravitino ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am just keeping the document format consistent with Iceberg catalog(#537). Additionally, the table-related operations you mentioned should be placed under the sub-directory in Hive catalog document directory, but the current document framework does not support a hierarchical directory.
docs/gravitino-manage-hive.md
Outdated
} | ||
``` | ||
|
||
* `provider`: Set this to "hive" to use Hive as the catalog provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
provider
is immutable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not mention that the provider
is mutable, what would you recommend revising this sentence to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should tell users they can't change provider
when they alter catalog info. otherwise, besides create we should add drop&alter&load&list docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about change to:
* `provider`: Must set this to "hive" in order to use Hive as the catalog provider.
docs/gravitino-manage-hive.md
Outdated
|
||
## After the catalog is initialized | ||
|
||
You can manage and operate on tables using the following URL format: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hive Catalog provides some custom table properties, such as format
, we should tell the users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but it's a table property, not a catalog
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to list all the table properties here to tell users how to configure.
You can manage and operate on tables using the following URL format: | ||
|
||
```shell | ||
http://{GravitinoServerHost}:{GravitinoServerPort}/api/metalakes/{metalake}/catalogs/{catalog}/schemas/{schema}/tables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest to provide a simple example to create a hive table
@yuqi1129 @justinmclean I have made corresponding modifications based on the comments. Can you help me review it again? |
docs/gravitino-manage-hive.md
Outdated
|
||
### configuration | ||
|
||
| Configuration item | Description | value | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The value is Default value
or just an example value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can you please add a column named Since version
as @jerryshao suggested to mark in which version we introduce this configuration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
docs/gravitino-manage-hive.md
Outdated
|
||
* `provider`: Set this to "hive" to use Hive as the catalog provider. | ||
* `metastore.uris`: This is a required configuration, and it should be the Hive metastore service URIs. | ||
* Other configuration parameters with the `gravitino.bypass.` prefix can be added to the "properties" section and passed down to the underlying Hive metastore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'd better add an example like 'gravitino.bypass.hive.metastore.client.capability.check', then we would pass hive.metastore.client.capability.check
to hive metastore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does gravitino.bypass.hive.metastore.client.capability.check
mean? I think we can offer meaningful examples that can be useful for the Hive catalog but I don't know what there is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hive.metastore.client.capability.check
is the key of an exact configuration in HiveConf
, it's just a example of how to use gravition.bypas.
prefix. If users want to override default hive value, they can use gravitino.bypass.xxxx
to overwrite xxx
in hive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added to catalog properties section
What changes were proposed in this pull request?
add user doc of Hive catalog
Why are the changes needed?
Fix: #562
Does this PR introduce any user-facing change?
no
How was this patch tested?
not need