Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

From issue 3371 #1

Closed
wants to merge 584 commits into from
Closed

From issue 3371 #1

wants to merge 584 commits into from

Conversation

coolderli
Copy link
Owner

What changes were proposed in this pull request?

(Please outline the changes and how this PR fixes the issue.)

Why are the changes needed?

(Please clarify why the changes are needed. For instance,

  1. If you propose a new API, clarify the use case for a new API.
  2. If you fix a bug, describe the bug.)

Fix: # (issue)

Does this PR introduce any user-facing change?

(Please list the user-facing changes introduced by your change, including

  1. Change in user-facing APIs.
  2. Addition or removal of property keys.)

How was this patch tested?

(Please test your changes, and provide instructions on how to test it:

  1. If you add a feature or fix a bug, add a test to cover your changes.
  2. If you fix a flaky test, repeat it for many times to prove it works.)

yuqi1129 and others added 30 commits April 24, 2024 11:15
### What changes were proposed in this pull request?

Change Java doc link to 0.5.0.

### Why are the changes needed?

Release 0.5.0 is going to release. 

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

N/A.
…rl (apache#3162)

### What changes were proposed in this pull request?

using `http://127.0.0.1:9001/iceberg/v1/config` to. verify the
avaibility of Iceberg service.

### Why are the changes needed?

The original verify URL is not accessible in some environment.

Fix: apache#3158 

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

just document
…r, Group and Role NameIdentifier (apache#3143)

### What changes were proposed in this pull request?

add `checkUser`, `checkGroup`, `checkRole` in `NameIdentifier`

### Why are the changes needed?

Fix: apache#3132 

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

ut

---------

Co-authored-by: yangliwei <[email protected]>
…pache#2717)

### What changes were proposed in this pull request?
Support retrieve iceberg metadataColumns, such as `_spec_id`,
`_partition`, `_file`, `_pos`, `_deleted`.

### Why are the changes needed?

Support retrieve iceberg metadataColumns, row-level operations depend on
this.

Fix: apache#2587

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?

New integration test.
…che#3166)

### What changes were proposed in this pull request?

Modify the description of `CATALOG_OPERATION_IMPL`, avoid the use of
"hack"

### Why are the changes needed?

Fix: apache#3081 

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?
  
No need to test

---------

Co-authored-by: TimWang <[email protected]>
…pache#3171)

### What changes were proposed in this pull request?

Change the dummy port to a real valid port.

### Why are the changes needed?

PostgreSQL is different from MySQL and is more strict about obtaining
JDBC drivers. We need to provide a real port or we can't the driver.

Fix: apache#3161 

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

IT.
…to support more iceberg catalog backends (apache#3164)

### What changes were proposed in this pull request?
Refactor Spark-connector IT.


### Why are the changes needed?
to support more iceberg catalog backends, such as testing hive, jdbc,
rest catalog backends.

Fix: apache#3163

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Existing ITs.
… hive (apache#3169)

### What changes were proposed in this pull request?
transform `hive` provider to `text` format

### Why are the changes needed?
Fix: apache#3129 

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
add UT and IT
### What changes were proposed in this pull request?
String.format() change \n to %n

### Why are the changes needed?

Fix: apache#3034 

### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
no

Co-authored-by: 韩望欣 <[email protected]>
### What changes were proposed in this pull request?

relace static fields

### Why are the changes needed?
Fix: apache#3028

### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
Use existing test cases in TestAuthorizationUtils.class

Co-authored-by: 韩望欣 <[email protected]>
…apache#3041)

### What changes were proposed in this pull request?

Catch Exception instead of Throwable in HiveCatalogOperations.java and
HiveClientPool.java

### Why are the changes needed?

Fix: apache#3035 

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

exist ut

Co-authored-by: yangliwei <[email protected]>
…3147)

### What changes were proposed in this pull request?

Use ? rather than {0,1} in regex in GravitinoVersion.java

### Why are the changes needed?

Fix: apache#3077 

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

exist ut

Co-authored-by: yangliwei <[email protected]>
…ule `integration-test-common` (apache#3201)

### What changes were proposed in this pull request?

Trigger integration test when there is a change in module
`integration-test-common`.

### Why are the changes needed?

We should start the integration test if there are changes in module
`integration-test-common`. however, CI
https://github.com/datastrato/gravitino/actions/runs/8872110159/job/24356012525?pr=3197
of PR apache#3197 can't start the
CI pipeline.

Fixed: apache#3207 

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

CI passed.
… improve the wording (apache#3219)

### What changes were proposed in this pull request?

Fix the wrong sequence number issue in doc and improve the wording.

### Why are the changes needed?

Fix: apache#3218

### Does this PR introduce _any_ user-facing change?

No.
### How was this patch tested?

N/A
…log after all tests are finished (apache#3197)

### What changes were proposed in this pull request?

close containers and upload container log after all tests are finished

### Why are the changes needed?

Fix: apache#3024 

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

existing ITs

---------

Co-authored-by: zhanghan18 <[email protected]>
Co-authored-by: Qi Yu <[email protected]>
…apache#3224)

### What changes were proposed in this pull request?
 polish iceberg rest catalog document
- format table
- replace spark-shell with spark sql
- add `--package`
- add `;`

### Why are the changes needed?
more user friendly

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
just document
…arbage Collector cleaning fileset version info (apache#3191)

### What changes were proposed in this pull request?

When obtaining the maximum version result of fileset, pass the result
into the object to solve the `ClassCastException` problem.

### Why are the changes needed?

Fix: apache#3190 

### How was this patch tested?

Add some ITs.

---------

Co-authored-by: xiaojiebao <[email protected]>
…ConfigOption (apache#3217)

### What changes were proposed in this pull request?
Fix the issue of method `checkValue` in the ConfigOption. If we use
`checkValue` in the wrong position, it won't work.

### Why are the changes needed?

Fix: apache#3216

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
UT.

Co-authored-by: Heng Qin <[email protected]>
…catalog (apache#3235)

### What changes were proposed in this pull request?

Update playground docs for using simple catalog

### Why are the changes needed?

Fix: apache#3234 

### Does this PR introduce _any_ user-facing change?

NO

### How was this patch tested?

NO
…ithout UTF-8 (apache#3179)

### What changes were proposed in this pull request?

Currently in the java client, the json result is not encoded with
`UTF-8` when requesting the server, which will cause some Chinese
characters to be garbled.It will use `ISO_8859_1` as default. This PR
fixed this.

![image](https://github.com/datastrato/gravitino/assets/26177232/9342dd32-1ded-4670-a3c7-37d9a5673955)

### Why are the changes needed?

Fix: apache#3165 

### How was this patch tested?

Add some ITs.

---------

Co-authored-by: xiaojiebao <[email protected]>
…pache#3238)

### What changes were proposed in this pull request?

use simple catalog name in
`com.datastrato.gravitino.integration.test.container.TrinoContainer#checkSyncCatalogFromGravitino`,
and verify it where it is called.

### Why are the changes needed?

Fix: apache#3237 

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

ITs

---------

Co-authored-by: zhanghan18 <[email protected]>
…nt (apache#3204)

### What changes were proposed in this pull request?

* Add pylint rule for logging, logging should use old format (%), not
fstring
* Unify all the logging format
* Set many disable rules for future modification (I add TODO tags after
them, will work on them after this PR)

### Why are the changes needed?

Fix: apache#3203 

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?
```bash
./gradlew :clients:client-python:test
```

---------

Co-authored-by: TimWang <[email protected]>
<!--
1. Title: [#<issue>] <type>(<scope>): <subject>
   Examples:
     - "[apache#123] feat(operator): support xxx"
     - "[apache#233] fix: check null before access result in xxx"
     - "[MINOR] refactor: fix typo in variable name"
     - "[MINOR] docs: fix typo in README"
     - "[apache#255] test: fix flaky test NameOfTheTest"
   Reference: https://www.conventionalcommits.org/en/v1.0.0/
2. If the PR is unfinished, please mark this PR as draft.
-->

### What changes were proposed in this pull request?
Fixed assertion



Fix:  apache#3251

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

Ran test locally multiple times

---------

Signed-off-by: Rohit Satya <[email protected]>
…order (apache#3255)

### What changes were proposed in this pull request?

Equality assertions did not have correct expected and actual order. 

### Why are the changes needed?

Provides code clarity. 

Fix: apache#3252 

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Test only changes and ran associated tests.
### What changes were proposed in this pull request?

Removing Inheritance

### Why are the changes needed?

Fix: apache#3227 

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

existing tests
…ead#sleep` (apache#3239)

### What changes were proposed in this pull request?

use `Awaitility#await` instead of `Thread#sleep`

### Why are the changes needed?

Fix: apache#3214 

### Does this PR introduce _any_ user-facing change?

N/A 

### How was this patch tested?

ITs

---------

Co-authored-by: zhanghan18 <[email protected]>
… GravitinoVirtualFileSystem.java (apache#3098)

### What changes were proposed in this pull request?

replace the non-capturing group `?:` with independent, non-capturing
group (atomic group) `?>` to eliminate backtracking.

### Why are the changes needed?

To prevent stack overflow

Fix: apache#3037 
Fix: apache#3086 

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
Add UT
…ation (apache#3267)

### What changes were proposed in this pull request?

Added `@Override` annotation on getTableBuilder in
PostgreSqlTableOperations

### Why are the changes needed?

Fix: apache#3250
…nformation_schema` for Doris catalog. (apache#3276)

### What changes were proposed in this pull request?

Filter system database `information_schema` when `list` or `get` table
for Doris catalogs.

### Why are the changes needed?

The system database/table should not be accessible to users directly,
and both MySQL and PG have filtered it.

Fix: apache#3275 

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

Added test `testListSystemDatabase`.
…ore the Trino started (apache#3175)

### What changes were proposed in this pull request?

 Make  loading catalogs before the Trino started

### Why are the changes needed?

Fix: apache#2627 

### Does this PR introduce _any_ user-facing change?

NO

### How was this patch tested?

UT
diqiu50 and others added 28 commits June 17, 2024 10:22
…for TrinoQueryTestTool (apache#3845)

### What changes were proposed in this pull request?

1、Support `ignore_failed` on run TrinoQueryTestTool.
2、Summarize all the test cases results when test finished

### Why are the changes needed?

Fix: apache#3637

### Does this PR introduce _any_ user-facing change?

NO

### How was this patch tested?

Exist IT
…apache#3875)

### What changes were proposed in this pull request?

Fix mistakes in Trino connection configuration document

### Why are the changes needed?

It's a bug needs to be fixed.

Fix: apache#3874 

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

N/A.
… running IT in deploy mode (apache#3735)

### What changes were proposed in this pull request?
1. using Awaitility to wait for Gravitino server starting in IT

### Why are the changes needed?

1. To prevent a bug that run IT test before starting Gravitino server.
  
Fix: apache#3612

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

IT passed
…type (apache#3505)

### What changes were proposed in this pull request?

support mysql unsigned integer type

### Why are the changes needed?
Fix: apache#2340

### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
UT
…roupId='language' queryString> (apache#3861)

### What changes were proposed in this pull request?

This pull request updates all `<Tabs>` components in the documentation
to support multi-language switching by adding `groupId='language'` and
`queryString` attributes. It also ensures that all related links in the
documentation include the appropriate `queryString` parameter for
language selection.

Changes include:
- Updating all `<Tabs>` components to the following format:
  ```markdown
  <Tabs groupId='language' queryString>
    <TabItem value="Json" label="Json">
      ```json
      {
        "direction": "asc",
        "nullOrder": "NULLS_LAST",
        "sortTerm":  {
          "type": "field",
          "fieldName": ["score"]
        }
      }
      ```
    </TabItem>
    <TabItem value="java" label="Java">
      ```java
SortOrders.of(NamedReference.field("score"), SortDirection.ASCENDING,
NullOrdering.NULLS_LAST);
      ```
    </TabItem>
  </Tabs>
****

### Why are the changes needed?

Fix: apache#3409
These changes are needed to enable multi-language support in the
documentation, allowing users to easily switch between different
language examples. This improves the usability and accessibility of the
documentation for a wider audience.

### Does this PR introduce any user-facing change?
Yes, this PR introduces the following user-facing changes:

Users can now switch between different language examples in the
documentation using a consistent interface.
Documentation links will now include queryString parameters to ensure
the correct language example is displayed.
How was this patch tested?
The changes were tested by:

-Not yet

Co-authored-by: LanceLin <[email protected]>
…ild Ranger Docker (apache#3775)

### What changes were proposed in this pull request?

Fix: apache#3776 

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Unit and Integration tests.

TODO:
- [x] Change ranger image name in ContainerSuite.java back to default
- [x] Change `RANGER_DOWNLOAD_URL` to the location of
`datastrato/apache-ranger`.
- [x] @xunliu needs to push new image release to docker hub.
- [x] @unknowntpo needs to update `docker-image-details.md`
…-build.md (apache#3736)

### What changes were proposed in this pull request?

This pull request adds detailed instructions for building Gravitino on
Windows using the Windows Subsystem for Linux (WSL). The new section
provides a comprehensive step-by-step guide to setting up the necessary
environment and dependencies on Windows to successfully build and run
Gravitino. It also includes specific instructions for integrating with
IntelliJ IDEA and Visual Studio Code (VS Code).

### Why are the changes needed?

These changes are needed to provide Windows users with clear and concise
instructions on how to build Gravitino using WSL. This documentation
expands the accessibility of the project to a wider audience who may be
using Windows as their primary development environment. The detailed
steps ensure that developers can set up a consistent development
environment across different operating systems.

Fix: apache#3812

### Does this PR introduce _any_ user-facing change?

No user-facing APIs or properties were changed. The addition is purely
documentation, aimed at helping developers set up their environment on
Windows using WSL.

### How was this patch tested?

The documentation changes were reviewed for accuracy and clarity. The
steps outlined were followed to ensure they work as described, using an
environment set up with Windows 11 and Ubuntu 22.04 on WSL. The specific
instructions were used to successfully build the project, confirming
their effectiveness.

---------

Co-authored-by: LanceLin <[email protected]>
Co-authored-by: Jerry Shao <[email protected]>
… entities (apache#3623)

### What changes were proposed in this pull request?
Support to import the entities when loading entities

### Why are the changes needed?

Fix: apache#3607 

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Add ut.

---------

Co-authored-by: Heng Qin <[email protected]>
Co-authored-by: Rory <[email protected]>
Co-authored-by: Jerry Shao <[email protected]>
…ache#3886)

### What changes were proposed in this pull request?

fix type cast issue in `DTOConverters`.

### Why are the changes needed?

Fix: apache#3884 

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Not need.

---------

Co-authored-by: zhanghan18 <[email protected]>
…#3880)

### What changes were proposed in this pull request?
add flink and spark it logs

### Why are the changes needed?
missing flink and spark it logs

### Does this PR introduce _any_ user-facing change?
no

### How was this patch tested?
existing tests
…g schemas and tables (apache#2335)

### What changes were proposed in this pull request?

Improve security when creating and dropping schemas and tables.

This PR adds the following checks for identifier names using the
capability framework
- Regex check
- As a best practice, it's generally advised to avoid including spaces
in database names. In this PR, database names that include space will be
considered illegal.
- String length check, since SQL injection usually requires using longer
string
    - Mysql: at most 64 characters
    - Postgresql: at most 63 characters

We refer to specifications of the earliest version of DB that gravitino
currently supports:
- Postgresql identifier rules:
https://www.postgresql.org/docs/12/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS
- Mysql identifier naming:
https://dev.mysql.com/doc/refman/5.7/en/identifiers.html
- Mysql identifier length limit:
https://dev.mysql.com/doc/refman/5.7/en/identifier-length.html
### Why are the changes needed?

Fix: apache#2179 

### Does this PR introduce _any_ user-facing change?
Add name identifier checks before attempting to create or drop schemas
and tables.

### How was this patch tested?
Add IT tests.
…em in Python (apache#3528)

### What changes were proposed in this pull request?

Support Gravitino Virtual File System in Python so that we can read and
write Fileset storage data. The first PR only supports HDFS.

After research, the following popular cloud storages or companies have
implemented their own FileSystem based on
fsspec(https://filesystem-spec.readthedocs.io/en/latest/index.html):
1. S3(https://github.com/fsspec/s3fs)
2. Azure(https://github.com/fsspec/adlfs)
3. Gcs(https://github.com/fsspec/gcsfs)
4. OSS(https://github.com/fsspec/ossfs)
5.
Databricks(https://github.com/fsspec/filesystem_spec/blob/master/fsspec/implementations/dbfs.py)
6. Snowflake(https://github.com/snowflakedb/snowflake-ml-python), 

So this PR will implement GVFS based on the fsspec interface.

### Why are the changes needed?

Fix: apache#2059 

### How was this patch tested?

Add some UTs and ITs.

---------

Co-authored-by: xiaojiebao <[email protected]>
…on for Paimon catalog (apache#3889)

### What changes were proposed in this pull request?

Based on apache#2746 contributed
by @SteNicholas.

Add a basic code skeleton for Paimon catalog.

### Why are the changes needed?

Fix: apache#3885

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Just build.

---------

Co-authored-by: caican <[email protected]>  SteNicholas <[email protected]>
…artup speed for Doris (apache#3883)

### What changes were proposed in this pull request?

- remove chmod in Doris Container start.sh
- add chmod in Doris Dockerfile

### Why are the changes needed?

accelerate the startup speed for Doris Container

Fix: apache#3881

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

Manual
…OConverters` (apache#3904)

### What changes were proposed in this pull request?

Convert literal values of partition in `DTOConverters`.

### Why are the changes needed?

Fix: apache#3903 

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

UT.

Co-authored-by: zhanghan18 <[email protected]>
…ystem (apache#3908)

### What changes were proposed in this pull request?

This PR proposes to add a basic `TagManager` framework. The current
framework is not ready to work since it misses the core logic.

### Why are the changes needed?

This subtask adds a basic tag framework without actual logic. The reason
of adding this is to control the PR size to avoid a big PR, since
there're many changes in RDBMS support. If we want to make it complete,
the PR will be very big.

Fix: apache#3895 

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

The current PR is not ready to work, so tests will be added later on.
### What changes were proposed in this pull request?

In this PR, I've changed the tool tip to show the metalake comment and
catalog comment when hovering over the metalake and catalog in each row
respectively.

### Why are the changes needed?

Before the PR when hovering over the metalake row, it would display the
metalake name which is redundant. Additionally, when hovering over the
catalog name it did not show any tooltip at all.

Fix: apache#3286 

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?
I tested this using this the web UI to see if it displays correctly when
the mouse is hovered over.
…o add more principals and key tables (apache#3851)

### What changes were proposed in this pull request?

Add more proxy users in the file `core-site.xml` in the Kerberos Hive
docker file

### Why are the changes needed?

As we are going to support schema or fileset level user authentication,
we need more principals and key tables, so we have to change the docker
image file.

Fix: apache#3850 

### Does this PR introduce _any_ user-facing change?

N/A.

### How was this patch tested?

N/A.
coolderli pushed a commit that referenced this pull request Jul 5, 2024
…che#4006)

### What changes were proposed in this pull request?

Frontend integration test failed in certain headless environment, like
the ec2 in aws.

### Why are the changes needed?

 Frontend integration test failed due to the following error:
```
 MetalakePageTest > initializationError FAILED
    org.openqa.selenium.WebDriverException: unknown error: Chrome failed to start: exited abnormally.
      (chrome not reachable)
      (The process started from chrome location /actions-runner/_work/gravitino-test/gravitino-test/gravitino/integration-test/build/chrome/chrome-linux/chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
    Build info: version: '3.141.59', revision: 'e82be7d358', time: '2018-11-14T08:17:03'
    System info: host: 'ip-172-31-5-251', ip: '172.31.5.251', os.name: 'Linux', os.arch: 'amd64', os.version: '6.5.0-1020-aws', java.version: '1.8.0_412'
    Driver info: driver.version: ChromeDriver
    remote stacktrace: #0 0x561b8bc79869 <unknown>
    #1 0x561b8bc14383 <unknown>
    apache#2 0x561b8b9f6ca3 <unknown>
    apache#3 0x561b8ba1a286 <unknown>
    apache#4 0x561b8ba157cd <unknown>
    apache#5 0x561b8ba4f11d <unknown>
    apache#6 0x561b8ba49963 <unknown>
    apache#7 0x561b8ba1fe36 <unknown>
    apache#8 0x561b8ba20fd5 <unknown>
    apache#9 0x561b8bc41f90 <unknown>
    apache#10 0x561b8bc53d80 <unknown>
```

### Does this PR introduce _any_ user-facing change?

N/A

### How was this patch tested?

<img width="1492" alt="image"
src="https://github.com/datastrato/gravitino/assets/154112360/3e16870e-ad51-4677-aec5-8e8be0c68f9b">
@coolderli coolderli closed this Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.