Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[486] Revert artifact name with incubator prefix #505

Merged
merged 1 commit into from
Aug 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ catalogOptions: # all other options are passed through in a map
key1: value1
key2: value2
```
5. run with `java -jar incubator-xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml [--hadoopConfig hdfs-site.xml] [--convertersConfig converters.yaml] [--icebergCatalogConfig catalog.yaml]`
5. run with `java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml [--hadoopConfig hdfs-site.xml] [--convertersConfig converters.yaml] [--icebergCatalogConfig catalog.yaml]`
The bundled jar includes hadoop dependencies for AWS, Azure, and GCP. Sample hadoop configurations for configuring the converters
can be found in the [xtable-hadoop-defaults.xml](https://github.com/apache/incubator-xtable/blob/main/utilities/src/main/resources/xtable-hadoop-defaults.xml) file.
The custom hadoop configurations can be passed in with the `--hadoopConfig [custom-hadoop-config-file]` option.
Expand Down
6 changes: 3 additions & 3 deletions demo/notebook/demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,9 @@
"import $ivy.`org.apache.hudi:hudi-spark3.2-bundle_2.12:0.14.0`\n",
"import $ivy.`org.apache.hudi:hudi-java-client:0.14.0`\n",
"import $ivy.`io.delta:delta-core_2.12:2.0.2`\n",
"import $cp.`/home/jars/incubator-xtable-core-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/incubator-xtable-api-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/incubator-xtable-hudi-support-utils-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/xtable-core-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/xtable-api-0.1.0-SNAPSHOT.jar`\n",
"import $cp.`/home/jars/xtable-hudi-support-utils-0.1.0-SNAPSHOT.jar`\n",
"import $ivy.`org.apache.iceberg:iceberg-hive-runtime:1.3.1`\n",
"import $ivy.`io.trino:trino-jdbc:431`\n",
"import java.util._\n",
Expand Down
6 changes: 3 additions & 3 deletions demo/start_demo.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ cd $XTABLE_HOME

mvn install -am -pl xtable-core -DskipTests -T 2
mkdir -p demo/jars
cp xtable-hudi-support/xtable-hudi-support-utils/target/incubator-xtable-hudi-support-utils-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-api/target/incubator-xtable-api-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-core/target/incubator-xtable-core-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-hudi-support/xtable-hudi-support-utils/target/xtable-hudi-support-utils-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-api/target/xtable-api-0.1.0-SNAPSHOT.jar demo/jars
cp xtable-core/target/xtable-core-0.1.0-SNAPSHOT.jar demo/jars

cd demo
docker-compose up
8 changes: 4 additions & 4 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
<modelVersion>4.0.0</modelVersion>

<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable</artifactId>
<artifactId>xtable</artifactId>
<name>xtable</name>

<parent>
Expand Down Expand Up @@ -88,17 +88,17 @@
<dependencies>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-api</artifactId>
<artifactId>xtable-api</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-core</artifactId>
<artifactId>xtable-core</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-hudi-support-utils</artifactId>
<artifactId>xtable-hudi-support-utils</artifactId>
<version>${project.version}</version>
</dependency>

Expand Down
4 changes: 2 additions & 2 deletions website/docs/biglake-metastore.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ This document walks through the steps to register an Apache XTable™ (Incubatin
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account_key.json
```
5. Clone the Apache XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
6. Download the [BigLake Iceberg JAR](gs://spark-lib/biglake/biglake-catalog-iceberg1.2.0-0.1.0-with-dependencies.jar) locally.
Apache XTable™ (Incubating) requires the JAR to be present in the classpath.

Expand Down Expand Up @@ -117,7 +117,7 @@ catalogOptions:
From your terminal under the cloned Apache XTable™ (Incubating) directory, run the sync process using the below command.

```shell md title="shell"
java -cp xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:/path/to/downloaded/biglake-catalog-iceberg1.2.0-0.1.0-with-dependencies.jar org.apache.xtable.utilities.RunSync --datasetConfig my_config.yaml --icebergCatalogConfig catalog.yaml
java -cp xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar:/path/to/downloaded/biglake-catalog-iceberg1.2.0-0.1.0-with-dependencies.jar org.apache.xtable.utilities.RunSync --datasetConfig my_config.yaml --icebergCatalogConfig catalog.yaml
```

:::tip Note:
Expand Down
4 changes: 2 additions & 2 deletions website/docs/bigquery.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@ If you are not planning on using Iceberg, then you do not need to add these to y
:::

#### Steps to add additional configurations to the Hudi writers:
1. Add the extensions jar (`incubator-xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar`) to your class path
1. Add the extensions jar (`xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar`) to your class path
For example, if you're using the Hudi [quick-start guide](https://hudi.apache.org/docs/quick-start-guide#spark-shellsql)
for spark you can just add `--jars incubator-xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` to the end of the command.
for spark you can just add `--jars xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` to the end of the command.
2. Set the following configurations in your writer options:
```shell md title="shell"
hoodie.avro.write.support.class: org.apache.xtable.hudi.extensions.HoodieAvroWriteSupportWithFieldIds
Expand Down
2 changes: 1 addition & 1 deletion website/docs/fabric.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ An example hadoop configuration for authenticating to ADLS storage account is as
```

```shell md title="shell"
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml --hadoopConfig hadoop.xml
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml --hadoopConfig hadoop.xml
```

Running the above command will translate the table `people` in Iceberg or Hudi format to Delta Lake format. To validate
Expand Down
4 changes: 2 additions & 2 deletions website/docs/glue-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ This document walks through the steps to register an Apache XTable™ (Incubatin
also set up access credentials by following the steps
[here](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-quickstart.html)
3. Clone the Apache XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)

## Steps
### Running sync
Expand Down Expand Up @@ -84,7 +84,7 @@ Replace with appropriate values for `sourceFormat`, `tableBasePath` and `tableNa
From your terminal under the cloned xtable directory, run the sync process using the below command.

```shell md title="shell"
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
```

:::tip Note:
Expand Down
4 changes: 2 additions & 2 deletions website/docs/hms.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ This document walks through the steps to register an Apache XTable™ (Incubatin
or a distributed system like Amazon EMR, Google Cloud's Dataproc, Azure HDInsight etc.
This is a required step to register the table in HMS using a Spark client.
3. Clone the XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
4. This guide also assumes that you have configured the Hive Metastore locally or on EMR/Dataproc/HDInsight
and is already running.

Expand Down Expand Up @@ -88,7 +88,7 @@ datasets:

From your terminal under the cloned Apache XTable™ (Incubating) directory, run the sync process using the below command.
```shell md title="shell"
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
```

:::tip Note:
Expand Down
4 changes: 2 additions & 2 deletions website/docs/how-to.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ history to enable proper point in time queries.
1. A compute instance where you can run Apache Spark. This can be your local machine, docker,
or a distributed service like Amazon EMR, Google Cloud's Dataproc, Azure HDInsight etc
2. Clone the Apache XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
3. Optional: Setup access to write to and/or read from distributed storage services like:
* Amazon S3 by following the steps
[here](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) to install AWSCLIv2
Expand Down Expand Up @@ -351,7 +351,7 @@ Authentication for GCP requires service account credentials to be exported. i.e.
In your terminal under the cloned Apache XTable™ (Incubating) directory, run the below command.

```shell md title="shell"
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
```

**Optional:**
Expand Down
4 changes: 2 additions & 2 deletions website/docs/unity-catalog.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ This document walks through the steps to register an Apache XTable™ (Incubatin
3. Create a Unity Catalog metastore in Databricks as outlined [here](https://docs.gcp.databricks.com/data-governance/unity-catalog/create-metastore.html#create-a-unity-catalog-metastore).
4. Create an external location in Databricks as outlined [here](https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-location.html).
5. Clone the Apache XTable™ (Incubating) [repository](https://github.com/apache/incubator-xtable) and create the
`incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)
`xtable-utilities-0.1.0-SNAPSHOT-bundled.jar` by following the steps on the [Installation page](/docs/setup)

## Pre-requisites (for open-source Unity Catalog)
1. Source table(s) (Hudi/Iceberg) already written to external storage locations like S3/GCS/ADLS or local.
Expand Down Expand Up @@ -48,7 +48,7 @@ datasets:
From your terminal under the cloned Apache XTable™ (Incubating) directory, run the sync process using the below command.

```shell md title="shell"
java -jar xtable-utilities/target/incubator-xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
java -jar xtable-utilities/target/xtable-utilities-0.1.0-SNAPSHOT-bundled.jar --datasetConfig my_config.yaml
```

:::tip Note:
Expand Down
4 changes: 2 additions & 2 deletions xtable-api/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,12 @@
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<artifactId>incubator-xtable-api</artifactId>
<artifactId>xtable-api</artifactId>
<name>xtable-api</name>

<parent>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable</artifactId>
<artifactId>xtable</artifactId>
<version>0.1.0-SNAPSHOT</version>
</parent>

Expand Down
8 changes: 4 additions & 4 deletions xtable-core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -19,23 +19,23 @@
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<artifactId>incubator-xtable-core</artifactId>
<artifactId>xtable-core</artifactId>
<name>xtable-core</name>

<parent>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable</artifactId>
<artifactId>xtable</artifactId>
<version>0.1.0-SNAPSHOT</version>
</parent>

<dependencies>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-api</artifactId>
<artifactId>xtable-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-hudi-support-utils</artifactId>
<artifactId>xtable-hudi-support-utils</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
Expand Down
4 changes: 2 additions & 2 deletions xtable-hudi-support/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable</artifactId>
<artifactId>xtable</artifactId>
<version>0.1.0-SNAPSHOT</version>
</parent>

<artifactId>incubator-xtable-hudi-support</artifactId>
<artifactId>xtable-hudi-support</artifactId>
<packaging>pom</packaging>


Expand Down
6 changes: 3 additions & 3 deletions xtable-hudi-support/xtable-hudi-support-extensions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
### When should you use them?
The Hudi extensions provide the ability to add field IDs to the parquet schema when writing with Hudi. This is a requirement for some engines, like BigQuery and Snowflake, when reading an Iceberg table. If you are not planning on using Iceberg, then you do not need to add these to your Hudi writers.
### How do you use them?
1. Add the extensions jar (`incubator-xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar`) to your class path.
For example, if you're using the Hudi [quick-start guide](https://hudi.apache.org/docs/quick-start-guide#spark-shellsql) for spark you can just add `--jars incubator-xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` to the end of the command.
1. Add the extensions jar (`xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar`) to your class path.
For example, if you're using the Hudi [quick-start guide](https://hudi.apache.org/docs/quick-start-guide#spark-shellsql) for spark you can just add `--jars xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar` to the end of the command.
2. Set the following configurations in your writer options:
`hoodie.avro.write.support.class: org.apache.xtable.hudi.extensions.HoodieAvroWriteSupportWithFieldIds`
`hoodie.client.init.callback.classes: org.apache.xtable.hudi.extensions.AddFieldIdsClientInitCallback`
Expand All @@ -33,7 +33,7 @@ For example, if you're using the Hudi [quick-start guide](https://hudi.apache.or
### When should you use them?
If you want to use XTable with Hudi [streaming ingestion](https://hudi.apache.org/docs/hoodie_streaming_ingestion) to sync each commit into other table formats.
### How do you use them?
1. Add the extensions jar (`incubator-xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar`) to your class path.
1. Add the extensions jar (`xtable-hudi-extensions-0.1.0-SNAPSHOT-bundled.jar`) to your class path.
2. Add `org.apache.xtable.hudi.sync.XTableSyncTool` to your list of sync classes
3. Set the following configurations based on your preferences:
`hoodie.xtable.formats.to.sync: "ICEBERG,DELTA"` (or simply use one format)
Expand Down
8 changes: 4 additions & 4 deletions xtable-hudi-support/xtable-hudi-support-extensions/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -19,22 +19,22 @@
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<artifactId>incubator-xtable-hudi-support-extensions</artifactId>
<artifactId>xtable-hudi-support-extensions</artifactId>

<parent>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-hudi-support</artifactId>
<artifactId>xtable-hudi-support</artifactId>
<version>0.1.0-SNAPSHOT</version>
</parent>

<dependencies>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-hudi-support-utils</artifactId>
<artifactId>xtable-hudi-support-utils</artifactId>
</dependency>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-core</artifactId>
<artifactId>xtable-core</artifactId>
</dependency>

<!-- Logging API -->
Expand Down
4 changes: 2 additions & 2 deletions xtable-hudi-support/xtable-hudi-support-utils/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<artifactId>incubator-xtable-hudi-support-utils</artifactId>
<artifactId>xtable-hudi-support-utils</artifactId>

<parent>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-hudi-support</artifactId>
<artifactId>xtable-hudi-support</artifactId>
<version>0.1.0-SNAPSHOT</version>
</parent>

Expand Down
8 changes: 4 additions & 4 deletions xtable-utilities/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -20,21 +20,21 @@
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<parent>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable</artifactId>
<artifactId>xtable</artifactId>
<version>0.1.0-SNAPSHOT</version>
</parent>
<modelVersion>4.0.0</modelVersion>

<artifactId>incubator-xtable-utilities</artifactId>
<artifactId>xtable-utilities</artifactId>

<dependencies>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-api</artifactId>
<artifactId>xtable-api</artifactId>
</dependency>
<dependency>
<groupId>org.apache.xtable</groupId>
<artifactId>incubator-xtable-core</artifactId>
<artifactId>xtable-core</artifactId>
</dependency>

<!-- command line arg parsing -->
Expand Down