Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE] Refine maven Spark dependency #3625

Merged
merged 4 commits into from
Nov 8, 2023
Merged

Conversation

ulysses-you
Copy link
Contributor

@ulysses-you ulysses-you commented Nov 6, 2023

What changes were proposed in this pull request?

  • Remove the profile in shim and ut module, so when we add a new Spark version, we do not need change these two modules. Instead, add spark.shim.module and spark.test.module.
  • Remove spark32, spark33, spark34 properties, we only need spark.version and change it in each spark version profile
  • Remove the spark-hive-thriftserver dependency

How was this patch tested?

PASS CI

Copy link

github-actions bot commented Nov 6, 2023

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

github-actions bot commented Nov 6, 2023

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Nov 6, 2023

Run Gluten Clickhouse CI

1 similar comment
Copy link

github-actions bot commented Nov 6, 2023

Run Gluten Clickhouse CI

@@ -11,7 +11,7 @@

<artifactId>gluten-celeborn-package</artifactId>
<packaging>jar</packaging>
<name>Gluten Celeborn Common</name>
<name>Gluten Celeborn Package</name>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a typo fix

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -198,6 +198,9 @@
<ignoreClass>org.apache.spark.unused.UnusedStubClass</ignoreClass>
<!-- jdo -->
<ignoreClass>javax.jdo.*</ignoreClass>
<!-- javax -->
<ignoreClass>javax.transaction.*</ignoreClass>
<ignoreClass>javax.xml.*</ignoreClass>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it happy with spark-hive. It actually does nothing since we mark spark-hive as provided.

Copy link

github-actions bot commented Nov 6, 2023

Run Gluten Clickhouse CI

@ulysses-you
Copy link
Contributor Author

cc @zhouyuan @PHILO-HE @zzcclp thank you

PHILO-HE
PHILO-HE previously approved these changes Nov 7, 2023
pom.xml Outdated
@@ -237,6 +216,8 @@
<id>spark-ut</id>
<modules>
<module>gluten-ut</module>
<module>gluten-ut/common</module>
<module>gluten-ut/${spark.shim.version}</module>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your improvement!
Although same module names (spark32/spark33/spar34) are used, it may be better to define & use spark.test.module instead of spark.shim.version for readability since they are under gluten-ut/, not shims. Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added spark.test.module to decouple with spark.shim.module

pom.xml Outdated
<spark.version>${spark32.version}</spark.version>
<delta.version>${delta20.version}</delta.version>
<sparkbundle.version>3.2</sparkbundle.version>
<spark.shim.version>spark32</spark.shim.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, spark.shim.module is better.

Copy link

github-actions bot commented Nov 7, 2023

Run Gluten Clickhouse CI

@PHILO-HE
Copy link
Contributor

PHILO-HE commented Nov 8, 2023

Please rebase the code to fix red CI. Thanks!

Copy link

github-actions bot commented Nov 8, 2023

Run Gluten Clickhouse CI

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Please also update PR description to reflect the latest changes. Thanks!

@kerwin-zk
Copy link
Contributor

LGTM. Thanks!

@ulysses-you ulysses-you merged commit 427971d into apache:main Nov 8, 2023
18 checks passed
@ulysses-you ulysses-you deleted the mvn branch November 8, 2023 04:05
@zzcclp
Copy link
Contributor

zzcclp commented Nov 8, 2023

After merging this pr, do you meet a problem which can not modify the specified module when changing spark version in IDEA?
image

@zzcclp
Copy link
Contributor

zzcclp commented Nov 8, 2023

It seems that it can not remove the spark version profile for the gluten-ut module and shim module.

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_3625_time.csv log/native_master_11_07_2023_e3eff1d8f_time.csv difference percentage
q1 34.08 34.38 0.293 100.86%
q2 24.70 25.03 0.323 101.31%
q3 38.03 38.14 0.111 100.29%
q4 37.93 37.57 -0.356 99.06%
q5 70.66 71.50 0.847 101.20%
q6 7.99 6.26 -1.727 78.38%
q7 84.56 82.22 -2.339 97.23%
q8 86.30 86.95 0.644 100.75%
q9 118.63 119.81 1.182 101.00%
q10 51.30 51.26 -0.034 99.93%
q11 20.00 19.73 -0.268 98.66%
q12 24.82 24.39 -0.427 98.28%
q13 48.57 50.30 1.729 103.56%
q14 20.58 17.67 -2.912 85.85%
q15 32.85 30.35 -2.499 92.39%
q16 16.25 16.20 -0.050 99.70%
q17 102.24 101.51 -0.738 99.28%
q18 148.94 148.26 -0.675 99.55%
q19 14.81 16.17 1.366 109.23%
q20 29.25 30.31 1.067 103.65%
q21 223.50 224.88 1.383 100.62%
q22 13.05 14.08 1.035 107.93%
total 1249.02 1246.98 -2.043 99.84%

@PHILO-HE
Copy link
Contributor

PHILO-HE commented Nov 8, 2023

After merging this pr, do you meet a problem which can not modify the specified module when changing spark version in IDEA? image

Yes, I also encounter this issue on my IDE. With this patch, it seems IDE cannot identify the shim/test modules for spark3.3 when using spark-3.3 profile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants