Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] NoClassDefFoundError when attempting to read from Vertica #559

Open
padraic-mcatee opened this issue Apr 26, 2024 · 0 comments
Open
Labels
bug Something isn't working

Comments

@padraic-mcatee
Copy link

padraic-mcatee commented Apr 26, 2024

Environment

  • Spark version: 3.5.0
  • Hadoop version: 3.3.6
  • Vertica version: 11
  • Vertica Spark Connector version: 3.3.5
  • Java version: 8
  • Additional Environment Information:
    • EMR 7.0.0

Problem Description

Missing class def. I see vertica-spark is on spark 3.3 - possibly some deprecation there?

  1. Steps to reproduce:
  2. Expected behaviour:
  3. Actual behaviour:
  4. Error message/stack trace:
py4j.protocol.Py4JJavaError: An error occurred while calling o254.createOrReplace.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3099 in stage 0.0 failed 4 times, most recent failure: Lost task 3099.3 in stage 0.0 (TID 3281) ([2600:1f18:41ad:2102:1022:9d72:bf0:2463] executor 102): java.lang.NoClassDefFoundError: org/apache/spark/sql/internal/SQLConf$LegacyBehaviorPolicy$
	at com.vertica.spark.datasource.fs.HadoopFileStoreLayer.openReadParquetFile(FileStoreLayerInterface.scala:380)
	at com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.$anonfun$startPartitionRead$2(VerticaDistributedFilesystemReadPipe.scala:429)
	at scala.util.Either.flatMap(Either.scala:341)
	at com.vertica.spark.datasource.core.VerticaDistributedFilesystemReadPipe.startPartitionRead(VerticaDistributedFilesystemReadPipe.scala:416)
	at com.vertica.spark.datasource.core.DSReader.openRead(DSReader.scala:65)
	at com.vertica.spark.datasource.v2.VerticaBatchReader.<init>(VerticaDatasourceV2Read.scala:273)
	at com.vertica.spark.datasource.v2.VerticaReaderFactory.createReader(VerticaDatasourceV2Read.scala:261)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:84)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:35)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.hasNext(Unknown Source)
	at org.apache.spark.sql.execution.WholeStageCodegenEvaluatorFactory$WholeStageCodegenPartitionEvaluator$$anon$1.hasNext(WholeStageCodegenEvaluatorFactory.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$1(WriteToDataSourceV2Exec.scala:441)
	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1409)
	at org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run(WriteToDataSourceV2Exec.scala:486)
	at org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run$(WriteToDataSourceV2Exec.scala:425)
	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:491)
	at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:388)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:93)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
	at org.apache.spark.scheduler.Task.run(Task.scala:143)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:629)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:95)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:632)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.internal.SQLConf$LegacyBehaviorPolicy$
	... 32 more
  1. Code sample or example on how to reproduce the issue:

Spark Connector Logs

@padraic-mcatee padraic-mcatee added the bug Something isn't working label Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant