Spark application run in cluster mode not able to read and write from hbase on hdp 3.1 cluster #342

jainshashank24 · 2020-12-02T10:04:12Z

Spark application run in cluster mode not able to read and write from hbase on hdp 3.1 cluster.
Job launches in yarn and in driver after sometime it shows the following error

2020-12-01 09:40:23 [DEBUG] [org.apache.hadoop.hbase.client.ConnectionImplementation:919] - locateRegionInMeta parentTable='hbase:meta', attempt=0 of 36 failed; retrying after sleep of 36
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=36, exceptions:
Tue Dec 01 09:40:23 UTC 2020, null, java.net.SocketTimeoutException: callTimeout=60000, callDuration=60528: Call to hdp-slv-01.hadoop-store.back.christine.info/10.181.66.21:16020 failed on local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=0, waitTime=60227, rpcTimeout=59994 row 't,,99999999999999' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hdp-slv-01.hadoop-store.back.christine.info,16020,1606739459240, seqNum=-1

    at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:298)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:242)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
    at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:268)
    at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:436)
    at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:311)
    at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:596)
    at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(ConnectionImplementation.java:852)
    at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(ConnectionImplementation.java:755)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:325)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:153)
    at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:58)
    at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:192)
    at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:268)
    at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:436)
    at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:311)
    at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:596)
    at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD$$anon$2.hasNext(HBaseTableScan.scala:187)
    at scala.collection.Iterator$JoinIterator.hasNext(Iterator.scala:216)
    at scala.collection.Iterator$ConcatIterator.hasNext(Iterator.scala:192)
    at org.apache.spark.sql.execution.datasources.hbase.HBaseTableScanRDD$$anon$3.hasNext(HBaseTableScan.scala:215)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(generated.java:28)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10$$anon$1.hasNext(WholeStageCodegenExec.scala:614)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:253)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:247)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:836)
    at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:836)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:49)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:109)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

Can someone help in identifying the issue

The text was updated successfully, but these errors were encountered:

beejay19 · 2020-12-03T17:40:22Z

@jainshashank24 : Facing a similar issue, did it work in client mode? Which shc-core version are you using and what are the dependencies you added?

Thanks

jainshashank24 · 2020-12-04T03:48:43Z

Hi @beejay19 yes it did work in client mode but in cluster mode its not working
I used this version of jar "shc-core-1.1.0.3.1.0.0-78.jar"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spark application run in cluster mode not able to read and write from hbase on hdp 3.1 cluster #342

Spark application run in cluster mode not able to read and write from hbase on hdp 3.1 cluster #342

jainshashank24 commented Dec 2, 2020

beejay19 commented Dec 3, 2020 •

edited

Loading

jainshashank24 commented Dec 4, 2020

Spark application run in cluster mode not able to read and write from hbase on hdp 3.1 cluster #342

Spark application run in cluster mode not able to read and write from hbase on hdp 3.1 cluster #342

Comments

jainshashank24 commented Dec 2, 2020

beejay19 commented Dec 3, 2020 • edited Loading

jainshashank24 commented Dec 4, 2020

beejay19 commented Dec 3, 2020 •

edited

Loading