-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pig HcatStorer fails with AWS Glue Data Catalog as metastore for Hive. #37
Comments
+1 |
+1 i get another error: |
I had the same issue on
In my case the solution was to manually specify the missing jar:
On other emr version, Then to load data from glue table:
then check:
|
thanks @moneroexamples ! yes, it did the trick. Instead of adding the jar like you describe, you can also use REGISTER command in the script. Looks like this solution works only for 5x EMR releases (hive2), it doesn't work for 6x. Does anyone have any advice? |
@Oleks777 I just checked on
As a side note. On EMR 6.6,
giving error:
you can solve this by by setting up
|
@moneroexamples many thanks! i spent a lot of time to compile the client for hive2 and it is good to know there is a compiled version available from AWS. |
I request all to either support the premsie of the issue title or confirm if HCatStorer for partition write works with Glue data catalog as hive metastore. I completely get the iterations done above to make basic commands work with Pig on EMR. Thanks |
@Oleks777 Sadly I don't know how to configure EMR so that the extra paths/jars are loaded for |
Any update on this issue? We also encountered the same |
I'm getting the same error when storing data to ORC or Parquet tables with latest version of EMR 6.12.0. It seems support to write to Glue tables is broken. |
After a little bit of digging we can see the problem originates here: at org.apache.hive.hcatalog.mapreduce.FileOutputCommitterContainer.cancelDelegationTokens(FileOutputCommitterContainer.java:1012) If we look at the file: We can see that cancellingDelegationTokens is the last thing that happens. We can also see how it's used: All we really need to do is to return a null instead of throwing operation not supported and then delegation cancel method should work fine. |
Use case
Running the example here - > https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hcatalog-pig.html
Outcome: Pig script Fails when Glue is the hive metastore.Script reports fail status.
The files are written in S3 though
Error logs
The text was updated successfully, but these errors were encountered: