Queries fail in spark when the column contains '$' character #48

ramkumar71 · 2021-03-25T04:48:23Z

Hello,

I have used 0.2.0 release of this library to query the data in EMR using spark. I created a glue table with the necessary schema and location and used AWS Glue as a catalog for EMR cluster.

I used below Ion data

{
  domain_id:11111,
  item_id:"0061137456",
  version_time:2012-01-26T17:52:54.749Z, 
  marketplace_id:1,
  product:{
    merchant:[
      {
        value:"0061137456",
        $ims_state:{
          changed_at_version:6681,
          value:tom
        }
      }
    ]
  }
}

This is the below error with the stack trace(More stacktrace in the attached file)

spark-sql> select * from table;
[test.txt](https://github.com/amzn/ion-hive-serde/files/6202107/test.txt)

21/03/22 19:27:19 ERROR Table: Unable to get field from serde: com.amazon.ionhiveserde.IonHiveSerDe
java.lang.IllegalArgumentException: Error: name expected at the position 102 of 'string:decimal(38,0):timestamp:struct<merchant:array<struct<value:string,$ims_state:struct<value:string,changed_at_version:decimal(38,0)>>>>' but '$' is found.
	at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.expect(TypeInfoUtils.java:354)
	at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseType(TypeInfoUtils.java:478)
	at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseType(TypeInfoUtils.java:447)
	at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseType(TypeInfoUtils.java:484)
	at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils$TypeInfoParser.parseTypeInfos(TypeInfoUtils.java:305)
	at org.apache.hadoop.hive.serde2.typeinfo.TypeInfoUtils.getTypeInfosFromTypeString(TypeInfoUtils.java:765)
	at com.amazon.ionhiveserde.IonHiveSerDe.readColumnTypes(IonHiveSerDe.java:190)
	at com.amazon.ionhiveserde.IonHiveSerDe.initialize(IonHiveSerDe.java:73)
	at org.apache.hadoop.hive.serde2.AbstractSerDe.initialize(AbstractSerDe.java:53)
	at org.apache.hadoop.hive.serde2.SerDeUtils.initializeSerDe(SerDeUtils.java:521)
	at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:391)
	at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:276)
	at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:258)
	at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:605)
	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getTableOption$1$$anonfun$apply$7.apply(HiveClientImpl.scala:358)
	at org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$getTableOption$1$$anonfun$apply$7.apply(HiveClientImpl.scala:355)

Note: All the above queries worked as expected in Hive

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Queries fail in spark when the column contains '$' character #48

Queries fail in spark when the column contains '$' character #48

ramkumar71 commented Mar 25, 2021

Queries fail in spark when the column contains '$' character #48

Queries fail in spark when the column contains '$' character #48

Comments

ramkumar71 commented Mar 25, 2021