You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Kind of cheating but a naive solution is to use pandas json_normalized to parse the json and then convert the resulting pandas df into Spark. The logic seems a bit too simple to justify a dedicated helper function though
It is possible only if you know the final schema. Otherwise you need to infer the schema first somehow. And even with known schema the simplest solution is still to use UDFs. My first question, do we know the schema in such a case? If not, I would suggest to start from the function like infer_json_schema(col).
Suggestion from this Reddit thread.
The text was updated successfully, but these errors were encountered: