Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] DataFrame type fails pydantic validation when using Spark Connect / Serverless #62

Closed
maxim-mityutko opened this issue Sep 9, 2024 · 6 comments
Labels
enhancement New feature or request
Milestone

Comments

@maxim-mityutko
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Pydantic enforces strict types. In the current implementation all Spark related logic (readers, writers, transforms, integrations) expect DataFrame (pyspark.sql.DataFrame) class as input or output. However in Spark Connect and subsequently in the Serverless compute the DataFrame class is pyspark.sql.connect.DataFrame, which causes errors in pydantic model validations.

Describe the solution you'd like

Model should except both native and connect DataFrames as a valid input / output

Describe alternatives you've considered

...

Additional context

...

@mikita-sakalouski
Copy link
Contributor

@mikita-sakalouski
Copy link
Contributor

We will have to integrate changes you doing to this branch.

@mikita-sakalouski
Copy link
Contributor

@maxim-mityutko
Copy link
Contributor Author

@mikita-sakalouski to be honest, I would prefer if we treat this a separate issue and merge to main as part of the #59 .
Reason being is that this small change only addresses the strict validation of the DF types, nothing else. It unlocks the next steps for the serverless compute poc.
The feature you are referring is much bigger and will probably require more testing and time.

@dannymeijer dannymeijer added this to the 0.9.0 milestone Oct 2, 2024
@mikita-sakalouski
Copy link
Contributor

@maxim-mityutko Looks like PR #63 is covering requested functionality, please check and close the request.

@dannymeijer
Copy link
Member

Solved with #63

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Done
Development

No branches or pull requests

3 participants