-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(python): add plot namespace (which defers to hvplot) #13238
Conversation
203b007
to
fe06c3a
Compare
fe06c3a
to
2619c8a
Compare
@hoxbro @maximlt @jbednar @MarcSkovMadsen I think you're all involved in hvplot (apologies if not - your names are ones I've come across in related discussions)? If so, would be curious to hear your thoughts on this, whether it seems maintainable, and if there's any risks Polars should be aware of. hvplot has been friendly and welcoming to Polars so I hope this could take the collaboration further 🤝 (I'm aware it's xmas eve, so sorry for writing today - I don't expect a reply, I've just got a long train ride to visit family and so figured I'd try this out) |
For some people, Christmas comes early this year. |
But if plotting requires running some operations for example if using 'by' argument, making boxplot, histogram or using datashader, would you not want the plotting library to control when things are collected? (I'm not the expert) |
I'd suggest moving incrementally:
|
Yes.. that's a big no-no for me. |
This sounds great! I can't think of any risks for doing this; hvPlot is part of our daily workflows for Pandas and Dask DataFrames already, and I'm very pleased for Polars to be joining this list. I don't have experience with using Polars myself, but |
Hope so too! Looks like there's already an open issue holoviz/datashader#1199 Adding In the meantime, I think this would already be an improvement:
It's fairly common to hear user Polars users say they wish they could quickly add I'll mark as ready-for-review once CI is green |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The structure here looks great!
I only wonder where the limits are - are all data types supported, for example? What happens if we want to plot some dataframes with Decimal columns? What kind of error do I get if I try to plot an empty DataFrame / the column I am trying to plot is missing?
I would appreciate some additional tests to showcase this.
@@ -392,7 +392,7 @@ def __get__( # type: ignore[override] | |||
return self.fget( # type: ignore[misc] | |||
instance if isinstance(instance, cls) else cls | |||
) | |||
except AttributeError: | |||
except (AttributeError, ImportError): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding this to avoid an importerror if building the docs without hvplot
installed. I don't think hvplot
should be required for building docs, and it's probably not worth showing plots in the docs themselves
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good - I am just a bit unsure about relying on the library internals like this. Feels like this would break easily when new hvplot versions are released. It would be nice to get some feedback from the hvplot people.
However, I think this version is probably good enough to be released and get some user feedback. It's probably stable enough on the short term and we can improve as we go along.
# private | ||
not name.startswith("_") | ||
# `.plot` not available on Expr | ||
and name != "plot" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be required - plot
is not an empty function so it's skipped. Everything runs fine when I comment this out - am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you try building the docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm no, indeed it fails on building the docs. But the problem isn't that .plot
is not available on Expr. The problem is that it calls hvPlotTabularPolars
which imports polars, which doesn't exist while building the docs.
I'm not sure of the best solution here. I would ideally want to avoid hardcoding random stuff here as the functionality is already complex enough. But I'm not sure what the best solution would be 🤔
I think we should probably update sphinx_accessor
in some way to avoid this from happening. Looking into it as we speak...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, but my understanding is that expr_dispatch
is only needed for methods available on Expr
you're right about complexity, I just couldn't think of a solution which doesn't add any. maybe there's a more generic solution which would mean that this isn't necessary
cc @alexander-beedie in case you have ideas here
I have answered the questions and the changes LGTM. I would make one suggestion of adding a |
Thanks for your reviews! Sure, I've switched to using I'm not sure about adding a function which only raises for LazyFrame - any function which is defined is going to tab-complete for users, and if something only raises then I think I'd prefer to keep it out of the API (e.g. |
That seems like a fair point. My thought was more helping users. An example could be if they started with a DataFrame, did some work, and plotted, and after that, they changed it to a LazyFrame, they will get a non-obvious It was only a suggestion and I trust your judgment 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's just the one loose end around the interaction between Sphinx and expr_dispatch
. We can try to solve it now or keep it as a TODO. Otherwise, this looks good and I think it's a great addition!
Al right! Let's get this in. Thanks a lot @MarcoGorelli! Curious how this is received. |
Might be worth adding Polars in this section |
can you add this an issue to hvplot please? |
Done. Please verify this is what you suggest @alonme holoviz/hvplot#1244 |
Much simpler version of #13149
The idea to just defer to
hvplot
(much like howread_database_uri
, as far as I understand, defers toconnectorx
)Demo: (note: these plots are interactive, I took a screenshot with my mouse over one of them to show the popup that appears)
Doing
.collect
within a plotting function sounds potentially like a footgun, and a risk that people re-trigger the same expensive IO function over and over again? I'd suggest, at least to start, to only have.plot
onDataFrame
, so people are expected tocollect
before plotting