-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(python): Respect dtype
and strict
in pl.Series
's constructor for pyarrow arrays, numpy arrays, and pyarrow-backed pandas
#15962
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #15962 +/- ##
=======================================
Coverage 80.97% 80.98%
=======================================
Files 1386 1386
Lines 178479 178479
Branches 2877 2877
=======================================
+ Hits 144530 144539 +9
+ Misses 33458 33448 -10
- Partials 491 492 +1 ☔ View full report in Codecov by Sentry. |
Thanks, @MarcoGorelli, for the quick review. I believe I now understand what you meant and have made the necessary changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating! I think this is very close 💪
I'd just suggest expanding the tests, to include:
- a test where you convert from a pyarrow array with
strict=False
- a test where you convert from a pandas series with
strict=False
I believe the change has been made |
def test_series_from_pandas_with_strict() -> None: | ||
s = pl.Series(pd.Series([1, 2.5, 3]), strict=False) | ||
assert_series_equal(s, pl.Series([1.0, 2.5, 3.0], dtype=pl.Float64)) | ||
|
||
|
||
def test_series_from_pyarrow_with_strict() -> None: | ||
s = pl.Series(pa.array([1, 2.5, 3]), strict=False) | ||
assert_series_equal(s, pl.Series([1.0, 2.5, 3.0], dtype=pl.Float64)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something, but I don't see what these tests add to what you already have
Rather than adding these, how about testing converting from a numpy array with [-1, 2, 3]
to a Polars Series with dtype pl.UInt8
, with strict=False and strict=True? I think this will cover the behaviour change you mentioned in https://github.com/pola-rs/polars/pull/15962/files#r1590912969
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two tests added earlier were because I felt there needed to be a simple test with strict=False
without any other parameters. However, I think you are right, these two tests are covered by others.
The behavior change mentioned in #15962 (comment) has been covered by the test using strict=False
like below. It was precisely because of your previous suggestion to modify the test that I discovered this potential problem.
s = pl.Series("foo", pd.Series([-1, 2, 3], dtype="Int8"), pl.UInt8, strict=False)
assert s.to_list() == [None, 2, 3]
assert s.dtype == pl.UInt8
Nevertheless, I will add a test with a numpy array and the strict
parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks @luke396 !
pl.Series
's dtype for pyarrow pathdtype
and strict
in pl.Series
's constructor for pyarrow arrays, numpy arrays, and pyarrow-backed pandas
Thanks, @MarcoGorelli, for your patience and selfless help. This PR wouldn't have been completed without you! |
…r for pyarrow arrays, numpy arrays, and pyarrow-backed pandas (pola-rs#15962)
xref #15948 (comment) and a follow-up PR of #15948
Fix wrong
dtype
when createpl.Series
frompd.Series
.when
before
after