Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: Panic when running complex query on 0.20.17 backtrace included #15375

Closed
2 tasks done
kszlim opened this issue Mar 28, 2024 · 9 comments
Closed
2 tasks done
Labels
A-panic Area: code that results in panic exceptions bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars

Comments

@kszlim
Copy link
Contributor

kszlim commented Mar 28, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

Can't share my code unfortunately

Log output

thread 'python' panicked at crates/polars-plan/src/logical_plan/optimizer/predicate_pushdown/mod.rs:357:54:
called `Option::unwrap()` on a `None` value
stack backtrace:
   0:     0x7f0f683ba4d8 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9a7c8ebfe0e5a9ea
   1:     0x7f0f65e3177b - core::fmt::write::hafeb62294a20d279
   2:     0x7f0f6838897e - std::io::Write::write_fmt::h83162bfc67f19d63
   3:     0x7f0f683bf6d9 - std::sys_common::backtrace::print::he94870e87e3768e8
   4:     0x7f0f683beff9 - std::panicking::default_hook::{{closure}}::hd193b85d1d659c30
   5:     0x7f0f683c00e6 - std::panicking::rust_panic_with_hook::ha4c10d4e371025d4
   6:     0x7f0f683bfa2c - std::panicking::begin_panic_handler::{{closure}}::h8d8b80cfb6d2af9a
   7:     0x7f0f683bf9b9 - std::sys_common::backtrace::__rust_end_short_backtrace::hedf50fb5defff019
   8:     0x7f0f683bf9a6 - rust_begin_unwind
   9:     0x7f0f64f82be5 - core::panicking::panic_fmt::hde3e2c796a3a5416
  10:     0x7f0f64f82cc0 - core::panicking::panic::h150613825d1ad83d
  11:     0x7f0f64f83018 - core::option::unwrap_failed::h83da879f26f94880
  12:     0x7f0f67f89887 - polars_plan::logical_plan::optimizer::predicate_pushdown::PredicatePushDown::push_down::{{closure}}::h803901bdf697e6c9
  13:     0x7f0f67f775cb - polars_plan::logical_plan::optimizer::predicate_pushdown::PredicatePushDown::push_down::hf47ef6c3f217f851
  14:     0x7f0f67f87f23 - polars_plan::logical_plan::optimizer::predicate_pushdown::PredicatePushDown::push_down::{{closure}}::h803901bdf697e6c9
  15:     0x7f0f67f775cb - polars_plan::logical_plan::optimizer::predicate_pushdown::PredicatePushDown::push_down::hf47ef6c3f217f851
  16:     0x7f0f67f8e271 - polars_plan::logical_plan::optimizer::predicate_pushdown::PredicatePushDown::pushdown_and_continue::h3f92b1bc19ce0409
  17:     0x7f0f67f83740 - polars_plan::logical_plan::optimizer::predicate_pushdown::PredicatePushDown::push_down::{{closure}}::h803901bdf697e6c9
  18:     0x7f0f67f775cb - polars_plan::logical_plan::optimizer::predicate_pushdown::PredicatePushDown::push_down::hf47ef6c3f217f851
  19:     0x7f0f67f7c41a - polars_plan::logical_plan::optimizer::cache_states::set_cache_states::h4a916d2445b89bff
  20:     0x7f0f66e31d0b - polars_lazy::frame::LazyFrame::optimize_with_scratch::h36aac1830dcf25ad
  21:     0x7f0f66fbfa14 - polars_lazy::frame::LazyFrame::prepare_collect::h5a8d7d9207e961e3
  22:     0x7f0f66fbf7eb - polars_lazy::frame::LazyFrame::collect::hba43825d72772acc
  23:     0x7f0f65c04512 - polars::lazyframe::_::<impl polars::lazyframe::PyLazyFrame>::__pymethod_collect__::h90e4526652b9c509
  24:     0x7f0f65640c87 - pyo3::impl_::trampoline::trampoline::hd22154ab9052e1b3
  25:     0x7f0f71a5fc51 - method_vectorcall_NOARGS
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Objects/descrobject.c:453
  26:     0x7f0f71a52d63 - _PyObject_VectorcallTstate
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/./Include/internal/pycore_call.h:92
  27:     0x7f0f71a52d63 - PyObject_Vectorcall
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Objects/call.c:299
  28:     0x7f0f719fbd59 - _PyEval_EvalFrameDefault
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Python/ceval.c:4769
  29:     0x7f0f71b59219 - _PyEval_EvalFrame
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/./Include/internal/pycore_ceval.h:73
  30:     0x7f0f71b59219 - _PyEval_Vector
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Python/ceval.c:6434
  31:     0x7f0f71b59219 - PyEval_EvalCode
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Python/ceval.c:1148
  32:     0x7f0f71ba3121 - run_eval_code_obj
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Python/pythonrun.c:1710
  33:     0x7f0f71ba3121 - run_mod
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Python/pythonrun.c:1731
  34:     0x7f0f71ba4a20 - pyrun_file
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Python/pythonrun.c:1626
  35:     0x7f0f71ba4a20 - _PyRun_SimpleFileObject
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Python/pythonrun.c:440
  36:     0x7f0f71ba4f4c - _PyRun_AnyFileObject
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Python/pythonrun.c:79
  37:     0x7f0f71bc8bae - pymain_run_file_obj
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Modules/main.c:360
  38:     0x7f0f71bc8bae - pymain_run_file
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Modules/main.c:379
  39:     0x7f0f71bc8bae - pymain_run_python
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Modules/main.c:601
  40:     0x7f0f71bc8bae - Py_RunMain
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Modules/main.c:680
  41:     0x7f0f71bc90a3 - pymain_main
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Modules/main.c:710
  42:     0x7f0f71bc90a3 - Py_BytesMain
                               at /tmp/python-build.20231205220011.25159/Python-3.11.7/Modules/main.c:734
  43:     0x7f0f70c0313a - __libc_start_main
  44:           0x40066a - _start
  45:                0x0 - <unknown>
Traceback (most recent call last):

Issue description

When running a complex query with several joins/filters/computations I get a panic

Expected behavior

Shouldn't panic

Installed versions

--------Version info---------
Polars:               0.20.17
Index type:           UInt32
Platform:             Linux-5.10.210-178.855.x86_64-x86_64-with-glibc2.26
Python:               3.11.7 (main, Dec  5 2023, 22:00:36) [GCC 7.3.1 20180712 (Red Hat 7.3.1-17)]

----Optional dependencies----
adbc_driver_manager:  <not installed>
cloudpickle:          3.0.0
connectorx:           <not installed>
deltalake:            <not installed>
fastexcel:            <not installed>
fsspec:               2024.2.0
gevent:               <not installed>
hvplot:               <not installed>
matplotlib:           3.8.3
nest_asyncio:         1.6.0
numpy:                1.26.4
openpyxl:             <not installed>
pandas:               2.2.1
pyarrow:              15.0.1
pydantic:             <not installed>
pyiceberg:            <not installed>
pyxlsb:               <not installed>
sqlalchemy:           <not installed>
xlsx2csv:             <not installed>
xlsxwriter:           <not installed>
@kszlim kszlim added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Mar 28, 2024
@kszlim
Copy link
Contributor Author

kszlim commented Mar 28, 2024

Looks like it's this line of code https://github.com/pola-rs/polars/blob/main/crates/polars-plan/src/logical_plan/optimizer/predicate_pushdown/mod.rs#L357C43-L357C44

Might be to do with hive partitioning

@kszlim kszlim changed the title Panic when running complex query on 0.20.17 backtrace included Regression: Panic when running complex query on 0.20.17 backtrace included Mar 28, 2024
@kszlim
Copy link
Contributor Author

kszlim commented Mar 28, 2024

Can confirm that turning off my hive partitioning prevents this from occurring.

@kszlim
Copy link
Contributor Author

kszlim commented Mar 28, 2024

My complex query also fails with a missing column near the end of execution of the query (that previously worked in 0.20.16). I'll try to produce a MRE for that, might be hard to.

@kszlim
Copy link
Contributor Author

kszlim commented Mar 28, 2024

Also found that the missing column regression still occurs with optimization turned entirely off.

@stinodego
Copy link
Member

Thansk for the report, but this will be hard to fix without a reproducible example.

@stinodego stinodego added the A-panic Area: code that results in panic exceptions label Mar 28, 2024
@kszlim
Copy link
Contributor Author

kszlim commented Mar 29, 2024

I tried to produce a repro for both issues, unfortunately couldn't get them working :(

I can pin down the exact commits that caused issues though if that helps?

@kszlim
Copy link
Contributor Author

kszlim commented Mar 29, 2024

So the second bug (where I got a missing column) seems to come from this, i'm guessing at some point aliasing isn't carried through properly. I couldn't reproduce it with something small though... I changed my group_by(new_name="old_name") to group_by("old_name")` and just settled with the old name and it seemed to be a valid workaround.

The first bug with the panic seems to occur from this @ritchie46 I could only solve this by disabling hive partitioning.

@ritchie46
Copy link
Member

This is fixed now right? After #15381

@kszlim
Copy link
Contributor Author

kszlim commented Mar 29, 2024

Yep!

@kszlim kszlim closed this as completed Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-panic Area: code that results in panic exceptions bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars
Projects
None yet
Development

No branches or pull requests

3 participants