Move to numpy 2.0 #42

CSchoel · 2024-08-06T22:38:26Z

This is just a note to myself: I've seen some tests for the Lorenz system breaking under numpy >= 2.0. I'll need to investigate what has changed there and make sure we can support newer versions of numpy.

CSchoel · 2024-08-10T18:09:39Z

We need to address this at some point. For now, we should make sure that we have regression tests that run all algorithms with settings that make them deterministic and check for exact result values. This should hopefully make us bulletproof against accidentally changing the output by updating a dependency.

bramiozo · 2024-08-22T09:23:17Z

What errors do you get with numpy>=2 ?

I was surprised to see my numpy being downgraded during a poetry install :D.

toni-neurosc · 2024-09-23T09:12:40Z

Hi, we're using Nolds in our project (PyNeuromodulation) and I was wondering what the issue with moving to Numpy 2 is. Nolds is the only package that is downgrading us to Numpy 1.26 right now, which is not a big deal but I was wondering if there's anything I can do to help with the migration here, or what are the tests that are failing right now.

toni-neurosc · 2024-09-23T10:57:27Z

I have done a bit of digging into the issue and I have pinpointed the problem to the function datasets.lorenz_eurler, it seems to be basically a problem of casting between dtypes float32 and float64, more specifically in how the intermediate results of the following calculation are being interpreted:

    return np.array([
      sigma * (y - x), 
      rho * x - y - x * z,
      x * y - beta * z
    ], dtype="float32")

This is what i get in Numpy 1.26

Types: 
 x=<class 'numpy.float32'> 
 y=<class 'numpy.float32'> 
 z=<class 'numpy.float32'> 
 sigma=<class 'int'> 
 rho=<class 'int'> 
 beta=<class 'float'>
Values: x=1.0 
  y=1.0 
  z=1.0 
  sigma=10 
 rho=28 
 beta=2.6666666666666665
Intermediate types: 
 sigma * (y - x): <class 'numpy.float64'> 
 rho * x - y - x * z: <class 'numpy.float64'> 
 x * y - beta * z: <class 'numpy.float64'>
Result type: float32

But in Numpy 2.0

Input types: 
 x=<class 'numpy.float32'> 
 y=<class 'numpy.float32'> 
 z=<class 'numpy.float32'> 
 sigma=<class 'int'> 
 rho=<class 'int'> 
 beta=<class 'float'>

Intermediate types: 
 sigma * (y - x): <class 'numpy.float32'> 
 rho * x - y - x * z: <class 'numpy.float32'> 
 x * y - beta * z: <class 'numpy.float32'>
 
Result type: float32

It seems that Numpy 1.26 was either casting the inputs or the results of the intermediate calculations to np.float64, while in Numpy 2.0 the precision is maintained between inputs and outputs. This change is documented in the Numpy 2.0 migration guide: https://numpy.org/devdocs/numpy_2_0_migration_guide.html#changes-to-numpy-data-type-promotion

The floating point error is basically accumulating over the iterations, producing a different result in each version of Numpy.

I'm not sure if this is a bug in the lorenz function or in the test output. I would probably just calculate everything in float64 dtype to maintain as much precision as possible, but then the test fails to match the expected result.

CSchoel · 2024-09-30T17:45:26Z

Nice find! Thank you very much for putting in the work to dig through the code and the numpy changelog. 🙏 👍

I figured it would be something minor like that, since the test results are not far off from the expected value. To safeguard against this, I want to create regression tests in #50. I'll prioritize this issue to double-check that there aren't any other changes introduced with numpy 2.0.

I didn't plan to release another version between 0.6.0 and 1.0.0 (see https://github.com/CSchoel/nolds/milestone/1), but if the downgrade to numpy < 2.0 causes issues, I can try to fit in a 0.6.1 that makes nolds fully compatible with numpy 2.0.

CSchoel · 2024-09-30T21:01:49Z

Good news: After implementing the regression tests and checking them for separate versions of numpy and scikit-learn, I can confirm that none of the algorithms behave differently. It's only the code for the Lorenz system itself that seems to be affected - which makes sense, since a chaotic system per definition is sensitive to small changes in its parameters. 😄

I think I should be able to publish version 0.6.1 with relaxed version restrictions without issues.

toni-neurosc · 2024-10-01T07:30:32Z

Hi @CSchoel, glad that you figured out that the functionality isn't broken between versions of Numpy. Our program doesn't really break by keeping 1.26, but it makes it hard to do some optimizations like calling internal Numpy functions to skip checks (our data is already validated) and save some time during real time data processing, because the internal implementations have changed slightly between versions. Thank you for taking the time to fix this, and have a nice day!

CSchoel · 2024-10-01T14:27:52Z

I just released version 0.6.1 in #59. Please let me know if it works. 😄

CSchoel self-assigned this Aug 6, 2024

CSchoel added this to the Release nolds 1.0 milestone Aug 10, 2024

CSchoel mentioned this issue Aug 11, 2024

Introduce regression tests with deterministic values #50

Closed

7 tasks

CSchoel mentioned this issue Oct 1, 2024

Numpy2 #56

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move to numpy 2.0 #42

Move to numpy 2.0 #42

CSchoel commented Aug 6, 2024 •

edited

Loading

CSchoel commented Aug 10, 2024

bramiozo commented Aug 22, 2024

toni-neurosc commented Sep 23, 2024

toni-neurosc commented Sep 23, 2024

CSchoel commented Sep 30, 2024

CSchoel commented Sep 30, 2024

toni-neurosc commented Oct 1, 2024

CSchoel commented Oct 1, 2024

Move to numpy 2.0 #42

Move to numpy 2.0 #42

Comments

CSchoel commented Aug 6, 2024 • edited Loading

CSchoel commented Aug 10, 2024

bramiozo commented Aug 22, 2024

toni-neurosc commented Sep 23, 2024

toni-neurosc commented Sep 23, 2024

CSchoel commented Sep 30, 2024

CSchoel commented Sep 30, 2024

toni-neurosc commented Oct 1, 2024

CSchoel commented Oct 1, 2024

CSchoel commented Aug 6, 2024 •

edited

Loading