Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problems with requiring an abandoned nmslib for py>3.10 (doesn't build) #1985

Open
stas00 opened this issue Sep 13, 2024 · 18 comments
Open

problems with requiring an abandoned nmslib for py>3.10 (doesn't build) #1985

stas00 opened this issue Sep 13, 2024 · 18 comments

Comments

@stas00
Copy link

stas00 commented Sep 13, 2024

pyserini depends on nmslib which appears to be an abandoned project

The last released version it builds with is py3.10

To use py>3.10 one has to build from git source

You can see the discussion here nmslib/nmslib#538 with the workaround proposed here nmslib/nmslib#538 (comment) - the problem is that nobody is releasing a new version (since years!) so the fix isn't available. The workaround requires building from git source and since it has to build a binary this is very problematic as it adds a huge overhead to the CI run.

The fix to the issue was merged 3 years ago in nmslib/nmslib#488 but no new releases have been made or are planned to be made.

Not sure how you can help here, but I thought I'd ask. Our project depends on pyserini and this nmslib dependency breaks our CI if we want to switch to py>3.10.

Thank you.

@lintool
Copy link
Member

lintool commented Sep 14, 2024

hey @searchivarius - what do you think? Indeed, last official release was in 2021: https://pypi.org/project/nmslib/

@stas00 thanks for bringing this up - simple solution is just to rip out all nmslib code. I don't think there's actually that much.

Let's discuss.

@searchivarius
Copy link

searchivarius commented Sep 14, 2024

I have been trying to find some time to fix the release pipeline, but there have always been something new. The key problem is that Travis basically defaulted on us and other OSS projects (on paper they are supportive, but they don't really give us capacity, their build servers just time out) and I have not replaced it with something like GitHub actions.

So, I certainly plan to fix the failed pipeline, but I can't give any guarantees. Feel free to implement whatever simple solution you find necessary to implement.

@stas00
Copy link
Author

stas00 commented Sep 14, 2024

@searchivarius, IMHO, making a new source release is all that is needed.

The binary wheels are a bonus, but if we can build from source from the desired version that's 99% fantastic.

Thank you.

@searchivarius
Copy link

The source release won't install on Windows (building anything on Windows is a separate hell). It is also going to be some trouble to install on Linux machines, b/c each installation will require building from sources, thoughts?

@lintool
Copy link
Member

lintool commented Sep 15, 2024

Hi @stas00 I'm not opposed to (temporarily?) ripping out the nmslib dependency for now. I don't think it's actively being used, and we've been wanting to prune our requirements chain anyway (e.g., make certain features optional).

Would you be willing to send a PR to do that?

@stas00
Copy link
Author

stas00 commented Sep 16, 2024

The source release won't install on Windows (building anything on Windows is a separate hell).

Then for a time being only the older version will be available on Windows.

It is also going to be some trouble to install on Linux machines, b/c each installation will require building from sources, thoughts?

There is no problem with building from source as long as it's a released source with a version. Then it gets cached and re-used on the next rebuild. The package manager knows that the latest version is in.

The problem is when one builds from git+https - in that case the package manager is forced to rebuild it every time since it can't "anchor" on a specific version.

So I suggest as a first step to release just the new source version that builds with higher python versions as well as the previous python versions. And then seek out how to build binary wheels as a bonus.

@stas00
Copy link
Author

stas00 commented Sep 16, 2024

@lintool, let's see if @searchivarius can make a new source release - which would immediately solve this dependency issue, even if it requires building from source.

@lintool
Copy link
Member

lintool commented Sep 16, 2024

@stas00 sure, let me know!

@searchivarius
Copy link

Then for a time being only the older version will be available on Windows.
The pypi source tar ball is not designated to a specific environment, AFAIK.

@stas00
Copy link
Author

stas00 commented Sep 16, 2024

I'm not sure what to do with the reply above, @searchivarius

Currently all users have to build from git+https to be able to use nmslib with py>3.10 - all I'm asking is to make a new source release, which is exactly the same as building from git+https but it'll be versioned and thus enable versioning and proper package management.

What is your objection to doing that?

@searchivarius
Copy link

@stas00 to make sure we are on the same page: are you talking about just fixing the source or about fixing the source and uploading it to pypi?

@stas00
Copy link
Author

stas00 commented Sep 16, 2024

Making a new version and uploading to pypi, yes. And thank you!

@searchivarius
Copy link

searchivarius commented Sep 16, 2024 via email

@stas00
Copy link
Author

stas00 commented Sep 16, 2024

oh, I think I now understand what you mean. You're saying the current HEAD isn't installable on Windows, so if you make a new release, Windows users will no longer be able to install nmslib unless their explicitly specify nmslib==2.1.1, correct?

Then perhaps change the README.md's Installation section to say:

Windows users have to install pip install nmslib==2.1.1 and only up to py<3.11 is supported.

Would that work?

I think that's better than nobody being able to install it. We aren't taking a way from those Windows users to install your library - it's just going to be restricted for now to nmslib==2.1.1 until things will further improve.

@searchivarius
Copy link

@stas00 yes, though it's not so simple though with the README, b/c we have a lot of users. We need to ensure smooth installation.

That said, I was lucky enough to fix Appveyor very quickly (unlike some prior times when it took me days, lol) and I have some second thoughts about Travis. If these thoughts are correct, I will be able to make a release very soon.

@lintool
Copy link
Member

lintool commented Sep 26, 2024

@stas00 following up here... please close issue if you've resolved or found a workaround?

@nunoFcul
Copy link

I've been trying for two weeks to use pyserini, to replicate a paper's results for research purposes.

Tried many possible workarounds mentioned on forums and nothing seems to work for my specific situation.

I'm currently trying to replicate the results from "https://github.com/texttron/hyde".

I was able to install nmslib from the source with the workaround mentioned on nmslib/nmslib#538 but ran into errors when importing from pyserini.search related to numpy.

The message:

ImportError: cannot import name 'broadcast_to' from 'numpy.lib.stride_tricks' (/home/nuno-marques/dev_env/thesis/hyde/.venv/lib/python3.10/site-packages/numpy/lib/stride_tricks.py)

Any insights or help would be much appreciated.

@searchivarius
Copy link

Apologies for this @nunoFcul and @stas00 I have fixed some issues with the build (more was broken than I thought, but not everything). Will soon create a release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants