-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HNSW failing to find the nearest neighbor #26
Comments
Have you tried varying the values for How does classic knn work on the dataset? Hnsw has in its test a compatibility requirement i.e. order should match |
I modified the comparison test case found in the tests folder to use my data set. The k=1 tests fail, but the k=10 and k=20 test pass, meaning that on average, 90% of the actual nearest neighbors are found. However, when I add a condition to check that at least one actual NN is found for every query, it fails. The same condition passes in the original test using randomized data. I guess the challenge in my data set is that the points are not uniformly distributed, but there are large empty areas. Here's the test case
|
I replaced |
I'm trying to replace my regular grid search with HNSW, but HNSW seems to fail rather spectacularly in finding the nearest neighbor in some cases. I understand that it's an approximate method, but it's a bit underwhelming if when I ask for 50 nearest neighbors, the closest of them is 3x farther than the actual nearest neighbor. What I'd really want is to get, say, 10 neighbors so that I could be fairly certain that at least ~5 of the actual nearest neighbors are be included.
Am I doing something wrong or is HNSW the wrong method for my need?
Below is a minimal example with my data (data file attached, they are nodes of a surface mesh). I've tried playing with the parameters of HierarchicalNSW but they don't seem to have much effect.
Output:
points.csv
The text was updated successfully, but these errors were encountered: