Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Null Pointer Exception when searching by vector Id for an Id that doesn't exist #310

Closed
bcrastnopol opened this issue Sep 25, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@bcrastnopol
Copy link

bcrastnopol commented Sep 25, 2021

Describe the bug
I have a cosine-lsh mapped index, and when run nearest neighbor searches (exact and approximate) by vector id, I get the following error:

  "error" : {
    "root_cause" : [
      {
        "type" : "runtime_exception",
        "reason" : "Failed to retrieve vector at index [my-index] id [1] field [v]"
      }
    ],
    "type" : "runtime_exception",
    "reason" : "Failed to retrieve vector at index [my-index] id [1] field [v]",
    "caused_by" : {
      "type" : "null_pointer_exception",
      "reason" : "Cannot invoke \"java.util.Map.get(Object)\" because the return value of \"org.elasticsearch.action.get.GetResponse.getSourceAsMap()\" is null"
    }
  },
  "status" : 500
}

Expected behavior
Searching for a vector with an id that doesn't exist should not throw a 500 error

Environment (please complete the following information):

  • Elastiknn version: 7.13.1.0 and 7.14.1.0
  • OS: ubuntu linux

To Reproduce
Steps to reproduce the behavior:
1.Create an index

{
 "settings": {
   "index": {
     "number_of_shards": 1,          
     "elastiknn": true               
   }
 }
}
  1. Add a mapping
{
   "properties": {
       "my_vec": {
           "type": "elastiknn_dense_float_vector",
           "elastiknn": {
               "dims": 100,                      
               "model": "lsh",                   
               "similarity": "cosine",             
               "L": 99,                            
               "k": 1                              
           }
       }
   }
}
  1. Search for a vector that doesn't exist by Id
{
   "_source": [
     "vid"
 ], 
     "size": 10,
 "query": {
   
   "elastiknn_nearest_neighbors": {
     "model": "lsh",
     "similarity": "cosine",
     "candidates": 100,
     "field": "v",
     "vec": {
       "index": "my-index",
       "field": "v",
       "id": "1"
     }
   }
 }
}
  1. See error

Additional context
I just want to say that this plugin is incredible! I'm running a very large cluster (100m+ vectors) and I've gotten great results compared to some some other nearest neighbor libraries. Plus, this has the added benefit of incremental index updates! Keep up the great work here!

@alexklibisz
Copy link
Owner

Thanks! It's probably a simple fix. I'll look into it in the next couple days.

@alexklibisz
Copy link
Owner

Hi @bcrastnopol , the issue should be resolved in the 7.14.1.1 release. I resolved in #311 and added some regression tests. Feel free to re-open this issue if it's still a problem.

I also opened #312 to look at some related latent issues and/or inconsistencies in exception handling.

Thanks for the kind words about the plugin. If you can, consider submitting a PR to add your use-case to the readme (here). I don't use the plugin in my day-to-day work so it's always neat to hear how it's being used.

@bcrastnopol
Copy link
Author

Thank you for fixing this - it works great!

We're still evaluating a couple of solutions but I'm advocating for this. If we end up using it I'll find out what I'm allowed to share publicly and submit a PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants