Skip to content

I encountered an issue regarding the LIMIT return. #36322

Discussion options

You must be logged in to vote

The current behavior of milvus is confusing users when there are duplicated primary keys in one collection, which is caused by historical reasons and not easy to change.

Just highlight these points:

  1. insert() doesn't verify duplicate primary keys because it is time-consuming work, especially for huge datasets.
  2. upsert() can avoid duplicate pk but it is also a heavy task for huge datasets.
  3. search()/query() only returns one item for duplicate primary keys because it doesn't make sense if we return topk like this:
No.1  ID = 1, distance=0.01
No.2  ID = 1, distance=0.03
No.3  ID = 2, distance=0.06
No.4  ID = 2, distance=0.1
No.5  ID = 1, distance=0.2
......

Replies: 1 comment 4 replies

Comment options

You must be logged in to vote
4 replies
@adol001
Comment options

@yhmo
Comment options

@adol001
Comment options

@yhmo
Comment options

Answer selected by adol001
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants