Bulk Insert and segment size #36342
Replies: 2 comments 4 replies
-
Segment size is defined by the milvus.yaml, not bulk_insert().
bulk_insert() splits the data according to this value. Assume a numpy file size is 10GB, bulk_insert() will split it into 10 segments with 1GB size per segment. If you want a 18GB segment. You can set dataCoord.segment.maxSize to be 18432, and use a huge numpy file(18GB) to try. |
Beta Was this translation helpful? Give feedback.
-
Once you set the segemnt size to larger, compaction will help to merge small segment to larger. |
Beta Was this translation helpful? Give feedback.
-
Hi,
For some justified reasons, I want my segment size to be about 18GB.
I use bulk_insert() in order to ingest a database of 500 million 96-float32 features.
But it seems that the size of the segment is determined according to the size of each numpy file sent during the bulk insert process (which is limited to 4GB).
How can I manage to have a segment size of 18GB?
Should I use another method in order to ingest the data so that my segment size is according to the value set in milvus.yaml?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions