-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
11 changed files
with
296 additions
and
182 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,103 +1,55 @@ | ||
# ngt-rs   [![Latest Version]][crates.io] [![Latest Doc]][docs.rs] | ||
# ngt-rs | ||
|
||
[Latest Version]: https://img.shields.io/crates/v/ngt.svg | ||
[crates.io]: https://crates.io/crates/ngt | ||
[Latest Doc]: https://docs.rs/ngt/badge.svg | ||
[docs.rs]: https://docs.rs/ngt | ||
[![crate]][crate-ngt] [![doc]][doc-ngt] | ||
|
||
[crate]: https://img.shields.io/crates/v/ngt.svg | ||
[crate-ngt]: https://crates.io/crates/ngt | ||
[doc]: https://docs.rs/ngt/badge.svg | ||
[doc-ngt]: https://docs.rs/ngt | ||
|
||
Rust wrappers for [NGT][], which provides high-speed approximate nearest neighbor | ||
searches against a large volume of data in high dimensional vector data space (several | ||
ten to several thousand dimensions). | ||
ten to several thousand dimensions). The vector data can be `f32`, `u8`, or [f16][]. | ||
|
||
This crate provides the following indexes: | ||
* `NgtIndex`: Graph and tree-based index[^1] | ||
* `QgIndex`: Quantized graph-based index[^2] | ||
* `QbgIndex`: Quantized blob graph-based index | ||
* [`NgtIndex`][index-ngt]: Graph and tree based index[^1] | ||
* [`QgIndex`][index-qg]: Quantized graph based index[^2] | ||
* [`QbgIndex`][index-qbg]: Quantized blob graph based index | ||
|
||
Both quantized indexes are available through the `quantized` Cargo feature. Note that | ||
they rely on `BLAS` and `LAPACK` which thus have to be installed locally. The CPU | ||
running the code must also support `AVX2` instructions. Furthermore, `QgIndex` | ||
performances can be [improved][qg-optim] by using the `qg_optim` Cargo feature. | ||
they rely on `BLAS` and `LAPACK` which thus have to be installed locally. Furthermore, | ||
`QgIndex` performances can be [improved][qg-optim] by using the `qg_optim` Cargo | ||
feature. | ||
|
||
The `NgtIndex` default implementation is an ANNG, it can be optimized[^3] or converted | ||
The `NgtIndex` default implementation is an ANNG. It can be optimized[^3] or converted | ||
to an ONNG through the [`optim`][ngt-optim] module. | ||
|
||
By default `ngt-rs` will be built dynamically, which requires `CMake` to build NGT. This | ||
means that you'll have to make the build artifact `libngt.so` available to your final | ||
binary (see an example in the [CI][ngt-ci]). | ||
|
||
However the `static` feature will build and link NGT statically. Note that `OpenMP` will | ||
also be linked statically. If the `quantized` feature is used, then `BLAS` and `LAPACK` | ||
libraries will also be linked statically. | ||
|
||
Finally, NGT's [shared memory][ngt-sharedmem] and [large dataset][ngt-largedata] | ||
features are available through the features `shared_mem` and `large_data` respectively. | ||
|
||
## Usage | ||
|
||
Defining the properties of a new index: | ||
|
||
```rust,ignore | ||
use ngt::{NgtProperties, NgtDistance}; | ||
// Defaut properties with vectors of dimension 3 | ||
let prop = NgtProperties::<f32>::dimension(3)?; | ||
// Or customize values (here are the defaults) | ||
let prop = NgtProperties::<f32>::dimension(3)? | ||
.creation_edge_size(10)? | ||
.search_edge_size(40)? | ||
.distance_type(NgtDistance::L2)?; | ||
``` | ||
binary (see an example in the [CI][ngt-ci]). However the `static` feature will build and | ||
link NGT statically. Note that `OpenMP` will also be linked statically. If the | ||
`quantized` feature is used, then `BLAS` and `LAPACK` libraries will also be linked | ||
statically. | ||
|
||
Creating/Opening an index and using it: | ||
NGT's [shared memory][ngt-sharedmem] and [large dataset][ngt-largedata] features are | ||
available through the Cargo features `shared_mem` and `large_data` respectively. | ||
|
||
```rust,ignore | ||
use ngt::{NgtIndex, NgtProperties, EPSILON}; | ||
[^1]: [Graph and tree based method explanation][ngt-desc] | ||
|
||
// Create a new index | ||
let prop = NgtProperties::dimension(3)?; | ||
let index: NgtIndex<f32> = NgtIndex::create("target/path/to/index/dir", prop)?; | ||
[^2]: [Quantized graph based method explanation][qg-desc] | ||
|
||
// Open an existing index | ||
let mut index = NgtIndex::open("target/path/to/index/dir")?; | ||
// Insert two vectors and get their id | ||
let vec1 = vec![1.0, 2.0, 3.0]; | ||
let vec2 = vec![4.0, 5.0, 6.0]; | ||
let id1 = index.insert(vec1)?; | ||
let id2 = index.insert(vec2)?; | ||
// Build the index in RAM (not yet persisted on disk) | ||
// This is required in order to be able to search vectors | ||
index.build(2)?; | ||
// Perform a vector search (with 1 result) | ||
let res = index.search(&vec![1.1, 2.1, 3.1], 1, EPSILON)?; | ||
assert_eq!(res[0].id, id1); | ||
assert_eq!(index.get_vec(id1)?, vec![1.0, 2.0, 3.0]); | ||
// Remove a vector and check that it is not present anymore | ||
index.remove(id1)?; | ||
let res = index.get_vec(id1); | ||
assert!(res.is_err()); | ||
// Verify that now our search result is different | ||
let res = index.search(&vec![1.1, 2.1, 3.1], 1, EPSILON)?; | ||
assert_eq!(res[0].id, id2); | ||
assert_eq!(index.get_vec(id2)?, vec![4.0, 5.0, 6.0]); | ||
// Persist index on disk | ||
index.persist()?; | ||
``` | ||
[^3]: [NGT index optimizations in Python][ngt-optim-py] | ||
|
||
[ngt]: https://github.com/yahoojapan/NGT | ||
[ngt-desc]: https://opensource.com/article/19/10/ngt-open-source-library | ||
[ngt-sharedmem]: https://github.com/yahoojapan/NGT#shared-memory-use | ||
[ngt-largedata]: https://github.com/yahoojapan/NGT#large-scale-data-use | ||
[ngt-ci]: https://github.com/lerouxrgd/ngt-rs/blob/master/.github/workflows/ci.yaml | ||
[ngt-optim]: https://docs.rs/ngt/latest/ngt/optim/index.html | ||
[ngt-optim-py]: https://github.com/yahoojapan/NGT/wiki/Optimization-Examples-Using-Python | ||
[qg-desc]: https://medium.com/@masajiro.iwasaki/fusion-of-graph-based-indexing-and-product-quantization-for-ann-search-7d1f0336d0d0 | ||
[qg-optim]: https://github.com/yahoojapan/NGT#build-parameters-1 | ||
|
||
[^1]: https://opensource.com/article/19/10/ngt-open-source-library | ||
[^2]: https://medium.com/@masajiro.iwasaki/fusion-of-graph-based-indexing-and-product-quantization-for-ann-search-7d1f0336d0d0 | ||
[^3]: https://github.com/yahoojapan/NGT/wiki/Optimization-Examples-Using-Python | ||
[f16]: https://docs.rs/half/latest/half/struct.f16.html | ||
[index-ngt]: https://docs.rs/ngt/latest/ngt/#usage | ||
[index-qg]: https://docs.rs/ngt/latest/ngt/qg/ | ||
[index-qbg]: https://docs.rs/ngt/latest/ngt/qgb/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.