-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #14 from lerouxrgd/ngt-2
NGT 2
- Loading branch information
Showing
18 changed files
with
2,266 additions
and
557 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
[package] | ||
name = "ngt" | ||
version = "0.4.5" | ||
version = "0.5.0" | ||
authors = ["Romain Leroux <[email protected]>"] | ||
edition = "2021" | ||
description = "Rust wrappers for NGT nearest neighbor search." | ||
|
@@ -11,17 +11,23 @@ license = "Apache-2.0" | |
readme = "README.md" | ||
|
||
[dependencies] | ||
ngt-sys = { path = "ngt-sys", version = "1.14.8-static" } | ||
num_enum = "0.5" | ||
openmp-sys = { version="1.2.3", features=["static"] } | ||
half = "2" | ||
ngt-sys = { path = "ngt-sys", version = "2.1.2" } | ||
num_enum = "0.7" | ||
scopeguard = "1" | ||
|
||
[dev-dependencies] | ||
rand = "0.8" | ||
rayon = "1" | ||
tempfile = "3" | ||
|
||
[features] | ||
default = [] | ||
static = ["ngt-sys/static"] | ||
shared_mem = ["ngt-sys/shared_mem"] | ||
large_data = ["ngt-sys/large_data"] | ||
quantized = ["ngt-sys/quantized"] | ||
qg_optim = ["quantized", "ngt-sys/qg_optim"] | ||
|
||
[package.metadata.docs.rs] | ||
features = ["quantized"] | ||
rustdoc-args = ["--cfg", "docsrs"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,79 +1,55 @@ | ||
# ngt-rs   [![Latest Version]][crates.io] [![Latest Doc]][docs.rs] | ||
# ngt-rs | ||
|
||
[Latest Version]: https://img.shields.io/crates/v/ngt.svg | ||
[crates.io]: https://crates.io/crates/ngt | ||
[Latest Doc]: https://docs.rs/ngt/badge.svg | ||
[docs.rs]: https://docs.rs/ngt | ||
[![crate]][crate-ngt] [![doc]][doc-ngt] | ||
|
||
Rust wrappers for [NGT][], which provides high-speed approximate nearest neighbor | ||
searches against a large volume of data. | ||
|
||
Building NGT requires `CMake`. By default `ngt-rs` will be built dynamically, which | ||
means that you'll need to make the build artifact `libngt.so` available to your final | ||
binary. You'll also need to have `OpenMP` installed on the system where it will run. If | ||
you want to build `ngt-rs` statically, then use the `static` Cargo feature, note that in | ||
this case `OpenMP` will be disabled when building NGT. | ||
|
||
Furthermore, NGT's shared memory and large dataset features are available through Cargo | ||
features `shared_mem` and `large_data` respectively. | ||
|
||
## Usage | ||
|
||
Defining the properties of a new index: | ||
[crate]: https://img.shields.io/crates/v/ngt.svg | ||
[crate-ngt]: https://crates.io/crates/ngt | ||
[doc]: https://docs.rs/ngt/badge.svg | ||
[doc-ngt]: https://docs.rs/ngt | ||
|
||
```rust | ||
use ngt::{Properties, DistanceType, ObjectType}; | ||
|
||
// Defaut properties with vectors of dimension 3 | ||
let prop = Properties::dimension(3)?; | ||
|
||
// Or customize values (here are the defaults) | ||
let prop = Properties::dimension(3)? | ||
.creation_edge_size(10)? | ||
.search_edge_size(40)? | ||
.object_type(ObjectType::Float)? | ||
.distance_type(DistanceType::L2)?; | ||
``` | ||
|
||
Creating/Opening an index and using it: | ||
|
||
```rust | ||
use ngt::{Index, Properties, EPSILON}; | ||
Rust wrappers for [NGT][], which provides high-speed approximate nearest neighbor | ||
searches against a large volume of data in high dimensional vector data space (several | ||
ten to several thousand dimensions). The vector data can be `f32`, `u8`, or [f16][]. | ||
|
||
// Create a new index | ||
let prop = Properties::dimension(3)?; | ||
let index = Index::create("target/path/to/index/dir", prop)?; | ||
This crate provides the following indexes: | ||
* [`NgtIndex`][index-ngt]: Graph and tree based index[^1] | ||
* [`QgIndex`][index-qg]: Quantized graph based index[^2] | ||
* [`QbgIndex`][index-qbg]: Quantized blob graph based index | ||
|
||
// Open an existing index | ||
let mut index = Index::open("target/path/to/index/dir")?; | ||
Both quantized indexes are available through the `quantized` Cargo feature. Note that | ||
they rely on `BLAS` and `LAPACK` which thus have to be installed locally. Furthermore, | ||
`QgIndex` performances can be [improved][qg-optim] by using the `qg_optim` Cargo | ||
feature. | ||
|
||
// Insert two vectors and get their id | ||
let vec1 = vec![1.0, 2.0, 3.0]; | ||
let vec2 = vec![4.0, 5.0, 6.0]; | ||
let id1 = index.insert(vec1)?; | ||
let id2 = index.insert(vec2)?; | ||
The `NgtIndex` default implementation is an ANNG. It can be optimized[^3] or converted | ||
to an ONNG through the [`optim`][ngt-optim] module. | ||
|
||
// Actually build the index (not yet persisted on disk) | ||
// This is required in order to be able to search vectors | ||
index.build(2)?; | ||
By default `ngt-rs` will be built dynamically, which requires `CMake` to build NGT. This | ||
means that you'll have to make the build artifact `libngt.so` available to your final | ||
binary (see an example in the [CI][ngt-ci]). However the `static` feature will build and | ||
link NGT statically. Note that `OpenMP` will also be linked statically. If the | ||
`quantized` feature is used, then `BLAS` and `LAPACK` libraries will also be linked | ||
statically. | ||
|
||
// Perform a vector search (with 1 result) | ||
let res = index.search(&vec![1.1, 2.1, 3.1], 1, EPSILON)?; | ||
assert_eq!(res[0].id, id1); | ||
assert_eq!(index.get_vec(id1)?, vec![1.0, 2.0, 3.0]); | ||
NGT's [shared memory][ngt-sharedmem] and [large dataset][ngt-largedata] features are | ||
available through the Cargo features `shared_mem` and `large_data` respectively. | ||
|
||
// Remove a vector and check that it is not present anymore | ||
index.remove(id1)?; | ||
let res = index.get_vec(id1); | ||
assert!(matches!(res, Result::Err(_))); | ||
[^1]: [Graph and tree based method explanation][ngt-desc] | ||
|
||
// Verify that now our search result is different | ||
let res = index.search(&vec![1.1, 2.1, 3.1], 1, EPSILON)?; | ||
assert_eq!(res[0].id, id2); | ||
assert_eq!(index.get_vec(id2)?, vec![4.0, 5.0, 6.0]); | ||
[^2]: [Quantized graph based method explanation][qg-desc] | ||
|
||
// Persist index on disk | ||
index.persist()?; | ||
``` | ||
[^3]: [NGT index optimizations in Python][ngt-optim-py] | ||
|
||
[ngt]: https://github.com/yahoojapan/NGT | ||
[ngt-desc]: https://opensource.com/article/19/10/ngt-open-source-library | ||
[ngt-sharedmem]: https://github.com/yahoojapan/NGT#shared-memory-use | ||
[ngt-largedata]: https://github.com/yahoojapan/NGT#large-scale-data-use | ||
[ngt-ci]: https://github.com/lerouxrgd/ngt-rs/blob/master/.github/workflows/ci.yaml | ||
[ngt-optim]: https://docs.rs/ngt/latest/ngt/optim/index.html | ||
[ngt-optim-py]: https://github.com/yahoojapan/NGT/wiki/Optimization-Examples-Using-Python | ||
[qg-desc]: https://medium.com/@masajiro.iwasaki/fusion-of-graph-based-indexing-and-product-quantization-for-ann-search-7d1f0336d0d0 | ||
[qg-optim]: https://github.com/yahoojapan/NGT#build-parameters-1 | ||
[f16]: https://docs.rs/half/latest/half/struct.f16.html | ||
[index-ngt]: https://docs.rs/ngt/latest/ngt/#usage | ||
[index-qg]: https://docs.rs/ngt/latest/ngt/qg/ | ||
[index-qbg]: https://docs.rs/ngt/latest/ngt/qgb/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
[package] | ||
name = "ngt-sys" | ||
version = "1.14.8-static" | ||
version = "2.1.2" | ||
authors = ["Romain Leroux <[email protected]>"] | ||
edition = "2021" | ||
links = "ngt" | ||
|
@@ -18,4 +18,6 @@ cpp_build = { version = "0.5", optional = true } | |
[features] | ||
static = ["dep:cpp_build"] | ||
shared_mem = [] | ||
large_data = [] | ||
large_data = [] | ||
quantized = [] | ||
qg_optim = [] |
Submodule NGT
updated
87 files
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.