Class Hnsw

Class Documentation

class n2::Hnsw

Public Functions

Hnsw()
Hnsw(int dim, std::string metric = "angular")

Makes an instance of Hnsw Index.

Return

A new Hnsw index.

Parameters
  • dim: Dimension of vectors.

  • metric: An optional parameter to choose a distance metric. ('angular' | 'L2' | 'dot') (default: 'angular').

Hnsw(const Hnsw &other)
Hnsw(Hnsw &&other) noexcept
~Hnsw()
Hnsw &operator=(const Hnsw &other)
Hnsw &operator=(Hnsw &&other) noexcept
void AddData(const std::vector<float> &data)

Adds vector to Hnsw index.

Parameters
  • data: A vector with dimension dim.

void SetConfigs(const std::vector<std::pair<std::string, std::string>> &configs)

Set configurations by key/value pairs.

To set configurations as default values, pass negative values to configuration parameters.

void Build(int m = -1, int max_m0 = -1, int ef_construction = -1, int n_threads = -1, float mult = -1, NeighborSelectingPolicy neighbor_selecting = NeighborSelectingPolicy::HEURISTIC, GraphPostProcessing graph_merging = GraphPostProcessing::SKIP, bool ensure_k = false)

Builds a hnsw graph with given configurations.

To see other available values for

neighbor_selecting and graph_merging, refer to NeighborSelectingPolicy() and GraphPostProcessing().
See

Fit(), SetConfigs()

Parameters
  • m: Max number of edges for nodes at level > 0 (default: 12).

  • max_m0: Max number of edges for nodes at level == 0 (default: 24).

  • ef_construction: Refer to HNSW paper for its role (default: 150).

  • n_threads: Number of threads for building index.

  • mult: Level multiplier (default value recommended) (default: 1/log(1.0*M)).

  • NeighborSelectingPolicy: Neighbor selecting policy.

  • GraphPostProcessing: Graph merging heuristic.

void Fit()

Builds a hnsw graph with given configurations.

bool SaveModel(const std::string &fname) const

Saves the index to disk.

Parameters
  • fname: An index file name.

bool LoadModel(const std::string &fname, const bool use_mmap = true)

Loads an index from disk.

Parameters
  • fname: An index file name.

  • use_mmap: An optional parameter (default: true). If this parameter is set, N2 loads model through mmap.

void UnloadModel()

Unloads the loaded index file.

void SearchByVector(const std::vector<float> &qvec, size_t k, size_t ef_search, std::vector<int> &result)
void SearchByVector(const std::vector<float> &qvec, size_t k, size_t ef_search, std::vector<std::pair<int, float>> &result)

Search k nearest items (as vectors) to a query item.

Parameters
  • qvec: A query vector.

  • k: k value.

  • ef_search: (default: 50 * k). If you pass a negative value to ef_search, ef_search will be set as the default value.

  • [out] result: k nearest items.

void SearchById(int id, size_t k, size_t ef_search, std::vector<int> &result)
void SearchById(int id, size_t k, size_t ef_search, std::vector<std::pair<int, float>> &result)

Search k nearest items (as ids) to a query item.

Parameters
  • id: A query id.

  • k: k value.

  • ef_search: (default: 50 * k). If you pass a negative value to ef_search, ef_search will be set as the default value.

  • [out] result: k nearest items.

void BatchSearchByVectors(const std::vector<std::vector<float>> &qvecs, size_t k, size_t ef_search, size_t n_threads, std::vector<std::vector<int>> &results)
void BatchSearchByVectors(const std::vector<std::vector<float>> &qvecs, size_t k, size_t ef_search, size_t n_threads, std::vector<std::vector<std::pair<int, float>>> &results)

Search k nearest items (as vectors) to each query item (batch search with multi-threads).

Parameters
  • qvecs: Query vectors.

  • k: k value.

  • ef_search: (default: 50 * k). If you pass a negative value to ef_search, ef_search will be set as the default value.

  • n_threads: Number of threads to use for search.

  • [out] result: vector of k nearest items for each input query item in the order passed to parameter qvecs.

void BatchSearchByIds(const std::vector<int> ids, size_t k, size_t ef_search, size_t n_threads, std::vector<std::vector<int>> &results)
void BatchSearchByIds(const std::vector<int> ids, size_t k, size_t ef_search, size_t n_threads, std::vector<std::vector<std::pair<int, float>>> &results)

Search k nearest items (as ids) to each query item (batch search with multi-threads).

Parameters
  • ids: Query ids.

  • k: k value.

  • ef_search: (default: 50 * k). If you pass a negative value to ef_search, ef_search will be set as the default value.

  • n_threads: Number of threads to use for search.

  • [out] result: vector of k nearest items for each input query item in the order passed to parameter ids.

void PrintDegreeDist() const

Prints degree distributions.

void PrintConfigs() const

Prints index configurations.