Skip to content

Vector Database SDK πŸš€

VecML provides an easy-to-use, fast, and accurate vector database SDK toolkit.


πŸ›  Creating a Fluffy Interface

To use all features of the SDK, include the following header:

#include "fluffy_interface.h"

First, we create a Fluffy Interface instance (fluffyInterface).
A Fluffy Interface manages one data collection and all the indices built for that collection.

std::string fluffy_license_path = "license.txt";
std::string base_path = "path/to/your/project";
fluffy::FluffyInterface fluffyInterface(base_path, fluffy_license_path);

A fluffy interface is created at base_path. Data and indices will be stored in this directory.


πŸ—‚ Adding Vectors to the Interface

We now can add a data vector into the data collection. Each vector is stored in a fluffy::Vector object, with the following components:

  • string_id (required, str): a unique string identifier of the vector.
  • vector (required): the vector data (the supported format will be introduced below).
  • Attributes (optional, str): extra properties of the vector, e.g., date, location, title.

Currently the VecML SDK supports the following vector types:

  • dense: the standard float32 dense vector. For example, [0, 1.2, 2.4, -10 ,5.7]. Standard embedding vectors from language or vision models can be saved as this type.
  • dense8Bit: uint8 dense vectors, with integer vector elements ranging in [0, 255]. For example, [0, 3, 76, 255, 152]. 8-bit quantized embedding vectors can be saved as this type for storage saving.
  • dense4Bit: 4-bit quantized dense vectors, with integer vector elements ranging in [0, 15].
  • dense2Bit: 2-bit quantized dense vectors, with integer vector elements ranging in [0, 3].
  • dense1Bit: 1-bit quantized dense vectors, with binary vector elements.
  • sparse: sparse vector formatted as a set of index:value pairs (e.g., the libsvm data format). This is useful for high-dimensional sparse vectors.

Given a C++ vector, we first need to build the fluffy::Vector with the proper type:

// for Dense type
std::vector<float> cpp_vector = {0.1, 0.5, -0.7};     // float32 c++ vector
std::unique_ptr<fluffy::Vector> vector;
fluffyInterface.build_vector_dense(cpp_vector, vector);

// for X-Bit Dense type
std::vector<uint8_t> cpp_vector = {3, 217, 56};    // uint8_t c++ vector
std::unique_ptr<fluffy::Vector> vector;
int bits_per_element = 8;    // or 4, 2, 1
fluffyInterface.build_vector_bit(cpp_vector, bits_per_element, vector);

// for Sparse type
std::vector<std::pair<idx_t, float>> cpp_vector = {{0, 0.1}, {7, 0.5}, {102948, -0.3}};
std::unique_ptr<fluffy::Vector> vector;
int dim = 0;
fluffyInterface.build_vector_sparse(cpp_vector, vector, dim);
Note that, for sparse vector, we can simply set the parameter dim to zero.

We can also set attributes of the vector by

std::string date = "2025-03-11";            
std::string sentence = "hello, world";     

vector->set_attribute("date", reinterpret_cast<const uint8_t*>(date.data()), date.size());
vector->set_attribute("sentence", reinterpret_cast<const uint8_t*>(sentence.data()), sentence.size());
Now we have set two attributes, "date" and "sentence", for the vector.

After building the fluffy::Vector object, to insert the vector to fluffy interface:

std::string string_id = "the_unique_id"
fluffyInterface.add_data(string_id, vector);
To insert multiple vectors, loop over the vectors with the above approach.

NOTE: For one vector collection, all vectors added are required to have same type and dimensionality.


πŸ“’ Building an Index

After vectors are inserted into the data collection (managed by fluffyInterface), you can call attach_index() to create an ANN index for nearest neighbor search.

Distance Types. The VecML SDK supports the following distance function types to build the index:

  • Dense (float32) and sparse vectors: Euclidean (DistanceFunctionType::Euclidean), cosine (DistanceFunctionType::NegativeCosineSimilarity), inner product (DistanceFunctionType::NegativeInnerProduct)

  • Quantized dense vectors (8-bit, 4-bit, 2-bit, 1-bit): Euclidean (DistanceFunctionType::Euclidean), Hamming (DistanceFunctionType::Hamming)

NOTE: for cosine similarity and inner product, by design we add a negative sign to the true value (e.g., if cosine=0.91, it will be shown as -0.91). You can take the absolute value to get the "original" similarity.

std::string index_name = "index_1";
fluffy::DistanceFunctionType distance_type = fluffy::DistanceFunctionType::Euclidean;
int dim = ...;     // this is the dimension of your vector

// for dense vector
fluffyInterface.attach_index(dim, "dense", distance_type, index_name);

// for quantized dense vector
fluffyInterface.attach_index(dim, "dense4Bit", 4, distance_type, index_name);    // similar for 8-bit, 2-bit, 1-bit

// for sparse vector
fluffyInterface.attach_index(0, "sparse", distance_type, index_name);    // for sparse vectors, we can set dim = 0 
After running the function, an index named "index_1" is constructed for the data collection.

Alternative Approach

You can build the index and create the vector collection simultaneously.

  • Attach the index before adding data.
  • As you add vectors to the collection, they will also be indexed in real-time.

fluffyInterface.attach_index(1024, "dense", fluffy::DistanceFunctionType::Euclidean, "index_1");

std::vector<float> cpp_vector = ...;
std::unique_ptr<fluffy::Vector> vector;
fluffyInterface.build_vector_dense(cpp_vector, vector);

fluffyInterface.add_data(string_id, vector);
This will add the vector to the data collection, and at the same time insert it into the index "index_1".

Batch Insert for Indexing (Multi-Thread)

The VecML SDK supports batch insert which implements multi-thread computation (parallel vector insertion to an index). Currently, to use batch insertion, the user must call attach_index BEFORE add_data_batch.

fluffyInterface.attach_index(3, "dense", distance_type, index_name);           // attach index first
std::vector<std::pair<std::string, std::unique_ptr<fluffy::Vector>>> batch_data;

for (int i=0; i<100; ++i) {
  std::vector<float> cpp_vector = {0.1+i, 0.5+i, -0.7+i};     // as illustration, we construct some synthetic vectors
  std::unique_ptr<fluffy::Vector> vector;
  fluffyInterface.build_vector_dense(cpp_vector, vector);

  batch_data.push_back({std::to_string(i), std::move(vector)});
}

int num_threads = 8;
fluffyInterface.add_data_batch(batch_data, num_threads);      // batch insert to index
The parameter num_threads specifies the number of threads used for parallel data insertion. The batch size of the vectors can be flexible. For example, 1000 vectors per batch could be a reasonable number.

Thread Safety

If batch insertion is used, do not apply other custom multithreading (e.g., std::thread in C++) or use OpenMP.


After building an index, you can search for the top-k nearest neighbors. We will create a Query object for it.

fluffy::Query query;
query.top_k = k;     // set the number of returned neareast neighbors
query.similarity_measure = fluffy::DistanceFunctionType::Euclidean;

// Set up the query vector. The codes for building the vector is the same as adding vector. We use dense vector as an example
std::vector<float> cpp_query_vector = {0.0, -0.2, 0.6};
std::unique_ptr<fluffy::Vector> query_data;
fluffyInterface.build_vector_dense(cpp_query_vector, query_data);

query.vector = query_data.get();         // assign the raw pointer of query data to query.vector

// Perform the search
fluffy::InterfaceQueryResults result;
fluffyInterface.search(query, result, "index_1");    // specify the index name to search

// Access and print the results
int count = 0;
for (auto res : result.results) {
    std::cout << "Neighbor " << count + 1 << ": " << res.string_id << "  " << res.dis << std::endl;
    count++;
}

'''
When k=3, the printed output would be something like:

Neighbor 1: "string_id_83"  0.52
Neighbor 2: "string_id_47203"  0.69
Neighbor 3: "string_id_2751"  0.72
'''
The returned retrieval result is a size-k vector of (string_id, distance) pairs, representing the top-1, top-2, ... nearest neighbors. Top-1 is the most similar to the query.

The corresponding vector can also be retrieved from data collection using string_id by

// We use dense vector as an example
std::vector<float> retrieved_vector;
std::string string_id = "unique_string_id";
fluffyInterface.to_vector_dense(string_id, retrieved_vector);

βŒ›Search with Filter / Condition Constraints

VecML SDK supports nearest neighbor search under filter/condition constraints. For example, you can search top-k neighbors of a query under the constraint that the vector attribute date is within a range. Fluffy allows very general and flexible conditions---the condition is simply a user-defined C++ Lambda function on a fluffy::Vector that returns a bool, which can depend on the vector data and its attributes. An example is given below, where the filter is a date range.

std::string start_attr = "2024-03-11";
std::string end_attr = "2024-05-26";
auto eval_func = [start_attr, end_attr](const fluffy::Vector& vector) -> bool {
    std::string date;
    if (vector.get_attribute("date", date) == fluffy::ErrorCode::Success) {    // retrieve date from vector
        return date >= start_attr && date <= end_attr;      // check if the date is within the start and end range; if yes, return true
    }
    return false;
};
condition.set_eval(eval_func);

// Then add the condition to query and search
query.condition = &condition;
fluffy::InterfaceQueryResults result;
fluffyInterface.search(query, result, "index_1");
The retrieved results are guaranteed to satisfy the user-defined condition, and it is guaranteed that exactly k nearest neighbors will be returned, unless there are in total fewer than k vectors in the data collection that satisfy the condition.

πŸ—‘Remove a Vector

To remove a vector (both from data collection and the indices), simply call remove_data with the string ID of the vector:

std::string_id = "unique_id";
fluffyInterface.remove_data(string_id);


Managing Storage and Performance

πŸ’Ύ Flushing Data to Disk

To ensure that all changes are persisted to disk, call flush():

fluffy::ErrorCode status = fluffyInterface.flush();
if (status == fluffy::ErrorCode::Success) {
    std::cout << "Data successfully flushed to disk." << std::endl;
}
Calling flush() prevents data loss in case of system crashes.

βœ‚οΈ Saving Memory

Efficient memory management is crucial when working with large vector datasets. Our SDK provides an offload() function that allows offloading memory to disk to reduce runtime memory usage.

Using Offload to Reduce Memory Usage

The fluffyInterface class provides the following method:

fluffyInterface.offload();

Calling offload() frees up memory by offloading unused or cached data to disk. The system will transparently reload data as needed, meaning you do not need to manage reloading manually.

When to Call Flush/Offload?

While flush() and offload() can be called at any time, frequent calls may slightly slow down the system due to repeated disk operations. Consider using them strategically:

βœ… Best Practices:

  • Call flush() after a large batch of data is inserted/indexed, or a relatively long time window periodically.

  • Call offload() after indexing large datasets to free memory used during index creation.

  • Call offload() after performing queries if you have limited memory.

  • Do not call flush() or offload() in rapid succession (e.g., in a loop), as it may slow down the system. If necessary, call them after a batch of operations.

Example Usage

// Reduce memory usage after indexing
fluffy::ErrorCode status = fluffyInterface.offload();
if (status == fluffy::ErrorCode::Success) {
    std::cout << "Memory offloaded successfully." << std::endl;
} else {
    std::cerr << "Offload failed with error: " << static_cast<int>(status) << std::endl;
}

Memory Budget Considerations

If your application has a low memory budget, using offload() after indexing and querying is recommended. This approach ensures that only actively used data remains in memory while older or less frequently used data is offloaded.

By using offload(), you can efficiently manage memory usage without sacrificing functionality, as data will be transparently reloaded when needed.

πŸš€ Now your system can handle larger datasets with lower memory usage!