GitHub - MinishLab/model2vec-rs: Official Rust Implementation of Model2Vec

Fast State-of-the-Art Static Embeddings in Rust

Quickstart • Features • Models • Performance • Relation to Python Model2Vec

model2vec-rs is a Rust crate providing an efficient implementation for inference with Model2Vec static embedding models. Model2Vec is a technique for creating compact and fast static embedding models from sentence transformers, achieving significant reductions in model size and inference speed. This Rust crate is optimized for performance, making it suitable for applications requiring fast embedding generation.

Quickstart

You can utilize model2vec-rs in two ways:

As a library in your Rust projects
As a standalone Command-Line Interface (CLI) tool for quick terminal-based inferencing

1. Using `model2vec-rs` as a Library

Integrate model2vec-rs into your Rust application to generate embeddings within your code.

a. Add model2vec-rs as a dependency:

cargo add model2vec-rs

b. Load a model and generate embeddings:

use anyhow::Result;
use model2vec_rs::model::StaticModel;

fn main() -> Result<()> {
    // Load a model from the Hugging Face Hub or a local path.
    // Arguments: (repo_or_path, hf_token, normalize_embeddings, subfolder_in_repo)
    let model = StaticModel::from_pretrained(
        "minishlab/potion-base-8M", // Model ID from Hugging Face or local path to model directory
        None,                       // Optional: Hugging Face API token for private models
        None,                       // Optional: bool to override model's default normalization. `None` uses model's config.
        None                        // Optional: subfolder if model files are not at the root of the repo/path
    )?;

    let sentences = vec![
        "Hello world".to_string(),
        "Rust is awesome".to_string(),
    ];

    // Generate embeddings using default parameters
    // (Default max_length: Some(512), Default batch_size: 1024)
    let embeddings = model.encode(&sentences);
    // `embeddings` is a Vec<Vec<f32>>
    println!("Generated {} embeddings.", embeddings.len());

    // To generate embeddings with custom arguments:
    let custom_embeddings = model.encode_with_args(
        &sentences,
        Some(256), // Optional: custom max token length for truncation
        512,       // Custom batch size for processing
    );
    println!("Generated {} custom embeddings.", custom_embeddings.len());

    Ok(())
}

Alternative: Loading from in-memory bytes (wasm / embedded use cases):

When filesystem access is unavailable, use from_bytes instead of from_pretrained:

use model2vec_rs::model::StaticModel;

let model = StaticModel::from_bytes(
    include_bytes!("path/to/tokenizer.json").as_ref(),
    include_bytes!("path/to/model.safetensors").as_ref(),
    include_bytes!("path/to/config.json").as_ref(),
    None, // normalize: None reads the value from config.json
)?;

let embeddings = model.encode(&["Hello world".to_string()]);

See the feature flags section for wasm and local-only build configurations.

2. Using the `model2vec-rs` CLI

a. Install the CLI tool: This command compiles the crate in release mode (for speed) and installs the model2vec-rs executable to Cargo's binary directory ~/.cargo/bin/.

cargo install model2vec-rs

Ensure ~/.cargo/bin/ is in your system's PATH to run model2vec-rs from any directory.

b. Generate embeddings via CLI: The compiled binary installed via cargo install is significantly faster (often >10x) than running via cargo run -- ... without release mode.

Encode a single sentence:
```
model2vec-rs encode-single "Hello world" "minishlab/potion-base-8M"
```
Embeddings will be printed to the console in JSON format. This command should take less than 0.1s to execute.

Encode multiple lines from a file and save to an output file:

echo -e "This is the first sentence.\nThis is another sentence." > my_texts.txt
model2vec-rs encode my_texts.txt "minishlab/potion-base-8M" --output embeddings_output.json

Features

Fast Inference: Optimized Rust implementation for fast embedding generation.
Hugging Face Hub Integration: Load pre-trained Model2Vec models directly from the Hugging Face Hub using model IDs, or use models from local paths.
Model Formats: Supports models with f32, f16, and i8 weight types stored in safetensors files.
Flexible Loading: Load models from the Hub (from_pretrained), raw bytes (from_bytes), or owned/borrowed data (from_owned, from_borrowed) for embedded and WASM use cases.
Batch Processing: Encodes multiple sentences in batches.
Configurable Encoding: Allows customization of maximum sequence length and batch size during encoding.

Feature flags

The crate exposes a few feature combinations for different runtimes:

default: native build with onig tokenization and optional Hugging Face Hub downloads
fancy-regex: alternative tokenizer backend for native builds
local-only: disable remote model downloads and restrict loading to local paths or from_bytes(...)
wasm: minimal WebAssembly-oriented feature set for in-memory loading via from_bytes(...)

Typical invocations:

# native local-only build
cargo build --no-default-features --features onig,local-only

# wasm (requires getrandom backend config)
RUSTFLAGS='--cfg getrandom_backend="wasm_js"' \
cargo check --no-default-features --features wasm --target wasm32-unknown-unknown

What is Model2Vec?

Model2Vec is a technique to distill large sentence transformer models into highly efficient static embedding models. This process significantly reduces model size and computational requirements for inference. For a detailed understanding of how Model2Vec works, including the distillation process and model training, please refer to the main Model2Vec Python repository and its documentation.

This model2vec-rs crate provides a Rust-based engine specifically for inference using these Model2Vec models.

Models

A variety of pre-trained Model2Vec models are available on the HuggingFace Hub (MinishLab collection). These can be loaded by model2vec-rs using their Hugging Face model ID or by providing a local path to the model files.

Model	Language	Distilled From (Original Sentence Transformer)	Params	Task
potion-base-32M	English	bge-base-en-v1.5	32.3M	General
potion-multilingual-128M	Multilingual	bge-m3	128M	General
potion-retrieval-32M	English	bge-base-en-v1.5	32.3M	Retrieval
potion-code-16M	Code	CodeRankEmbed	16M	Code
potion-base-8M	English	bge-base-en-v1.5	7.5M	General
potion-base-4M	English	bge-base-en-v1.5	3.7M	General
potion-base-2M	English	bge-base-en-v1.5	1.8M	General

Performance

We compared the performance of the Rust implementation with the Python version of Model2Vec. The benchmark was run single-threaded on a CPU.

Implementation	Throughput
Rust	8000 samples/second
Python	4650 samples/second

The Rust version is roughly 1.7× faster than the Python version.

Relation to Python `model2vec`

model2vec-rs (This Crate): High-performance Rust engine for 1.7x faster Model2Vec inference.
model2vec (Python-based): Handles model distillation, training, fine-tuning, and slower Python-based inference.

License

MIT

Citing Model2Vec

If you use Model2Vec in your research, please cite the following:

@software{minishlab2024model2vec,
  author       = {Stephan Tulkens and {van Dongen}, Thomas},
  title        = {Model2Vec: Fast State-of-the-Art Static Embeddings},
  year         = {2024},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.17270888},
  url          = {https://github.com/MinishLab/model2vec},
  license      = {MIT}
}

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github/workflows		.github/workflows
assets/images		assets/images
src		src
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fast State-of-the-Art Static Embeddings in Rust

Quickstart

1. Using `model2vec-rs` as a Library

2. Using the `model2vec-rs` CLI

Features

Feature flags

What is Model2Vec?

Models

Performance

Relation to Python `model2vec`

License

Citing Model2Vec

About

Uh oh!

Releases 6

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fast State-of-the-Art Static Embeddings in Rust

Quickstart

1. Using model2vec-rs as a Library

2. Using the model2vec-rs CLI

Features

Feature flags

What is Model2Vec?

Models

Performance

Relation to Python model2vec

License

Citing Model2Vec

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

1. Using `model2vec-rs` as a Library

2. Using the `model2vec-rs` CLI

Relation to Python `model2vec`

Packages