fastembed-rs - Rust Implementation for Fast and Efficient Embedding Inference

FastEmbed-rs: A Fast and Efficient Embedding Library in Rust

FastEmbed-rs is a powerful Rust library developed as a Rust implementation of the popular @qdrant/fastembed project. It's designed for developers who need fast, reliable embeddings generation for text and images, with an emphasis on speed, efficiency, and accuracy. Below, we provide an in-depth look at FastEmbed-rs, highlighting its features, usage, models, and benefits.

🍕 Features

FastEmbed-rs stands out due to its efficient, high-performance capabilities:

Synchronous Usage: It operates without the dependency on asynchronous runtimes like Tokio, allowing straightforward integration in projects.
ONNX Inference: Utilizes @pykeio/ort for high-performance ONNX model inference.
Fast Tokenization: Leverages @huggingface/tokenizers for fast text encodings.
Parallelism: Supports batch embeddings generation with concurrent processing using @rayon-rs/rayon.

With these features, the tool enhances performance and scalability, especially suitable for large data processing.

🔍 Alternatives in Other Languages

FastEmbed's versatility extends beyond Rust, with implementations in multiple languages:

Python: Explore fastembed.
Go: Use fastembed-go.
JavaScript: Apply fastembed-js.

🤖 Models

FastEmbed-rs offers a range of models tailored to different embedding needs:

Text Embedding

Default model: BAAI/bge-small-en-v1.5

Other models:
- sentence-transformers/all-MiniLM-L6-v2
- mixedbread-ai/mxbai-embed-large-v1

Expand your options with a detailed list of supported models suitable for various text processing requirements, spanning different languages and sizes.

Sparse Text Embedding

Default model: prithivida/Splade_PP_en_v1

Sparse text embeddings are efficient for specific applications, where sparsity can significantly reduce storage requirements without compromising on performance.

Image Embedding

Default model: Qdrant/clip-ViT-B-32-vision

Additional models:
- Qdrant/resnet50-onnx
- Qdrant/Unicom-ViT-B-16/32

These models cater to image processing tasks and facilitate robust embeddings for visual data.

Reranking

Default model: BAAI/bge-reranker-base

Other reranking models:
- BAAI/bge-reranker-v2-m3

Reranking models enhance search and information retrieval tasks by reordering results for improved relevance.

🚀 Installation

Integrating FastEmbed-rs into your Rust project is simple. Execute the following command in your project directory:

cargo add fastembed

Or, include it in your Cargo.toml:

[dependencies]
fastembed = "3"

📖 Usage

Text Embeddings

Generate text embeddings effortlessly.

use fastembed::{TextEmbedding, InitOptions, EmbeddingModel};

let model = TextEmbedding::try_new(Default::default())?;
let documents = vec!["passage: Hello, World!"];
let embeddings = model.embed(documents, None)?;
println!("Embeddings length: {}", embeddings.len()); // -> 4

Image Embeddings

Support for image data processing is equally straightforward.

use fastembed::{ImageEmbedding, ImageInitOptions, ImageEmbeddingModel};

let model = ImageEmbedding::try_new(Default::default())?;
let images = vec!["assets/image_0.png"];
let embeddings = model.embed(images, None)?;
println!("Embeddings length: {}", embeddings.len()); // -> 2

Candidates Reranking

Optimizes result ordering for better application outcomes.

use fastembed::{TextRerank, RerankInitOptions, RerankerModel};

let model = TextRerank::try_new(
    RerankInitOptions::new(RerankerModel::BGERerankerBase)
)?;
let results = model.rerank("what is panda?", documents, true, None)?;
println!("Rerank result: {:?}", results);

🚒 Under the Hood

Why fast? FastEmbed-rs justifies its "fast" moniker through:

Quantized model weights - Reducing model size without sacrificing accuracy.
ONNX Runtime - Supports inference on various hardware, including CPU and GPU.

Why light? The library minimizes bloat by avoiding hidden dependencies typically introduced through larger libraries like Huggingface Transformers.

Why accurate? Achieves precision superior to OpenAI Ada-002 and leads in Embedding leaderboards like MTEB.

📄 LICENSE

FastEmbed-rs is open source, licensed under Apache 2.0, promoting free use and development contributions.

FastEmbed-rs stands as an invaluable tool for developers needing robust embedding solutions, offering speed, efficiency, and flexibility across multiple data types and languages. Whether your project involves text processing, image analysis, or advanced ranking strategies, FastEmbed-rs provides the tools you need with ease of use in mind.