client-vector-search - Enhance Vector Search Capabilities with Client-Side Embedding and Caching Solutions

Client-Vector-Search Project Introduction

The client-vector-search project is a dynamic and efficient vector search library designed to be used on both client-side browsers and server-side environments. This library is tailored for embedding, searching, and caching operations, standing out with its speed and performance compared to other existing solutions like OpenAI's text-embedding-ada-002 and Pinecone.

Key Features

Transformer Embeddings: By default, the library utilizes the gte-small transformer model to embed text documents. This functionality provides a powerful way to process and convert text data into vector form.
Cosine Similarity Calculation: It allows users to compute the cosine similarity between different embeddings, aiding in comparing the likeness of various text documents.
Client-Side Indexing and Searching: Users can create an index and perform search operations directly on the client side, facilitating rapid access and retrieval of information.
Caching Support: The library offers vector caching with browser-based caching support to enhance performance and quicken response times.

Roadmap and Future Enhancements

The long-term vision for client-vector-search is to offer a fast, simple solution that scales comfortably with user needs, typically handling thousands of vectors efficiently. Here are some anticipated features and improvements:

Introduction of an HNSW index suitable for both browser and Node.js environments, independent of third-party libraries.
Implementation of a comprehensive testing framework, including health checks and performance benchmarks.

Installation and Quickstart

Installing the client-vector-search library is straightforward using npm:

npm i client-vector-search

Here's a simple guide to get you started:

Embed Text: Use the getEmbedding function to convert text into embeddings asynchronously.
```
const embedding = await getEmbedding("Apple");
```
Initialize the Index: Form an index with objects that include an ‘embedding’ attribute.
```
const initialObjects = [...];
const index = new EmbeddingIndex(initialObjects);
```

Search for Similar Items: Query the index with a vector and retrieve the most similar entities.

const queryEmbedding = await getEmbedding('Fruit');
const results = await index.search(queryEmbedding, { topK: 5 });

Troubleshooting and Usage in NextJS

For developers using Next.js, modifications to the next.config.js are necessary to prevent conflicts with certain node modules.

Step-by-Step Usage Guide

Outlined below is a comprehensive step-by-step guide to utilizing all aspects of the library:

Generate Embeddings: Convert strings into embeddings for further processing.
Calculate Similarity: Measure the similarity between different text representations.
Manage Index: Add, update, and remove items from the index as needed.
Persistent Storage: Save your index on IndexedDB for future retrieval and searches.
Database Management: Instructions for deleting and managing databases and object stores within IndexedDB.

This guide ensures a thorough understanding of the capabilities of the library, encouraging hands-on experimentation to fully grasp its functionality.

In summary, the client-vector-search library is an essential tool for developers looking to integrate efficient and scalable vector searching capabilities into their applications. With a focus on speed, simplicity, and ease of use, this library promises to meet a wide range of application needs, backed by continuous improvements and support from its founding team.