AnglE - Optimize Semantic Textual Similarity with Innovative Sentence Embeddings

AnglE: Optimized Text Embeddings

AnglE is an innovative library designed to facilitate the training and inference of powerful sentence embeddings. Sponsored by Mixedbread, AnglE is built on the groundbreaking research detailed in the paper "AnglE: Angle-optimized Text Embeddings." This tool provides users with the ability to train state-of-the-art BERT/LLM-based sentence embeddings with minimal coding effort, making it an accessible option for those seeking excellence in semantic textual similarity tasks.

Features

AnglE’s distinctiveness lies in its versatile features:

Loss Functions: AnglE incorporates various loss functions tailored for different embedding tasks, including AnglE Loss, Contrastive Loss, CoSENT Loss, and the newly branded Espresso Loss (formerly known as 2DMSE).
Model Backbones: The framework supports numerous transformer-based models such as BERT, RoBERTa, and ALBERT, as well as LLM-based models like LLaMA and Mistral. Furthermore, it has capabilities for bi-directional LLM-based models like OpenELMo through specific GitHub repositories.
Training Options: AnglE provides flexible training modes that can accommodate both single-GPU and multi-GPU training environments, catering to various computational resources and project scales.

Achievements

AnglE has achieved outstanding performance benchmarks:

In May 2024, the AnglE paper was accepted by the ACL 2024 Main Conference, reflecting the academic community’s recognition of its innovation.
Models trained with AnglE, such as mixedbread's embedding and the universal sentence embedding WhereIsAI/UAE-Large-V1, have reached state-of-the-art status on the MTEB Leaderboard, showcasing exceptional average scores.

Official Pretrained Models

AnglE offers a variety of pretrained models for different applications:

BERT-based Models: These models, such as WhereIsAI/UAE-Large-V1, are designed for general-purpose English language tasks or specific domains like medical or code similarity.
LLM-based Models: The library also includes models like SeanLee97/angle-llama-13b-nli, which are finely tuned for tasks involving English language similarity measurement.

Quick Start Guide

Getting started with AnglE is straightforward:

Installation: Users can easily install the library via pip, ensuring a hassle-free setup process.
Inference: Whether using BERT-based models or LLM-based models, AnglE’s intuitive API allows users to encode text and measure semantic similarity quickly. It supports both prompt-based and non-prompt-based inference to cater to varying user needs.
Training: AnglE offers comprehensive support for data preparation and training. Users can utilize the CLI for single or multi-GPU training, and detailed documentation guides users through data transformation and model fitting.

Fine-tuning and Custom Training

AnglE provides specialized tips and scripts for fine-tuning models using different dataset formats, optimizing loss weights accordingly to achieve the best results. The package also supports converting AnglE models for use with sentence-transformers, enhancing compatibility with other NLP frameworks.

In conclusion, AnglE stands out as a potent tool for developing and fine-tuning sentence embeddings, offering users robust features and easy-to-follow instructions for achieving excellence in various embedding tasks. Its combination of flexibility, efficiency, and cutting-edge technology makes it suitable for both research and practical applications in semantic textual similarity and beyond.