text-embeddings-inference
Text Embeddings Inference provides a high-performance framework for deploying text embeddings and sequence classification models. It includes optimized extraction for popular models like FlagEmbedding and GTE with efficient Docker support, eliminating model graph compilation and facilitating fast booting. This toolkit supports various models including Bert and CamemBERT, offering features like dynamic batching and distributed tracing, suitable for diverse deployment environments.