Project Icon

LLMLingua

Improve Large Language Model Efficiency and Cost-Effectiveness through Prompt Compression

Product DescriptionLLMLingua provides efficient tools for compressing prompts in large language models, enabling up to 20x compression with minimal performance degradation. It integrates seamlessly with frameworks such as LangChain and LlamaIndex, optimizing costs and enhancing retrieval-augmented generation performance. LLMLingua-2 further improves task-agnostic compression, offering 3x-6x speed boosts. The latest release includes MInference, which cuts inference latency by as much as 10x in long-context applications, contributing to advancements in AI prompt compression.
Project Details