llama-cpp-agent
Introduction
The llama-cpp-agent framework is a versatile tool designed to make interacting with Large Language Models (LLMs) easier and more efficient. It provides an intuitive interface for users to engage in conversations with LLMs, execute function calls, generate structured output, perform retrieval tasks with enhanced generation techniques, and process text through smart agent chains that use various tools.
One standout feature of this framework is its guided sampling technique. This approach helps ensure that the output from the models adheres to user-defined structures, even when the models weren't specifically trained for such tasks. The framework is highly adaptable, compatible with llama.cpp servers, llama-cpp-python, and other server options like TGI and vllm.
Key Features
- Simple Chat Interface: Users can seamlessly engage in conversations with LLMs.
- Structured Output: It can produce structured outputs, resembling objects, from LLMs.
- Function Calling: It supports single and parallel function executions.
- RAG - Retrieval Augmented Generation: Perform retrieval tasks with additional generation processes assisted by colbert reranking.
- Agent Chains: Text can be processed using sophisticated agent chains that handle Conversational, Sequential, and Mapping tasks.
- Guided Sampling: Uses pre-defined grammars and JSON schema generation, enabling most 7B LLMs to manage function calling and structured outputs.
- Multiple Providers Supported: The framework works seamlessly with llama-cpp-python, llama.cpp server, TGI server, and vllm server.
- Compatibility and Flexibility: It is compatible with Python functions, Pydantic tools, llama-index tools, and OpenAI tool schemas, making it suitable for various applications, ranging from general chat interactions to specific function executions.
Installation
Installing the llama-cpp-agent framework is straightforward using pip:
pip install llama-cpp-agent
Documentation
Comprehensive and up-to-date documentation is available to help users navigate and utilize the framework effectively. Access it here.
Getting Started
For newcomers, a getting started guide is available to assist users in setting up the framework quickly. Find it here.
Discord Community
Join the llama-cpp-agent community on Discord to engage with other users and contributors. The community is active and can provide support and collaboration opportunities. Join here.
Usage Examples
The framework offers a variety of usage examples to demonstrate its diverse capabilities:
- Simple Chat: Initiate a chat using the llama.cpp server backend.
- Parallel Function Calling: Execute multiple functions concurrently.
- Structured Output: Generate structured data effortlessly.
- RAG - Retrieval Augmented Generation: Perform advanced retrieval tasks.
- llama-index Tools: Utilize llama-index tools for querying.
- Sequential Chain: Implement a product launch campaign.
- Mapping Chain: Summarize articles into a single cohesive summary.
- Knowledge Graph Creation: Create knowledge graphs using the framework's tools.
Additional Information
Predefined Messages Formatter
The framework provides predefined message formatters, available in various styles such as MISTRAL, CHATML, and more.
Creating Custom Messages Formatter
Users can create custom message formatters by instantiating the MessagesFormatter
class with desired parameters.
Contributing
Contributions to the llama-cpp-agent framework are welcome. Interested contributors should adhere to the project's guidelines, which include forking the repository, maintaining code style, and thoroughly testing changes before submission. Detailed guidelines can be found on the project's GitHub page.
License
The llama-cpp-agent framework is available under the MIT License, allowing for broad usage and contribution.
FAQ
Common questions include installation tips for optional dependencies, guidelines for contribution, and compatibility assurances with the latest versions of related software. Further details are provided in the project's FAQ section.