Introducing BricksLLM: AI Gateway For Deploying LLMs in Production
Overview
BricksLLM is a powerful cloud-native AI gateway developed in Go, designed to streamline the deployment and management of Large Language Models (LLMs) in production environments. This versatile platform offers native support for popular AI models and providers such as OpenAI, Anthropic, Azure OpenAI, and vLLM, making it a valuable tool for businesses aiming to harness the power of AI on an enterprise scale.
Key Features
BricksLLM stands out with an impressive range of features that cater to different business needs:
- PII Detection and Masking: Automatically identify and obscure personally identifiable information to protect user privacy.
- Rate Limiting: Control the frequency of API requests to manage server load efficiently.
- Cost Control and Analytics: Monitor and limit expenditures associated with API calls and usage.
- Request Analytics: Gain insights into requests to optimize performance and cost.
- Caching and Request Retries: Enhance reliability with caching mechanisms and the ability to retry failed requests.
- Failover Capabilities: Improve service reliability by managing backups in case of failures.
- Access Control: Define who can access specific models and endpoints to ensure security and manageability.
- Comprehensive Support: Easily integrate with OpenAI, Anthropic, Azure OpenAI, and vLLM, as well as other custom models and deployments.
- Datadog Integration: Seamlessly connect with Datadog for advanced monitoring and logging.
Getting Started
BricksLLM is user-friendly, and starting with the platform is straightforward:
-
Clone the Repository: Start by cloning the BricksLLM-Docker repository to your local system.
git clone https://github.com/bricks-cloud/BricksLLM-Docker
-
Navigate to Directory: Switch to the BricksLLM-Docker directory.
cd BricksLLM-Docker
-
Local Deployment: Deploy BricksLLM locally using Docker, complete with Postgresql and Redis for data management.
docker compose up
-
Provider Setting Configuration: Configure a provider setting with necessary details, such as API keys.
curl -X PUT http://localhost:8001/api/provider-settings \ -H "Content-Type: application/json" \ -d '{ "provider":"openai", "setting": { "apikey": "YOUR_OPENAI_KEY" } }'
-
API Key Creation: Create an API key with specified limits on request rate and cost.
curl -X PUT http://localhost:8001/api/key-management/keys \ -H "Content-Type: application/json" \ -d '{ "name": "My Secret Key", "key": "my-secret-key", "tags": ["mykey"], "settingIds": ["ID_FROM_STEP_FOUR"], "rateLimitOverTime": 2, "rateLimitUnit": "m", "costLimitInUsd": 0.25 }'
-
Request Redirection: Direct your requests through the BricksLLM interface for seamless interaction with AI models.
curl -X POST http://localhost:8002/api/providers/openai/v1/chat/completions \ -H "Authorization: Bearer my-secret-key" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-3.5-turbo", "messages": [ { "role": "system", "content": "hi" } ] }'
Keeping BricksLLM Updated
To ensure the optimal performance of BricksLLM, regularly update to the latest version:
-
Pull Latest Version:
docker pull luyuanxin1995/bricksllm:latest
-
Pull Specific Version:
docker pull luyuanxin1995/bricksllm:1.4.0
Documentation
For a deeper dive into the technical setup, environment variables, and more advanced configurations, BricksLLM provides comprehensive documentation to assist users in maximizing the utility of this AI gateway platform. It includes guidance on configuring the environment, accessing the admin server, and utilizing the proxy server, ensuring users have the resources they need to effectively manage their LLM deployments.