Introduction to SimpleAI
SimpleAI is an innovative self-hosted alternative to the more conventional but less open AI APIs available today. It provides a versatile platform for experimenting with various Language Learning Models (LLMs). SimpleAI effectively replicates key endpoints commonly used in LLM operations, making it an invaluable tool for developers seeking a robust AI service without the constraints of traditional services.
Key Features
SimpleAI focuses on reproducing four main endpoints:
- Text Completion (
/completions
): This feature includes both streaming and non-streaming responses, enabling dynamic text generation. - Chat (
/chat/completions
): Supports dialogue and conversational AI functions, compatible with streaming and non-streaming responses. - Edits (
/edits
): Allows for the modification and enhancement of text inputs based on given instructions. - Embeddings (
/embeddings
): Facilitates the creation of text embeddings for various applications.
However, it currently does not support image, audio, file handling, fine-tuning, or moderation endpoints.
Why SimpleAI?
SimpleAI was created as a fun and productive project with several practical advantages:
- Independence and Flexibility: It allows users to experiment with new models without dependency on specific API providers.
- Benchmarking: Users can create benchmarks to assess and identify the best performing models or approaches.
- Use-Case Specific: It supports scenarios where reliance on external services is not feasible, thus allowing customization without the need for complete rewriting.
SimpleAI encourages users to explore different applications and share their unique findings.
Installation and Setup
To get started with SimpleAI, users need Python version 3.9 or higher. Installation can be done directly from the source or via PyPI:
-
From Source:
pip install git+https://github.com/lhenault/simpleAI
-
From PyPI:
pip install simple_ai_server
After installing, users should create a configuration file to declare their models using simple_ai init
, which will generate a models.toml
file. The server can then be started with a simple command:
simple_ai serve [--host 127.0.0.1] [--port 8080]
Model Integration and Declaration
SimpleAI uses gRPC for model queries, making it accessible across different programming languages. Users can integrate models by implementing specific methods like .embed()
for embedding tasks. To declare a new model, users should deploy a gRPC service and configure it in the models.toml
file. Here’s a sample configuration for a locally deployed LLaMA model:
[llama-7B-4b]
[llama-7B-4b.metadata]
owned_by = 'Meta / ggerganov'
permission = []
description = 'C++ implementation of LlaMA model, 7B parameters, 4-bit quantization'
[llama-7B-4b.network]
url = 'localhost:50051'
type = 'gRPC'
Usage
SimpleAI features a user-friendly interface through Swagger UI, providing ease of exploration for various endpoints. Users can interact with the API using familiar tools like cURL
or the OpenAI Python client:
import openai
openai.api_key = 'Free the models'
openai.api_base = "http://127.0.0.1:8080"
# Example of a completion request
completion = openai.Completion.create(model="llama-7B", prompt="Hello everyone this is")
Troubleshooting Common Issues
Some common issues users might encounter include the need for CORS configuration or endpoint prefix adjustments. SimpleAI provides solutions and code snippets for addressing such issues by extending functionality with custom scripts.
Contribution
SimpleAI is a work in progress, open to contributions from the community. Whether it's code, documentation, or creative inputs like logos, any help is appreciated. Developers can set up their development environment using make
and poetry
:
make install-dev
This command installs all necessary development dependencies and configures pre-commit helpers, facilitating an efficient workflow for contributors.
In summary, SimpleAI offers a flexible and open environment for testing and deploying AI models, appealing to both professional developers and hobbyists interested in AI experimentation. Its self-hosted nature provides users with increased control over AI integrations, making it a unique and attractive option in the landscape of AI technologies.