api-for-open-llm - Streamlined API Access for Open-source Language Models

Introducing the API for Open LLMs Project

The API for Open LLMs is a comprehensive initiative designed to provide a unified backend interface for open-source large language models (LLMs), maintaining compatibility with OpenAI's response standards. This project is ideal for developers and researchers looking to integrate diverse open-source models with minimal hassle, mimicking the well-known OpenAI ChatGPT API structure.

Recent Updates

The project continues to evolve with regular updates and support for new models:

As of June 13, 2024, the project supports the MiniCPM-Llama3-V-2_5 model, which requires modifications to specific environment variables for seamless integration.
On June 12, 2024, support was added for the GLM-4V model, featuring necessary environment variable changes for proper setup.
With the introduction of the QWEN2 model on June 8, 2024, users can easily switch to this model through designated environment variable adjustments.
Further back on June 5, 2024, the GLM4 model was supported with similar environment modifications.
Earlier updates in April 2024 included support for Code Qwen and Rerank reordering models, enhancing the project's flexibility and utility.

For a detailed timeline and more updates, please visit the news section.

Key Features

The project's main goal is to streamline the use of large language models by providing a robust API interface that feels familiar to those using OpenAI's tools. Here are some of the stand-out features:

OpenAI-Like API: Users can interact with a range of open-source models via an API that mirrors the OpenAI ChatGPT interface.
Streamed Response Support: This feature allows for real-time, printer-like response outputs, enhancing interactivity.
Text Embedding Models: The API supports embedding models for document-level knowledge Q&A, widening application possibilities.
LangChain Compatibility: Developers can leverage this powerful library's functionalities to build complex applications.
Easy Model Switch: By simply altering environment variables, users can swap out models in a way that suits various applications, effectively substituting these models for ChatGPT.
Custom Model Support: The platform supports the integration of custom-trained 'lora' models, allowing personalized enhancements.
Advanced Inference and Concurrency: With vLLM support, this project aids in accelerated inference and handles concurrent requests efficiently.

Supported Models

The project supports an array of models with varying complexities and parameter sizes. This includes popular names like Baichuan, ChatGLM, DeepSeek, LLaMA, Qwen, and Yi, among others. Users can find more detailed information on supported models and their start-up processes in the project's documentation.

How to Use

The setup process involves configuring environment variables like OPENAI_API_KEY and OPENAI_API_BASE to point to the desired backend URL. Once set up, users can run a streamlined chat interface through streamlit, a web app framework, providing a user-friendly way to interact with the models.

For code-based interactions, users can utilize the Python openai library to execute different functionalities such as chat completions, text completions, and computing embeddings. Detailed usage examples can be found in the linked sections of the project's documentation.

Integration with Other Projects

One significant advantage of this API setup is its ability to fit into existing ChatGPT applications and interfaces with minimal configuration changes. Users can modify the OPENAI_API_BASE to integrate with:

ChatGPT-Next-Web: A straightforward deployment of a well-designed ChatGPT web interface.
Dify: A platform simplifying LLMOps, allowing seamless creation of AI-native applications.

Licensing and Contributions

The project is licensed under the Apache 2.0 License, encouraging wide usage and contribution. For more insights into licensing, refer to the LICENSE file.

Acknowledgments and References

This project draws inspiration from numerous pioneering works in the field of large language models and AI research, empowering users with state-of-the-art conversational AI capabilities. Visit the project page for a comprehensive list of references.

In summary, the API for Open LLMs provides a versatile and user-friendly platform for deploying and interacting with a variety of open-source large language models. Its design ensures ease of integration with existing AI applications, making it a valuable tool for developers and researchers alike.