edgen - Native GenAI API Server with Data Privacy and Flexibility

Introduction to Edgen Project

Edgen is an innovative local GenAI API server that stands as a drop-in replacement for OpenAI's API. With its versatile capabilities, Edgen offers various AI endpoint services and functions, providing users with a platform for deploying generative AI technology on local devices securely, privately, and cost-free.

Key Features of Edgen

OpenAI Compliant API: Edgen is designed to be compatible with OpenAI's API, ensuring seamless transition and integration for users already familiar with OpenAI services.
Multi-Endpoint Support: It extends multiple AI endpoints including chat completions using large language models (LLMs) and speech-to-text transcriptions with Whisper models.
Model Agnostic: Edgen boasts compatibility with various models like Llama2, Mistral, and many others, providing users with flexibility in their AI model preferences.
Optimized Inference: Users don’t need advanced AI skills to optimize inference, as Edgen handles hardware and platform compatibility for optimal performance.
Modular Approach: Its modular design allows easy addition of new models while selecting the best runtime configuration for the user's hardware setup.
Model Caching: Foundational models are cached locally, enabling one model to power numerous applications without repeated downloads.
Native Compatibility: Built in Rust, Edgen is natively compiled to work across major platforms like Windows, MacOS, and Linux without the need for Docker.

Usage and Setup

To start using Edgen, users simply download the application, which runs locally on their existing hardware. It provides a robust platform for developing and deploying generative AI applications while ensuring data privacy as all processes occur on-device. Edgen eliminates the necessity for additional cloud infrastructure, making it a scalable, reliable, and free solution.

Endpoints Provided by Edgen

Chat Completions: Offers conversation and chat completion functionalities.
Audio Transcriptions: Converts speech to text efficiently.
Embeddings: Facilitates processes like semantic search and recommendations.
Forthcoming Features: Edgen is pursuing developments in image generation, multimodal chat completions, and enhanced audio speech features.

Supported Platforms and Models

Edgen supports Windows, Linux, and MacOS platforms, ensuring broad accessibility and compatibility. The documentation provides a detailed list of supported AI models to guide users through their options.

GPU Support

For advanced performance, Edgen supports GPU compilation and execution through Vulkan, CUDA, and Metal, enhancing model processing speed and efficiency. Detailed instructions on enabling these features are available in the project documentation.

Benefits of Local GenAI

Data Privacy: Users’ data is processed locally, avoiding exposure to external servers.
Scalability: Users scale their AI solutions without cloud-based limitations.
Reliability: Operates without internet dependency, removing issues like downtime and rate limits.
Cost Efficiency: Utilized hardware, reducing additional software costs.

Getting Started

Users are encouraged to download Edgen and explore its capabilities through EdgenChat, a local chat application. With comprehensive documentation and community support, users can seamlessly integrate generative AI into their applications.

Edgen’s commitment to an open-source approach invites contributions and collaboration via its roadmap and communication channels, including Discord and GitHub. The project benefits from the foundational work in libraries like llama.cpp, whisper.cpp, and ggml, facilitating its developmental progress in the AI space.