modelz-llm
Modelz LLM is an adaptable inference server that integrates open source large language models like FastChat, LLaMA, and ChatGLM for both local and cloud settings. It includes an OpenAI compatible API for smooth interaction with OpenAI Python SDK or LangChain. Offering Docker images for easy deployment on Kubernetes and other cloud platforms, Modelz LLM ensures efficient model hosting. Supporting APIs for tasks such as completions and embeddings, it enhances model interaction and offers a comprehensive solution for deploying LLMs.