sensei - Open Source AI Answer Engine Utilizing Advanced LLMs

Sensei Search: An Overview

Sensei Search is an innovative AI-enabled answer engine designed to deliver precise answers and insights by utilizing the power of advanced artificial intelligence technologies. Equipped with large language models (LLMs), Sensei Search aims to enhance the efficiency and reliability of information retrieval.

Visual Overview

Among the visual features, Sensei Search offers both light and dark mode options to cater to user preferences, as demonstrated in the screenshots provided.

Insights from Open Source Models

Engaging with open source LLMs has provided numerous insights and learnings. These experiences are thoroughly discussed in various forums, including a noteworthy post on Reddit that delves into building an open-source AI with LLMs. This discussion is informative for anyone interested in the practical applications and challenges of implementing LLMs in open-source projects.

Technical Infrastructure

Sensei Search is built upon a robust tech stack, leveraging some of the most advanced technologies available:

Frontend Development: Utilizes the latest in web development with Next.js and Tailwind CSS.
Backend Development: Powered by FastAPI and the OpenAI client.
Language Models (LLMs): Includes integration with advanced models like Command-R, Qwen-2-72b-instruct, WizardLM-2 8x22B, Claude Haiku, and GPT-3.5-turbo.
Search Capability: Enhanced by SearxNG and Bing for versatile search functionalities.
Memory Storage: Redis is used for efficient memory management.
Deployment Platforms: Uses AWS and Paka for deploying the application in a robust cloud environment.

Running Sensei Search

Sensei Search can be run either on local infrastructure or deployed in the cloud, offering flexibility according to user requirements.

Local Deployment

To deploy Sensei Search locally, follow these straightforward steps:

Set up the backend environment by configuring the .env.development file. An example configuration is provided that supports running models through Ollama.
Launch the application using Docker Compose.
Access the application via a web browser at http://localhost:3000.

Note: A good GPU is recommended for optimal performance of complex models like command-r.

Cloud Deployment

Deploying Sensei Search in the cloud is managed via AWS, with the process streamlined by Paka. Before initiating the deployment, ensure you have an AWS account with sufficient GPU quota.

Steps to deploy in the cloud include:

Install Paka to handle cloud deployments.
Provision and configure the AWS cluster as per cluster.yaml, making sure to input the required Hugging Face token for model access.
Sequentially deploy both backend and frontend services.
Retrieve the cloud application's URL and access it through a browser.

This setup allows organizations to leverage the AI capabilities of Sensei Search efficiently, adapting to both local and cloud infrastructures based on their strategic needs.