searchGPT - Explore an Open Source RAG-Powered Search Engine Providing Natural Language Answers

searchGPT: A Comprehensive Guide

searchGPT is an innovative open-source project designed to enhance search engine utility using Large Language Model (LLM) technology. The aim is to deliver natural language responses to queries, making it a minimalistic implementation akin to the new Bing, specifically optimized for search and question answering purposes.

Overview

What sets searchGPT apart is its ability to derive answers from both web search content and file content, such as documents and presentations. This dual-source capability allows users to experience a seamless fusion of traditional search engine capabilities with advanced AI responses.

webui
explainability

Key Features

Source Capabilities:
- Real-Time Web Search: Users can obtain up-to-the-minute information from the web.
- File Content Search: Supports searches from various file formats, including PPT, DOC, and PDF.
Semantic Search: Employs advanced tools like FAISS and PyTerrier for effective semantic search results.
LLM Integration: Incorporates sophisticated language models from OpenAI and GooseAI to deliver accurate and meaningful responses.
User-Friendly Interface: The application is built with an intuitive frontend, ensuring easy navigation and a hassle-free user experience.

Experience the Demo

You can experience a demonstration of searchGPT at https://searchgpt-demo.herokuapp.com/index. Keep in mind that loading might take around 10 seconds, and it's recommended not to overload the system with automated programs.

Architecture and Evolution

architecture_roadmap

searchGPT is structured with a cohesive architecture that facilitates its innovative search methods. The project roadmap outlines future enhancements, focusing on increasing real-time factual accuracy, among other objectives.

Why Use RAG?

The Retrieval-Augmented Generation (RAG) approach is crucial because it addresses the limitations of LLMs, which cannot possibly learn all current information during their training. This method relies on retrieving up-to-date information to reference and ensures that responses are factually accurate.

Examples of improved accuracy through RAG include enhanced descriptions of terms like "ghost kitchen," providing detailed, contextual information as opposed to non-factual imaginings.

Getting Started

To start with searchGPT, you need:

Python 3.10.8: Ensure your system supports the specified Python version.
API Keys: Obtain OpenAI or GooseAI API keys, alongside an Azure Bing Search Subscription Key.
- OpenAI provides a starting balance, which allows for extensive usage before charges.
- Azure Bing offers a free tier for basic usage.

Installation Steps

Set Up Environment: Create an environment using Python or Anaconda and install the necessary Python packages.

Using native Python:

# Ensure you are using python=3.10.8
pip install -r requirements.txt

With Anaconda:

conda create --name searchgpt python=3.10.8
conda activate searchgpt
pip install -r requirements.txt

Configure API Keys: Input the OpenAI and Azure Bing Search API keys into the config.yaml file within the backend configuration.
Initiate Application: Run app.py or flask_app.py for the web app, or execute main.py for a quick test with output in the console.

Contribution Opportunities

searchGPT welcomes contributions, especially from developers with experience in frontend technologies. Those interested should consult the project's contribution guidelines for more information on how to get involved.

Licensing Information

The searchGPT project is distributed under the MIT License, allowing for widespread use and adaptation.

In summation, searchGPT represents a significant step forward in search engine technology, offering an engaging and informative experience by leveraging the power of LLMs in real-time search environments.