searchGPT: A Comprehensive Guide
searchGPT is an innovative open-source project designed to enhance search engine utility using Large Language Model (LLM) technology. The aim is to deliver natural language responses to queries, making it a minimalistic implementation akin to the new Bing, specifically optimized for search and question answering purposes.
Overview
What sets searchGPT apart is its ability to derive answers from both web search content and file content, such as documents and presentations. This dual-source capability allows users to experience a seamless fusion of traditional search engine capabilities with advanced AI responses.
Key Features
-
Source Capabilities:
- Real-Time Web Search: Users can obtain up-to-the-minute information from the web.
- File Content Search: Supports searches from various file formats, including PPT, DOC, and PDF.
-
Semantic Search: Employs advanced tools like FAISS and PyTerrier for effective semantic search results.
-
LLM Integration: Incorporates sophisticated language models from OpenAI and GooseAI to deliver accurate and meaningful responses.
-
User-Friendly Interface: The application is built with an intuitive frontend, ensuring easy navigation and a hassle-free user experience.
Experience the Demo
You can experience a demonstration of searchGPT at https://searchgpt-demo.herokuapp.com/index. Keep in mind that loading might take around 10 seconds, and it's recommended not to overload the system with automated programs.
Architecture and Evolution
searchGPT is structured with a cohesive architecture that facilitates its innovative search methods. The project roadmap outlines future enhancements, focusing on increasing real-time factual accuracy, among other objectives.
Why Use RAG?
The Retrieval-Augmented Generation (RAG) approach is crucial because it addresses the limitations of LLMs, which cannot possibly learn all current information during their training. This method relies on retrieving up-to-date information to reference and ensures that responses are factually accurate.
Examples of improved accuracy through RAG include enhanced descriptions of terms like "ghost kitchen," providing detailed, contextual information as opposed to non-factual imaginings.
Getting Started
To start with searchGPT, you need:
- Python 3.10.8: Ensure your system supports the specified Python version.
- API Keys: Obtain OpenAI or GooseAI API keys, alongside an Azure Bing Search Subscription Key.
- OpenAI provides a starting balance, which allows for extensive usage before charges.
- Azure Bing offers a free tier for basic usage.
Installation Steps
-
Set Up Environment: Create an environment using Python or Anaconda and install the necessary Python packages.
Using native Python:
# Ensure you are using python=3.10.8 pip install -r requirements.txt
With Anaconda:
conda create --name searchgpt python=3.10.8 conda activate searchgpt pip install -r requirements.txt
-
Configure API Keys: Input the OpenAI and Azure Bing Search API keys into the
config.yaml
file within the backend configuration. -
Initiate Application: Run
app.py
orflask_app.py
for the web app, or executemain.py
for a quick test with output in the console.
Contribution Opportunities
searchGPT welcomes contributions, especially from developers with experience in frontend technologies. Those interested should consult the project's contribution guidelines for more information on how to get involved.
Licensing Information
The searchGPT project is distributed under the MIT License, allowing for widespread use and adaptation.
In summation, searchGPT represents a significant step forward in search engine technology, offering an engaging and informative experience by leveraging the power of LLMs in real-time search environments.