Introduction to the Private Chatbot MPT-30B Langchain Project
Overview
The Private Chatbot MPT-30B Langchain project offers the ability to interact with your documents privately and without the need for an internet connection. This is achieved using MPT-30B, a robust open-source model with an impressive context length of 8k, which surpasses the original GPT-3 in performance. The project leverages a quantized version of MPT-30B, allowing users to operate the chatbot locally on their computers.
Requirements
To run the project, your system should have a minimum of 32GB of RAM and be equipped with Python 3.10 or later.
Installation
Here is a step-by-step guide to setting up the project:
-
Install Poetry: Poetry is a package manager for Python, and you can install it using the command:
pip install poetry
-
Clone the Repository: Clone the GitHub repository to your local machine. Use the following command, replacing the placeholder with the actual repository URL:
git clone {insert github repo url}
-
Install Project Dependencies: Navigate to the project directory and install the necessary dependencies using Poetry:
poetry install
-
Configuration: Copy the example environment configuration file to a new
.env
file:cp .env.example .env
-
Model Download: Download the model which is approximately 19GB. You can do this in two ways:
- Use the provided Python script:
python download_model.py
- Alternatively, download the model file manually from the linked Hugging Face page, and place it into a
models
folder within the project's root directory.
- Use the provided Python script:
-
Ingestion of Documents: Place the documents you wish to query into the
source_documents
folder. The project supports various file types, including CSV, Word documents, PDF, and more. To ingest these documents, run:python ingest.py
This process creates a local vectorstore and may take 20-30 seconds per document. All data remains on your local machine.
Using the Chatbot
Once your documents are ingested, you can start querying them by running:
poetry run python question_answer_docs.py
Alternatively, you can use:
make qa
When prompted, simply type your question and receive an answer based on your documents. Close the session by typing exit
.
Running the Plain Chatbot
If interacting solely with the MPT-30B chatbot (without documents) is preferred, skip the document ingestion and run:
poetry run python chat.py
Or use:
make chat
Credits
The project stands on the shoulders of giants, with contributions and templates from the likes of abacaj, imartinez, and TheBloke, who provided the MPT-30B GGML model.
This robust setup allows users the capability of private document interactions with advanced AI, all while remaining entirely offline and secure.