ChatPDF-GPT Project Overview
Introduction
ChatPDF-GPT is an exciting project that leverages the cutting-edge LangChain framework to create applications driven by language models. This unique application enables users to interact with PDF documents through a chat interface, powered by the advanced capabilities of OpenAI's language models.
The project showcases a connection between language models and external data sources, offering a dynamic environment for interaction. Users have the opportunity to upload a PDF document, which is subsequently processed and stored in a vector database called Pinecone and Supabase storage. Once the PDF is uploaded, users can engage in conversation with the document, making use of its content for meaningful interactions.
Developed using the Next.js framework, this project provides a robust, full-stack web application. The user interface is designed with the Radix UI library and styled using Tailwind CSS, based on the elegant template from shadcn/ui.
Features
- Upload a PDF: Users can upload PDF files, which are then stored in Pinecone and Supabase.
- Chat with PDF: The application processes the PDF content and enables users to have a conversation using the OpenAI API.
- PDF Preview: Users can view a preview of the PDF using the robust package @react-pdf-viewer.
- List PDFs: A list of all uploaded PDF documents in Supabase is available.
- Delete a PDF: Users can remove documents from the storage.
- Cite Sources: The chat interface provides PDF sources for references in AI responses, facilitating direct navigation to the information.
Usage Examples
ChatPDF-GPT includes examples that demonstrate various functionalities, such as:
- Interacting with Pinecone for saving and deleting vector data.
- Managing file uploads and deletions in Supabase.
- Listing documents from Supabase.
- Previewing PDFs using the @react-pdf-viewer.
- Navigating to specific sources in PDFs from AI responses.
Quick Testing Using the Demo
To experience the demo, users must input their own credentials for OpenAI, Supabase, and Pinecone. Instructions for obtaining Supabase credentials are detailed below, while guidance for OpenAI and Pinecone credentials can be found in their respective documentation.
OpenAI
Visit OpenAI for details about accessing API credentials.
Supabase Setup
- Create a Project: Visit Supabase and create a new project.
- Database Connection: After setting up, retrieve your database connection string under the "Database" tab.
- Connection Pooling: Obtain the Connection Pooling URL from the same area.
- Storage Keys: Access your
SUPABASE_URL
andSUPABASE_KEY
in the "API" tab. - Setup Supabase Bucket: Create or use an existing bucket in the "Storage" section.
- Configure Environment Variables: Define essential variables like
DATABASE_URL
,SUPABASE_KEY
, etc. - Manage Storage Bucket Policies: Adjust bucket policies as needed, with consideration for data security.
Pinecone
Visit Pinecone for guidance on acquiring API credentials.
Setup and Installation
To run ChatPDF-GPT locally, follow these steps:
- Clone the repository:
git clone https://github.com/anis-marrouchi/chatpdf-gpt.git
- Enter the project directory and install dependencies with pnpm:
cd chatpdf-gpt pnpm install
- Create a
.env
file and fill in your credentials as per.env.example
. - Set up the database schema using Prisma:
npx prisma migrate dev --name init
- Start the development server:
npm run dev
Contribution
ChatPDF-GPT is an open-source endeavor inviting contributions from all. See the contributing guide for instructions on participating.
Credits
The ChatPDF-GPT project is powered by a wealth of open-source resources, including:
- LangChain framework
- OpenAI for language model excellence
- Supabase for backend services
- Pinecone for vector database management
- Next.js and Vercel for web application frameworks
- shadcn/ui for UI templates
- Radix UI for UI components
- @react-pdf-viewer for PDF previews
The project honors the collaborative efforts of the open-source community that made its development possible.
License
ChatPDF-GPT is available under the MIT license.