Chat with Documents using LLM
The "Chat with Documents using LLM" project, presented as part of a Bellingcat hackathon, is an innovative tool that leverages the power of large language models to facilitate interactive communication with document contents.
Team Members
The project was crafted by two key contributors:
- Radu Ciocan: Focused on coding the project.
- Ana State: Handled the design aspects.
Tool Description
This tool offers a seamless chat interface that empowers users to upload documents in PDF or DOCX format and engage with their content through conversational exchanges. It utilizes ChatGPT, a large language model developed by OpenAI, to interpret and interact with the document's data.
Installation
To get started with this tool, follow these steps:
-
Ensure that Node.js version 18 or later is installed on your computer.
-
Clone the repository by executing the command:
git clone [email protected]:ciocan/langchain-chat-with-documents.git
-
Navigate to the project directory and install the necessary packages:
cd langchain-chat-with-documents npm install
-
Duplicate the
.env.example
file and rename it to.env
, then add your specific environment variable values:WEAVIATE_HOST= # Enter your Weaviate host domain without 'https://' WEAVIATE_API_KEY= # Your Weaviate API Key CLOUDFLARE_ACCOUNT_ID= CLOUDFLARE_SECRET_KEY= CLOUDFLARE_SECRET_ACCESS_KEY= OPENAI_API_KEY= # Your OpenAI API Key
- Weaviate: This is a vector database service where documents get vectorized and indexed. It's possible to install it locally or utilize their free cloud service.
- Cloudflare R2: An object storage solution compatible with AWS S3, with a free tier offering up to 10 GB of storage. More can be found on their official page.
- Obtain an OpenAI API key.
-
Launch the tool with:
npm run dev
Tech Stack
This project is built using the T3 Stack, which simplifies creating modern web applications by combining multiple technologies:
- Next.js: A React-based framework used for building server-side rendered applications.
- Tailwind CSS: A utility-first CSS framework for designing modern websites without leaving the HTML.
- tRPC: A tool for creating type-safe APIs using TypeScript.
Additional technologies include:
- Zustand: A minimalistic state management library utilized for managing application state.
- Mantine UI: Provides a collection of versatile UI components.
- LangChain: Employed to interface with the OpenAI LLM model.
High-Level Architecture
A visual representation of the project's high-level architecture illustrates the interaction between various components and services utilized by the tool, enhancing its functionality to chat with document contents smoothly.
The "Chat with Documents using LLM" project stands out by making document interaction more dynamic and user-friendly, simplifying how users explore complex data through intuitive conversation.