chatpdflike - Effortless PDF Analysis with Advanced AI-Powered Natural Language Queries

Introduction to ChatPDFLike

ChatPDFLike is an innovative web application designed to streamline the way users interact with PDF documents. By utilizing powerful large language models such as OpenAI's GPT-3.5 Turbo, the software allows users to upload their PDFs and ask questions in natural human language. The application then provides clear and accurate answers based on the content of these documents. It’s important to note that this project operates independently and is not associated with ChatPDF.

Key Features

PDF Uploads: Users can easily upload PDF files from their computer or provide a link to a PDF on the web.
Natural Language Queries: Simply ask questions about the document content as if you were speaking to a person.
Precise Answers: Get brief and accurate responses derived from the document's content.
Reference Identification: See the exact sections of the document that were used to form the answers.
Multiple Model Support: Works with models from different providers, including OpenAI and Ollama.
User-Friendly Interface: Built with a combination of Flask and JavaScript, the platform ensures an intuitive user experience.

How ChatPDFLike Functions

Text Extraction and Processing:
- PDFs are processed to pull text, which is then divided into smaller, manageable components.
Generating Embeddings:
- These smaller text components are transformed into embedding vectors through a select model, representing the meaning of each section.
Handling User Queries:
- As questions arise, they too are converted into embedding vectors for comparison.
Similarity Search:
- By calculating cosine similarity, the application finds and selects the text pieces closely linked to the user's query.
Prompt Creation:
- A prompt that includes the user’s question and the relevant text sections is crafted for the language model.
Generating Answers:
- The prompt is analyzed by the language model, which then constructs a relevant response.
Displaying Responses:
- The response is shown to the user in the application interface, complete with references to the original text.

Getting Started

To begin using ChatPDFLike, there's a short checklist of requirements and steps:

Prerequisites

Python: Ensure you have Python version 3.6 or newer.
API Keys Needed: Obtain necessary API keys from OpenAI for their models, and optionally from Ollama.

Installation Process

Clone the Repository: Perform a git clone to get the project's source code.
```
git clone https://github.com/Ulov888/chatpdflike.git
cd chatpdflike
```
Dependencies Installation: Use pip to install all needed packages as specified in the requirements.txt.
```
pip install -r requirements.txt
```

Setting API Keys

Get an OpenAI API key and set it as an environment variable.
```
export OPENAI_API_KEY="your_openai_api_key"
```
If using Ollama, similarly set the Ollama API key.
```
export OLLAMA_API_KEY="your_ollama_api_key"
```

Utilizing the Application

Running the Server: Start the application using:
```
python run.py
```
Access the Application: Open it in your browser by navigating to http://localhost:8080.
Uploading PDFs: You can upload by selecting a file from your local storage or entering the URL of a PDF document.
Inquiry Process: Once uploaded, use the chat function to type in questions about the document.
Receiving Answers: The application will provide answers and reference sections used for those answers.

Customization Options

Modifying Prompt Strategies

Users can customize how the model operates by adjusting the prompt strategy in the code, specifically within generate_embedding.py. Different strategies cater to varying document types, such as scientific papers or financial handbooks.

Language and Responses

The application currently supports response generation in Chinese for certain strategies which can be modified according to user needs.

Limitations to Consider

Cost of APIs: Utilizing OpenAI's services may incur charges.
Parsing Capabilities: Complex PDF structures might challenge the current parsing technology.
Embedding Constraints: There are limits to the size of text that can be processed at once.
Model Dependence: The quality of information provided relies heavily on the model and its training data.

Contributions from the community are warmly invited to enhance and evolve ChatPDFLike. It offers a great opportunity to interact with and improve an open-source project licensed under the Apache License.