ChatDocs Project Introduction
Overview
ChatDocs is an innovative tool designed to interact with your documents offline using artificial intelligence. This project emphasizes privacy, as all processing happens locally on your machine, ensuring no data leaves your system. An internet connection is only necessary for installing the tool and downloading certain AI models.
ChatDocs derives inspiration from PrivateGPT but extends its capabilities with additional features and functionalities.
Key Features
- Model Support: ChatDocs supports various AI models, including GGML/GGUF models via CTransformers, π€ Transformers models, and GPTQ models.
- User Interfaces: The tool boasts a web-based user interface (UI) that enhances user interaction and also provides a command-line interface for users who prefer text-based commands.
- Configuration: Extensive options for customization are available through a
chatdocs.yml
configuration file, allowing users to tailor the tool to their specific needs. - Document Compatibility: ChatDocs can handle a wide range of document types, including but not limited to CSV, Word, EverNote, Email, EPub, HTML, Markdown, PDFs, and PowerPoint files.
- GPU Support: The tool can leverage GPU resources for improved performance when dealing with large datasets or complex models.
Installation
Getting started with ChatDocs is straightforward. First, the tool is installed using the Python package manager with:
pip install chatdocs
Once installed, users need to download the required AI models:
chatdocs download
After these steps, ChatDocs can be used completely offline.
How to Use
To begin using ChatDocs, users add their document directory:
chatdocs add /path/to/documents
Processed documents are stored in a local directory named db
by default. Users can then interact with their documents through the web UI by visiting http://localhost:5000 in a browser, or use the command-line with:
chatdocs chat
Configuration Options
The chatdocs.yml
file is the heart of ChatDocs configuration. Users can change various settings, such as the embeddings model by specifying:
embeddings:
model: hkunlp/instructor-large
Similar configurations are available for CTransformers and π€ Transformers models. Users can specify a model type and location and adjust settings like GPU usage for performance enhancement.
GPU Utilization
For tasks requiring higher computational power, ChatDocs offers GPU support:
-
Embeddings: Enable GPU by specifying the device type in the configuration.
embeddings: model_kwargs: device: cuda
-
CTransformers: For CTransformers models, GPU layers can be configured.
ctransformers: config: gpu_layers: 50
-
Transformers: Specify the device index to use GPU with π€ Transformers models.
huggingface: device: 0
To use GPU, users might need to install additional components, such as the correct version of PyTorch with CUDA capabilities.
Conclusion
ChatDocs represents a robust and flexible solution for offline document interaction using AI. With its privacy-first approach and comprehensive support for various document and model types, it serves as a powerful tool for users needing AI-driven document analysis and interaction.