Introducing the Dot Project
Dot is a cutting-edge, open-source application aimed at transforming the way users interact with documents and files. By utilizing local language models (LLMs) and a process known as Retrieval Augmented Generation (RAG), Dot offers a user-friendly experience ideal for those without a programming background. The application draws inspiration from similar solutions like Nvidia's Chat with RTX and features an intuitive interface that makes document handling a breeze.
What Dot Can Do
Dot enables users to load a variety of document types including PDF, DOCX, PPTX, XLSX, and Markdown, into a local shared environment for seamless interaction. What sets Dot apart is its capability to answer general questions through a feature known as Big Dot, akin to the experience users have with ChatGPT. This means users can not only work with their documents but also explore other types of inquiries efficiently.
How It Works
Built with Electron JS, Dot integrates a fully-equipped Python environment encapsulating all necessary libraries and tools. The software leverages technologies such as FAISS for building local vector stores, Langchain, llama.cpp, and Huggingface, all of which are integral in establishing robust conversation chains. This robust setup ensures that document management and user interaction are both smooth and efficient.
Installing Dot
For general users, installing Dot is straightforward:
- Visit the Dot website to download the application suited for either Apple Silicon or Windows.
For developers interested in the backend:
- Clone the GitHub repository using
$ https://github.com/alexpinel/Dot.git
- Install Node.js and run
npm install
within the project directory. Usenpm install --force
if challenges arise during this step.
Additionally, a complete Python bundle is recommended for creating a distributable environment. Instructions and bundles are accessible here and here. Customize the bundles for Dot by renaming them 'python' and placing them in the llm
directory. For additional libraries, specify the path for bundle installation instead of using the usual pip install
.
Required Python Libraries
- PyTorch (CPU version for light weight)
- Langchain
- FAISS
- HuggingFace
- llama.cpp (prefer CUDA for Nvidia GPUs)
- pypdf
- docx2txt
- Unstructured
To finalize the setup and make document embeddings, install the sentence-transformers
library in a folder named mpnet
within the llm
directory. Additionally, acquire the Mistral 7B LLM for advanced functionality.
Future Developments
Dot's roadmap is exciting and includes several key enhancements:
- Compatibility with Linux
- Expanded support for choosing different LLMs
- Image file support
- Advanced document awareness extending beyond current content
- Simplified file selection processes
- Enhanced local LLM security measures
- Broadened document type compatibility
- Optimized file database management for faster file retrieval
How to Get Involved
The Dot project welcomes contributions from the community! Whether your interest lies in coding, documentation, or suggesting new features, your input is highly valued. As Dot is managed by a student, every bit of assistance and collaboration goes a long way.
Join the Dot community and become part of a dynamic project that is innovating document interaction and management!