Introducing pyLLMSearch: Advanced Retrieval-Augmented Generation System
pyLLMSearch is a sophisticated Retrieval-Augmented Generation (RAG) framework designed to enhance the process of question-answering using local document collections. With a user-friendly YAML-based configuration, it simplifies interactions with various types of documents while offering significant improvements in numerous system components. This package supports both OpenAI’s models and custom Large Language Models (LLMs) installed locally, making it highly versatile and adaptable to different needs.
Key Features
Supported Document Formats
- Built-in Parsers:
- Markdown (.md): This parser sorts documents based on headings, subheadings, and code blocks. Additional functionalities include cleaning image links and adding custom metadata.
- PDF (.pdf): Utilizes a MuPDF-based parser.
- DOCX: A custom parser that efficiently handles nested tables.
- Unstructured Formats: Other formats are supported by a pre-processor that handles various unstructured data types. More details on supported formats can be found here.
Enhanced Parsing and Embeddings
- Table Parsing: Implemented via open-source tools like gmft or Azure Document Intelligence.
- Image Parsing: Optionally enabled through the Gemini API.
- Document Collections: Allows interaction with multiple document collections and filters search results by collection.
- Incremental Embedding Updates: Update embeddings without needing to re-index entire document collections.
Embedding Capabilities
- Dense Embeddings: Generated and stored in a vector database using models like Hugging Face embeddings, Sentence-transformers, and Instructor-based models.
- Sparse Embeddings: Achieved via SPLADE to support hybrid (sparse + dense) search strategies.
Advanced Search Techniques
- Retrieve and Re-rank: Implements semantic search with strategies that leverage models such as
ms-marco-MiniLM
and the newerbge-reranker
. - HyDE (Hypothetical Document Embeddings): Can boost the quality of search results, especially in new or unfamiliar topic areas. Users should refer to the relevant documentation before enabling this feature.
- Multi-Querying: Inspired by
RAG Fusion
, this approach generates multiple variants of a query, offering diverse perspectives and improving search results.
Additional Functionality
- Chat History Support: Includes question contextualization for enhanced user interactions.
- Model Interaction: Supports various embedded document interaction models, including OpenAI’s ChatGPT, Hugging Face, Llama cpp, and AutoGPTQ (currently disabled).
- Interoperability: Works seamlessly with LiteLLM and Ollama via OpenAI API, with access to numerous models.
User Interfaces and Miscellaneous Features
- User Interfaces: Provides simple command-line and web interfaces for ease of use.
- Deep Linking: Jump directly to document sections, specific PDF pages, or headers in markdown files.
- Logging: Allows saving responses to an offline database for subsequent analysis.
- Experimental API: Offers flexibility for developing custom applications or integrations.
Experience the Demo
A visual demonstration of pyLLMSearch is available, showcasing its comprehensive capabilities in action.
Explore the Documentation
For a deeper dive into pyLLMSearch’s functionalities and configurations, users can access the detailed Documentation.
In summary, pyLLMSearch is a robust and adaptable solution for leveraging the power of RAG systems, tailored to efficiently handle and search through extensive local document collections with precision and advanced functionalities. Whether for academic research, business intelligence, or personal projects, this tool provides a significant edge in managing and retrieving valuable information.