RAG Experiment Accelerator
Overview
The RAG Experiment Accelerator is an all-encompassing tool designed to streamline the process of conducting experiments and evaluations using Azure AI Search and the Retrieval-Augmented Generation (RAG) pattern. This guide provides a thorough understanding of the accelerator, including its purpose, features, instructions for installation, usage, and more.
Purpose
The primary function of the RAG Experiment Accelerator is to facilitate and expedite the experimentation and evaluation of search queries and the response quality from OpenAI. It is tailored to assist researchers, data scientists, and developers in:
- Testing various Search and OpenAI hyperparameters to assess performance.
- Evaluating the success of different search methodologies.
- Fine-tuning and optimizing parameters for best performance.
- Discovering the most effective combination of hyperparameters.
- Crafting detailed reports and graphical representations from the results of experiments.
Latest Changes
As of March 18, 2024, the tool now includes content sampling, allowing users to sample data by a specific percentage, ensuring that results are representative of the entire dataset. A rebuild of your environment is advised if you've used the tool prior to this update due to new dependencies.
Features
The tool is highly configurable and comes with a range of features:
- Experiment Setup: Customize experiments by defining search engine parameters, types, query collections, and evaluation metrics.
- Integration Capabilities: It interfaces smoothly with Azure AI Search, Azure Machine Learning, MLFlow, and Azure OpenAI.
- Extensive Search Indexing: Various search indexes are created based on configurations defined in the config file.
- Document Loading Flexibility: Multiple document loading options are available, from Azure Document Intelligence to basic LangChain loaders.
- Custom Document Intelligence Loader: Utilizes a custom loader for 'prebuilt-layout' API models to format data for enhanced readability, and efficiently handle table content.
- Query Generation: Capable of producing diverse and customizable query collections for specific experimentation purposes.
- Variety of Search Types: Supports different search methods, empowering detailed analysis.
- Complex Query Handling: Breaks down complex queries into simpler ones to generate relevant context.
- Re-Ranking: LLM revises and ranks query responses for contextual relevance.
- Metrics and Evaluation: Offers comprehensive metrics to assess generated answers against ground-truth answers, including similarity metrics and retrieval performance metrics.
- Report Automation: Generates detailed visual reports to facilitate analysis and sharing of results.
- Multi-Lingual Support: Includes language analyzers for linguistic support across different languages and specialized user-defined patterns.
- Data Sampling: Provides a sampling process to quickly and efficiently experiment with portions of large datasets.
Products Used
- Azure AI Search Service: For search functionality and indexing.
- Azure OpenAI Service: For natural language processing and question answering.
- Azure Machine Learning Resources: To manage, deploy, and track experiments and their results.
Compute Setup
To run the tool, users can choose from the following methods:
1. Run within a Development Container
- A development container will install all necessary software for you, leveraging WSL.
- Prerequisites include installing Ubuntu via Windows Store, Docker Desktop, Visual Studio Code, and VS Code's Remote-Containers extension.
2. Local Install
- Users can manually install the tool on a Windows or Mac machine.
- Follow the steps to clone the repository and install dependencies using Anaconda or Miniconda.
Provision Infrastructure
There are three ways to set up the necessary Azure services:
- Azure Developer CLI: Utilize Azure CLI commands to provision infrastructure.
- Azure Portal: Deploy infrastructure directly from the Azure portal using a template.
- Azure CLI: Employ standard Azure CLI commands for deployment, with options for isolated networks.
How to Use
To utilize the RAG Experiment Accelerator, users need to:
- Clone the repository and manage configuration files.
- Run prepared Python scripts to set up indexes, generate question-answer pairs, conduct search queries, and evaluate results.
- Consider using an Azure ML pipeline for efficient execution on larger datasets.
Running with Sampling
Sampling allows users to experiment quickly and keep costs down, with results roughly indicative of the full dataset. The tool offers options for running the full process locally or sampling the data to then run on Azure Machine Learning.
By configuring the tool and using its wide range of features, users will be able to effectively experiment with and enhance their search and response capabilities using the RAG pattern.