Ollama Grid Search and A/B Testing Desktop App
The Ollama Grid Search project is a desktop application developed using Rust, which provides a streamlined interface for evaluating large language models (LLMs), prompts, and model parameters. The main objective of this tool is to facilitate the process of selecting the most suitable models and configurations for specific use cases by enabling users to explore various combinations and visualize the outcomes.
Purpose
This project is designed to automate the evaluation process of LLM models, prompts, and inference parameters. It enables users to easily test different combinations and visually inspect the results to identify the best setup for their needs. The tool works in conjunction with Ollama, a platform that must be installed and serving endpoints either locally or on a remote server.
Quick Example
In a typical use-case scenario, the user might want to test a simple prompt across two different models while experimenting with two temperature settings, such as 0.7
and 1.0
. The experiment interface allows users to compare results quickly and efficiently.
Installation
To install the Ollama Grid Search tool, users should visit the project's releases page.
Features
- Model Access: Automatically retrieves models from local or remote Ollama servers.
- Iteration and Testing: Allows testing of different models, prompts, and parameters to generate inferences concurrently.
- Visual Comparison: Provides A/B testing capabilities for comparing different prompts or models.
- Repeatability: Supports multiple iterations for each parameter combination and re-runs of past experiments.
- Concurrency Management: Enables limited concurrency or synchronous inference calls to manage server resources.
- Detailed Insights: Outputs inference parameters and includes metadata for response time and token analysis.
- Experiment Management: Lists experiments in a downloadable JSON format and offers comprehensive views and re-run capabilities.
Grid Search Concept
While traditional grid search refers to optimizing training hyperparameters, the Ollama Grid Search applies a similar strategy for evaluation. Users can select models, prompts, and parameter combinations to generate and compare results.
A/B Testing
The app also provides a robust A/B testing functionality. Users can select different models to compare their outputs for a specific prompt or apply various prompts to see how each performs under similar configurations.
Experiment Logs
For thorough analysis, users can access their experiment logs, which can be inspected or downloaded for further review.
Future Features
The development roadmap for the project includes:
- Grading and filtering results by quality.
- Storing experiments in a local database.
- Importing, exporting, and sharing experiment parameters.
Contributing
The project welcomes contributions. For simple changes, such as bug fixes, users are encouraged to submit a pull request directly. For more substantial modifications or suggestions, opening an issue for discussion is recommended before proceeding with development.
Development Setup
- Ensure Rust is installed.
- Clone the repository:
git clone https://github.com/dezoito/ollama-grid-search.git cd ollama-grid-search
- Install frontend dependencies using your preferred package manager:
bun install
- Configure
rust-analyzer
to useClippy
checks (especially in VS Code). - Start the app in development mode:
bun tauri dev
Citations
This repository and its contributions are recognized in academic works, such as theses from Santa Clara University about auto-tuning methods for machine learning hyperparameters.
Thank You!
The project extends gratitude to contributors such as @FabianLars, @pepperoni21, and @TomReidNZ for their support and contributions.