mergoo - Streamline the Integration and Training of Multiple LLM Experts

Mergoo Project Overview

The Mergoo project is a groundbreaking library designed to facilitate the merger of multiple Language Model Experts (LLMs) and enhance the efficiency of training these merged models. It offers an innovative approach by integrating knowledge from various LLM experts, whether they are domain-specific or more generalized.

🚀 Key Features

Merging Methods: Mergoo supports several advanced methods such as Mixture-of-Experts, Mixture-of-Adapters, and Layer-wise merging, providing flexibility in how these models are integrated.
Model Compatibility: It can be adapted to various base models, including Llama (even LLaMa3), Mistral, Phi3, and BERT.
Training Options: Users can choose between training only specific parts of the model (such as the Router of Mixture-of-Experts layers) or fully fine-tuning the merged language model.
Device Support: Mergoo supports multiple devices for training, including CPU, MPS, and GPU, ensuring adaptability in various computational environments.

Installation and Integration

Mergoo can be installed quickly via pip for stable releases or directly from GitHub for the latest updates. It also supports installation from source for those who prefer direct customization.

pip install mergoo

Getting Started

Setting up Mergoo involves specifying configurations such as model type, the number of experts per token, and the desired router layers when merging fully fine-tuned or LoRA fine-tuned LLM experts.

For example, merging fully fine-tuned LLM experts can look like this in Python:

config = {
    "model_type": "mistral",
    "num_experts_per_tok": 2,
    "experts": [
        {"expert_name": "base_expert", "model_id": "mistralai/Mistral-7B-v0.1"},
        {"expert_name": "expert_1", "model_id": "meta-math/MetaMath-Mistral-7B"},
        {"expert_name": "expert_2", "model_id": "ajibawa-2023/Code-Mistral-7B"}
    ],
    "router_layers": ["gate_proj", "up_proj", "down_proj"]
}

The library allows for seamless integration with the Hugging Face Trainer, enabling users to load and further fine-tune the merged expert models conveniently.

Learning and Support

Mergoo offers a range of tutorials and detailed guides to help users get the most out of its features. Whether exploring fully fine-tuned experts or engaging with Mixture of Adapters, there are resources available to support implementation.

The project has a clear and ambitious roadmap, aiming to introduce features such as router load balancing, supporting a broader range of expert models, and implementing advanced merging techniques.

Community and Contribution

Mergoo is open-source and thrives through community contribution. The project welcomes suggestions, feature requests, and code enhancements from users. Interested individuals can join the Leeroo community through various platforms, such as Twitter, LinkedIn, Discord, and the Leeroo website.

The project documentation urges potential contributors to pitch in with new ideas, feature implementations, or even help improve existing infrastructure.

For inquiries or further information, the community can open GitHub issues or reach out via email for support or questions.

Mergoo represents a significant step forward in the domain of language modeling, setting the stage for more insightful and capable integrations of expert systems.