Project Icon

mergekit

Optimize and Merge Language Models Using CPU and GPU

Product DescriptionMergeKit offers an effective solution for merging pre-trained language models with support for algorithms like Linear, SLERP, and Task Arithmetic. It is suitable for resource-constrained settings, functioning on both CPU and GPU with low VRAM requirements. Features include lazy tensor loading and layer-based model assembly. Compatible with models like Llama, Mistral, and GPT-NeoX, it also provides an intuitive GUI on Arcee's platform and supports sharing on the Hugging Face Hub. A versatile YAML configuration enables custom merge strategies.
Project Details