#fine-tuning
xtuner
XTuner is a versatile toolkit for efficiently fine-tuning both language and vision models, compatible with a variety of GPU platforms. It offers support for models like InternLM, Mixtral, Llama, and VLMs such as LLaVA, ensuring flexibility and scalability. With features such as FlashAttention and Triton kernels, XTuner optimizes training processes and integrates seamlessly with DeepSpeed. It supports several training algorithms, including QLoRA and LoRA, and provides a structured data pipeline that accommodates diverse dataset formats. XTuner models are ready for deployment through systems like LMDeploy and can be evaluated with tools such as OpenCompass. Recent updates include support enhancements and installation guidance.
refact
The repository provides a WebUI for fine-tuning and self-hosting open-source code models, facilitating enhanced code completion and chat features within Refact plugins. It supports easy Docker-based server hosting, use of multiple models on a single GPU, and integration with external GPT-models through third-party keys. Notable features include model sharding, download and upload of Lloras, and compatibility with models like Refact/1.6B and the starcoder2 series. Comprehensive plugin support for VS Code and JetBrains allows seamless integration into development workflows, making it suitable for small teams or individual developers under the BSD-3-Clause license, with enterprise options available.
Agent-FLAN
Agent-FLAN improves large language models by seamlessly integrating agent abilities. It tackles issues such as hallucinations and data distribution shifts, enhancing performance on agent evaluation datasets by 3.5%. Utilizing AgentInstruct and Toolbench datasets, it achieves better learning paces. Available on HuggingFace and OpenXLab, it marks significant progress in agent task performance and general abilities of LLMs.
CogVideo
The CogVideoX series presents advanced, open-source models for video generation, enabling tasks such as text-to-video, video continuation, and image-to-video. The latest models, CogVideoX-5B and CogVideoX-5B-I2V, enhance video quality and visual effects, providing a flexible framework for GPU fine-tuning. Recent enhancements feature open-source access to key models, boosting inference efficiency and incorporating new prompt optimization tools. Supported by detailed technical documents and community interaction, the series offers innovative video generation capabilities, assisting both developers and researchers.
DNABERT
DNABERT employs pre-trained encoders to enhance DNA sequence analysis, offering extensive resources like source codes and visualization tools. An extension of Hugging Face's transformers for genomic DNA, DNABERT is continually updated, featuring DNABERT-2 for multi-species genomes. It supports general and task-specific fine-tuning, offering efficiency and ease of use for researchers employing NVIDIA GPUs on Linux, ultimately facilitating advanced genomic insights.
embedding_studio
This open-source framework transforms vector databases into comprehensive search engines by efficiently integrating embedding models. It supports clickstream data collection and the continuous enhancement of search functionalities. With a high degree of customization, the framework dynamically optimizes search performance to align with varying data sources. It is particularly beneficial for handling rich, unstructured data or supporting customer-focused platforms, facilitating quick adaptation to shifting user preferences. Leveraging advanced features in development, this solution ensures consistent search optimization, offering a cost-effective option for complex data environments.
LLaMA-LoRA-Tuner
The tool facilitates LLaMA model evaluation and adjustment with low-rank adaptation (LoRA), featuring a 1-click setup on Google Colab for streamlined training, easy switching among primary base models like 'llama-7b-hf' and 'gpt4all-j', and compatibility with various dataset formats. Recent updates introduce a chat UI and demo mode for innovative model interaction, though the latest version lacks fine-tuning capability. It remains a valuable asset for researchers seeking a versatile and accessible model exploration tool.
DisCo
DisCo provides a toolkit for creating realistic human dance sequences via disentangled control, supporting both video and image outputs for various applications like pre-training, fine-tuning, and specific fine-tuning. It excels in producing generalized dance forms efficiently, minimizing the need for extensive fine-tuning, thus positioning itself as a leading solution in this field. The project includes user-friendly features ensuring easy training and experimentation, well-suited for real-world applications using advanced DisCo technology.
custom-diffusion
Learn how Custom Diffusion enables efficient fine-tuning of text-to-image models like Stable Diffusion. This approach introduces new concepts into models by adjusting key parameters, resulting in unique, multi-concept images with minimal storage impact. Access newly released datasets and utilize swift processing capabilities, now available in diffusers for improved training and inference.
llm-engine
LLM Engine is a comprehensive tool for deploying and customizing large language models such as LLaMA, MPT, and Falcon. It supports hosted infrastructure or Kubernetes deployment, offering scalable solutions with ready-to-use APIs, efficient inference, and open-source integrations. Upcoming documentation for K8s installation and cost-effective strategies aim to optimize resources further. Explore the potential of AI models with LLM Engine's detailed guidance and flexible deployment options.
gpu_poor
The tool estimates required GPU memory and token throughput for large language models (LLMs) on multiple GPUs and CPUs. It provides a detailed memory usage analysis for both training and inference, supporting quantization tools such as GGML, bitsandbytes, and frameworks like vLLM, llama.cpp, and HF. Key functionalities include vRAM requirements, token rate calculation, and finetuning duration approximation. The tool assists in assessing quantization suitability, maximum context capacity, and batch-size capability for GPUs, offering valuable insights into GPU memory optimization.
tanuki.py
Tanuki enables smooth integration of LLM-enhanced functions into Python applications, focusing on reliability and type-safety. It offers simple implementation with lower costs and latency, supported by numerous well-known models. Tanuki automates model distillation and implements test-driven alignment, simplifying functionality enhancement without managing prompts. Experience scalable improvements and up to 90% cost savings, suitable for developers requiring efficient, structured output from LLMs across various applications.
FireAct
Discover resources for optimizing language models using FireAct. The repository provides prompts, demo codes, and datasets tailored for language agent fine-tuning. It supports task exploration, OpenAI API integration, and SERP API utilization. FireAct guides through data generation, Alpaca and GPT format fine-tuning, and supervised learning for enhanced outcomes. Explore a model zoo with Llama family-based multitask models for effective language agent applications.
bert4torch
This open-source project supports a variety of tasks including loading and fine-tuning large language models like chatglm, llama, and baichuan. It simplifies deployment with a single command and features models such as BERT, RoBERTa, ALBERT, and GPT for flexible finetuning. Extensive practical examples are provided, validated on public datasets. The project offers intuitive tools that incorporate effective techniques, allowing model loading from the transformers library and efficient process monitoring. Initially developed with 'torch==1.10', it now accommodates 'torch2.0', making it a versatile resource for developers seeking flexibility and ease in model training and deployment.
LLM-Finetuning-Toolkit
This toolkit provides a CLI solution for managing LLM fine-tuning experiments efficiently. It controls all key aspects of the experimentation pipeline—prompts, LLMs, optimization strategies, and testing—via a single YAML config file. Installation through pipx or pip makes it user-friendly. The toolkit’s modular design supports data ingestion, model configuration, and quality testing, with an open invitation for open-source contributions.
awesome-llms-fine-tuning
Discover a curated selection of resources for fine-tuning Large Language Models (LLMs) like GPT, BERT, and RoBERTa. This repository provides comprehensive tutorials, papers, tools, and best practices for advancing LLMs in specific domains. It serves machine learning practitioners and data scientists in optimizing LLM performance and ensuring alignment with particular tasks. Explore insights and guidelines from GitHub projects to courses and literature.
cookbook
Mistral Cookbook offers practical examples and tutorials from users and partners that highlight the diverse applications of Mistral models, including model evaluations and embedding techniques. Contributions focus on clarity, originality, and are reproducible, adding value to the community. Accepts submissions in .md or .ipynb formats with Colab-compatible examples, fostering a collaborative learning environment.
ChatGLM-Efficient-Tuning
The project implements advanced fine-tuning techniques for the ChatGLM-6B model, including LoRA, P-Tuning V2, and Reinforcement Learning with Human Feedback (RLHF). It features a comprehensive Web UI for single GPU-based training, evaluation, and inference, highlighting its role in optimizing large language models. The repository supports various datasets like Stanford Alpaca, BELLE, and GPT-4 generated data, enhancing ChatGLM's adaptability to diverse datasets and tuning methods. Although the project is no longer actively maintained, it has significantly contributed to the efficient tuning of language models.
LLaMA-Factory
LLaMA-Factory streamlines the fine-tuning of large language models with advanced algorithms and scalable resources. It supports various models such as LLaMA, LLaVA, and Mistral. Offering capabilities like full-tuning, freeze-tuning, and different quantization methods, it enhances training speed and GPU memory usage efficiency. The platform facilitates experiment tracking and offers fast inference through an intuitive API and interface, suitable for developers improving text generation projects.
xTuring
The platform provides an intuitive interface for fine-tuning open-source LLMs such as Mistral, LLaMA, and GPT-J. It facilitates customized model management while maintaining data privacy, supports data ingestion, scalable GPU usage, and employs memory-efficient methods like INT4 and LoRA to lower hardware expenses. Users can explore various tuning techniques, assess models using defined metrics, and leverage new library features such as LLaMA 2 integration and CPU inference capabilities for enhanced performance and precision.
aikit
AIKit is an adaptable platform for hosting, deploying, and fine-tuning large language models (LLMs). It offers OpenAI API-compatible tools, supports LocalAI for inference, and provides a flexible fine-tuning interface through Unsloth. Its minimal image size enhances security, and it supports multi-modal models and OpenAI API clients. AIKit is suitable for air-gapped environments and allows multiple model hosting with one image. It can be deployed on Kubernetes and supports AMD64, ARM64, and NVIDIA GPUs for faster inferencing.
LLaMa2lang
Explore a methodology to enhance the performance of LLaMa3-8B in multiple non-English languages through advanced fine-tuning techniques and Retrieval-Augmented Generation (RAG). This guide details the step-by-step process, from dataset translation to the use of QLoRA and PEFT for efficient language model tuning. It covers a variety of foundation models, including LLaMa3 and Mistral, providing broad compatibility. Notably cost-effective, the project can be executed using free GPU resources like Google Colab. Discover the integration of various translation paradigms and implementation of DPO for improved model responses, suitable for developers enhancing multilingual chat platforms.
mistral-finetune
The mistral-finetune project provides an efficient platform for fine-tuning Mistral models by leveraging the LoRA training framework. This method focuses on memory conservation by locking most weights and adjusting only a small fraction. Tailored for multi-GPU environments, it also accommodates single GPU use for smaller models, like the 7B. Recently, it includes support for models such as Mistral Large v2 and Mistral Nemo, demanding more memory for larger tasks but enhancing finetuning capabilities. It serves as a straightforward entry point for finetuning Mistral models, emphasizing specific data formatting and installation instructions, essential for advanced training across various systems.
chatglm_finetuning
This project enhances ChatGLM models by offering diverse tuning options with integrations for PyTorch Lightning, ColossalAI, and Transformer trainers. It includes guidance for LoRA and other fine-tuning methods, installation instructions, data scripts, and continual updates for improved model application.
Platypus
This project delivers advanced solutions to enhance transformer architectures like LLaMA and LLaMA-2 using LoRA and PEFT. It focuses on efficiency and affordability, allowing users to access fine-tuned models on HuggingFace with seamless integration. Recent advancements include improved data processing and scripts for easy model setup and tuning. Discover various data refinement techniques to ensure model training accuracy and uniqueness, with detailed CLI guidelines for local deployment.
florence2-finetuning
Discover methods to fine-tune Microsoft's Florence-2, a compact yet powerful vision-language model applicable in diverse tasks such as captioning and OCR. This comprehensive guide addresses specific task adaptation like DocVQA and provides insights on installation and training, including single and distributed GPU setups. Understanding model revisions coupled with appropriate datasets can significantly boost performance, positioning Florence-2 as a flexible choice in computer vision and language tasks.
LLMs_interview_notes
This collection offers detailed interview notes for Large Language Models (LLMs) derived from expert experiences. It includes foundational to advanced preparation, addressing frequent interview questions, model structures, and training goals. The guide provides strategies for managing issues such as repetitive outputs and model choice in different fields, as well as insights on distributed training, efficient tuning, and inference. It serves as a practical resource for understanding LLMs in professional interviews without excessive embellishment.
simple-llm-finetuner
Explore a straightforward interface for tuning language models with the LoRA method on NVIDIA GPUs. The Simple LLM Finetuner uses the PEFT library to provide easy-to-use tools for dataset handling, parameter tweaking, and evaluating model inference. Suitable for beginners, it supports small datasets and can run on standard Colab instances. Adjust settings with ease to boost model performance with minimal effort.
CareGPT
Explore an open source medical language model utilizing diverse datasets and efficient deployment methods. Features include ChatGPT fine-tuning, Gradio deployment, and LLaMA model support, integrating AI techniques with comprehensive medical knowledge. Access tools for GPT-4/ChatGPT model distillation, benefiting from resources for constructing medical knowledge bases. Ideal for developers and researchers, the platform provides extensive data on over 60 medical departments and performs well on benchmarks such as CMB-IvyGPT. Engage with advanced medical AI through downloadable models and online interaction.
LongLoRA
LongLoRA utilizes efficient fine-tuning to enhance long-context language models with techniques like shifted short attention and Flash-Attention compatibility. Supporting models from 7B to 70B and context lengths up to 100k, it integrates an open-sourced dataset, LongAlpaca-12k, while facilitating reduced memory usage through QLoRA. This approach expands models' capability for complex tasks and optimizes computational resources.
mint
Discover a minimalistic PyTorch library implementing common Transformer architectures, ideal for model development from scratch. Engage with sequential tutorials featuring BERT, GPT, and additional models crafted to enhance understanding of Transformers. Utilize fast subword tokenization with HuggingFace tokenizers. The library supports pretraining on various dataset sizes using in-memory and out-of-memory techniques and includes fine-tuning capabilities. Experience features such as the BERT completer for masked string completion. A functional toolkit to support machine learning projects.
DALM
The DALM toolkit allows developers to implement domain-specific language models into their applications, improving AI systems by aligning them with distinct intellectual properties for better performance. This open-source resource facilitates efficient fine-tuning with Retrieval Augmented Generation frameworks and supports popular models like Llama and GPT. Accessible for demonstration, it offers a comprehensive framework for training and assessing domain-oriented models via contrastive learning and data preparation tools.
uniem
Discover the forefront of Chinese text embedding with our open-source project on HuggingFace. Featuring uniem's integration with sentence-transformers, text2vec, and tools like SGPT, recent updates in version 0.3.0 enhance fine-tuning possibilities. Explore new benchmarks in text classification and retrieval with MTEB-zh and contribute to the evolving community under Apache-2.0 licensing.
Phi-3CookBook
Explore a detailed guide on Microsoft's Phi-3 AI models, notable for their performance across language, reasoning, coding, and math tasks. The resource offers step-by-step methods for deploying and fine-tuning these adaptable models on platforms like Azure AI Studio, GitHub, and Hugging Face. Ideal for developers and AI enthusiasts, it highlights the potential of Phi-3 for customized AI applications.
Whisper-Finetune
Discover how to optimize Whisper, the advanced ASR model with multilingual support. The project emphasizes Lora fine-tuning for non-timestamped, timestamped, and audio-less data, and accelerates model performance via CTranslate2 and GGML for deployment on diverse platforms, including Windows and Android. Recent updates show enhanced Chinese recognition and processing speed, with this comprehensive guide detailing setup, data preparation, and evaluation strategies for maximizing Whisper's potential.
Feedback Email: [email protected]