#language models
LoRA
LoRA employs low-rank matrix adaptations, reducing trainable parameters and optimizing task adaptation in large language models. This approach minimizes storage needs and avoids inference delays. The Python package integrates with PyTorch and the Hugging Face PEFT library, ensuring competitive performance alongside full fine-tuning in benchmarks like GLUE. LoRA adapts specific Transformer elements, like query and value projections, offering flexibility across models such as RoBERTa, DeBERTa, and GPT-2. The 'loralib' can be installed to apply these techniques efficiently.
raptor
RAPTOR offers an advanced approach to language models with its recursive tree structure, improving the efficiency of information retrieval in large texts. It supports integration with custom models for summarization and question-answering, making it highly adaptable to different research requirements. The open-source nature encourages continuous enhancement through community contributions.
OpenPrompt
Discover an open-source framework for prompt-learning that enhances pre-trained language models to adapt to diverse NLP tasks through textual templates and PLMs. Key features include seamless integration with Huggingface transformers and flexible adaptable strategies for various applications. Stay informed about the latest project updates like UltraChat for supervised instruction tuning. OpenPrompt offers a standardized platform for simplified and efficient NLP model deployment.
cognee
Cognee provides a flexible and scalable ECL pipeline solution designed to assist developers in effectively managing AI applications. It facilitates the integration and retrieval of historical conversations, documents, and audio transcriptions, thereby lowering hallucinations, development efforts, and costs. Supporting a range of tools like vector and graph storage alongside various LLMs, Cognee is apt for a wide range of data operations. Its modular nature and user management features not only improve development efficiency but also ensure robust and secure data management.
llms-from-scratch-cn
This project provides a detailed, step-by-step guide for building large language models (LLMs) from scratch. Focused on practical implementation and theoretical understanding, it includes tutorials and code examples for comprehension and creation of models like ChatGPT. Targeted at those interested in natural language processing and AI, the project emphasizes hands-on learning of LLM architecture, pre-training, and fine-tuning. Participants can explore models such as ChatGLM, Llama, and RWKV, enhancing their understanding of various model functionalities and mechanisms.
ScaleLLM
A cutting-edge inference system designed for large language models, utilizing advanced techniques such as tensor parallelism and OpenAI-compatible APIs. It supports leading open-source models like Llama3.1 and GPT-NeoX, aiming for seamless production deployment with high efficiency through tools like Flash Attention and Paged Attention. The system is under active development, introducing enhancements like CUDA Graph, Prefix Cache, and Speculative Decoding. Easy installation via PyPI, offering customization and a flexible server for various tasks, ideal for performance and scalability needs.
minimal-chat
MinimalChat is an open-source app supporting multiple language models like GPT-4 Omni, designed for voice interaction and mobile responsiveness. It offers local hosting, ensuring offline access and secure data storage. Features include customization, markdown support, and easy model switching, making it a versatile tool for development and practical applications.
promptlib
Explore the impact of refined prompt engineering for large models such as GPT-4 and ChatGPT. This project demonstrates how well-structured prompts enhance natural language processing and leverage advanced language model capabilities. It aims to develop tools for developers and knowledge workers, laying a groundwork for broader usage.
Recurrent-LLM
Learn how RecurrentGPT enhances long-text generation by integrating recurrence from RNNs into large language models like ChatGPT, addressing GPT models' limitations. This innovation allows for the generation of texts of any length through stored language-based memory, facilitating interactive and interpretable text generation. This advancement also positions RecurrentGPT as a key technology in next-generation writing systems and AI As Contents, enabling personalized user interactions.
MiniChain
Explore a small library that aims to simplify the process of coding with large language models through streamlined prompt chaining. The library supports the creation of applications such as Retrieval-Augmented QA and Chat with memory, requiring minimal code. Compatible with backends like OpenAI, Hugging Face, and Python, it features prompt visualization and integration with external data sources. MiniChain distinguishes itself with the separation of prompt text using Jinja templates and effortless function annotation, ideal for developers focusing on simplicity without losing functionality.
gpt-neox
This repository offers a robust platform for training large-scale autoregressive language models with advanced optimizations and extensive system compatibility. Utilizing NVIDIA's Megatron and DeepSpeed, it supports distributed training through ZeRO and 3D parallelism on various hardware environments like AWS and ORNL Summit. Widely adopted by academia and industry, it provides predefined configurations for popular model architectures and integrates seamlessly with the open-source ecosystem, including Hugging Face libraries and WandB. Recent updates introduce support for AMD GPUs, preference learning models, and improved Flash Attention, promoting continued advancements in large-scale model research.
direct-preference-optimization
This repository offers a robust implementation of Direct Preference Optimization, including conservative DPO and IPO, to improve language model efficiency. Compatible with HuggingFace models, it facilitates easy dataset integration and supports diverse GPU setups, enhancing supervised fine-tuning and preference learning for scalable training solutions.
langsmith-sdk
LangSmith SDKs provide tools to debug, evaluate, and monitor language models and intelligent agents. Integrating seamlessly with LangChain's Python and JavaScript libraries, these SDKs support application tracing and performance analysis for any LLM application. Simplify workflows using LangSmith, from the developers of LangChain. Access detailed documentation and tutorials for best practices to fully leverage the LangSmith platform.
h2o-llmstudio
H2O LLM Studio is a no-code platform for fine-tuning large language models using an intuitive GUI. It features cutting-edge techniques like Low-Rank Adaptation, supports multiple hyperparameters, and offers model performance tracking through Neptune and W&B integration. Recent enhancements provide robust training and optimization methods.
RWKV-Runner
The RWKV-Runner project simplifies the use of large language models through automation and a lightweight executable. It is compatible with the OpenAI API, transforming ChatGPT clients into RWKV clients. Notable features include easy model startup, adaptable VRAM settings, user-friendly interfaces, and multilingual support. Additional tools for model conversion, download management, LoRA finetuning, and example server deployments are also available, making it suitable for users seeking efficient model management across various platforms.
Aquila2
The Aquila2 series includes open-source language models like AquilaChat2, known for its advanced long-text processing, surpassing other models on various benchmarks. With options such as Aquila2-7B and Aquila2-34B, alongside the experimental Aquila2-70B-Expr, the project facilitates finetuning and quantization, accompanied by comprehensive deployment guides for platforms like Hugging Face and ModelHub. This project provides significant improvements in reasoning tasks and long-context comprehension, ideal for complex language application development, with regular updates promoting continuous progress.
phasellm
PhaseLLM is an open-source framework that simplifies the integration and evaluation of large language models such as OpenAI's GPT-3.5, Anthropic's Claude, and Cohere. It offers standardized API interactions, evaluation tools, and automation to optimize model performance in applications like chatbots, with a focus on efficiency and cost-effectiveness for developers and data scientists.
chrome-ai
Chrome AI uses the built-in Gemini Nano model to deliver advanced language processing via the Vercel AI platform. The project allows integration for text generation and embedding, with customizable settings for personalized AI experiences. Though still under development, it showcases potential AI advancements in Chrome applications and supports browsers with WebGPU and WebAssembly, offering a versatile approach to AI.
vercel-llm-api
This reverse-engineered API wrapper allows access to various language models like OpenAI's ChatGPT and Cohere's Command Nightly through Vercel AI Playground. It supports downloading models, text generation, and chat messages customization. While it has limitations like hardcoded user-agents and lacks authentication, the library simplifies model access. Install using 'pip3 install vercel-llm-api', with usage requiring simple Python client setup. Explore models such as Bloom and GPT-3.5 for flexible model interactions without subscription requirements.
laser
Layer-Selective Rank Reduction (LASER) enhances performance in language models by employing low-rank approximations of weight matrices. This technique optimizes reasoning tasks such as question-answering without further training by targeting specific layers and parameters. The project is under active development, focusing on refactoring for better flexibility and usability. It provides reproducible results across various models and benchmarks while encouraging community contributions and interaction. Core features include efficient hyperparameter tuning and adaptability for different language models.
chat_gpt_sdk
The library enhances Flutter by integrating OpenAI's GPT-3.5 and GPT-4 models, supporting text and chat completions, function calling, and image processing. It facilitates easy management of assistants, threads, and runs, and includes features like message and error handling, translation, and image generation. With a straightforward API, it allows for seamless interaction with OpenAI's language models, suitable for various Flutter applications.
kani
Kani is a flexible framework optimized for chat-based language models, including both hosted and open-source versions like GPT and LLaMA. It offers robust customization for NLP researchers and developers, facilitating tool integration and function calls with comprehensive control. By managing chat memory and function execution efficiently, Kani allows for seamless incorporation and quick iteration, free from hidden operations. This framework caters to diverse applications ranging from academic studies to industry implementations, presenting a straightforward and adaptable choice compared to more rigid frameworks.
llama3
Explore the enhanced capabilities of Llama 3 models, ranging from 8B to 70B parameters, available on Hugging Face. Access model weights, tools, and community scripts for responsible AI innovation, detailed across various repositories with guidance on safe use.
machine-learning-list
The reading list systematically introduces fundamental and advanced machine learning concepts, especially focusing on language models. It serves as a guide to key principles, deployment strategies, reasoning techniques, and AI’s broader implications. Structured in tiers, it balances theory and practical application. Subjects include machine learning basics, transformers, training methods, and applications, with insights into AI safety, economic, and philosophical aspects — ideal for understanding and scaling machine learning models.
open-interpreter
Utilize LLMs to execute code locally via a terminal interface, supporting multiple languages for data analysis and content creation. Open Interpreter offers more flexibility than ChatGPT's Code Interpreter by accessing local libraries and the internet. It ensures user-verified code execution, optimizing safety, and efficiency. Integrate effortlessly with development workflows and improve productivity with interactive demo capabilities.
open-instruct
Investigate the tuning of language models with leading-edge approaches on publicly accessible datasets. This project provides a unified codebase for training and assessing, featuring modern enhancements like LoRA, QLoRA, and efficient parameter updates. Find further insights and advancements through related research publications. The repository contains datasets, evaluation scripts for key benchmarks, and offers models such as Tülu tailored to diverse datasets, facilitating improved language model outcomes. Engage in fine-tuning for instruction adherence, employing advanced practices and reliable evaluation techniques.
keras-llm-robot
Keras-llm-robot utilizes Langchain and Fastchat frameworks in a Streamlit UI for offline deployment of Hugging Face models, with features like model integration, multimodal support, and customizations including quantization and fine-tuning. It also offers tools for retrieval, speech, and image recognition, plus environment setup guides for multiple OSs, ideal for developers exploring AI model deployment.
mergekit
MergeKit offers an effective solution for merging pre-trained language models with support for algorithms like Linear, SLERP, and Task Arithmetic. It is suitable for resource-constrained settings, functioning on both CPU and GPU with low VRAM requirements. Features include lazy tensor loading and layer-based model assembly. Compatible with models like Llama, Mistral, and GPT-NeoX, it also provides an intuitive GUI on Arcee's platform and supports sharing on the Hugging Face Hub. A versatile YAML configuration enables custom merge strategies.
instruct-eval
InstructEval is a platform designed to evaluate instruction-tuned LLMs including Alpaca and Flan-T5, using benchmarks like MMLU and BBH. It supports many HuggingFace Transformer models, allows qualitative comparisons, and assesses generalization on tough tasks. With user-friendly scripts and detailed leaderboards, InstructEval shows model strengths. Additional datasets like Red-Eval and IMPACT enhance safety and writing assessments, providing researchers with in-depth performance insights.
floneum
Floneum facilitates local AI application development with Kalosm, a Rust interface for text, audio, and image model processing, offering quantization and acceleration. Floneum Editor allows intuitive design of AI workflows. Support includes models like Llama and Whisper, and tools for context extraction and web scraping. Engage with the community via Discord and GitHub.
datablations
Discover strategies for scaling language models in data-limited contexts. This repository includes experiments on data repetition and computational budgets, working with up to 900 billion tokens and models with 9 billion parameters. It offers a scaling law for computational efficiency, considering the decreasing utility of repeated tokens and excess parameters. Methods to address data limitations, such as code augmentation and filtering techniques including perplexity and deduplication, are explained. Access to over 400 training models and datasets is provided, supporting robust language model development in constrained environments.
bigscience
This workshop explores large language models with Megatron-GPT2 architecture through detailed trainings and experiments. It addresses model scaling, training dynamics, and instabilities, supported by extensive documentation and logs. Providing resources like code repositories and training scripts, the project fosters transparency and collaboration within the AI community, guiding toward future advancements in language models.
DeepInception
Large language models, while successful, face risks from adversarial jailbreaks affecting their safety. DeepInception offers a novel, less resource-intensive method, inspired by the Milgram experiment, to bypass usage controls via personification and nested scenes. This approach highlights vulnerabilities in various LLMs, underlining the need for enhanced safety measures.
PanelGPT
Explore a new method for enhancing language model reasoning abilities through 'Panel Discussion' techniques. Taking inspiration from expert panels in conferences, this approach improves understanding and discourse, leading to better results in zero-shot prompting contexts. Evaluations on the GSM8K dataset underscore its effectiveness, establishing it as a superior method over strategies like Chain-of-Thought and Tree-of-Thought. The method's potential covers complex reasoning tasks, providing an efficient solution. Learn how integrating a collaborative discussion framework can enhance AI capabilities.
LLaMA-LoRA-Tuner
The tool facilitates LLaMA model evaluation and adjustment with low-rank adaptation (LoRA), featuring a 1-click setup on Google Colab for streamlined training, easy switching among primary base models like 'llama-7b-hf' and 'gpt4all-j', and compatibility with various dataset formats. Recent updates introduce a chat UI and demo mode for innovative model interaction, though the latest version lacks fine-tuning capability. It remains a valuable asset for researchers seeking a versatile and accessible model exploration tool.
genslm
GenSLMs utilizes large-scale language models to analyze SARS-CoV-2 evolution through sequence embeddings and synthetic sequence generation. Operating on supercomputers like Polaris and Perlmutter, it uses a hierarchical diffusion model for detailed genomic analysis, supporting efficient genome sequence modeling. The platform enhances research accuracy, serving as a robust tool for advancing virology studies.
SWE-bench
SWE-bench is a benchmark for testing language models' abilities to solve real-world GitHub software issues. It provides a containerized evaluation environment using Docker, ensuring repeatable assessments. Recent updates feature SWE-bench Verified, a collection of 500 engineer-confirmed solvable problems. Developed in collaboration with OpenAI, SWE-bench supports reproducible evaluations across different systems. Its resources are designed to help with model training, inference, and task creation, supporting NLP and machine learning applications in software engineering.
simple-evals
This repository provides a lightweight library for transparent evaluations of language models, emphasizing zero-shot and chain-of-thought methods. It includes benchmark results for models such as GPT-4, using tests like MMLU and HumanEval. The library favors simple, realistic instructions over complex prompting to better gauge real-world performance. While not actively maintained, it allows for updates such as bug fixes and new models. The setup supports OpenAI and Anthropic APIs for efficient, adaptable evaluations.
readme-ai
Boost development efficiency with an AI tool that automatically creates detailed README files, supporting numerous programming languages and customizable settings. Compatible with models like OpenAI and Google Gemini, it offers offline operation to meet various project documentation needs.
opencommit
OpenCommit streamlines version control by auto-generating meaningful commit messages using AI. It supports GPT-4, offers easy setup, GitMoji integration, and language customization, and is compatible with providers like OpenAI, Azure, and Ollama.
LLM-Kit
This open-source project provides a versatile WebUI toolkit designed to manage language model workflows effortlessly. Users can create custom models and applications without coding, in environments like Python and CUDA. The toolkit features robust modules, including APIs for prominent language models such as OpenAI and Baidu's Wenxin Yiyan. It supports functionalities including chat, image generation, dataset processing, and embedding models. Key features include role-play settings with memory and background libraries, and compatibility with large-scale models like ChatGLM and Phoenix-Chat. Operating under the AGPL-3.0 license, it encourages community involvement and shared development.
FastEdit
FastEdit provides a quick solution for injecting new information into large language models efficiently with just one command. It supports models such as GPT-J, LLaMA, and BLOOM, allowing for updated outputs. The tool requires Python 3.8+ and PyTorch 1.13.1+, leveraging Rank-One Model Editing for enhanced performance. Easy data preparation and installation enable effective model editing to maintain accuracy and relevance in multilingual contexts.
LLM-Zoo
This project details the release and characteristics of global large language models, providing a valuable resource for both open-source and closed-source LLMs developed after ChatGPT. It gathers essential data such as model sizes, supported languages, domains, and training datasets, alongside links to GitHub repositories, HuggingFace models, and academic publications. Regular updates keep users informed, with an invitation for contributions to enhance this dataset. Ideal for researchers and developers interested in the dynamics of natural language processing models.
GPT-Jailbreak
Access a repository with straightforward instructions for modifying language models such as GPT-3 and GPT-4. Discover how to personalize these models for enhanced AI functionality, without the need for installations, and contribute to improving their capabilities with community input.
LLM.swift
LLM.swift is a lightweight library offering interaction with large language models on macOS, iOS, watchOS, tvOS, and visionOS. It supports developers focusing on performance and ease of integration with Swift projects. With options for using HuggingFace models, it balances speed and stability. The library features customizable preprocess, postprocess, and update functions, providing precision and control for AI integration.
catalyst
Explore a fast and versatile C# Natural Language Processing library offering efficient non-destructive tokenization, flexible entity recognition, and reliable language detection. Catalyst facilitates FastText and StarSpace embeddings training, with readily available pre-trained models. Compatible with Windows, Linux, and macOS, it offers robust tools for semantic analysis. It suits projects requiring quick processing, aligning with .NET standard 2.0 for smooth pipeline integration.
helm
Stanford's CRFM-HELM project presents a framework for evaluating language models, including datasets such as NaturalQuestions and models like GPT-3. It expands evaluations beyond accuracy to metrics such as efficiency and bias, assesses robustness through perturbations, and offers access through a modular API and proxy server. The project also explores vision-language and text-to-image model evaluations with reliable findings. Comprehensive documentation supports effortless installation and use by researchers assessing language models.
codellama
Discover Code Llama, advanced language models based on Llama 2, designed to facilitate coding with features like code infilling and zero-shot learning. Models are available for both Python and general applications, ranging from 7B to 34B parameters and support up to 100K tokens. This offering is suitable for both individuals and businesses, providing models for diverse use cases with essential safety measures. Access resources and initial code to explore these pretrained and fine-tuned models effectively.
OLMo-Eval
OLMo-Eval, an evaluation framework for language models, leverages task sets for metric computation on NLP tasks. Built with ai2-tango and ai2-catwalk, it offers adaptable evaluation and integrates with Google Sheets for reporting. Deployment is simple through command line, supporting diverse models and datasets, facilitating ongoing development and analysis. Suited for benchmarking a variety of language models on standard tasks.
EasyContext
This project demonstrates how established methods can expand language models to manage contexts as long as 1 million tokens using efficient strategies such as sequence parallelism, Deepspeed zero3 offload, and flash attention. It delivers comprehensive training scripts, supports various parallel approaches, and highlights significant improvements in both perplexity and 'needle-in-a-haystack' evaluations for Llama2 models.
Feedback Email: [email protected]