#HuggingFace

Logo of diffusers
diffusers
Diffusers provides a range of pretrained models for creating images, audio, and 3D structures. The library includes user-friendly diffusion pipelines, adjustable schedulers, and modular components compatible with PyTorch and Flax. It ensures cross-platform support, even for Apple Silicon, offering resources for both new and experienced developers to start quickly, train models, and optimize performance.
Logo of infinity
infinity
Explore a REST API crafted for high-throughput, low-latency text embedding services. Easily deploy models from HuggingFace using fast inference backends such as Torch, Optimum, and CTranslate2. Infinity enables multi-modal orchestration, allowing diverse model deployment and management. Built with FastAPI, it complies with OpenAI's API specifications, ensuring straightforward implementation. Benefit from dynamic batching and GPU acceleration with NVIDIA CUDA, AMD ROCM, etc. Learn integration with serverless platforms for optimal scalability and performance.
Logo of docker-llama2-chat
docker-llama2-chat
Learn to efficiently deploy both official and Chinese LLaMA2 models with Docker for local use. This guide provides detailed instructions and scripts for setting up 7B and 13B models, suitable for GPU or CPU. Ideal for developers looking to test language models, it highlights the capabilities and advantages of using these models in different applications.
Logo of TabFormer
TabFormer
Discover advanced hierarchical transformer modules for tabular data analysis, featuring a synthetic credit card transaction dataset and enhanced Adaptive Softmax for data masking. Utilizing HuggingFace's transformers, the project enables effective modeling of time series with BERT and GPT-2 models, suitable for Python and Pytorch platforms.
Logo of mPLUG-Owl
mPLUG-Owl
Examine the progressive developments in multi-modal large language models achieved by mPLUG-Owl. This family utilizes modular architecture to boost multimodality. Stay informed on the advancements of mPLUG-Owl3, which emphasizes long image-sequence comprehension, and note mPLUG-Owl2's CVPR 2024 accolade. Gain insight into the enriched features of the Chinese version, mPLUG-Owl2.1, which collectively contribute to advancing AI linguistic capabilities.
Logo of GoLLIE
GoLLIE
Explore GoLLIE, a Large Language Model designed to excel in zero-shot information extraction through adherence to annotation guidelines. This model supports creating dynamic annotation schemas and goes beyond existing knowledge with detailed definitions. GoLLIE's improved performance is available to the public on the HuggingFace Hub, with comprehensive instructions for installation, usage, and dataset generation, aiding customization in information extraction tasks.
Logo of AutoAudit
AutoAudit
AutoAudit, an open-source language model, enhances network security by offering tools for analyzing malicious code, detecting attacks, and predicting vulnerabilities. It supports security professionals with accurate, fast analysis and is integrated with ClamAV for seamless scanning operations. Future updates target improved reasoning, accuracy, and expanded tool integration.
Logo of llm.ts
llm.ts
llm.ts offers a unified interface to access over 30 language models through one API call, supporting environments such as Node, Deno, and browsers without any dependencies. This lightweight tool under 10kB allows easy API key management and simplifies executing multiple prompts. With support for models from OpenAI, Cohere, and HuggingFace, llm.ts provides a broad application range, making it suitable for developers looking for a simple, dependency-free way to integrate advanced language models.
Logo of dataspeech
dataspeech
Data-Speech provides efficient scripts for speech dataset annotation, emphasizing audio transformation and automated tagging essential for next-gen text-to-speech models. This toolset aids in crafting natural language labeled datasets, inspired by significant research on speech synthesis. It works seamlessly with renowned datasets like LibriTTS-R and MLS and integrates with the Parler-TTS library for both inference and training. Discover streamlined processes for annotating and generating detailed language prompts from speech data.
Logo of UniControl
UniControl
Explore UniControl, a unified diffusion model enabling controllable visual generation for various tasks within one framework. It achieves pixel-level precision by combining visual conditions for structural guidance and language prompts for style. By leveraging pretrained text-to-image models and a task-specific HyperNet, UniControl efficiently handles diverse condition-to-image tasks. This framework outperforms single-task models of similar sizes, representing a significant advancement in visual generation. Access includes open-source code, model checkpoints, and datasets for further exploration.
Logo of ViTamin
ViTamin
ViTamin offers scalable vision models that excel in zero-shot ImageNet accuracy and open-vocabulary segmentation. It integrates with platforms like Hugging Face and timm, supporting applications like pre-training and detection. By using fewer parameters, ViTamin achieves high benchmark performances, contributing to advances in vision-language AI research.
Logo of direct-preference-optimization
direct-preference-optimization
This repository offers a robust implementation of Direct Preference Optimization, including conservative DPO and IPO, to improve language model efficiency. Compatible with HuggingFace models, it facilitates easy dataset integration and supports diverse GPU setups, enhancing supervised fine-tuning and preference learning for scalable training solutions.
Logo of stable-diffusion-nvidia-docker
stable-diffusion-nvidia-docker
The project facilitates Stable Diffusion deployment using Docker, allowing GPU-based image generation without the need for coding skills. Features include a UI built with Gradio, support for the Stable Diffusion 2.0 model, and functionalities like img2img and image inpainting. Its Data Parallel approach enables multi-GPU support, optimizing inference speed for art and design tasks with straightforward installation for Ubuntu and Windows users.
Logo of LangChain-ChatGLM-Webui
LangChain-ChatGLM-Webui
The LangChain-ChatGLM-Webui project provides a WebUI utilizing LangChain and ChatGLM-6B series models for applications grounded in local knowledge. Supporting multiple text file formats like txt, docx, md, and pdf, it includes models such as ChatGLM-6B and Belle for enhanced embedding functionalities. Designed for real-world AI model implementation, the project offers online accessibility through HuggingFace, ModelScope, and AIStudio. Compatible with Python 3.8.1+, it facilitates straightforward deployment. Continuous updates and community engagement ensure its dynamic advancement, inviting developer participation without exaggerated claims.
Logo of opencompass
opencompass
OpenCompass is a comprehensive platform for assessing large language models, featuring advanced algorithms and a user-friendly interface. It supports 20+ HuggingFace and API models, evaluating over 70 datasets with about 400,000 questions. The platform is proficient in distributed evaluations, providing billion-scale assessments within hours, and supports various paradigms including zero-shot and few-shot learning. OpenCompass is modular and easily extendable, accommodating new models and datasets. It also allows for API and accelerated evaluations with different backends, contributing to a fair, open, and reproducible benchmarking ecosystem with its tools like CompassKit, CompassHub, and CompassRank.
Logo of transformers
transformers
Access a wide range of pretrained transformer models suitable for various applications in text, vision, and audio, with easy integration using JAX, PyTorch, and TensorFlow. The Transformers library by Hugging Face offers tools for deploying and refining these models, promoting collaboration among developers and researchers. Benefit from reduced computational demands, flexible model configurations, and the ability to transition seamlessly across different frameworks. Applicable to tasks such as sentiment analysis, object detection, and speech recognition, these models support the development of contemporary AI solutions.
Logo of HuggingFaceModelDownloader
HuggingFaceModelDownloader
This tool facilitates the download of models and datasets from HuggingFace, supporting multithreaded downloads for large files with SHA256 verification. Key features include parallel connections, the ability to resume downloads, and storage flexibility. Compatible with Linux, Mac, and Windows WSL2, it integrates smoothly with projects in Go and Python. Users can fine-tune downloads using specific model filters and directory options while installation scripts simplify setup across various OS architectures.
Logo of rulm
rulm
Discover significant developments in Russian language models through efficient implementations and detailed comparisons. Featuring the RuTurboAlpaca dataset with GPT-3.5-turbo and the Saiga models, the project provides valuable resources on HuggingFace and GitHub. It enables interaction with models from 7b to 70b, fostering innovation in Russian NLP tasks with active community support via DataFest and fine-tuning in Colab.
Logo of AMchat
AMchat
AMchat is developed to address advanced mathematics challenges with a broad dataset comprising math problems and their solutions. Utilizing InternLM2-Math-7B as its foundation and fine-tuned through XTuner, the model exhibits strong capabilities in handling complex mathematical equations. Deployment is flexible, allowing usage via Docker, OpenXLab, or local installations. With the introduction of the Q8_0 quantized model version, it offers better performance. Various deployment options ensure wide accessibility, optimizing precision in mathematical problem-solving across different applications.
Logo of Awesome-LLM-Large-Language-Models-Notes
Awesome-LLM-Large-Language-Models-Notes
Explore a detailed compilation of large language models (LLMs), organized by year, size, and name. This resource covers foundational and recent models such as Transformer, GPT, BERT, GPT-4, and BLOOM, with links to research papers and implementations. An essential guide for NLP research and applications, complete with insightful articles and the significance of HuggingFace for model deployment.
Logo of mlx-llm
mlx-llm
Explore real-time deployment of Large Language Models on Apple Silicon using MLX. Access a broad spectrum of models like LLaMA and Phi3, and leverage model quantization and embedding extraction for enhanced efficiency. Suitable for developers aiming to optimize LLMs on Apple devices or investigate fine-tuning with LoRA and RAG features.
Logo of speech-dataset-generator
speech-dataset-generator
The tool facilitates the creation of multilingual datasets for training text-to-speech and speech recognition models by transcribing and refining audio quality. It segments audio, identifies speaker gender, and utilizes pyannote embeddings for automatic speaker naming. Suitable for detecting multiple speakers, it enhances audio using deepfilternet, resembleai, or mayavoz. The tool supports input from local files, YouTube, LibriVox, and TED Talks, storing data efficiently in a Chroma database.
Logo of caduceus
caduceus
Learn about bi-directional equivariant methods in DNA sequence modeling, aiding tasks such as genomic prediction and classification. The project uses pre-trained models from HuggingFace, supporting processes like pretraining and fine-tuning. It is a valuable resource for genomic researchers and bioinformaticians. Access detailed guides for model deployment and experiments with Python scripts in advanced computing setups.
Logo of evo
evo
Evo is a biological model that supports genome and molecular sequence modeling, featuring StripedHyena architecture for efficient computation. With 7 billion parameters and training on the OpenGenome dataset, Evo provides deep sequence analysis. Model checkpoints such as 'evo-1-8k-base' and 'evo-1-131k-base' are available for molecular- and genome-scale tasks. Integration with HuggingFace and a web interface from Together AI make Evo accessible for genomics research.
Logo of Moore-AnimateAnyone
Moore-AnimateAnyone
Explore a face reenactment method utilizing facial landmarks from driving videos for pose control while retaining image identity. Access inference codes and pretrained models for accurate face movements, with training scripts for custom model development. Test the demo at HuggingFace Spaces or apply the technology on Moore's AIGC platform for diverse uses. Engage with a community in advancing this evolving technology.
Logo of UltraFastBERT
UltraFastBERT
UltraFastBERT brings exponential speed to language modeling with innovative FFFs and selective neuron use. Find optimized CPU and CUDA implementations and step-by-step setup instructions. Enhance your projects with UltraFastBERT-1x11-long, seamlessly integrating with HuggingFace transformers.
Logo of dolma
dolma
Dolma provides a 3 trillion token dataset derived from diverse sources such as web content and academic materials for language model training by AI2. Available on HuggingFace, it includes a high-speed toolkit suitable for processing large datasets with parallel workflows, cross-platform portability, and efficient deduplication using Rust Bloom filters. Researchers can utilize built-in taggers and customize settings for AWS S3, enhancing the versatility in AI and machine learning initiatives.
Logo of ml-engineering
ml-engineering
Discover a comprehensive collection of methodologies, tools, and step-by-step guides designed for training large language and multi-modal models, aimed at engineers and ML operators. This resource includes practical scripts and commands and draws from the author's experience with models such as BLOOM-176B and IDEFICS-80B, as well as ongoing work with RAG models at Contextual.AI. It covers crucial aspects from hardware configurations to debugging and orchestration, supporting efficient development and inference of advanced machine learning models. Follow updates and enhancements through social media.
Logo of xgen
xgen
Salesforce AI Research introduces innovative models designed for long sequence modeling with support for input sequences up to 8K. The lineup includes the XGen-7B-4K-Base and XGen-7B-8K-Base models, along with a special instruction-finetuned variant for research applications. These models use the OpenAI Tiktoken package to ensure efficient auto-regressive sampling. Comprehensive citations and installation instructions are provided to assist researchers in utilizing these state-of-the-art models across various applications.
Logo of beto
beto
BETO is a Spanish language BERT model trained on a vast Spanish corpus using the Whole Word Masking technique. Featuring architectures similar to BERT-Base, BETO offers both uncased and cased versions tailored for Spanish natural language processing tasks. It shows better performance than Multilingual BERT in various benchmarks, such as POS and NER-C, improving accuracy for Spanish language tasks. With a 31k BPE vocabulary, it ensures comprehensive coverage of linguistic structures. Available via the HuggingFace Transformers library, BETO supports a wide range of NLP applications in Spanish.
Logo of flan-alpaca
flan-alpaca
The flan-alpaca project offers improved problem-solving through the fine-tuning of Vicuna-13B on the Flan dataset, as well as demonstrating FLAN-T5's capability in text-to-audio generation. This approach extends Stanford Alpaca's instruction tuning to various models, with all pre-trained models accessible via HuggingFace. It includes practical tools for interactive demo, benchmarking, data preprocessing, training, and efficient integration. This objective overview highlights the project's focus on accessible, high-performance language model tuning.
Logo of One-2-3-45
One-2-3-45
One-2-3-45 presents a novel approach in utilizing 2D diffusion models for 3D AI content generation with a forward-only paradigm that minimizes optimization time. It allows efficient creation of 3D models, as showcased by updates such as rendering scripts and APIs for inference. Accepted at NeurIPS 2023 and integrated with Hugging Face Spaces, the project offers revolutionary methods for 3D modeling with easy setup options. Interactive demos and model training using the Objaverse-LVIS dataset enhance user engagement.
Logo of PicoMLXServer
PicoMLXServer
Pico MLX Server features an intuitive GUI for the MLX AI framework, streamlining AI model integration on MacOS. The solution supports multiple server instances, real-time logging capabilities, and OpenAI protocol compatibility for AI chat clients. Users can download or compile the software directly from GitHub and utilize automated setup tools. Designed specifically for MacOS 14.0 and above, Pico MLX Server facilitates downloading MLX models from HuggingFace and simplifies Python environment configuration. The platform empowers users to establish customized servers with ease, enhancing AI capabilities alongside the Pico AI Assistant.
Logo of Lemur
Lemur
The Lemur project offers an open source language model that combines natural language understanding with coding capabilities, providing a strong foundation for language agents. By harmonizing language and coding, the model performs well across benchmarks, allowing agents to execute tasks effectively. Explore models such as OpenLemur/lemur-70b-v1 and OpenLemur/lemur-70b-chat-v1 for advanced applications and regular updates. Review integration options and deployment strategies within diverse interactive environments.
Logo of notus
notus
Discover an extensive collection of models tailored for chat applications utilizing advanced SFT, DPO, and RLHF techniques. Notus models emphasize a data-driven, human-focused methodology, demonstrated by performance metrics from MT-Bench, AlpacaEval, and Open LLM Leaderboard benchmarks. Named after the Greek god of the south wind, Notus integrates mythology with AI, appreciating the open-source community's indispensable support. Learn how Notus 7B v1 excels beyond earlier versions in recent assessments.
Logo of Segment-Any-Anomaly
Segment-Any-Anomaly
Explore a new approach to zero-shot anomaly segmentation without additional training through hybrid prompt regularization combined with existing foundation models. Improve anomaly detection using models like Grounding DINO and Segment Anything. This repository features user-friendly demos available on Colab and Huggingface, showcasing the efficacy of the SAA+ framework on datasets such as MVTec-AD, VisA, KSDD2, and MTD. SAA+ provides optimal anomaly identification with minimal setup, catering to computer vision researchers and developers. Discover recent advancements and the work that led to success at the VAND workshop.
Logo of CLAP
CLAP
This open-source project utilizes advanced contrastive learning to extract latent audio and text representations, optimizing AI processing capabilities. Supported by IEEE ICASSP 2023, it extends compatibility with large-scale datasets and diverse downstream tasks, seamlessly integrating with Hugging Face Transformers. Ideal for researchers in audio understanding and data enhancement, with pre-trained checkpoints enhancing model performance.
Logo of chat_templates
chat_templates
This repository contains a variety of chat templates designed for instruction-tuned large language models (LLMs), supporting HuggingFace's Transformer library. It includes templates for the latest models like Meta's Llama-3.1 and Google's Gemma-2. These templates can be integrated into applications for enhanced interaction and response generation. Detailed examples and configurations make this resource useful for developers focusing on conversational AI. Contributions to add more templates are encouraged.
Logo of Chinese-Llama-2-7b
Chinese-Llama-2-7b
Chinese Llama 2 7B offers an open-source version of the LLaMA2 model, fully commercializable and featuring bilingual text-to-speech and text-to-vision datasets. Optimized for LLaMA-2-chat integration, it supports advanced multimodal applications. Regular updates provide new model resources, quantized versions, GGML models, and deployment solutions like Docker and API. Resources are accessible on HuggingFace, Baidu, and Colab for both CPU and GPU users.
Logo of gpt-neo
gpt-neo
This open-source project provides an advanced framework for developing large language models similar to GPT-3, utilizing model and data parallelism with the mesh-tensorflow library. It supports both TPU and GPU environments, featuring distinct capabilities such as local and linear attention, and Mixture of Experts, which set it apart in the AI landscape. Although active code development ceased in August 2021, the repository continues to be a valuable resource for enthusiasts and professionals interested in AI model training. The project's integration with HuggingFace Transformers allows for simplified model experimentation, catering to both beginner and advanced users. Additionally, the transition to a GPU-focused repository, GPT-NeoX, highlights its adaptability to the evolving hardware landscape, further driven by community contributions and open-source collaboration.
Logo of awesome-pretrained-chinese-nlp-models
awesome-pretrained-chinese-nlp-models
This repository offers a meticulously curated selection of Chinese pretrained language models, including multimodal and large language models. It serves as a valuable resource for NLP researchers and practitioners, providing a range of models from foundational to specialized dialogue and multimodal conversation models. Regular updates ensure access to the latest models. Key features include various LLMS models like BERT and GPT, delivering general and domain-specific functionalities, along with evaluation benchmarks, online model trials, and open datasets for comprehensive NLP efforts.
Logo of openlm
openlm
OpenLM allows integration with language models from various providers like HuggingFace and Cohere, using parameters compatible with OpenAI's Completion API. It supports multiple prompt completion in a single request with minimal setup. The installation process is simple via pip, and multiple examples showcase its capabilities, including API key setups and custom model additions. Future updates will support more standardized endpoints. The project operates under an MIT License, welcoming contributions.
Logo of transfer-learning-conv-ai
transfer-learning-conv-ai
This project provides a well-structured codebase enabling the training of conversational agents via transfer learning from OpenAI's GPT and GPT-2 models. It replicates HuggingFace's successful outcomes from the NeurIPS 2018 ConvAI2 competition, simplifying over 3,000 lines of competition code into a concise 250-line script, optimized for distributed and FP16 training. The model can be trained on cloud instances within an hour, with a pre-trained version readily available for immediate deployment. The project includes setup instructions, Docker support, and detailed guidance for training, interaction, and evaluation, thus offering a comprehensive solution for creating cutting-edge conversational AI.
Logo of foldingdiff
foldingdiff
The project details a diffusion model to create novel protein backbones using PyTorch and PyTorch Lightning, accessible via HuggingFace and SuperBio for browser-based interaction. Comprehensive scripts support data handling, model training, and structure sampling. Researchers can train personalized models on the CATH dataset and explore pre-trained models for structural assessment. The model also supports designability evaluation and amino acid sequence generation for high prediction accuracy.
Logo of GPT2-Chinese
GPT2-Chinese
The GPT2-Chinese project provides a comprehensive toolkit for training Chinese language models using GPT2 technology. It includes support for BERT tokenizer and BPE models, enabling the generation of varied textual content such as poems and novels. The repository offers diverse pre-trained models, from ancient Chinese to lyrical styles, ideal for NLP practitioners. This resource supports large training corpora and encourages community collaboration through discussions and model contributions, aiding developers in advancing their NLP expertise in a practical and informative manner.
Logo of Platypus
Platypus
This project delivers advanced solutions to enhance transformer architectures like LLaMA and LLaMA-2 using LoRA and PEFT. It focuses on efficiency and affordability, allowing users to access fine-tuned models on HuggingFace with seamless integration. Recent advancements include improved data processing and scripts for easy model setup and tuning. Discover various data refinement techniques to ensure model training accuracy and uniqueness, with detailed CLI guidelines for local deployment.
Logo of PuLID
PuLID
Explore sophisticated approaches to ID customization utilizing the contrastive alignment methodologies of PuLID. Investigate efficient GPU-compatible solutions backed by diverse demos and resources. Keep abreast of the latest PuLID models and enhancements like PuLID-FLUX, designed for optimal local and online execution. Enjoy straightforward integration aided by easy installation and detailed guides. Utilize ongoing updates and support for consistent performance advancements. PuLID serves as a significant resource for researchers and developers focused on improving AI-based image generation.
Logo of Multimodal-Toolkit
Multimodal-Toolkit
The toolkit incorporates HuggingFace transformers to integrate multimodal data, enhancing tasks like classification and regression with a combination of categorical, numerical, and text features. It supports transformers such as BERT and RoBERTa and offers detailed datasets and examples for configuration. Suitable for AI applications requiring comprehensive data input, it provides flexible and robust methodologies for predictive modeling. Refer to the blog post and Colab notebook for detailed use cases.
Logo of llama3-chinese
llama3-chinese
This project leverages high-quality Chinese and English multi-turn SFT data using the DORA and LORA+ methodologies to improve natural language processing capabilities. It provides downloadable models, instructions for merging LORA models, and supports deployment and inference. The focus is on flexible usage and robust performance in Chinese NLP, making it a valuable resource for research applications without exaggeration.
Logo of ChatGLM3
ChatGLM3
ChatGLM3-6B, developed by Zhipu AI and Tsinghua University's KEG Lab, is an advanced open-source dialogue model with strong foundational capabilities and extensive function support. Featuring innovative prompt formats and superior long text understanding, it leads among models with less than 10B parameters. It is open for both research and commercial use under open-source guidelines, offering enhanced features like system prompts and advanced function calls.