#Pre-trained Models

Logo of wav2letter
wav2letter
Explore the next step in speech recognition development with wav2letter's integration into Flashlight ASR. Access pivotal pre-consolidation resources and use detailed recipes for reproducing significant research models like ConvNets and sequence-to-sequence architectures. Utilize data preparation tools, confirm reproducibility with Flashlight 0.3.2, and connect with the dynamic wav2letter community. This MIT-licensed project offers innovative solutions for both supervised and semi-supervised speech recognition.
Logo of Graphormer
Graphormer
Graphormer is a specialized deep learning tool enhancing molecule science research in areas like drug and material discovery. Its features include pre-trained models for various datasets, and compatibility with frameworks such as PyG and DGL. Available through Azure Quantum Elements, Graphormer has proven effective in competitions like the Open Catalyst Project. Detailed documentation and resources help researchers and developers leverage its full capabilities for scientific progress.
Logo of CFLD
CFLD
This article introduces an advanced pose-guided image synthesis method using Coarse-to-Fine Latent Diffusion, showcased at CVPR 2024. It demonstrates improvements in resolution through datasets like DeepFashion. Resources such as code, models, and a customizable Jupyter notebook are provided to aid researchers in optimizing image synthesis processes. Access to generated images and pre-trained models allows for experimentation and further study.
Logo of Awesome-Chinese-LLM
Awesome-Chinese-LLM
Explore a diverse collection of over 100 Chinese language models, applications, datasets, and tutorials. This project showcases notable models such as ChatGLM, LLaMA, and Qwen, offering resources for various technical requirements. It serves as a collaborative platform for sharing open-source models and applications, fostering a broad resource hub in the evolving field of Chinese language models from development to deployment and learning materials.
Logo of AliceMind
AliceMind
Discover Alibaba MinD Lab's comprehensive suite of advanced pre-trained models and techniques, including the mPLUG-Owl2 for multimodal enhancement. Explore resources across vision-language understanding and cross-lingual tasks with innovative releases like mPLUG-DocOwl and Youku-mPLUG, designed for high-performance AI applications.
Logo of spark-nlp
spark-nlp
Utilize an efficient NLP library offering scalable annotations across 200+ languages, suitable for tasks such as tokenization and language translation. It integrates state-of-the-art transformers like BERT and GPT-2 and supports Python, R, and JVM platforms. This library facilitates model imports from frameworks including TensorFlow and ONNX, enhancing compatibility in distributed machine learning systems.
Logo of HorNet
HorNet
HorNet leverages Recursive Gated Convolution for advanced spatial interactions in vision backbones. These models deliver leading results on ImageNet-1K and ImageNet-22K, supporting various tasks such as image and 3D object classification. The PyTorch-based implementation provides detailed setups and training methods, ensuring seamless integration and scalability across machine learning projects.
Logo of nlp_chinese_corpus
nlp_chinese_corpus
This project offers a wide array of Chinese language corpora to aid advancements in natural language processing. It includes structured Wikipedia articles, varied news reports, and community question-and-answer datasets, addressing the limited access to large-scale Chinese text datasets by researchers and developers. The project focuses on creating extensive high-quality text records, enhancing pre-trained language models, and assisting in NLP tasks like word vector generation and question answering. Recent updates have added community Q&A and translation datasets, enriching the resources for building advanced Chinese NLP models.
Logo of Awesome-Remote-Sensing-Foundation-Models
Awesome-Remote-Sensing-Foundation-Models
This repository delivers a detailed set of resources including papers, datasets, benchmarks, code, and pre-trained weights dedicated to Remote Sensing Foundation Models (RSFMs). It systematically categorizes models into types like vision, vision-language, and generative, offering valuable developments such as PANGAEA, TEOChat, and SAR-JEPA. Designed for a neutral exploration, it aids in navigating through model types and associated projects, maintaining up-to-date information on significant research progress in journals like ICCV and NeurIPS. This collection serves professionals seeking an enhanced understanding of RSFMs through focuses on geographical knowledge, self-supervised learning, and multimodal fusion.