#dataset
seq2seq-couplet
This project utilizes a seq2seq model to generate Chinese couplets using Tensorflow. It offers a demo and requires Python 3.6 and a dataset. The model can be trained via 'couplet.py', with metrics like loss and BLEU score tracked on Tensorboard. For continuous training, sessions can be resumed effortlessly. Additionally, the model can be deployed as a web service via 'server.py' or Docker. Example couplets include '天朗气清风和畅' paired with '云蒸霞蔚日光辉'. Suitable for those interested in NLP and language generation.
WanJuan1.0
Intern · WanJuan 1.0 provides a comprehensive and open-source multimodal corpus including text, image-text, and video datasets with a total volume exceeding 2TB. Created by Shanghai AI Lab with rigorous data fine-tuning processes, this dataset ensures high quality, seamless integration, and alignment with Chinese values. It encompasses various domains like science and law, enhancing AI models' logical reasoning and generalization capabilities. Optimized for usability and efficiency, this dataset supports training for Multimodal Large Language Models, excelling in tasks like semantic interpretation and visual analysis.
Anti-UAV
This project presents a solution for the detection and tracking of Unmanned Aerial Vehicles (UAVs) using both PyTorch and Jittor frameworks. It meets the need for reliable UAV monitoring due to their expanding applications. The project offers a unique dataset with high-quality video sequences in RGB and Thermal Infrared (IR) formats. Newly released Jittor versions enhance compatibility with domestic hardware, aiming to address needs in security and defense. Licensed under MIT, the Anti-UAV project provides detailed evaluation metrics and comprehensive training resources.
fashion-mnist
Fashion-MNIST provides a modern alternative to the traditional MNIST dataset, featuring 28x28 grayscale images of Zalando's clothing articles across 10 categories. This dataset includes 60,000 training and 10,000 testing samples, making it suitable for rigorous testing of machine learning models. Unlike MNIST, Fashion-MNIST is designed to challenge algorithms with more complex image classification tasks. Researchers can easily integrate this dataset using popular libraries like TensorFlow and PyTorch, supporting the creation of more robust machine learning solutions.
toon3d
Toon3D facilitates 3D scene reconstruction from 2D cartoons, overcoming geometric inconsistencies. Through a precise image processing pipeline, it achieves visually coherent structures, paving new pathways in animation and virtual reality. Steps include setting environments, dataset management, and depth-based data processing, leading to potential advancements in computational art.
ltu
Discover how the LTU and LTU-AS models bridge audio and language processing, achieving state-of-the-art results in both closed-ended and open-ended audio question tasks. Access their PyTorch implementations, pretrained checkpoints, and comprehensive datasets crucial for audio and speech AI research. Try interactive demos on HuggingFace to explore their capabilities. These models demonstrate major advancements in audio and speech understanding, offering efficient inference methods such as APIs and local setups.
Exclusively-Dark-Image-Dataset
The ExDark dataset includes 7,363 images captured in various low-light conditions, providing a valuable resource for research in object detection and image enhancement. It features 12 annotated object classes similar to PASCAL VOC, essential for low-light image analysis. The open-source project, backed by CVIU publications, offers code for image enhancement and is governed by a BSD-3 license.
carla_garage
Explore the complexities of end-to-end autonomous driving models by uncovering hidden biases through a CARLA-based research initiative. The repository provides efficient, configurable code, exhaustive documentation, and pre-trained models, presenting a solid foundation for autonomous driving research. Key features include dataset generation, model evaluation, and advanced training methods designed for parallel processing to boost research efficiency. Ideal for developers progressing in complex autonomous driving benchmarks, this resource bypasses promotional language, focusing on practical benefits relevant to the field.
lvis-api
The LVIS API facilitates effective interaction with a large-scale dataset containing over 2 million instance segmentation masks within more than 1200 object categories. It offers tools for annotation manipulation, visualization, and result evaluation. This API, supporting Large Vocabulary Instance Segmentation, is essential for advancing computer vision research. The publicly available v1.0 release is integral to the LVIS Challenge at ECCV 2020. Install the API effortlessly in a virtual environment to seamlessly integrate with the COCO toolkit and boost research effectiveness.
MNBVC
MNBVC is an expansive and continuously growing corpus of over 38344GB of Chinese texts spanning both mainstream and niche cultures. Aiming for 40TB, it covers various formats such as news, literature, and online discussions. The corpus employs data cleaning tools to improve usability and serves as a valuable resource for AI and NLP research. The project invites community involvement to assist in expanding and processing the data. Access the repository to download and utilize the structured data for diverse linguistic and cultural studies.
kitti360LabelTool
The annotation tool, employing Python and JavaScript technologies, is designed for efficiently annotating the KITTI-360 dataset, which is pivotal for urban scene understanding. It includes features like demo data integration, user-task management, and XML result parsing. The tool's structured setup and compatibility ensure ease of use, making it a valuable resource for researchers. Offered under the MIT license, it is accessible and adaptable for varied research needs.
diffusiondb
Discover DiffusionDB, a comprehensive dataset of 14 million images generated via Stable Diffusion using real user prompts. Ideal for research on prompt interactions, deepfake detection, and AI tool development, with subsets catering to different storage needs. Effortlessly access images and metadata online using various loading methods.
Feedback Email: [email protected]