Project Icon

datasets

Open-Source Efficient Data Loading and Preprocessing Library for Machine Learning

Product DescriptionExplore a community-driven, lightweight library designed for efficient data loading and preprocessing in machine learning applications. It offers one-line data loaders with robust preprocessing capabilities for formats such as CSV, JSON, and images. Experience smart caching, memory-mapping, and seamless integration with frameworks like NumPy, Pandas, PyTorch, and TensorFlow. Benefit from built-in support for audio and image data, along with streaming for efficient large dataset access. An ideal tool for researchers needing a fast, flexible solution with efficient disk usage.
Project Details