GradCache
Gradient Cache overcomes GPU/TPU memory limits to efficiently scale contrastive learning. Compatible with PyTorch and JAX, it supports dense passage retrieval on single GPUs, lowering hardware costs with high FLOP systems. Suitable for deep learning, it supports mixed precision and distributed training, offering functional and decorator tools for streamlined cache implementation.