llama_cpp-rs
Llama_cpp-rs provides user-friendly and safe Rust bindings to the C++ library, llama.cpp, allowing for seamless execution of GGUF-based language models on CPUs, independent of machine learning expertise. Supporting diverse backends such as CUDA, Vulkan, Metal, and HIP/BLAS, the library is adaptable across various hardware environments. Users can efficiently load models and generate predictions with minimal coding efforts. It also delivers experimental functionalities like memory context size prediction. The project welcomes contributions while emphasizing a clean user experience and optimized performance through Cargo's release builds.