Introduction to HierarchicalKV
About HierarchicalKV
HierarchicalKV is a part of the NVIDIA Merlin framework, designed to meet the demanding requirements of recommendation systems (RecSys). It offers hierarchical key-value storage solutions which leverage the high-bandwidth memory (HBM) on GPUs and the host memory. This tool can also serve as a generic key-value storage library.
Benefits
In the field of machine learning, particularly when developing large recommender systems, engineers face several challenges:
- The limited size of HBM on a single GPU, which struggles to handle large recommendation models scaling to terabytes.
- Increasing difficulties in enhancing communication performance across extensive CPU clusters.
- The complex task of managing limited HBM consumption with custom strategies.
- Low utilization of HBM and host memory by most generic key-value libraries.
HierarchicalKV addresses these challenges by providing:
- Capability to train large RecSys models using both HBM and host memory simultaneously.
- Enhanced performance by bypassing CPUs and reducing communication workloads.
- Memory management strategies based on Least Recently Used (LRU) or other custom approaches.
- High operational load factors near the optimum level of 1.0.
Key Concepts
HierarchicalKV introduces several innovative ideas to enhance the efficiency and flexibility of key-value storage:
- Local ordering of buckets
- Separate storage for keys and values
- Storage of all keys in HBM
- Built-in and customizable eviction strategies
These innovations position NVIDIA GPUs as ideal for training comprehensive models for search, recommendations, and advertising tasks, addressing common hurdles in building, evaluating, and maintaining sophisticated recommendation systems.
API Overview
HierarchicalKV includes several key structures and classes. Key among them are:
HashTable
: Handles storage and lookup of key-value pairs.EvictStrategy
: Manages intelligent data eviction techniques to optimize storage.HashTableOptions
: Offers configuration settings for the hash table.
For full details, you can explore the API documentation, which includes explanations of other functionalities and features.
Eviction Strategies
In HierarchicalKV, a key's "score" indicates its importance. Eviction of keys from storage only occurs when all available space is occupied. The following strategies help determine eviction priorities:
- Lru: Uses the device clock for eviction timing.
- Lfu: Increments frequency via specified parameters.
- EpochLru & EpochLfu: Combines global epoch data with clock or frequency metrics.
- Customized: Fully controlled by user-defined parameters.
Configuration Settings
HierarchicalKV offers various customizable options:
- Initialization and Maximum Capacities: Control the storage limits.
- Memory Utilization: Specifies the amount of HBM and host memory used for key-value pairs.
- Dimensionality of Value Vectors and Bucket Sizes: Tailor the configuration to meet specific needs.
Users should generally maintain the default configurations for options to ensure optimal performance unless specific use cases dictate changes.
Example Usage
Below is a simplified example demonstrating how to set up and use HierarchicalKV for custom machine learning tasks:
#include "merlin_hashtable.cuh"
using TableOptions = nv::merlin::HashTableOptions;
using EvictStrategy = nv::merlin::EvictStrategy;
int main(int argc, char *argv[])
{
using K = uint64_t;
using V = float;
using S = uint64_t;
// Define table with LRU eviction strategy.
using HKVTable = nv::merlin::HashTable<K, V, S, EvictStrategy::kLru>;
std::unique_ptr<HKVTable> table = std::make_unique<HKVTable>();
// Configure options.
TableOptions options;
options.init_capacity = 16 * 1024 * 1024;
options.max_capacity = options.init_capacity;
options.dim = 16;
options.max_hbm_for_vectors = nv::merlin::GB(16);
// Initialize table resources.
table->init(options);
// Use table for various operations.
return 0;
}
Usage Restrictions
Certain restrictions apply when using HierarchicalKV:
- The
key_type
should either beint64_t
oruint64_t
. - The
score_type
must only beuint64_t
.
Building the Project
HierarchicalKV primarily functions as a header-only library but does provide binaries for benchmarking and testing. To build it, you'll need a compatible environment that supports recent versions of CUDA and GCC, along with Bazel or CMake for building.
Support and Contributions
HierarchicalKV is maintained by the NVIDIA Merlin Team and welcomes contributions from the public. For support and more information on contributing, users can refer to the issues page.
Conclusion
HierarchicalKV showcases the strength of NVIDIA's approach to large-scale data handling, particularly in dynamic fields like recommendation systems. By addressing memory constraints and performance hurdles, it facilitates the development and scaling of complex machine learning models efficiently.