ppl.llm.kernel.cuda

Introduction to `ppl.llm.kernel.cuda`

Overview

ppl.llm.kernel.cuda is a key component of the PPL.LLM system, which focuses on providing advanced computation capabilities for neural network models. It's part of an ecosystem that facilitates the execution of large language models using NVIDIA's CUDA technology. The project is oriented towards users who aim to leverage GPU acceleration, particularly on NVIDIA's Ampere and Hopper architectures.

For newcomers to this project, it is recommended to familiarize themselves with the system's general overview, which provides a broader context for its applications and uses.

Purpose

The main goal of ppl.llm.kernel.cuda is to act as a foundational library that offers primitive CUDA kernels specifically designed for the ppl.nn.llm project. By utilizing CUDA, the library enables efficient computation, which is crucial for deep learning tasks that require substantial processing power.

Prerequisites

To integrate and use ppl.llm.kernel.cuda, several system requirements need to be met:

The system must be running on Linux, either on x86_64 or arm64 CPUs.
GCC version 9.4.0 or higher is required.
CMake version 3.18 or higher is needed for building the project.
Git version 2.7.0 or higher is necessary for version control and cloning repositories.
The CUDA Toolkit, version 11.4 or higher, is essential for enabling CUDA functionalities, with version 11.6 being recommended for best performance.

Getting Started

To start working with ppl.llm.kernel.cuda, follow these steps:

Install Prerequisites: For Debian or Ubuntu systems, necessary tools can be installed using the command:
```
apt-get install build-essential cmake git
```
Clone the Source Code: To obtain the project files, clone the repository using:
```
git clone https://github.com/openppl-public/ppl.llm.kernel.cuda.git
```
Build from Source: After acquiring the source code, compile the project with the following command:
```
./build.sh -DPPLNN_CUDA_ENABLE_NCCL=ON -DPPLNN_ENABLE_CUDA_JIT=OFF -DPPLNN_CUDA_ARCHITECTURES="'80;86;87'" -DPPLCOMMON_CUDA_ARCHITECTURES="'80;86;87'"
```
This command configures the build process to enable specific CUDA architectures and settings.

License

ppl.llm.kernel.cuda is released under the Apache License, Version 2.0, which allows for free use and distribution with certain conditions. This ensures that users and developers can freely modify and share the project while adhering to the terms of this open-source license.