CogDL - Optimization Toolkit for Graph Neural Network Training

Introduction to CogDL

CogDL is a pioneering toolkit developed for graph deep learning, offering tools for researchers and developers to efficiently train and evaluate baseline or custom models for tasks such as node classification and graph classification. This versatile toolkit stands out due to its emphasis on efficiency, ease of use, and extensibility.

Key Features of CogDL

Efficiency

CogDL focuses on optimizing the speed and resource usage of Graph Neural Networks (GNNs). By utilizing well-tuned operators, it significantly enhances the training speed and reduces GPU memory usage.

Ease of Use

Designed with usability in mind, CogDL provides user-friendly APIs. These APIs facilitate conducting experiments with pre-existing models and datasets, making parameter tuning straightforward.

Extensibility

CogDL has a flexible framework, allowing users to apply GNN models to various scenarios effortlessly. This adaptability is one of its core strengths, making it suitable for a wide range of research applications.

Recent Updates

WWW 2023 Acceptance: The CogDL paper was accepted at the WWW 2023 conference. The team has released a new version (v0.6) that includes advanced examples of graph self-supervised learning like GraphMAE, GraphMAE2, and BGRL.
GNN Course: A free Graph Neural Network (GNN) course is available through the CogDL team, catering to learners interested in in-depth study.
Version Updates: Newer versions bring features such as mixed-precision training, bug fixes, and enhanced tutorial content, thus continuously improving the toolkit.

Getting Started with CogDL

System Requirements

Python 3.7 or higher
PyTorch 1.7.1 or higher

Installation

Install PyTorch as per the instructions on its GitHub page. Then, install CogDL using the following pip command:

pip install cogdl

Or, for the source version:

pip install git+https://github.com/thudm/cogdl.git

Usage

CogDL allows running experiments using its APIs. An example of basic usage with the CogDL API is running a Graph Convolutional Network (GCN) model on the Cora dataset through:

from cogdl import experiment
experiment(dataset="cora", model="gcn")

More complex configurations can include adjusting parameters, running multiple models, or performing hyperparameter optimization through automated searches.

Command-Line Interface

A command-line utility is available for users who prefer executing scripts directly. For example, you can run GCN and GAT models on the Cora dataset using the command:

python scripts/train.py --dataset cora --model gcn gat --seed 0 1 2 3 4

Community and Contribution

CogDL invites contributions from developers interested in implementing algorithms. Contributors are encouraged to open an issue on GitHub before making a pull request. Detailed guidelines ensure code quality through pre-commit hooks that maintain format and style consistency.

Fast GNN Training

For optimized operations, CogDL includes fast sparse matrix multiplication operators that accelerate GNN model training on GPUs.

Parallel Experiments

Users can execute simultaneous experiments across multiple GPUs to leverage computational power, optimizing the workflow for experiments with diverse models.

CogDL's Collaborative Development

Developed by a collaboration of institutions including Tsinghua University, ZJU, DAMO Academy, and ZHIPU.AI, the project is actively maintained by a dedicated team accessible via [email protected].

Citing CogDL

Research papers benefitting from CogDL are encouraged to cite its documentation as follows:

@inproceedings{cen2023cogdl,
    title={CogDL: A Comprehensive Library for Graph Deep Learning},
    author={Yukuo Cen et al.},
    booktitle={Proceedings of the ACM Web Conference 2023 (WWW'23)},
    year={2023}
}

This introduction captures the essence of CogDL, highlighting its capabilities and contributions to the graph deep learning community.