Introduction to CogDL
CogDL is a pioneering toolkit developed for graph deep learning, offering tools for researchers and developers to efficiently train and evaluate baseline or custom models for tasks such as node classification and graph classification. This versatile toolkit stands out due to its emphasis on efficiency, ease of use, and extensibility.
Key Features of CogDL
Efficiency
CogDL focuses on optimizing the speed and resource usage of Graph Neural Networks (GNNs). By utilizing well-tuned operators, it significantly enhances the training speed and reduces GPU memory usage.
Ease of Use
Designed with usability in mind, CogDL provides user-friendly APIs. These APIs facilitate conducting experiments with pre-existing models and datasets, making parameter tuning straightforward.
Extensibility
CogDL has a flexible framework, allowing users to apply GNN models to various scenarios effortlessly. This adaptability is one of its core strengths, making it suitable for a wide range of research applications.
Recent Updates
-
WWW 2023 Acceptance: The CogDL paper was accepted at the WWW 2023 conference. The team has released a new version (v0.6) that includes advanced examples of graph self-supervised learning like GraphMAE, GraphMAE2, and BGRL.
-
GNN Course: A free Graph Neural Network (GNN) course is available through the CogDL team, catering to learners interested in in-depth study.
-
Version Updates: Newer versions bring features such as mixed-precision training, bug fixes, and enhanced tutorial content, thus continuously improving the toolkit.
Getting Started with CogDL
System Requirements
- Python 3.7 or higher
- PyTorch 1.7.1 or higher
Installation
Install PyTorch as per the instructions on its GitHub page. Then, install CogDL using the following pip command:
pip install cogdl
Or, for the source version:
pip install git+https://github.com/thudm/cogdl.git
Usage
CogDL allows running experiments using its APIs. An example of basic usage with the CogDL API is running a Graph Convolutional Network (GCN) model on the Cora dataset through:
from cogdl import experiment
experiment(dataset="cora", model="gcn")
More complex configurations can include adjusting parameters, running multiple models, or performing hyperparameter optimization through automated searches.
Command-Line Interface
A command-line utility is available for users who prefer executing scripts directly. For example, you can run GCN and GAT models on the Cora dataset using the command:
python scripts/train.py --dataset cora --model gcn gat --seed 0 1 2 3 4
Community and Contribution
CogDL invites contributions from developers interested in implementing algorithms. Contributors are encouraged to open an issue on GitHub before making a pull request. Detailed guidelines ensure code quality through pre-commit hooks that maintain format and style consistency.
Fast GNN Training
For optimized operations, CogDL includes fast sparse matrix multiplication operators that accelerate GNN model training on GPUs.
Parallel Experiments
Users can execute simultaneous experiments across multiple GPUs to leverage computational power, optimizing the workflow for experiments with diverse models.
CogDL's Collaborative Development
Developed by a collaboration of institutions including Tsinghua University, ZJU, DAMO Academy, and ZHIPU.AI, the project is actively maintained by a dedicated team accessible via [email protected].
Citing CogDL
Research papers benefitting from CogDL are encouraged to cite its documentation as follows:
@inproceedings{cen2023cogdl,
title={CogDL: A Comprehensive Library for Graph Deep Learning},
author={Yukuo Cen et al.},
booktitle={Proceedings of the ACM Web Conference 2023 (WWW'23)},
year={2023}
}
This introduction captures the essence of CogDL, highlighting its capabilities and contributions to the graph deep learning community.