awesome-graph-self-supervised-learning - Detailed Overview of Self-Supervised Graph Learning Resources

Awesome Graph Self-Supervised Learning

GitHub stars GitHub forks

This project is a well-curated compilation of resources related to self-supervised graph representation learning. It finds its inspiration from influential projects like "awesome-deep-vision", "awesome-adversarial-machine-learning", "awesome-deep-learning-papers", and more. The main idea is to offer a single-point reference for individuals interested in exploring the realm of self-supervised learning (SSL) within graph data.

Why Self-Supervised Learning?

Self-supervised learning is gaining considerable traction in the AI industry for its promising ability to perform without manual annotations. Some prominent researchers have emphasized its importance:

Jitendra Malik: "Supervision is the opium of the AI researcher."
Alyosha Efros: "The AI revolution will not be supervised."
Yann LeCun: Thinks of "self-supervised learning as the cake with supervised learning being the icing."

Overview

The project involves extending the notion of self-supervised learning beyond its typical usage in domains like computer vision and natural language processing, to the domain of graph data. The self-supervised graph representation learning methods are categorized into three distinct types: contrastive, generative, and predictive.

Contrastive Learning: This type capitalizes on comparing different representations of data generated through varied augmentation techniques to glean insights. It's like comparing two different views or angles of the same scene to understand it better.
Generative Learning: Here, the focus is on understanding and using internal data traits for tasks that involve reconstructive learning. It utilizes attributes and data structure as self-supervision signals.
Predictive Learning: This type involves generating labels from graph data using statistical analysis and designing tasks to predict based on these labels. The idea is to handle how data relates to these self-generated labels.

Training Strategies

There are three primary strategies for training models using these SSL techniques:

Pre-training and Fine-tuning (P&F): Initially, the model learns via tasks without labeled data. The parameters learned form the starting point for further fine-tuning involving labeled data.
Joint Learning (JL): Here, a self-supervised task is combined with a supervised task, allowing them to learn in tandem.
Unsupervised Representation Learning (URL): The model initially trains using unlabeled data, and once trained, its parameters are used to further learn labeled data tasks.

Contrastive Learning

Within this category, the framework is diversified across three information levels: local (node-level), contextual (subgraph level), and global (entire graph level). They contrast different components:

Same-Scale Contrasting: Compares within the same level, like node-to-node.
Cross-Scale Contrasting: Compares across different levels, like node-to-subgraph.

The project further explores various methods under this umbrella, including Global-Global, Context-Context, Local-Local, Local-Global, Local-Context, and Context-Global contrasting techniques, each offering unique methodologies and algorithms such as GraphCL, IGSD, and DGI.

Generative Learning

Focused on reconstructing data or inferring its original state, generative learning includes methods such as:

Graph Autoencoding: Techniques like CDNMF and GraphMAE that dive into self-supervised learning by reconstructing graphs for better understanding.
Graph Completion: Assesses when self-supervision benefits graph convolution networks, helping complete the missing pieces.

With a comprehensive list of tools, techniques, and strategies, this project aims to be a valuable resource for researchers and practitioners looking to explore self-supervised approaches in graph representation learning. By providing categorized resources, it aims to foster innovation and understanding in the evolving field of AI-driven graph analysis.