Introduction to PyGOD: A Graph Outlier Detection Library
PyGOD is a Python library dedicated to graph outlier detection, a crucial process in identifying anomalies or unusual patterns within graph-structured data. This field holds significant importance across various domains, including the detection of suspicious activities in social networks and security systems, as detailed in recent studies by Dou et al. (2020) and Cai et al. (2021).
Features and Capabilities
PyGOD is built on top of PyTorch and its geometric extension, PyTorch Geometric (PyG), ensuring seamless integration and ease of use for developers already familiar with these frameworks. The library is inspired by the API design of PyOD, a well-known outlier detection tool.
Here's why PyGOD stands out:
- Unified APIs and Documentation: It offers a consistent API across multiple detection algorithms, supported by detailed documentation and interactive examples.
- Wide Range of Detectors: The library includes over 10 graph outlier detection algorithms, addressing various anomaly detection needs.
- Multilevel Analysis: Users can conduct anomaly detection at multiple levels, such as node, edge, and graph-level tasks.
- Scalability: With the ability to process large graphs through mini-batch and sampling, PyGOD is designed to handle massive datasets efficiently.
- Integration with PyG Data Objects: This feature streamlines data processing, making the library fully compatible with PyG data structures.
Getting Started with PyGOD
For those eager to dive into anomaly detection, PyGOD offers a straightforward process. In just a few lines of code, users can implement complex outlier detection models. Below is a simple example using the DOMINANT detector:
from pygod.detector import DOMINANT
model = DOMINANT(num_layers=4, epoch=20) # Configure your model here
model.fit(train_data) # Train your model on PyG data object
# Obtain outlier scores for the training data
score = model.decision_score_
# Optionally, predict labels and scores on testing data
pred, score = model.predict(test_data, return_score=True)
Installation and Dependencies
Before installing PyGOD, it's necessary to have PyTorch (version 2.0.0 or later) and PyTorch Geometric (version 2.3.0 or later) installed. These are not included with PyGOD, so users must set them up separately. Once ready, PyGOD can be easily installed via pip:
pip install pygod # Install PyGOD
pip install --upgrade pygod # Update to the latest version if needed
Required Dependencies:
- Python 3.8 or later
- NumPy 1.24.3
- Scikit-learn 1.2.2
- SciPy 1.10.1
- NetworkX 3.1
Algorithms Included in PyGOD
PyGOD integrates a variety of algorithms, each suitable for different aspects of anomaly detection in graphs. Some notable ones include SCAN (2007), DOMINANT (2019), and the more recent GUIDE (2021) and CONAD (2022). These algorithms leverage various methods such as clustering, graph neural networks (GNNs), and autoencoders to identify anomalies effectively.
Contribution and Support
PyGOD is a collaborative effort by researchers from institutions like UIC, IIT, BUAA, ASU, and CMU, fostering a community-driven development model. Developers and researchers interested in contributing can refer to the contribution guide. For any questions or support, the team can be contacted via the project’s GitHub page or email.
In summary, PyGOD offers a robust, scalable solution for graph-based anomaly detection, providing tools that are both powerful and easy to use. Its commitment to openness and collaboration makes it a valuable resource for the research and development community.