Overview of TypeDB-ML
TypeDB-ML is an innovative tool designed to enhance the capabilities of graph algorithms and machine learning by integrating with TypeDB, a strongly-typed database system. Although previously known as KGLIB, TypeDB-ML has evolved to support two prominent graph and machine learning libraries: NetworkX and PyTorch Geometric (PyG). However, it's important to note that this repository is no longer actively supported and will be closed by the end of 2023.
Integrations and Features
NetworkX Integration
TypeDB-ML's integration with NetworkX allows users to perform a series of complex algorithms on graph data that is exported from TypeDB. This library provides extensive algorithmic functionalities, which can enrich the manipulation and analysis of graph data. Some notable features include:
- Graph Structure Declaration: Users can declare the graph structure they want to explore and use optional sampling functions for refining queries.
- Graph Building: The tool enables querying a TypeDB instance and combining multiple results from various queries to construct a comprehensive graph using the method
build_graph_from_queries
.
PyTorch Geometric (PyG) Integration
TypeDB-ML leverages PyG for advanced machine learning applications on graph data. It empowers users to construct Graph Neural Networks (GNNs), which are useful for tasks like link prediction. Key features include:
- Data Handling: The
DataSet
object helps in efficiently loading graphs from TypeDB and converting them into PyGData
objects. - Heterogeneous Data Management: Recognizing the type-specific nature of data in TypeDB, TypeDB-ML assists in converting data into PyG
HeteroData
objects, maintaining node order integrity. - Feature Encoding: It provides a
FeatureEncoder
to manage encoding processes, supporting both continuous and categorical data to match the types and attributes found in TypeDB datasets. - Example Projects: A full example demonstrating link prediction is available to guide users through building GNNs on TypeDB data.
Additional Features
For those exploring machine learning with graph data, TypeDB-ML offers additional examples, such as using Tensorboard for visualizing PyG HeteroData
.
Resources and Installation
To better understand TypeDB-ML's inception and applications, users can explore resources like "Strongly Typed Data for Machine Learning" and "How Can We Complete a Knowledge Graph?" on YouTube.
Installation Quickstart
To start using TypeDB-ML, users need Python version 3.7 or higher. They should download the requirements.txt
from the repository and install necessary packages via pip. Users must have TypeDB 2.11.1 and the compatible typedb-client-python
installed to run TypeDB-ML smoothly.
Running Examples and Development
By exploring the included PyG heterogeneous link prediction example, users can see TypeDB-ML in action. Community engagement is encouraged via Vaticle's Discord and Discussion Forum for ongoing development discussions.
Building from Source
While most users will install TypeDB-ML using pip, developers interested in modifying the library can clone the repository and use Bazel to build from source and run all tests. This option requires Python 3.7+ and familiarity with Bazel's build and test environments.
Despite its powerful features, users should be aware of the project's forthcoming closure and plan their adoption of its technology accordingly.