ToG - Enhancing Large Language Model Integration with Deep Reasoning on Knowledge Graphs

Introduction to the ToG Project

ToG, or Think-on-Graph, is an innovative project that integrates deep reasoning with large language models using knowledge graphs to enhance decision-making and problem-solving processes. Recently, the project's paper was accepted by the International Conference on Learning Representations (ICLR) 2024, marking a significant milestone for the team. The project's repository has been moved to a new location on GitHub, accessible here.

Project Overview

The core focus of ToG is detailed in the paper titled "Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph". This work aims to bridge the gap between language models and knowledge graphs, facilitating more accurate and reliable reasoning processes. The project leverages the power of knowledge graphs like Freebase and Wikidata to bolster the capabilities of language models, making them more adaptable and insightful.

How ToG Works

ToG operates on a structured pipeline, clearly illustrated in the accompanying diagrams. This pipeline effectively integrates various components to perform logical reasoning and generate results based on the data in the knowledge graphs. The intricate workflow involves pruning, reasoning, and generating steps to deliver coherent answers and solutions to complex queries.

Structure of the Project

The ToG project is well-structured and organized into several key directories and components:

Requirements: The requirements.txt file lists all the libraries needed to run ToG.
Data: The data/ folder contains the datasets used for evaluation.
Chain of Thought (CoT): The CoT/ directory holds methods for processing reasoning sequences.
Evaluation: The eval/ folder includes scripts for evaluating the model's performance.
Knowledge Graphs: Directories named Freebase/ and Wikidata/ provide the environment settings for integrating these knowledge sources.
Tools and Source Code: The tools/ and ToG/ folders house the common tools and source codes essential for the functioning of ToG.

Getting Started with ToG

Before using ToG, users need to install either Freebase or Wikidata on their systems. Essential installation instructions are provided in the README.md files within the respective directories. The required libraries are detailed in requirements.txt, ensuring users have all the tools necessary for proper setup and execution.

How to Run ToG

Detailed instructions for running ToG are provided in the README.md file within the ToG/ directory. By following these guidelines, users can operate the system effectively on their local machines.

Evaluation Process

To evaluate the outputs of ToG, users should convert results from .jsonl format to .json using a script found in the tools directory. Once converted, these can be assessed using additional scripts in the eval folder, which offer comprehensive evaluation processes.

Citation

For those interested in citing this work, a citation format is offered:

@misc{sun2023thinkongraph,
      title={Think-on-Graph: Deep and Responsible Reasoning of Large Language Model with Knowledge Graph}, 
      author={Jiashuo Sun and Chengjin Xu and Lumingyuan Tang and Saizhuo Wang and Chen Lin and Yeyun Gong and Heung-Yeung Shum and Jian Guo},
      year={2023},
      eprint={2307.07697},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Experiments and Applications

Visual content on the GitHub page highlights the experimental results and real-world applications of ToG, reflecting its potential and utility in various scenarios and industries.

Licensing and Responsibility

The ToG project is released under the Apache 2.0 license. It clearly states that it assumes no legal responsibility for the outputs generated by the model or any damages resulting from the use of the resources provided.

With ToG, the integration of deep learning with knowledge graphs presents exciting opportunities to enhance the capabilities of large language models, pushing the boundaries of machine cognition and reasoning.