Creative Leap-of-Thought (CLoT)
Creative Leap-of-Thought, abbreviated as CLoT, is a pioneering project that explores a new ability in large language models (LLMs) known as Leap-of-Thought (LoT). This project stems from a paper that has been officially accepted at the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024 (CVPR 2024).
Exploring Leap-of-Thought
The fundamental objective of CLoT is to study the LoT ability in multimodal LLMs. Conventional LLMs primarily focus on linear, sequential thinking skills, such as Chain-of-Thought methods. Leap-of-Thought, by contrast, challenges these models to think differently—outside the conventional bounds, offering a new dimension of creative reasoning.
This project uniquely utilizes a humor generation game called Oogiri (大喜利) to assess and develop the Leap-of-Thought capabilities. In Oogiri, participants are required to deliver unexpected, humorous responses to various multimodal inputs. This makes it an ideal setting to exercise the LoT ability, prompting language models to generate innovative and entertaining outputs based on different kinds of information, whether it's Image-to-Text, Text-to-Text, or Image-Text-to-Text scenarios.
Launching CLoT Quickly
For those interested in experimenting with CLoT, the project provides a simple entry point. Users can test CLoT's zero-shot inference capabilities through a provided script. The setup involves the installation of necessary packages and running a straightforward command:
pip install -r requirements.txt
python inference.py
Gradio Web Interface
Additionally, the project offers a user-friendly interface via Gradio, which makes it accessible for those who prefer interacting with the tool in a web environment. This can be initiated with the following command:
python gradio_demo.py
Recent Developments
CLoT has seen significant updates:
- April 13, 2024: A dataset and a checkpoint have been released on ModelScope, providing resources for those who can't access Hugging Face.
- March 16, 2024: The same dataset and checkpoint were made available on Hugging Face.
- December 6, 2023: The project launched its official project page, featuring more detailed information and examples.
- December 5, 2023: The project's foundational paper was released on arXiv.
Planned Enhancements
CLoT's development checklist includes the successful completion of several components such as the project webpage, the preprint paper release, and making datasets and code available for public use.
Recognition
For those referencing this work, here is the citation format:
@misc{zhong2023clot,
title={Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation},
author={Zhong, Shanshan and Huang, Zhongzhan and Gao, Shanghua and Wen, Weushao and Lin, Liang and Zitnik, Marinka and Zhou, Pan},
journal={arXiv preprint arXiv:2312.02439},
year={2023}
}
The CLoT project represents an exciting frontier in artificial intelligence, pushing the boundaries of how machines can think creatively and humorously, and expanding the landscape of LLM capabilities beyond traditional linear reasoning.