Toolformer - Pytorch (wip)
Toolformer-Pytorch is an innovative project developed by MetaAI, aimed at enhancing the capabilities of language models by enabling them to use external tools effectively. This project is an implementation based on a research paper titled "Toolformer: Language Models That Can Use Tools." The main appeal of this project lies in its ability to integrate API calls directly into the output of a transformer model, thus enhancing the model's functionality and utility in real-world applications.
Appreciation
Several organizations and individuals have provided invaluable support to the Toolformer-Pytorch project. Stability.ai has generously sponsored the project, allowing for the development and open-source release of advanced AI research. Enrico, from GitHub, is acknowledged for his initial contributions that helped kickstart the project's development. Additionally, the artificial intelligence tool ChatGPT played a significant role in executing complex regular expressions for parsing functions and parameters used in API calls, a task that proved challenging for the developers.
Installation
Toolformer can be easily installed using pip:
$ pip install toolformer-pytorch
Usage
The core feature of Toolformer is its ability to make language models aware of contextual information like the current date and time through API calls. For instance, Toolformer allows developers to teach a model to use a Calendar API to enhance its output text. This functionality is achieved by defining prompts that instruct the model on when and how to insert API calls to gather necessary information.
Here's an example of how Toolformer-Pytorch is utilized:
import torch
from toolformer_pytorch import Toolformer, PaLM
def Calendar():
import datetime
from calendar import day_name, month_name
now = datetime.datetime.now()
return f'Today is {day_name[now.weekday()]}, {month_name[now.month]} {now.day}, {now.year}.'
prompt = """
Your task is to add calls to a Calendar API to a piece of text.
The API calls should help you get information required to complete the text.
You can call the API by writing "[Calendar()]"
Here are some examples of API calls:
Input: Today is the first Friday of the year.
Output: Today is the first [Calendar()] Friday of the year.
Input: The president of the United States is Joe Biden.
Output: The president of the United States is [Calendar()] Joe Biden.
Input: [input]
Output:
"""
data = [
"The store is never open on the weekend, so today it is closed.",
"The number of days from now until Christmas is 30",
"The current day of the week is Wednesday."
]
model = PaLM(
dim = 512,
depth = 2,
heads = 8,
dim_head = 64
).cuda()
toolformer = Toolformer(
model = model,
model_seq_len = 256,
teach_tool_prompt = prompt,
tool_id = 'Calendar',
tool = Calendar,
finetune = True
)
filtered_stats = toolformer(data)
response = toolformer.sample_model_with_api_calls("How many days until the next new years?")
Core Concept
The innovation in this project is the development of a 'fitness score' which evaluates how effectively the model's outputs make use of API calls. This score is critical for refining the language model's ability to place API calls intelligently, thereby reducing the complexity of the surrounding text.
Advanced Utilities
Toolformer provides advanced functionality such as the invoke_tools
method, which applies functions to specific sections of generated text. For instance, developers can define increment and decrement functions to manipulate numbers within a text string:
from toolformer_pytorch import invoke_tools
def inc(i):
return i + 1
def dec(i):
return i - 1
function_registry = dict(
inc = inc,
dec = dec
)
text = 'make the following api calls: [inc(1)] and [dec(2)] and [ignored(3)]'
invoke_tools(function_registry, text)
Future Developments
The project outlines several future goals, which include:
- Enhancing the tool's ability to handle errors in function names and parameters.
- Improving statistical reporting before final fine-tuning.
- Completing end-to-end training within Toolformer.
- Expanding compatibility to include more language models such as GPT-J.
- Developing datasets and evaluations for additional applications.
Citations
The development of Toolformer is built on previous research efforts, and the project cites relevant works, including foundational research papers on language models and their implementations.
Toolformer-Pytorch offers significant potential for expanding the application of language models in real-world scenarios by empowering them with tool-using capabilities. This advancement represents a pivotal step toward more intelligent and adaptive AI systems.