FireAct: Toward Language Agent Fine-tuning
FireAct is an innovative project focused on the fine-tuning of language agents. Stemming from a comprehensive academic publication, the project's repository encompasses prompts, demo code, and fine-tuning data integral for language model development and experimentation. The project is spearheaded by Baian Chen and collaborators, highlighting its scholarly foundation.
Overview
The FireAct project is structured to facilitate the development and fine-tuning of language agents. Key components include:
- Tools: Defined in the
tools/
directory, these are necessary for implementing various functionalities. - Tasks: Outlined in
tasks/
, these are targeted objectives or questions to be addressed by the language model. - Data Collection & Experimentation: Managed through
generation.py
, this step involves gathering data and running experiments. The outcomes are stored intrajs/
.
Data & Prompts
Data management and prompt setup are critical to the FireAct project. The data/
directory contains datasets to generate training data, with examples formatted for both Alpaca and GPT models. The prompts/
directory is specifically designed to aid in the generation of effective training data and in conducting experimental runs.
Setup
To get started with FireAct, users need to secure API keys for OpenAI and SERP services, creating environment variables for both. Setting up the project involves:
- Creating a virtual environment using Conda.
- Cloning the repository from GitHub.
- Installing necessary dependencies listed in
requirements.txt
.
Running the Demo
Data Generation
Users can execute data generation with:
python generation.py --task hotpotqa --backend gpt-4 --promptpath default --evaluate --random --task_split val --temperature 0 --task_end_index 5
This command requires setting a high --task_end_index
for obtaining substantial data samples. The data must then be converted to suitable formats for training.
Supervised Fine-tuning
The fine-tuning process for language models is illustrated with:
cd finetune/llama_lora
python finetune.py --base_model meta-llama/Llama-2-13b-chat-hf --data_path ../../data/finetune/alpaca_format/hotpotqa.json --micro_batch_size 8 --num_epochs 30 --output_dir ../../models/lora/fireact-llama-2-13b --val_set_size 0.01 --cutoff_len 512
Inference
Inference tasks can be executed using the following examples:
-
For FireAct Llama:
python generation.py --task hotpotqa --backend llama --evaluate --random --task_split dev --task_end_index 5 --modelpath meta-llama/Llama-2-7b-chat --add_lora --alpaca_format --peftpath forestai/fireact_llama_2_7b_lora
-
For FireAct GPT:
python generation.py --task hotpotqa --backend ft:gpt-3.5-turbo-0613:<YOUR_MODEL> --evaluate --random --task_split dev --temperature 0 --chatgpt_format --task_end_index 5
Model Zoo
FireAct offers a variety of multitask models based on the Llama family, available on Hugging Face:
- Llama2-7B: Available as a LoRA fine-tuned model and full model.
- Llama2-13B: Also available as a LoRA fine-tuned model.
- CodeLlama series (7B, 13B, 34B): All fine-tuned using the LoRA method.
These models encompass diverse functionalities suitable for a range of language tasks.
References
FireAct is complemented by key contributions from notable codebases such as ReAct, Stanford Alpaca, and various LoRA implementations, ensuring robust and comprehensive support for model fine-tuning.