Agent-FLAN - Enhancing Language Models for Improved Agent Performance in NLP Tasks

Agent-FLAN: Enhancing Language Models for Agent Tasks

Introduction

Agent-FLAN is an innovative project focused on improving the capability of large language models (LLMs) to function effectively as agents. Although open-sourced LLMs have excelled in various natural language processing (NLP) tasks, their performance as agents lacks compared to API-based models. The challenge is to integrate agent abilities into general-purpose LLMs. This project identifies several key insights: the training data for agents is complex and shifts away significantly from the original pre-training data, LLMs learn agent tasks at different speeds, and current methods often introduce errors due to hallucinations. Agent-FLAN addresses these issues by refining the training process, allowing Llama2-7B, a model underpinning this project, to surpass existing methods by 3.5% in various evaluations. Additionally, Agent-FLAN reduces hallucinations and improves LLMs' agent capabilities while maintaining their general proficiency.

What's New

March 21, 2024: The project paper was published on ArXiv.
March 20, 2024: The dataset and model checkpoint for Agent-FLAN were released.

Agent-FLAN Series

The Agent-FLAN models are fine-tuned through the AgentInstruct and Toolbench datasets, using a specially designed data generation pipeline. This approach equips the models with robust skills for numerous agent tasks and tool operations.

Model and Dataset Availability

Agent-FLAN is available on platforms such as HuggingFace and OpenXLab. The training involves a mix of the AgentInstruct, ToolBench, and ShareGPT datasets, maintaining a conversational structure similar to Llama-2-chat models. The flexible and structured format enhances the efficiency of training LLMs for agent tasks.

Model Resources

Agent-FLAN-7B: Can be accessed on HuggingFace and OpenXLab.

Dataset Resources

Agent-FLAN Dataset: Available on HuggingFace.

Detailed Results

Agent-FLAN significantly outperforms previous approaches to agent tuning, as shown in evaluations of both known (held-in) and novel (held-out) tasks. The results are standardized against GPT-4 performance for a clearer comparison, marking a significant advancement in bridging the gap between open-source models and API-based models.

Acknowledgements

The success of the Agent-FLAN project relies on contributions from projects like Lagent and T-Eval, whose work underpins the Agent-FLAN development.

Citation

For scholarly use of Agent-FLAN, the formal citation is provided to credit the efforts of its developers and contributors, as published in the arXiv repository.

License

Agent-FLAN is made available under the Apache 2.0 license, promoting open access and further innovation in the field of advanced LLM agent functionality.