Introduction to CodeAct Project
The CodeAct project introduces an innovative approach to enhancing language model (LLM) agents by integrating executable code into their action space, known as CodeAct. This approach allows LLM agents to perform better by unifying their actions into a single, cohesive framework. By using a Python interpreter, the system can execute, adapt, and revise actions based on real-time interactions and observations, such as code execution results.
Latest Developments
The team announced several updates in 2024:
- April 10, 2024: The release of CodeActAgent Mistral on the platform
ollama
, making it officially available. - March 11, 2024: Enhanced support for using "llama.cpp" for running CodeActAgent on laptops, specifically tested on MacOS.
- February 2, 2024: The official release of CodeAct, marking a significant milestone in the project's journey.
Why Choose CodeAct?
Extensive research involving 17 LLMs demonstrated that CodeAct consistently outperforms traditional methods such as Text and JSON, achieving up to a 20% higher success rate on certain benchmarks. Results from API-Bank and the new M³ToolEval benchmark highlight its superior performance.
This shows how CodeAct compares against Text and JSON based methods.
CodeActInstruct Dataset
The CodeActInstruct dataset is a pivotal part of the project, consisting of 7,000 multi-turn interactions specifically tailored to enhance CodeAct. This dataset is publicly accessible for further research and development via the Hugging Face platform.
Provides insights into the dataset collected for CodeAct.
CodeActAgent Models
The CodeActAgent, trained using both the CodeActInstruct dataset and general dialogues, demonstrates exceptional capability in handling outside-the-box tasks while maintaining strong generic performance. Two main variants are available:
- CodeActAgent-Mistral-7b-v0.1: Recommended for its use of the Mistral-7b-v0.1 model with an extended 32k context window.
- CodeActAgent-Llama-7b: Utilizes the Llama-2-7b model with a 4k context window.
Displays the evaluation results for different variants of CodeActAgent.
Implementing CodeActAgent
The CodeActAgent system consists of several components that users can employ for implementing their applications:
- LLM Serving: Compatible with OpenAI, this module can be run using vLLM or any similar software.
- User Interaction Interface: Options include a Chat-UI with MongoDB for historical data logging or a straightforward Python script for quick interactions.
- Code Execution Engine: This tool allows code execution requests to be processed during chat sessions, leveraging a Docker container setup.
Users with access to a Kubernetes cluster can automate the deployment of all these components efficiently. For those looking to get started without Kubernetes, Docker provides a streamlined alternative.
Reproduce Experiments
For enthusiasts and researchers interested in reproducing the experiments conducted during the development of CodeAct, detailed instructions for data generation, model training, and evaluation are available.
By harnessing the power of executable code in language models, the CodeAct project sets a new standard for LLM capabilities, providing robust solutions for a wide array of tasks and applications.