code-act - Improve LLM Agents' Efficiency with CodeAct's Executable Code Actions

Introduction to CodeAct Project

The CodeAct project introduces an innovative approach to enhancing language model (LLM) agents by integrating executable code into their action space, known as CodeAct. This approach allows LLM agents to perform better by unifying their actions into a single, cohesive framework. By using a Python interpreter, the system can execute, adapt, and revise actions based on real-time interactions and observations, such as code execution results.

Latest Developments

The team announced several updates in 2024:

April 10, 2024: The release of CodeActAgent Mistral on the platform ollama, making it officially available.
March 11, 2024: Enhanced support for using "llama.cpp" for running CodeActAgent on laptops, specifically tested on MacOS.
February 2, 2024: The official release of CodeAct, marking a significant milestone in the project's journey.

Why Choose CodeAct?

Extensive research involving 17 LLMs demonstrated that CodeAct consistently outperforms traditional methods such as Text and JSON, achieving up to a 20% higher success rate on certain benchmarks. Results from API-Bank and the new M³ToolEval benchmark highlight its superior performance.

Comparison between CodeAct and Text/JSON This shows how CodeAct compares against Text and JSON based methods.

CodeActInstruct Dataset

The CodeActInstruct dataset is a pivotal part of the project, consisting of 7,000 multi-turn interactions specifically tailored to enhance CodeAct. This dataset is publicly accessible for further research and development via the Hugging Face platform.

Data Statistics Provides insights into the dataset collected for CodeAct.

CodeActAgent Models

The CodeActAgent, trained using both the CodeActInstruct dataset and general dialogues, demonstrates exceptional capability in handling outside-the-box tasks while maintaining strong generic performance. Two main variants are available:

CodeActAgent-Mistral-7b-v0.1: Recommended for its use of the Mistral-7b-v0.1 model with an extended 32k context window.
CodeActAgent-Llama-7b: Utilizes the Llama-2-7b model with a 4k context window.

Model Performance Displays the evaluation results for different variants of CodeActAgent.

Implementing CodeActAgent

The CodeActAgent system consists of several components that users can employ for implementing their applications:

LLM Serving: Compatible with OpenAI, this module can be run using vLLM or any similar software.
User Interaction Interface: Options include a Chat-UI with MongoDB for historical data logging or a straightforward Python script for quick interactions.
Code Execution Engine: This tool allows code execution requests to be processed during chat sessions, leveraging a Docker container setup.

Users with access to a Kubernetes cluster can automate the deployment of all these components efficiently. For those looking to get started without Kubernetes, Docker provides a streamlined alternative.

Reproduce Experiments

For enthusiasts and researchers interested in reproducing the experiments conducted during the development of CodeAct, detailed instructions for data generation, model training, and evaluation are available.

By harnessing the power of executable code in language models, the CodeAct project sets a new standard for LLM capabilities, providing robust solutions for a wide array of tasks and applications.