Qwen-Agent - Improving LLM Applications with Instruction Following and Tool Usage

Introduction to Qwen-Agent

Qwen-Agent is a framework designed to develop large language model (LLM) applications with a focus on instruction following, the use of tools, planning, and memory capabilities. It is built around the features of Qwen and includes various example applications like Browser Assistant, Code Interpreter, and Custom Assistant.

Recent News

On September 18, 2024, a new feature called Qwen2.5-Math Demo was introduced. This demo showcases the tool-integrated reasoning capabilities of Qwen2.5-Math. It's important to note that the Python executor included is not sandboxed and is intended only for local testing purposes, not for use in production environments.

Getting Started

Installation

To install Qwen-Agent, users have the option of downloading the stable version from PyPI. For those interested in additional features, such as GUI support, code interpretation, or using Qwen2.5-Math for tool-integrated reasoning, they can include optional requirements. Alternatively, users can obtain the latest development version directly from the source by cloning the repository.

Model Service Preparation

Users can either take advantage of the model service offered by Alibaba Cloud's DashScope or deploy their model service using open-source Qwen models. Using DashScope requires setting an environment variable with the user's unique API key. Alternatively, for those preferring a customized setup, instructions are available for deploying an OpenAI-compatible API service.

Developing Your Own Agent

Qwen-Agent provides multiple components for users to create their agents. There are atomic components like LLMs, which inherit from the BaseChatModel, and tools, which derive from BaseTool. Users can combine these into higher-level components, such as agents derived from the Agent class.

Below is a step-by-step guide for creating an agent that can read PDF files and utilize tools, along with adding a custom tool for image generation.

Adding a Custom Tool - Implementing a tool named my_image_gen allows for image generation based on textual descriptions.
Configuring the LLM - Users can opt for the model service from DashScope or another OpenAI-compatible model service.
Creating an Agent - An example of using the Assistant agent, which can read files and use tools, is demonstrated via a script.
Running the Agent - Once set up, the agent can function as a chatbot, processing user queries and executing specified operations.

Example Applications and Features

Qwen-Agent includes built-in agent implementations like the Assistant class, but users can also develop their agents by inheriting from the Agent class. The project’s examples directory offers additional usage scenarios.

FAQs

Agent Functionality: Qwen-Agent supports function calling through LLM classes and some Agent classes, which enhance tool usage.

Handling Long Documents: To manage question answering over extremely long documents, Qwen-Agent has released advanced solutions that outperform native models on certain benchmarks.

BrowserQwen: A Special Application

BrowserQwen is a browser assistant that is based on Qwen-Agent. Users seeking further details can access its dedicated documentation.

Disclaimer

The code interpreter within Qwen-Agent is not sandboxed, meaning it operates in the user's local environment. Users are cautioned against running risky tasks and advised against using the interpreter in production situations.