Introduction to Qwen-Agent
Qwen-Agent is a framework designed to develop large language model (LLM) applications with a focus on instruction following, the use of tools, planning, and memory capabilities. It is built around the features of Qwen and includes various example applications like Browser Assistant, Code Interpreter, and Custom Assistant.
Recent News
On September 18, 2024, a new feature called Qwen2.5-Math Demo was introduced. This demo showcases the tool-integrated reasoning capabilities of Qwen2.5-Math. It's important to note that the Python executor included is not sandboxed and is intended only for local testing purposes, not for use in production environments.
Getting Started
Installation
To install Qwen-Agent, users have the option of downloading the stable version from PyPI. For those interested in additional features, such as GUI support, code interpretation, or using Qwen2.5-Math for tool-integrated reasoning, they can include optional requirements. Alternatively, users can obtain the latest development version directly from the source by cloning the repository.
Model Service Preparation
Users can either take advantage of the model service offered by Alibaba Cloud's DashScope or deploy their model service using open-source Qwen models. Using DashScope requires setting an environment variable with the user's unique API key. Alternatively, for those preferring a customized setup, instructions are available for deploying an OpenAI-compatible API service.
Developing Your Own Agent
Qwen-Agent provides multiple components for users to create their agents. There are atomic components like LLMs, which inherit from the BaseChatModel
, and tools, which derive from BaseTool
. Users can combine these into higher-level components, such as agents derived from the Agent
class.
Below is a step-by-step guide for creating an agent that can read PDF files and utilize tools, along with adding a custom tool for image generation.
-
Adding a Custom Tool - Implementing a tool named
my_image_gen
allows for image generation based on textual descriptions. -
Configuring the LLM - Users can opt for the model service from DashScope or another OpenAI-compatible model service.
-
Creating an Agent - An example of using the
Assistant
agent, which can read files and use tools, is demonstrated via a script. -
Running the Agent - Once set up, the agent can function as a chatbot, processing user queries and executing specified operations.
Example Applications and Features
Qwen-Agent includes built-in agent implementations like the Assistant
class, but users can also develop their agents by inheriting from the Agent
class. The project’s examples directory offers additional usage scenarios.
FAQs
Agent Functionality: Qwen-Agent supports function calling through LLM classes and some Agent classes, which enhance tool usage.
Handling Long Documents: To manage question answering over extremely long documents, Qwen-Agent has released advanced solutions that outperform native models on certain benchmarks.
BrowserQwen: A Special Application
BrowserQwen is a browser assistant that is based on Qwen-Agent. Users seeking further details can access its dedicated documentation.
Disclaimer
The code interpreter within Qwen-Agent is not sandboxed, meaning it operates in the user's local environment. Users are cautioned against running risky tasks and advised against using the interpreter in production situations.