Reflexion: Language Agents with Verbal Reinforcement Learning
Reflexion is an innovative project that explores how language agents can improve their problem-solving abilities through verbal reinforcement learning. Presented by a team of researchers including Noah Shinn, Federico Cassano, Edward Berman, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao at NeurIPS 2023, this project provides insights into the mechanisms by which AI systems can learn from their previous attempts using language as a guide.
Project Overview
Reflexion focuses on two primary types of tasks: reasoning and decision-making. It employs language agents equipped with different strategies that allow them to reflect, assess, and enhance their previous performances.
Reasoning Tasks (HotPotQA)
In the reasoning segment of the project, the agents tackle questions from the HotPotQA distractor dataset. Each experiment consists of a random selection of 100 questions, with the agents employing various types and reflexion strategies. Here's how the process works:
-
Agent Types: There are three main types of agents, each with unique characteristics:
ReAct
: A reactive agent.CoT_context
: This agent is given supporting context about the question.CoT_no_context
: This agent operates without any supporting context.
-
Reflexion Strategies: Agents use different strategies to reflect on their attempts:
ReflexionStrategy.NONE
: The agent receives no feedback about its last attempt.ReflexionStrategy.LAST_ATTEMPT
: The reasoning trace from the last attempt is provided as context.ReflexionStrategy.REFLEXION
: The agent reflects on its last attempt, using this reflection as context.ReflexionStrategy.LAST_ATTEMPT_AND_REFLEXION
: Both reasoning trace and self-reflection are provided for enhanced understanding.
Decision-Making Tasks (AlfWorld)
For the decision-making tasks, agents operate in a simulated environment. This part of Reflexion is designed to test how agents learn through iterative interaction with their environment. Setup involves defining several run parameters, such as the number of learning steps and task-environment pairs. Users can also choose to use persistent memory storage for self-reflections, which can be turned off for baseline runs.
Important Notes
Due to resource constraints, re-running these experiments might be challenging for individual developers. The project provides logged results from previous runs for those who want to explore the outcomes without hefty computational requirements. This data is available in designated directories for reasoning, decision-making, and additional programming tasks.
Additional Resources
Reflexion is thoroughly documented, with links to the codebase and further reading available:
- Explore the original code here
- Read more insights on the related blog post here
- For further questions, contact Noah Shinn via email at [email protected].
This project exemplifies how language can serve as a tool for AI learning, providing a new paradigm for enhancing machine intelligence through self-assessment and reflection.