en

#NetHack

Explore Motif's unique method of using a Large Language Model to define reward functions for AI agent training in NetHack. This method features a three-phase process: dataset annotation, reward training, and reinforcement learning, transforming LLM preferences into intrinsic agent motivation. Discover intuitive, human-aligned AI behaviors guided by customizable prompts and gain insights into Motif's capabilities for feedback-driven intrinsic rewards in reinforcement learning.

Terms of Use Privacy Policy Advertising Services

Feedback Email: [email protected]