motif
Explore Motif's unique method of using a Large Language Model to define reward functions for AI agent training in NetHack. This method features a three-phase process: dataset annotation, reward training, and reinforcement learning, transforming LLM preferences into intrinsic agent motivation. Discover intuitive, human-aligned AI behaviors guided by customizable prompts and gain insights into Motif's capabilities for feedback-driven intrinsic rewards in reinforcement learning.