Instruct2Act
Using Large Language Models, Instruct2Act translates multi-modal instructions into sequential actions for robotics, integrating perception, planning, and execution through models like SAM and CLIP, performing well across different tasks in zero-shot scenarios.