A Human-in-the-loop Approach to Robot Action Replanning through LLM Common-Sense Reasoning
This work was supported by the European Union Horizon Project TORNADO (GA 101189557).
The authors are with Human-Robot Interfaces and Interaction (HRII) Laboratory, Istituto Italiano di Tecnologia, Genoa, Italy.
Elena Merlo is also with the Dept. of Informatics, Bioengineering, Robotics, and Systems Engineering, University of Genoa, Genoa, Italy.
Robotics holds enormous potential to transform both daily life and industry, but adoption has been slowed by the lack of accessible tools for non-experts.
Traditional programming through demonstrations can be powerful, yet vision-only learning often struggles, it can misinterpret, hallucinate, or fail to adapt when conditions change.
Our new work introduces a human-in-the-loop framework where people stay firmly in control of robot decision-making.
Instead of blindly executing a vision-derived plan, the robot’s actions are reviewed, refined, and validated by the human before execution begins.
How it works:
· The robot observes a single RGB video and generates an initial execution plan.
· Humans provide natural language instructions — clarifying goals, priorities, or potential pitfalls.
· A Large Language Model (LLM) uses common-sense reasoning to integrate this feedback, correcting errors and adjusting the plan.
· Crucially: the human has the final word, ensuring the plan is safe, accurate, and aligned with intent before the robot acts.
Why this matters:
· Keeps humans in charge of the process, not sidelined by automation.
· Reduces the need for repeated demonstrations — feedback is as simple as giving instructions in plain language.
· Increases robustness by catching and fixing errors before they happen.
· Moves us closer to truly accessible, trustworthy robotics for non-experts.
Read More & Watch the Video!
⇒ Full Paper on IEEE Xplore: read here




