Catalogue Search | MBRL

Modeling Self-Awareness in Embodied Task Planning with LLM-Driven Heuristics

by Argenziano, Francesco , Nardi, Daniele , Brienza, Michele in Ablation , Analysis , Collaboration

2026

Task planning for robots in real-life settings is inherently complex due to challenges such as identifying grounded sequences of actions to achieve a goal, bridging the gap between high-level planning and low-level execution, and addressing the computational constraints of robotic hardware. Additionally, an essential, but frequently underestimated, aspect is the robot’s ability to recognize its own limitations and effectively delegate tasks beyond its abilities to human collaborators. Recent advancements in integrating Large Language Models (LLMs) into robotics have increased the likelihood of generating unfeasible plans, making this capability even more critical for any robot that needs to interact with a real environment. To address these challenges, this paper presents a framework that integrates open-vocabulary online grounding, planning, and embodiment, emphasizing self-awareness and self-analysis. The proposed self-awareness model enables robots to assess task complexity and map task requirements to their own capabilities. This can be used in several ways, including strategically requesting human assistance when needed. By leveraging pre-trained foundation models, our framework supports a multi-role mechanism that enhances adaptability and ensures effective collaboration between robots and humans in real-world scenarios. The experiments were conducted using a TIAGo robot with the framework’s ability to identify and delegate challenging tasks on a set of nine real-life scenarios, spanning different complexity levels. The results show a significant improvement in task success rate, achieving 89.1% compared to 37.0% of the baseline, demonstrating the effectiveness of incorporating self-awareness into the planning process. The code and the additional materials have been publicly released on the project website.

Journal Article

Share this book

Add to My Shelf

Text2Motion: from natural language instructions to feasible plans

by Migimatsu, Toki , Lin, Kevin , Bohg, Jeannette in Feasibility , Language instruction , Large language models

2023

We propose Text2Motion, a language-based planning framework enabling robots to solve sequential manipulation tasks that require long-horizon reasoning. Given a natural language instruction, our framework constructs both a task- and motion-level plan that is verified to reach inferred symbolic goals. Text2Motion uses feasibility heuristics encoded in Q-functions of a library of skills to guide task planning with Large Language Models. Whereas previous language-based planners only consider the feasibility of individual skills, Text2Motion actively resolves geometric dependencies spanning skill sequences by performing geometric feasibility planning during its search. We evaluate our method on a suite of problems that require long-horizon reasoning, interpretation of abstract goals, and handling of partial affordance perception. Our experiments show that Text2Motion can solve these challenging problems with a success rate of 82%, while prior state-of-the-art language-based planning methods only achieve 13%. Text2Motion thus provides promising generalization characteristics to semantically diverse sequential manipulation tasks with geometric dependencies between skills. Qualitative results are made available at https://sites.google.com/stanford.edu/text2motion.

Journal Article

Share this book

Add to My Shelf

Integrating action knowledge and LLMs for task planning and situation handling in open worlds

by Ding, Yan , Amiri, Saeid , Yang, Hao in Knowledge , Large language models , Robots

2023

Task planning systems have been developed to help robots use human knowledge (about actions) to complete long-horizon tasks. Most of them have been developed for “closed worlds” while assuming the robot is provided with complete world knowledge. However, the real world is generally open, and the robots frequently encounter unforeseen situations that can potentially break theplanner’s completeness. Could we leverage the recent advances on pre-trained Large Language Models (LLMs) to enable classical planning systems to deal with novel situations? This paper introduces a novel framework, called COWP, for open-world task planning and situation handling. COWP dynamically augments the robot’s action knowledge, including the preconditions and effects of actions, with task-oriented commonsense knowledge. COWP embraces the openness from LLMs, and is grounded to specific domains via action knowledge. For systematic evaluations, we collected a dataset that includes 1085 execution-time situations. Each situation corresponds to a state instance wherein a robot is potentially unable to complete a task using a solution that normally works. Experimental results show that our approach outperforms competitive baselines from the literature in the success rate of service tasks. Additionally, we have demonstrated COWP using a mobile manipulator. Supplementary materials are available at: https://cowplanning.github.io/

Journal Article

Share this book

Add to My Shelf

Robotic Task Planning Using a Backchaining Theorem Prover for Multiplicative Exponential First-Order Linear Logic

by Saranli, Uluc , Kortik, Sitar in Analysis , Artificial Intelligence , Control

2019

In this paper, we propose an exponential multiplicative fragment of linear logic to encode and solve planning problems efficiently in STRIPS domain, that we call the Linear Planning Logic (LPL). Linear logic is a resource aware logic treating resources as single use assumptions, therefore enabling encoding and reasoning of domains with dynamic state. One of the most important examples of dynamic state domains is robotic task planning, since informational or physical states of a robot include non-monotonic characteristics. Our novel theorem prover is using the backchaining method which is suitable for logic languages like Lolli and Prolog. Additionally, we extend LPL to be able to encode non-atomic conclusions in program formulae. Following the introduction of the language, our theorem prover and its implementation, we present associated algorithmic properties through small but informative examples. Subsequently, we also present a navigation domain using the hexapod robot RHex to show LPL’s operation on a real robotic planning problem. Finally, we provide comparisons of LPL with two existing linear logic theorem provers, llprover and linTAP. We show that LPL outperforms these theorem provers for planning domains.

Journal Article

Share this book

Add to My Shelf

Embodied intelligence in manufacturing: leveraging large language models for autonomous industrial robotics

by Lu, Wen Feng , Liu, Xuan , Li, Bingbing in Automation , Business and Management , Control

2025

This paper delves into the potential of Large Language Model (LLM) agents for industrial robotics, with an emphasis on autonomous design, decision-making, and task execution within manufacturing contexts. We propose a comprehensive framework that includes three core components: (1) matches manufacturing tasks with process parameters, emphasizing the challenges in LLM agents’ understanding of human-imposed constraints; (2) autonomously designs tool paths, highlighting the LLM agents’ proficiency in planar tasks and challenges in 3D spatial tasks; and (3) integrates embodied intelligence within industrial robotics simulations, showcasing the adaptability of LLM agents like GPT-4. Our experimental results underscore the distinctive performance of the GPT-4 agent, especially in Component 3, where it is outstanding in task planning and achieved a success rate of 81.88% across 10 samples in task completion. In conclusion, our study accentuates the transformative potential of LLM agents in industrial robotics and suggests specific avenues, such as visual semantic control and real-time feedback loops, for their enhancement.

Journal Article

Share this book

Add to My Shelf

ProgPrompt: program generation for situated robot task planning using large language models

by Blukis, Valts , Fox, Dieter , Singh, Ishika in Ablation , Free form , Large language models

2023

Task planning can require defining myriad domain knowledge about the world in which a robot needs to act. To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information. However, such methods either require enumerating all possible next steps for scoring, or generate free-form text that may contain actions not possible on a given robot in its current context. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments, robot capabilities, and tasks. Our key insight is to prompt the LLM with program-like specifications of the available actions and objects in an environment, as well as with example programs that can be executed. We make concrete recommendations about prompt structure and generation constraints through ablation experiments, demonstrate state of the art success rates in VirtualHome household tasks, and deploy our method on a physical robot arm for tabletop tasks. Website and code at progprompt.github.io

Journal Article

Share this book

Add to My Shelf

Multi-Robot Coordination Analysis, Taxonomy, Challenges and Future Scope

by Verma, Janardan Kumar , Ranga, Virender in Artificial Intelligence , Control , Coordination

2021

Recently, Multi-Robot Systems (MRS) have attained considerable recognition because of their efficiency and applicability in different types of real-life applications. This paper provides a comprehensive research study on MRS coordination, starting with the basic terminology, categorization, application domains, and finally, give a summary and insights on the proposed coordination approaches for each application domain. We have done an extensive study on recent contributions in this research area in order to identify the strengths, limitations, and open research issues, and also highlighted the scope for future research. Further, we have examined a series of MRS state-of-the-art parameters that affect MRS coordination and, thus, the efficiency of MRS, like communication mechanism, planning strategy, control architecture, scalability, and decision-making. We have proposed a new taxonomy to classify various coordination approaches of MRS based on the six broad dimensions. We have also analyzed that how coordination can be achieved and improved in two fundamental problems, i.e., multi-robot motion planning, and task planning, and in various application domains of MRS such as exploration, object transport, target tracking, etc.

Journal Article

Share this book

Add to My Shelf

Affordance-informed Robotic Manipulation via Intelligent Action Library

by Zhiyang, L , Ruiteng, Z , Zhengshen, Z in Human-robot interaction , Robot arms , Semantic segmentation

2024

In the realm of conventional affordance detection, the primary objective is to provide insights into the potential uses of objects. However, a significant limitation remains as these conventional methods merely treat affordance detection as a semantic segmentation task, disregarding the crucial aspect of interpreting affordances for actions that can be performed by manipulator. To address this critical gap, we present a novel pipeline incorporating the Intelligent Action Library (IAL) concept. This framework enables affordance interpretation for various manipulation tasks, allowing robots to be taught and guided on how to execute specific actions based on the detected affordances and human-robot interaction. Through real-world experiments, we have demonstrated the ingenuity and dependability of our pipeline, effectively bridging the gap between affordance detection and manipulation task planning and execution. The integration of IAL facilitates a seamless connection between understanding affordances and empowering robots to perform tasks with precision and efficiency. The demo link is available to the public: https://youtu.be/_oBAer2Vl8k

Journal Article

Share this book

Add to My Shelf

Multimodal Agent AI: A Survey of Recent Advances and Future Directions

by Sun, Yu-Zhu , Zhang, Peng , Ma, Jian-Cong in Artificial Intelligence , Computer Science , Computers

2025

In recent years, multimodal agent AI (MAA) has emerged as a pivotal area of research, holding promise for transforming human-machine interaction. Agent AI systems, capable of perceiving and responding to inputs from multiple modalities (e.g., language, vision, audio), have demonstrated remarkable progress in understanding complex environments and executing intricate tasks. This survey comprehensively reviews the state-of-the-art developments in MAA and examines its fundamental concepts, key techniques, and applications across diverse domains. We first introduce the basics of agent AI and its multimodal interaction capabilities. We then delve into the core technologies that enable agents to perform task planning, decision-making, and multi-sensory fusion. Furthermore, we focus on exploring various applications of MAA in robotics, healthcare, gaming, and beyond. Additionally, we mainly focus on analyzing the challenges and limitations of current systems and propose promising research directions for future improvements, including human-AI collaboration, online learning method improvement. By reviewing existing work and highlighting open questions, this survey aims to provide a comprehensive roadmap for researchers and practitioners in the field of MAA.

Journal Article

Share this book

Add to My Shelf

Research and development in agricultural robotics: A perspective of digital farming

by Pitonakova, Lenka , Weltzien, Cornelia , Ahmad, Desa in Agricultural development , Agricultural land , Agricultural research

2018

Digital farming is the practice of modern technologies such as sensors, robotics, and data analysis for shifting from tedious operations to continuously automated processes. This paper reviews some of the latest achievements in agricultural robotics, specifically those that are used for autonomous weed control, field scouting, and harvesting. Object identification, task planning algorithms, digitalization and optimization of sensors are highlighted as some of the facing challenges in the context of digital farming. The concepts of multi-robots, human-robot collaboration, and environment reconstruction from aerial images and ground-based sensors for the creation of virtual farms were highlighted as some of the gateways of digital farming. It was shown that one of the trends and research focuses in agricultural field robotics is towards building a swarm of small scale robots and drones that collaborate together to optimize farming inputs and reveal denied or concealed information. For the case of robotic harvesting, an autonomous framework with several simple axis manipulators can be faster and more efficient than the currently adapted professional expensive manipulators. While robots are becoming the inseparable parts of the modern farms, our conclusion is that it is not realistic to expect an entirely automated farming system in the future.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter