Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Reading LevelReading Level
-
Content TypeContent Type
-
YearFrom:-To:
-
More FiltersMore FiltersItem TypeIs Full-Text AvailableSubjectCountry Of PublicationPublisherSourceTarget AudienceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
2,106
result(s) for
"imitation learning"
Sort by:
Repetition-Based Approach for Task Adaptation in Imitation Learning
2022
Transfer learning is an effective approach for adapting an autonomous agent to a new target task by transferring knowledge learned from the previously learned source task. The major problem with traditional transfer learning is that it only focuses on optimizing learning performance on the target task. Thus, the performance on the target task may be improved in exchange for the deterioration of the source task’s performance, resulting in an agent that is not able to revisit the earlier task. Therefore, transfer learning methods are still far from being comparable with the learning capability of humans, as humans can perform well on both source and new target tasks. In order to address this limitation, a task adaptation method for imitation learning is proposed in this paper. Being inspired by the idea of repetition learning in neuroscience, the proposed adaptation method enables the agent to repeatedly review the learned knowledge of the source task, while learning the new knowledge of the target task. This ensures that the learning performance on the target task is high, while the deterioration of the learning performance on the source task is small. A comprehensive evaluation over several simulated tasks with varying difficulty levels shows that the proposed method can provide high and consistent performance on both source and target tasks, outperforming existing transfer learning methods.
Journal Article
Domain Adaptation for Imitation Learning Using Generative Adversarial Network
2021
Imitation learning is an effective approach for an autonomous agent to learn control policies when an explicit reward function is unavailable, using demonstrations provided from an expert. However, standard imitation learning methods assume that the agents and the demonstrations provided by the expert are in the same domain configuration. Such an assumption has made the learned policies difficult to apply in another distinct domain. The problem is formalized as domain adaptive imitation learning, which is the process of learning how to perform a task optimally in a learner domain, given demonstrations of the task in a distinct expert domain. We address the problem by proposing a model based on Generative Adversarial Network. The model aims to learn both domain-shared and domain-specific features and utilizes it to find an optimal policy across domains. The experimental results show the effectiveness of our model in a number of tasks ranging from low to complex high-dimensional.
Journal Article
Optimization Scheduling of Hydrogen-Coupled Electro-Heat-Gas Integrated Energy System Based on Generative Adversarial Imitation Learning
2025
Hydrogen energy is a crucial support for China’s low-carbon energy transition. With the large-scale integration of renewable energy, the combination of hydrogen and integrated energy systems has become one of the most promising directions of development. This paper proposes an optimized scheduling model for a hydrogen-coupled electro-heat-gas integrated energy system (HCEHG-IES) using generative adversarial imitation learning (GAIL). The model aims to enhance renewable-energy absorption, reduce carbon emissions, and improve grid-regulation flexibility. First, the optimal scheduling problem of HCEHG-IES under uncertainty is modeled as a Markov decision process (MDP). To overcome the limitations of conventional deep reinforcement learning algorithms—including long optimization time, slow convergence, and subjective reward design—this study augments the PPO algorithm by incorporating a discriminator network and expert data. The newly developed algorithm, termed GAIL, enables the agent to perform imitation learning from expert data. Based on this model, dynamic scheduling decisions are made in continuous state and action spaces, generating optimal energy-allocation and management schemes. Simulation results indicate that, compared with traditional reinforcement-learning algorithms, the proposed algorithm offers better economic performance. Guided by expert data, the agent avoids blind optimization, shortens the offline training time, and improves convergence performance. In the online phase, the algorithm enables flexible energy utilization, thereby promoting renewable-energy absorption and reducing carbon emissions.
Journal Article
Design of an Intelligent Vehicle Behavior Decision Algorithm Based on DGAIL
2023
With the development of AI, the intelligence level of vehicles is increasing. Structured roads, as common and important traffic scenes, are the most typical application scenarios for realizing autonomous driving. The driving behavior decision-making of intelligent vehicles has always been a controversial and difficult research topic. Currently, the mainstream decision-making methods, which are mainly based on rules, lack adaptability and generalization to the environment. Aimed at the particularity of intelligent vehicle behavior decisions and the complexity of the environment, this thesis proposes an intelligent vehicle driving behavior decision method based on DQN generative adversarial imitation learning (DGAIL) in the structured road traffic environment, in which the DQN algorithm is utilized as a GAIL generator. The results show that the DGAIL method can preserve the design of the reward value function, ensure the effectiveness of training, and achieve safe and efficient driving on structured roads. The experimental results show that, compared with A3C, DQN and GAIL, the model based on DGAIL spends less average training time to achieve a 95% success rate in the straight road scene and merging road scene, respectively. Apparently, this algorithm can effectively accelerate the selection of actions, reduce the randomness of actions during the exploration, and improve the effect of the decision-making model.
Journal Article
Intrinsically Motivated Open-Ended Multi-Task Learning Using Transfer Learning to Discover Task Hierarchy
by
Duminy, Nicolas
,
Zhu, Junshuai
,
Nguyen, Sao Mai
in
Artificial Intelligence
,
Computer Science
,
continual learning
2021
In open-ended continuous environments, robots need to learn multiple parameterised control tasks in hierarchical reinforcement learning. We hypothesise that the most complex tasks can be learned more easily by transferring knowledge from simpler tasks, and faster by adapting the complexity of the actions to the task. We propose a task-oriented representation of complex actions, called procedures, to learn online task relationships and unbounded sequences of action primitives to control the different observables of the environment. Combining both goal-babbling with imitation learning, and active learning with transfer of knowledge based on intrinsic motivation, our algorithm self-organises its learning process. It chooses at any given time a task to focus on; and what, how, when and from whom to transfer knowledge. We show with a simulation and a real industrial robot arm, in cross-task and cross-learner transfer settings, that task composition is key to tackle highly complex tasks. Task decomposition is also efficiently transferred across different embodied learners and by active imitation, where the robot requests just a small amount of demonstrations and the adequate type of information. The robot learns and exploits task dependencies so as to learn tasks of every complexity.
Journal Article
Learning for a Robot: Deep Reinforcement Learning, Imitation Learning, Transfer Learning
by
Hua, Jiang
,
Zeng, Liangcai
,
Ju, Zhaojie
in
adaptive and robust control
,
deep reinforcement learning
,
dexterous manipulation
2021
Dexterous manipulation of the robot is an important part of realizing intelligence, but manipulators can only perform simple tasks such as sorting and packing in a structured environment. In view of the existing problem, this paper presents a state-of-the-art survey on an intelligent robot with the capability of autonomous deciding and learning. The paper first reviews the main achievements and research of the robot, which were mainly based on the breakthrough of automatic control and hardware in mechanics. With the evolution of artificial intelligence, many pieces of research have made further progresses in adaptive and robust control. The survey reveals that the latest research in deep learning and reinforcement learning has paved the way for highly complex tasks to be performed by robots. Furthermore, deep reinforcement learning, imitation learning, and transfer learning in robot control are discussed in detail. Finally, major achievements based on these methods are summarized and analyzed thoroughly, and future research challenges are proposed.
Journal Article
Model-free reinforcement learning from expert demonstrations: a survey
2022
Reinforcement learning from expert demonstrations (RLED) is the intersection of imitation learning with reinforcement learning that seeks to take advantage of these two learning approaches. RLED uses demonstration trajectories to improve sample efficiency in high-dimensional spaces. RLED is a new promising approach to behavioral learning through demonstrations from an expert teacher. RLED considers two possible knowledge sources to guide the reinforcement learning process: prior knowledge and online knowledge. This survey focuses on novel methods for model-free reinforcement learning guided through demonstrations, commonly but not necessarily provided by humans. The methods are analyzed and classified according to the impact of the demonstrations. Challenges, applications, and promising approaches to improve the discussed methods are also discussed.
Journal Article
Smart Industrial Robot Control Trends, Challenges and Opportunities within Manufacturing
2022
Industrial robots and associated control methods are continuously developing. With the recent progress in the field of artificial intelligence, new perspectives in industrial robot control strategies have emerged, and prospects towards cognitive robots have arisen. AI-based robotic systems are strongly becoming one of the main areas of focus, as flexibility and deep understanding of complex manufacturing processes are becoming the key advantage to raise competitiveness. This review first expresses the significance of smart industrial robot control in manufacturing towards future factories by listing the needs, requirements and introducing the envisioned concept of smart industrial robots. Secondly, the current trends that are based on different learning strategies and methods are explored. Current computer-vision, deep reinforcement learning and imitation learning based robot control approaches and possible applications in manufacturing are investigated. Gaps, challenges, limitations and open issues are identified along the way.
Journal Article
Performance evaluation and improvement of deep Q network for lunar landing task
2024
Reinforcement learning is now being applied more and more in a variety of scenarios, the majority of which are based on the deep Q network (DQN) technology. However, the algorithm is heavily influenced by multiple factors. In this paper, we take the lunar lander as a case to study how various hyper-parameters affect the performance of the DQN algorithm, based on which we tune to get a model with better performance. At present, it is known that the DQN model has an average reward of 280+ on 100 test episodes, and the reward value of the model in this article can reach 290+. Meanwhile, its robustness is tested and verified by introducing additional uncertainty tests into the original problem. In addition, to speed up the training process, imitation learning is incorporated in our model, using heuristic function model guidance method to obtain demonstration data, which accelerates training speed and improves performance. Simulation results have proven the effectiveness of this method. 基于深度Q网络(DQN)技术的强化学习方法得到越来越广泛的应用, 但该类算法的性能深受多因素影响。文中以月球登陆器为例, 探讨不同超参数对DQN性能的影响, 在此基础上训练得到性能较优的模型。目前已知DQN模型在100个测试回合下平均奖励为280+, 文中模型奖励值可达到290+, 并且通过在原始问题中引入额外的不确定性测试验证了文中模型的鲁棒性。另外, 引入模仿学习的思想, 基于启发式函数的模型指导方法获取演示数据, 加快训练速度并提升性能, 仿真结果证明了该方法的有效性。
Journal Article