Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
LanguageLanguage
-
SubjectSubject
-
Item TypeItem Type
-
DisciplineDiscipline
-
YearFrom:-To:
-
More FiltersMore FiltersIs Peer Reviewed
Done
Filters
Reset
68,105
result(s) for
"reinforcement"
Sort by:
Reward associations do not explain transitive inference performance in monkeys
2018
The observation that monkeys appear to make transitive inferences has been taken as evidence of their ability to form and manipulate mental representations. However, alternative explanations have been proposed arguing that transitive inference performance based on expected or experienced reward value. To test the contribution of reward value to monkeys’ behavior in TI paradigms, we performed two experiments in which we manipulated the amount of reward associated with each item in an ordered list. In these experiments, monkeys were presented with pairs of items drawn from the list, and delivered rewards if subjects selected the item with the earlier list rank. When reward magnitude was biased to favor later list items, correct responding was reduced. However, monkeys eventually learned to make correct rule-based choices despite countervailing incentives. The results demonstrate that monkeys’ performance in TI paradigms is not driven solely by expected reward, but that they are able to make appropriate inferences in the face of discordant reward associations.
Journal Article
Challenges of real-world reinforcement learning: definitions, benchmarks and analysis
2021
Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the difficulties that must be addressed for RL to be commonly deployed in real-world systems. For each challenge, we define it formally in the context of a Markov Decision Process, analyze the effects of the challenge on state-of-the-art learning algorithms, and present some existing attempts at tackling it. We believe that an approach that addresses our set of proposed challenges would be readily deployable in a large number of real world problems. Our proposed challenges are implemented in a suite of continuous control environments called realworldrl-suite which we propose an as an open-source benchmark.
Journal Article
A Systematic Study on Reinforcement Learning Based Applications
by
Aljafari, Belqasem
,
Rajasekar, Elakkiya
,
Nikolovski, Srete
in
Algorithms
,
Analysis
,
Clustering
2023
We have analyzed 127 publications for this review paper, which discuss applications of Reinforcement Learning (RL) in marketing, robotics, gaming, automated cars, natural language processing (NLP), internet of things security, recommendation systems, finance, and energy management. The optimization of energy use is critical in today’s environment. We mainly focus on the RL application for energy management. Traditional rule-based systems have a set of predefined rules. As a result, they may become rigid and unable to adjust to changing situations or unforeseen events. RL can overcome these drawbacks. RL learns by exploring the environment randomly and based on experience, it continues to expand its knowledge. Many researchers are working on RL-based energy management systems (EMS). RL is utilized in energy applications such as optimizing energy use in smart buildings, hybrid automobiles, smart grids, and managing renewable energy resources. RL-based energy management in renewable energy contributes to achieving net zero carbon emissions and a sustainable environment. In the context of energy management technology, RL can be utilized to optimize the regulation of energy systems, such as building heating, ventilation, and air conditioning (HVAC) systems, to reduce energy consumption while maintaining a comfortable atmosphere. EMS can be accomplished by teaching an RL agent to make judgments based on sensor data, such as temperature and occupancy, to modify the HVAC system settings. RL has proven beneficial in lowering energy usage in buildings and is an active research area in smart buildings. RL can be used to optimize energy management in hybrid electric vehicles (HEVs) by learning an optimal control policy to maximize battery life and fuel efficiency. RL has acquired a remarkable position in robotics, automated cars, and gaming applications. The majority of security-related applications operate in a simulated environment. The RL-based recommender systems provide good suggestions accuracy and diversity. This article assists the novice in comprehending the foundations of reinforcement learning and its applications.
Journal Article
Grandmaster level in StarCraft II using multi-agent reinforcement learning
2019
Many real-world applications require artificial agents to compete and coordinate with other agents in complex environments. As a stepping stone to this goal, the domain of StarCraft has emerged as an important challenge for artificial intelligence research, owing to its iconic and enduring status among the most difficult professional esports and its relevance to the real world in terms of its raw complexity and multi-agent challenges. Over the course of a decade and numerous competitions
1
–
3
, the strongest agents have simplified important aspects of the game, utilized superhuman capabilities, or employed hand-crafted sub-systems
4
. Despite these advantages, no previous agent has come close to matching the overall skill of top StarCraft players. We chose to address the challenge of StarCraft using general-purpose learning methods that are in principle applicable to other complex domains: a multi-agent reinforcement learning algorithm that uses data from both human and agent games within a diverse league of continually adapting strategies and counter-strategies, each represented by deep neural networks
5
,
6
. We evaluated our agent, AlphaStar, in the full game of StarCraft II, through a series of online games against human players. AlphaStar was rated at Grandmaster level for all three StarCraft races and above 99.8% of officially ranked human players.
AlphaStar uses a multi-agent reinforcement learning algorithm and has reached Grandmaster level, ranking among the top 0.2% of human players for the real-time strategy game StarCraft II.
Journal Article
Reward associations do not explain transitive inference performance in monkeys
2018
The observation that monkeys appear to make transitive inferences has been taken as evidence of their ability to form and manipulate mental representations. However, alternative explanations have been proposed arguing that transitive inference performance based on expected or experienced reward value. To test the contribution of reward value to monkeys’ behavior in TI paradigms, we performed two experiments in which we manipulated the amount of reward associated with each item in an ordered list. In these experiments, monkeys were presented with pairs of items drawn from the list, and delivered rewards if subjects selected the item with the earlier list rank. When reward magnitude was biased to favor later list items, correct responding was reduced. However, monkeys eventually learned to make correct rule-based choices despite countervailing incentives. The results demonstrate that monkeys’ performance in TI paradigms is not driven solely by expected reward, but that they are able to make appropriate inferences in the face of discordant reward associations.
Journal Article
Reward associations do not explain transitive inference performance in monkeys
2018
The observation that monkeys appear to make transitive inferences has been taken as evidence of their ability to form and manipulate mental representations. However, alternative explanations have been proposed arguing that transitive inference performance based on expected or experienced reward value. To test the contribution of reward value to monkeys’ behavior in TI paradigms, we performed two experiments in which we manipulated the amount of reward associated with each item in an ordered list. In these experiments, monkeys were presented with pairs of items drawn from the list, and delivered rewards if subjects selected the item with the earlier list rank. When reward magnitude was biased to favor later list items, correct responding was reduced. However, monkeys eventually learned to make correct rule-based choices despite countervailing incentives. The results demonstrate that monkeys’ performance in TI paradigms is not driven solely by expected reward, but that they are able to make appropriate inferences in the face of discordant reward associations.
Journal Article
Reinforcement learning for bluff body active flow control in experiments and simulations
by
Wang, Zhicheng
,
Triantafyllou, Michael S.
,
Karniadakis, George Em
in
accelerated discovery
,
Active control
,
Algorithms
2020
We have demonstrated the effectiveness of reinforcement learning (RL) in bluff body flow control problems both in experiments and simulations by automatically discovering active control strategies for drag reduction in turbulent flow. Specifically, we aimed to maximize the power gain efficiency by properly selecting the rotational speed of two small cylinders, located parallel to and downstream of the main cylinder. By properly defining rewards and designing noise reduction techniques, and after an automatic sequence of tens of towing experiments, the RL agent was shown to discover a control strategy that is comparable to the optimal strategy found through lengthy systematically planned control experiments. Subsequently, these results were verified by simulations that enabled us to gain insight into the physical mechanisms of the drag reduction process. While RL has been used effectively previously in idealized computer flow simulation studies, this study demonstrates its effectiveness in experimental fluid mechanics and verifies it by simulations, potentially paving the way for efficient exploration of additional active flow control strategies in other complex fluid mechanics applications.
Journal Article
Learning team-based navigation: a review of deep reinforcement learning techniques for multi-agent pathfinding
by
Younes, Younes Al
,
Chung, Jaehoon
,
Najjaran, Homayoun
in
Agents
,
Algorithms
,
Artificial Intelligence
2024
Multi-agent pathfinding (MAPF) is a critical field in many large-scale robotic applications, often being the fundamental step in multi-agent systems. The increasing complexity of MAPF in complex and crowded environments, however, critically diminishes the effectiveness of existing solutions. In contrast to other studies that have either presented a general overview of the recent advancements in MAPF or extensively reviewed Deep Reinforcement Learning (DRL) within multi-agent system settings independently, our work presented in this review paper focuses on highlighting the integration of DRL-based approaches in MAPF. Moreover, we aim to bridge the current gap in evaluating MAPF solutions by addressing the lack of unified evaluation indicators and providing comprehensive clarification on these indicators. Finally, our paper discusses the potential of model-based DRL as a promising future direction and provides its required foundational understanding to address current challenges in MAPF. Our objective is to assist readers in gaining insight into the current research direction, providing unified indicators for comparing different MAPF algorithms and expanding their knowledge of model-based DRL to address the existing challenges in MAPF.
Journal Article