Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
369 result(s) for "multi-agent deep reinforcement learning"
Sort by:
Power Allocation and Energy Cooperation for UAV-Enabled MmWave Networks: A Multi-Agent Deep Reinforcement Learning Approach
Unmanned Aerial Vehicle (UAV)-assisted cellular networks over the millimeter-wave (mmWave) frequency band can meet the requirements of a high data rate and flexible coverage in next-generation communication networks. However, higher propagation loss and the use of a large number of antennas in mmWave networks give rise to high energy consumption and UAVs are constrained by their low-capacity onboard battery. Energy harvesting (EH) is a viable solution to reduce the energy cost of UAV-enabled mmWave networks. However, the random nature of renewable energy makes it challenging to maintain robust connectivity in UAV-assisted terrestrial cellular networks. Energy cooperation allows UAVs to send their excessive energy to other UAVs with reduced energy. In this paper, we propose a power allocation algorithm based on energy harvesting and energy cooperation to maximize the throughput of a UAV-assisted mmWave cellular network. Since there is channel-state uncertainty and the amount of harvested energy can be treated as a stochastic process, we propose an optimal multi-agent deep reinforcement learning algorithm (DRL) named Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to solve the renewable energy resource allocation problem for throughput maximization. The simulation results show that the proposed algorithm outperforms the Random Power (RP), Maximal Power (MP) and value-based Deep Q-Learning (DQL) algorithms in terms of network throughput.
Multi-agent reinforcement learning for cooperative lane changing of connected and autonomous vehicles in mixed traffic
Autonomous driving has attracted significant research interests in the past two decades as it offers many potential benefits, including releasing drivers from exhausting driving and mitigating traffic congestion, among others. Despite promising progress, lane-changing remains a great challenge for autonomous vehicles (AV), especially in mixed and dynamic traffic scenarios. Recently, reinforcement learning (RL) has been widely explored for lane-changing decision makings in AVs with encouraging results demonstrated. However, the majority of those studies are focused on a single-vehicle setting, and lane-changing in the context of multiple AVs coexisting with human-driven vehicles (HDVs) have received scarce attention. In this paper, we formulate the lane-changing decision-making of multiple AVs in a mixed-traffic highway environment as a multi-agent reinforcement learning (MARL) problem, where each AV makes lane-changing decisions based on the motions of both neighboring AVs and HDVs. Specifically, a multi-agent advantage actor-critic (MA2C) method is proposed with a novel local reward design and a parameter sharing scheme. In particular, a multi-objective reward function is designed to incorporate fuel efficiency, driving comfort, and the safety of autonomous driving. A comprehensive experimental study is made that our proposed MARL framework consistently outperforms several state-of-the-art benchmarks in terms of efficiency, safety, and driver comfort.
Detecting Sensor Faults, Anomalies and Outliers in the Internet of Things: A Survey on the Challenges and Solutions
The Internet of Things (IoT) has gained significant recognition to become a novel sensing paradigm to interact with the physical world in this Industry 4.0 era. The IoTs are being used in many diverse applications that are part of our life and is growing to become the global digital nervous systems. It is quite evident that in the near future, hundreds of millions of individuals and businesses with billions will have smart-sensors and advanced communication technology, and these things will expand the boundaries of current systems. This will result in a potential change in the way we work, learn, innovate, live and entertain. The heterogeneous smart sensors within the Internet of Things are indispensable parts, which capture the raw data from the physical world by being the first port of contact. Often the sensors within the IoT are deployed or installed in harsh environments. This inevitably means that the sensors are prone to failure, malfunction, rapid attrition, malicious attacks, theft and tampering. All of these conditions cause the sensors within the IoT to produce unusual and erroneous readings, often known as outliers. Much of the current research has been done in developing the sensor outlier and fault detection models exclusively for the Wireless Sensor Networks (WSN), and adequate research has not been done so far in the context of the IoT. Wireless sensor network’s operational framework differ greatly when compared to IoT’s operational framework, using some of the existing models developed for WSN cannot be used on IoT’s for detecting outliers and faults. Sensor faults and outlier detection is very crucial in the IoT to detect the high probability of erroneous reading or data corruption, thereby ensuring the quality of the data collected by sensors. The data collected by sensors are initially pre-processed to be transformed into information and when Artificially Intelligent (AI), Machine Learning (ML) models are further used by the IoT, the information is further processed into applications and processes. Any faulty, erroneous, corrupted sensor readings corrupt the trained models, which thereby produces abnormal processes or outliers that are significantly distinct from the normal behavioural processes of a system. In this paper, we present a comprehensive review of the detecting sensor faults, anomalies, outliers in the Internet of Things and the challenges. A comprehensive guideline to select an adequate outlier detection model for the sensors in the IoT context for various applications is discussed.
Multi-User Opportunistic Spectrum Access for Cognitive Radio Networks Based on Multi-Head Self-Attention and Multi-Agent Deep Reinforcement Learning
Aiming to address the issue of multi-user dynamic spectrum access in an opportunistic mode in cognitive radio networks leading to low sum throughput, we propose a multi-user opportunistic spectrum access method based on multi-head self-attention and multi-agent deep reinforcement learning. First, an optimization model for joint channel selection and power control in multi-user systems is constructed based on centralized training with a decentralized execution framework. In the training phase, the decision-making policy is optimized using global information, while in the execution phase, each agent makes decisions according to its observations. Meanwhile, a multi-constraint dynamic proportional reward function is designed to guide the agent in selecting more rational actions by refining the constraints and dynamically adjusting the reward proportion. Furthermore, a multi-head self-attention mechanism is incorporated into the critic network to dynamically allocate attention weights to different users, thereby enhancing the ability of the network to estimate the joint action value. Finally, the proposed method is evaluated in terms of convergence, throughput, and dynamic performance. Simulation results demonstrate that the proposed method significantly improves the sum throughput of secondary users in opportunistic spectrum access.
Multi agent reinforcement learning for online layout planning and scheduling in flexible assembly systems
Manufacturing systems are undergoing systematic change facing the trade-off between the customer's needs and the economic and ecological pressure. Especially assembly systems must be more flexible due to many product generations or unpredictable material and demand fluctuations. As a solution line-less mobile assembly systems implement flexible job routes through movable multi-purpose resources and flexible transportation systems. Moreover, a completely reactive rearrangeable layout with mobile resources enables reconfigurations without interrupting production. A scheduling that can handle the complexity of dynamic events is necessary to plan job routes and control transportation in such an assembly system. Conventional approaches for this control task require exponentially rising computational capacities with increasing problem sizes. Therefore, the contribution of this work is an algorithm to dynamically solve the integrated problem of layout optimization and scheduling in line-less mobile assembly systems. The proposed multi agent deep reinforcement learning algorithm uses proximal policy optimization and consists of a decoder and encoder, allowing for various-sized system state descriptions. A simulation study shows that the proposed algorithm performs better in 78% of the scenarios compared to a random agent regarding the makespan optimization objective. This allows for adaptive optimization of line-less mobile assembly systems that can face global challenges.
Multi-Agent Deep Reinforcement Learning for Collision-Free Posture Control of Multi-Manipulators in Shared Workspaces
In multi-manipulator systems operating within shared workspaces, achieving collision-free posture control is challenging due to high degrees of freedom and complex inter-manipulator interactions. Traditional motion planning methods often struggle with scalability and computational efficiency in such settings, motivating the need for learning-based approaches. This paper presents a multi-agent deep reinforcement learning (MADRL) framework for real-time collision-free posture control of multiple manipulators. The proposed method employs a line-segment representation of manipulator links to enable efficient interlink distance computation to guide cooperative collision avoidance. Employing a centralized training and decentralized execution (CTDE) framework, the approach leverages global state information during training, while enabling each manipulator to rely on local observations for real-time collision-free trajectory planning. By integrating efficient state representation with a scalable training paradigm, the proposed framework provides a principled foundation for addressing coordination challenges in dense industrial workspaces. The approach is implemented and validated in NVIDIA Isaac Sim across various overlapping workspace scenarios. Compared to conventional state representations, the proposed method achieves faster learning convergence and superior computational efficiency. In pick-and-place tasks, collaborative multi-manipulator control reduces task completion time by over 50% compared to single-manipulator operation, while maintaining high success rates (>83%) under dense workspace conditions. These results confirm the effectiveness and scalability of the proposed framework for real-time, collision-free multi-manipulator control.
Intelligent games meeting with multi-agent deep reinforcement learning: a comprehensive review
Recent years have witnessed the great achievement of the AI-driven intelligent games, such as AlphaStar defeating the human experts, and numerous intelligent games have come into the public view. Essentially, deep reinforcement learning (DRL), especially multiple-agent DRL (MADRL) has empowered a variety of artificial intelligence fields, including intelligent games. However, there is lack of systematical review on their correlations. This article provides a holistic picture on smoothly connecting intelligent games with MADRL from two perspectives: theoretical game concepts for MADRL, and MADRL for intelligent games. From the first perspective, information structure and game environmental features for MADRL algorithms are summarized; and from the second viewpoint, the challenges in intelligent games are investigated, and the existing MADRL solutions are correspondingly explored. Furthermore, the state-of-the-art (SOTA) MADRL algorithms for intelligent games are systematically categorized, especially from the perspective of credit assignment. Moreover, a comprehensively review on notorious benchmarks are conducted to facilitate the design and test of MADRL based intelligent games. Besides, a general procedure of MADRL simulations is offered. Finally, the key challenges in integrating intelligent games with MADRL, and potential future research directions are highlighted. This survey hopes to provide a thoughtful insight of developing intelligent games with the assistance of MADRL solutions and algorithms.
Applications of Multi-Agent Deep Reinforcement Learning: Models and Algorithms
Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as network resource allocation and sharing, network routing, and traffic signal controls. Multi-agent DRL (MADRL) enables multiple agents to interact with each other and with their operating environment, and learn without the need for external critics (or teachers), thereby solving complex problems. Significant performance enhancements brought about by the use of MADRL have been reported in multi-agent domains; for instance, it has been shown to provide higher quality of service (QoS) in network resource allocation and sharing. This paper presents a survey of MADRL models that have been proposed for various kinds of multi-agent domains, in a taxonomic approach that highlights various aspects of MADRL models and applications, including objectives, characteristics, challenges, applications, and performance measures. Furthermore, we present open issues and future directions of MADRL.
A hybrid MARL clustering framework for real time open pit mine truck scheduling
This paper proposes an innovative approach that combines a QMIX algorithm (a multi-agent deep reinforcement learning algorithm, MADRL) with a Gaussian Mixture Model (GMM) algorithm for optimizing intelligent path planning and scheduling of mining trucks in open-pit mining environments. The focus of this method is twofold. Firstly, it achieves collaborative cooperation among multiple mining trucks using the QMIX algorithm. Secondly, it integrates the GMM algorithm with QMIX for modeling, predicting, classifying and analyzing existing vehicle outcomes, to enhance the navigation capabilities of mining trucks within the environment. Through simulation experiments, the effectiveness of this combined algorithm was validated in improving vehicle operational efficiency, reducing non-working waiting time, and enhancing transportation efficiency. Moreover, this research compares the results of the algorithm with single-agent deep reinforcement learning algorithms, demonstrating the advantages of multi-agent algorithms in environments characterized with multi-agent collaboration. The QMIX-GMM mixed framework outperformed traditional approaches as complexity increased in the mining environments. The study provides new technological insights for intelligent planning of mining trucks and offers significant reference value for the automation of multi-agent collaboration in other environments. The limitation has been with regard to the maximum fleet size considered in the study being suitable to small or mid-scale mines.
Dynamic Navigation and Area Assignment of Multiple USVs Based on Multi-Agent Deep Reinforcement Learning
The unmanned surface vehicle (USV) has attracted more and more attention because of its basic ability to perform complex maritime tasks autonomously in constrained environments. However, the level of autonomy of one single USV is still limited, especially when deployed in a dynamic environment to perform multiple tasks simultaneously. Thus, a multi-USV cooperative approach can be adopted to obtain the desired success rate in the presence of multi-mission objectives. In this paper, we propose a cooperative navigating approach by enabling multiple USVs to automatically avoid dynamic obstacles and allocate target areas. To be specific, we propose a multi-agent deep reinforcement learning (MADRL) approach, i.e., a multi-agent deep deterministic policy gradient (MADDPG), to maximize the autonomy level by jointly optimizing the trajectory of USVs, as well as obstacle avoidance and coordination, which is a complex optimization problem usually solved separately. In contrast to other works, we combined dynamic navigation and area assignment to design a task management system based on the MADDPG learning framework. Finally, the experiments were carried out on the Gym platform to verify the effectiveness of the proposed method.