Catalogue Search | MBRL

A survey on large language model based autonomous agents

by FENG, Xueyang , ZHANG, Zeyu , YANG, Hao in Artificial intelligence , autonomous agent , Computer Science

2024

Autonomous agents have long been a research focus in academic and industry communities. Previous research often focuses on training agents with limited knowledge within isolated environments, which diverges significantly from human learning processes, and makes the agents hard to achieve human-like decisions. Recently, through the acquisition of vast amounts of Web knowledge, large language models (LLMs) have shown potential in human-level intelligence, leading to a surge in research on LLM-based autonomous agents. In this paper, we present a comprehensive survey of these studies, delivering a systematic review of LLM-based autonomous agents from a holistic perspective. We first discuss the construction of LLM-based autonomous agents, proposing a unified framework that encompasses much of previous work. Then, we present a overview of the diverse applications of LLM-based autonomous agents in social science, natural science, and engineering. Finally, we delve into the evaluation strategies commonly used for LLM-based autonomous agents. Based on the previous studies, we also present several challenges and future directions in this field.

Journal Article

Share this book

Add to My Shelf

Causal Identification of Artificial Intelligence Effects on Enterprise Labor Structure via a Partially Linear Double Machine Learning Estimator: Evidence from High-Dimensional Panel Data

by Lee, Zne-Jung , Lin, Yankai , Li, Wenjie in AI adoption , Artificial intelligence , causal inference

2026

This study develops a semiparametric causal inference framework to quantify the effect of Artificial Intelligence (AI) adoption on enterprise labor structure under high-dimensional confounding. We employ the Double Machine Learning (DML) estimator proposed , which combines Neyman orthogonality and cross-fitting to achieve reliable causal identification in settings where conventional regression methods are prone to bias from high-dimensional controls and nonlinear confounding. Nuisance functions are estimated using Lasso and Random Forests, enabling flexible modeling of complex relationships between control variables and outcomes. Using an unbalanced panel of Chinese A-share listed companies spanning 2006 to 2023, we identify a significant positive average treatment effect of AI adoption on the share of high-skilled labor (estimate: 0.118; 95% CI: [0.073, 0.163]), indicating that complementarity between AI and skilled workers dominates substitution at the firm level. Heterogeneity analysis reveals that the effect is stronger in manufacturing (0.183) than in services (0.071), and more pronounced in Eastern China (0.142) than in Central and Western regions (0.079). Quantile regression further shows that the complementarity effect intensifies at higher skill quantiles. A Panel Smooth Transition Regression (PSTR) model identifies a digitalization threshold beyond which AI–skill complementarity further strengthens. Mediation analysis confirms that productivity enhancement, digital transformation, and innovation activities together account for the majority of the total effect, with productivity improvement alone contributing approximately 34%. Placebo tests and propensity score weighting validate the robustness of our findings.

Journal Article

Share this book

Add to My Shelf

Leveraging LLM-based agents for social science research: insights from citation network simulations

by Ding, Bolin , Lin, Yankai , Ji, Jiarui in 4007/2801 , 4007/4009 , 4014/2801

2026

The emergence of Large Language Models (LLMs) demonstrates their potential to encapsulate the logic and patterns inherent in human behavior simulation by leveraging extensive web data pre-training. However, the boundaries of LLM capabilities in social simulation remain unclear. To further explore the social attributes of LLMs, we introduce the CiteAgent framework, designed to generate citation networks based on human-behavior simulation with LLM-based agents. CiteAgent successfully captures predominant phenomena in real-world citation networks, including power-law distribution, citational distortion, and shrinking diameter. Building on this realistic simulation, we establish two LLM-based research paradigms in social science: LLM-SE (LLM-based Survey Experiment) and LLM-LE (LLM-based Laboratory Experiment). These paradigms facilitate rigorous analyses of citation network phenomena, allowing us to validate and challenge existing theories. Additionally, we extend the research scope of traditional science of science studies through idealized social experiments, with the simulation experiment results providing valuable insights for real-world academic environments. Our work demonstrates the potential of LLMs for advancing science of science research in social science.

Journal Article

Share this book

Add to My Shelf

Effect of Fluorosilicone Rubber on Mechanical Properties, Dielectric Breakdown Strength and Hydrophobicity of Methyl Vinyl Silicone Rubber

by Wang, Zhaoyang , Lin, Yankai , Li, Zhanxu in Aging , Aluminum , Analysis

2023

Silicone rubber (SIR) is used in high-voltage insulators because of its insulation, and excellent hydrophobicity is very important in harsh outdoor environments. To enhance the hydrophobicity and low-temperature resistance of silicone rubber, methyl vinyl silicone rubber and fluorosilicone rubber (FSIR) blend composites with different ratios were prepared. The samples were characterized and analyzed using scanning electron microscopy, tensile testing, dynamic mechanical analysis and static contact angle testing. The results showed that after blending, SIR and FSIR were well compatible. FSIR had higher elastic modulus and reduced the tensile strength to some extent in SIR/FSIR composites. The addition of a small amount of FSIR made its crystallization temperature decrease from −30 to −45 °C, meaning that the low-temperature resistance was significantly improved. The breakdown strength of SIR/FSIR composites can still be maintained at a high level when a small amount of FSIR is added. The contact angle of the composites increased from 108.9 to 115.8° with the increase in FSIR content, indicating the enhanced hydrophobicity. When the samples were immersed in water for 96 h, the hydrophobicity migration phenomenon occurred. The static contact angle of the samples with less FSIR content had a weaker decreasing trend, which illustrated that the hydrophobicity was maintained at a high level.

Journal Article

Share this book

Add to My Shelf

Mechanical, Dielectric and Hydrophobic Properties of Phenyl Silicone Rubber and Methyl Vinyl Silicone Rubber Blend Composites

by Ao, Hui , Wang, Jian , Lin, Yankai in composite materials , dielectric materials , electric breakdown

2025

Silicone rubber (SIR) composite insulators are widely employed in electrical applications due to their exceptional chemical stability, low surface energy and superior electrical insulation properties. To enhance the hydrophobicity and low‐temperature resistance of SIR, blended composites with varying ratios of SIR and phenyl silicone rubber (PSIR) were fabricated. The study revealed that matrix‐filler network interactions between PSIR and fillers were weaker compared to those in SIR‐based systems. Increasing PSIR content led to reduced elongation at break in the composites, while tensile strength remained largely unchanged. Concurrently, the breakdown strength is inferior to that of pure PSIR composites. Notably, the blend of SIR and PSIR enhances both hydrophobicity and resistance to hydrophobicity migration. This work provides a strategic approach for enhancing the performance of SIR composites suitable for applications in regions with high humidity and significant rainfall. The study revealed that matrix‐filler network interactions between PSIR and fillers were weaker compared to those in SIR‐based systems. Increasing PSIR content led to reduced elongation at break in the composites, while tensile strength remained largely unchanged. Notably, the composites exhibited a marginal increase in static contact angle, indicating enhanced hydrophobicity.

Journal Article

Share this book

Add to My Shelf

Lyapunov and Riccati Equations from a Positive System Perspective

by Lin, Yankai , Wu, Dongjun in Convergence , Linear systems , Observability (systems)

2025

This paper presents a new interpretation of the Lyapunov and Riccati equations from the perspective of positive system theory. We show it is possible to construct positive systems related to these equations, and then certain conclusions -- such as the existence and uniqueness of solutions -- can be drawn from positive systems theory. Specifically, under standard observability assumptions, a strictly positive linear system can be constructed for Lyapunov equations, leading to exponential convergence in Hilbert metric to the Perron-Frobenius vector -- closely related to the solution of the Lyapunov equation. For algebraic Riccati equations, homogeneous strictly positive systems can be constructed, which exhibit more complex dynamical behaviors. While the existence and uniqueness of the solution can still be proven, only asymptotic convergence can be obtained.

Paper

Share this book

Add to My Shelf

Online Convex Optimization Using Coordinate Descent Algorithms

by Shames, Iman , Lin, Yankai , Nešić, Dragan in Algorithms , Computational geometry , Convexity

2024

This paper considers the problem of online optimization where the objective function is time-varying. In particular, we extend coordinate descent type algorithms to the online case, where the objective function varies after a finite number of iterations of the algorithm. Instead of solving the problem exactly at each time step, we only apply a finite number of iterations at each time step. Commonly used notions of regret are used to measure the performance of the online algorithm. Moreover, coordinate descent algorithms with different updating rules are considered, including both deterministic and stochastic rules that are developed in the literature of classical offline optimization. A thorough regret analysis is given for each case. Finally, numerical simulations are provided to illustrate the theoretical results.

Paper

Share this book

Add to My Shelf

DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards

by Lin, Yankai , Wu, Wei , Zhang, Kaiyi in Centroids , Discriminators , Effectiveness

2026

Reinforcement learning from verifiable rewards (RLVR) has emerged as a central technique for improving the reasoning capabilities of large language models. Despite its effectiveness, how response-level rewards translate into token-level probability changes remains poorly understood. We introduce a discriminator view of RLVR updates, showing that the policy-gradient update direction implicitly acts as a linear discriminator over token-gradient vectors and thereby determines which token probabilities are increased or decreased during learning. Under standard sequence-level RLVR, this discriminator is constructed from positive- and negative-side centroids formed by advantage-weighted averaging of token-gradient vectors. However, such centroid construction can be dominated by shared high-frequency patterns, such as formatting tokens, diluting sparse yet discriminative directions that better distinguish high-reward responses from low-reward ones. To address this limitation, we propose \\(DelTA\\), a discriminative token credit assignment method that estimates token coefficients to amplify side-specific token-gradient directions and downweight shared or weakly discriminative ones. These coefficients reweight a self-normalized RLVR surrogate, making the effective side-wise centroids more contrastive and thereby reshaping the RLVR update direction. On seven mathematical benchmarks, DelTA outperforms the strongest same-scale baselines by 3.26 and 2.62 average points on Qwen3-8B-Base and Qwen3-14B-Base, respectively. Additional results on code generation, a different backbone, and out-of-domain evaluations further demonstrate the generalization ability of DelTA.

Paper

Share this book

Add to My Shelf

DARC: Decoupled Asymmetric Reasoning Curriculum for LLM Evolution

by Lin, Yankai , Ye, Xuyan , Fan, Shengda in Annotations , Artificial intelligence , Asymmetry

2026

Self-play with large language models has emerged as a promising paradigm for achieving self-improving artificial intelligence. However, existing self-play frameworks often suffer from optimization instability, due to (i) non-stationary objectives induced by solver-dependent reward feedback for the Questioner, and (ii) bootstrapping errors from self-generated pseudo-labels used to supervise the Solver. To mitigate these challenges, we introduce DARC (Decoupled Asymmetric Reasoning Curriculum), a two-stage framework that stabilizes the self-evolution process. First, we train the Questioner to synthesize difficulty-calibrated questions, conditioned on explicit difficulty levels and external corpora. Second, we train the Solver with an asymmetric self-distillation mechanism, where a document-augmented teacher generates high-quality pseudo-labels to supervise the student Solver that lacks document access. Empirical results demonstrate that DARC is model-agnostic, yielding an average improvement of 10.9 points across nine reasoning benchmarks and three backbone models. Moreover, DARC consistently outperforms all baselines and approaches the performance of fully supervised models without relying on human annotations. The code is available at https://github.com/RUCBM/DARC.

Paper

Share this book

Add to My Shelf

Modified Control Barrier Function for Quadratic Program Based Control Design via Sum-of-Squares Programming

by Lin, Yankai , Chong, Michelle S , Murguia, Carlos in Closed loops , Controllers , Functions (mathematics)

2025

We consider a nonlinear control affine system controlled by inputs generated by a quadratic program (QP) induced by a control barrier functions (CBF). Specifically, we slightly modify the condition satisfied by CBFs and study how the modification can positively impact the closed loop behavior of the system. We show that, QP-based controllers designed using the modified CBF condition preserves the desired properties of QP-based controllers using standard CBF conditions. Furthermore, using the generalized S-procedure for polynomial functions, we formulate the design of the modified CBFs as a Sum-Of-Squares (SOS) program, which can be solved efficiently. Via a numerical example, the proposed CBF design is shown to have superior performance over the standard CBF widely used in existing literature.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter