Catalogue Search | MBRL

An innovative study of art teaching strategies for preschool children based on big data analysis

by Zhao, Junchao , Tao, Zhengwei in 97C40 , Big Data , F values

2024

To understand the relationship between art teaching strategies and innovative development in preschool children. In this paper, using a sample-control research design, 2 middle classes of a kindergarten were selected as experimental and control classes to test the changes of children on these 5 dimensions after the art teaching was carried out, based on no significant differences in the 5 dimensions of the pre-test of innovative thinking test (drawing), namely originality, fluency, delicacy, title, and contemplation. The k-means algorithm in big data was also used to help analyze the effect of art instruction on innovativeness in preschool children. The k-means algorithm analysis revealed that in comparing between-group and within-group differences, there was a significant difference between the control and experimental classes in originality F of 6.65 and a borderline significant F of 3.94 in the dimension of title abstraction. The F values of fluency, originality, titleless, delicacy, and resistance to premature closure (contemplation) in the time dimension were 13.40, 17.84, 3.57, 21.04, and 14.60, respectively. The results found that the dimensions of fluency, originality, captioned abstraction, delicacy, and resistance to premature closure (contemplation) were significantly higher for the posttest children than for the pretest, and the dimensions of innovativeness were significant for both time and class interactions. Thus, there is a correlation between art instruction and innovative thinking development in preschool children.

Journal Article

Share this book

Add to My Shelf

Modeling the connection between oil painting creation and university students’ inspiration in the context of “Internet+”

by Zhao, Junchao , Sahir, Safrizal bin , Tao, Zhengwei in 68T01 , Algorithms , Convolutional neural network

2024

In the Internet era, oil painting creation requires constant observation and learning before it can be inspired, and the inspiration for oil painting creation is like a godsend. This paper constructs a fast Fourier transform convolutional model based on fast Fourier transform and convolutional neural network to study the connection between oil painting creation and university students’ inspiration in the context of “Internet+”. The model was evaluated algorithmically, and a dataset of oil paintings was built to investigate the influence of the hue and brushwork technique on students’ inspiration. From the algorithm evaluation, the algorithm of this paper’s model reduces the floating-point computation from 34.76G to 20.08G than the traditional time-domain convolution algorithm, which reduces the floating-point computation by nearly 15G and greatly improves the running speed of this paper’s model. From the hue examples, the percentage of blue-gray, pale white, and light blue oil paintings are 34.95%, 18.59%, and 10.73%, respectively. Various hues will bring students different emotional expressions and induce their creative inspiration. In terms of brushwork techniques, the average percentages of the six brushwork techniques are 16.53%, 22.21%, 15.68%, 11.73%, 14.69%, and 19.16%, respectively. The fast Fourier transform convolution model established under the background of “Internet+” can effectively analyze the different tones and brushstroke techniques in oil paintings so that students can feel the emotion and vitality of the works in the process of observation and thus promote the generation of inspiration for oil painting creation of college students.

Journal Article

Share this book

Add to My Shelf

Learning to Evolve: Bayesian-Guided Continual Knowledge Graph Embedding

by He, Yuanpeng , Zhang, Yichi , Zhang, Xuan in Bayesian analysis , Clustering , Digital media

2026

As social media and the World Wide Web become hubs for information dissemination, effectively organizing and understanding the vast amounts of dynamically evolving Web content is crucial. Knowledge graphs (KGs) provide a powerful framework for structuring this information. However, the rapid emergence of new hot topics, user relationships, and events in social media renders traditional static knowledge graph embedding (KGE) models rapidly outdated. Continual Knowledge Graph Embedding (CKGE) aims to address this issue, but existing methods commonly suffer from catastrophic forgetting, whereby older, but still valuable, information is lost when learning new knowledge (such as new memes or trending events). This means the model cannot effectively learn the evolution of the data. We propose a novel CKGE framework, BAKE. Unlike existing methods, BAKE formulates CKGE as a sequential Bayesian inference problem and utilizes the Bayesian posterior update principle as a natural continual learning strategy. This principle is insensitive to data order and provides theoretical guarantees to preserve prior knowledge as much as possible. Specifically, we treat each batch of new data as a Bayesian update to the model's prior. By maintaining the posterior distribution, the model effectively preserves earlier knowledge even as it evolves over multiple snapshots. Furthermore, to constrain the evolution of knowledge across snapshots, we introduce a continual clustering method that maintains the compact cluster structure of entity embeddings through a regularization term, ensuring semantic consistency while allowing controlled adaptation to new knowledge. We conduct extensive experiments on multiple CKGE benchmarks, which demonstrate that BAKE achieves the top performance in the vast majority of cases compared to existing approaches.

Paper

Share this book

Add to My Shelf

EvolSQL: Structure-Aware Evolution for Scalable Text-to-SQL Data Synthesis

by Zhou, Xiansheng , Gao, Jianling , Chongyang Tao in Complexity , Datasets , Query languages

2026

Training effective Text-to-SQL models remains challenging due to the scarcity of high-quality, diverse, and structurally complex datasets. Existing methods either rely on limited human-annotated corpora, or synthesize datasets directly by simply prompting LLMs without explicit control over SQL structures, often resulting in limited structural diversity and complexity. To address this, we introduce EvolSQL, a structure-aware data synthesis framework that evolves SQL queries from seed data into richer and more semantically diverse forms. EvolSQL starts with an exploratory Query-SQL expansion to broaden question diversity and improve schema coverage, and then applies an adaptive directional evolution strategy using six atomic transformation operators derived from the SQL Abstract Syntax Tree to progressively increase query complexity across relational, predicate, aggregation, and nesting dimensions. An execution-grounded SQL refinement module and schema-aware deduplication further ensure the creation of high-quality, structurally diverse mapping pairs. Experimental results show that a 7B model fine-tuned on our data outperforms one trained on the much larger SynSQL dataset using only 1/18 of the data.

Paper

Share this book

Add to My Shelf

ODA: Observation-Driven Agent for integrating LLMs and Knowledge Graphs

by Li, Youdi , Sun, Lei , Arakawa, Hiroshi in Graphs , Knowledge representation , Large language models

2024

The integration of Large Language Models (LLMs) and knowledge graphs (KGs) has achieved remarkable success in various natural language processing tasks. However, existing methodologies that integrate LLMs and KGs often navigate the task-solving process solely based on the LLM's analysis of the question, overlooking the rich cognitive potential inherent in the vast knowledge encapsulated in KGs. To address this, we introduce Observation-Driven Agent (ODA), a novel AI agent framework tailored for tasks involving KGs. ODA incorporates KG reasoning abilities via global observation, which enhances reasoning capabilities through a cyclical paradigm of observation, action, and reflection. Confronting the exponential explosion of knowledge during observation, we innovatively design a recursive observation mechanism. Subsequently, we integrate the observed knowledge into the action and reflection modules. Through extensive experiments, ODA demonstrates state-of-the-art performance on several datasets, notably achieving accuracy improvements of 12.87% and 8.9%.

Paper

Share this book

Add to My Shelf

RAGShaper: Eliciting Sophisticated Agentic RAG Skills via Automated Data Synthesis

by Li, Bo , Zhang, Huanyao , Yan, Guochen in Annotations , Bridge failure , Cognition

2026

Agentic Retrieval-Augmented Generation (RAG) empowers large language models to autonomously plan and retrieve information for complex problem-solving. However, the development of robust agents is hindered by the scarcity of high-quality training data that reflects the noise and complexity of real-world retrieval environments. Conventional manual annotation is unscalable and often fails to capture the dynamic reasoning strategies required to handle retrieval failures. To bridge this gap, we introduce RAGShaper, a novel data synthesis framework designed to automate the construction of RAG tasks and robust agent trajectories. RAGShaper incorporates an InfoCurator to build dense information trees enriched with adversarial distractors spanning Perception and Cognition levels. Furthermore, we propose a constrained navigation strategy that forces a teacher agent to confront these distractors, thereby eliciting trajectories that explicitly demonstrate error correction and noise rejection. Comprehensive experiments confirm that models trained on our synthesized corpus significantly outperform existing baselines, exhibiting superior robustness in noise-intensive and complex retrieval tasks.

Paper

Share this book

Add to My Shelf

LLMs are Also Effective Embedding Models: An In-depth Overview

by Shen, Tao , Li, Zhen , Hua, Kai in Effectiveness , Efficiency , Embedding

2025

Large language models (LLMs) have revolutionized natural language processing by achieving state-of-the-art performance across various tasks. Recently, their effectiveness as embedding models has gained attention, marking a paradigm shift from traditional encoder-only models like ELMo and BERT to decoder-only, large-scale LLMs such as GPT, LLaMA, and Mistral. This survey provides an in-depth overview of this transition, beginning with foundational techniques before the LLM era, followed by LLM-based embedding models through two main strategies to derive embeddings from LLMs. 1) Direct prompting: We mainly discuss the prompt designs and the underlying rationale for deriving competitive embeddings. 2) Data-centric tuning: We cover extensive aspects that affect tuning an embedding model, including model architecture, training objectives, data constructions, etc. Upon the above, we also cover advanced methods for producing embeddings from longer texts, multilingual, code, cross-modal data, as well as reasoning-aware and other domain-specific scenarios. Furthermore, we discuss factors affecting choices of embedding models, such as performance/efficiency comparisons, dense vs sparse embeddings, pooling strategies, and scaling law. Lastly, the survey highlights the limitations and challenges in adapting LLMs for embeddings, including cross-task embedding quality, trade-offs between efficiency and accuracy, low-resource, long-context, data bias, robustness, etc. This survey serves as a valuable resource for researchers and practitioners by synthesizing current advancements, highlighting key challenges, and offering a comprehensive framework for future work aimed at enhancing the effectiveness and efficiency of LLMs as embedding models.

Paper

Share this book

Add to My Shelf

Rethinking Regularization Methods for Knowledge Graph Completion

by He, Yuanpeng , Li, Jiandong , Zhang, Xuan in Knowledge representation , Regularization , Regularization methods

2025

Knowledge graph completion (KGC) has attracted considerable attention in recent years because it is critical to improving the quality of knowledge graphs. Researchers have continuously explored various models. However, most previous efforts have neglected to take advantage of regularization from a deeper perspective and therefore have not been used to their full potential. This paper rethinks the application of regularization methods in KGC. Through extensive empirical studies on various KGC models, we find that carefully designed regularization not only alleviates overfitting and reduces variance but also enables these models to break through the upper bounds of their original performance. Furthermore, we introduce a novel sparse-regularization method that embeds the concept of rank-based selective sparsity into the KGC regularizer. The core idea is to selectively penalize those components with significant features in the embedding vector, thus effectively ignoring many components that contribute little and may only represent noise. Various comparative experiments on multiple datasets and multiple models show that the SPR regularization method is better than other regularization methods and can enable the KGC model to further break through the performance margin.

Paper

Share this book

Add to My Shelf

Enhancing LLM Generation with Knowledge Hypergraph for Evidence-Based Medicine

by Zhang, Ying , Zhao, Haiyan , Dou, Chengfeng in Evidence-based medicine , Graph theory , Graphs

2025

Evidence-based medicine (EBM) plays a crucial role in the application of large language models (LLMs) in healthcare, as it provides reliable support for medical decision-making processes. Although it benefits from current retrieval-augmented generation~(RAG) technologies, it still faces two significant challenges: the collection of dispersed evidence and the efficient organization of this evidence to support the complex queries necessary for EBM. To tackle these issues, we propose using LLMs to gather scattered evidence from multiple sources and present a knowledge hypergraph-based evidence management model to integrate these evidence while capturing intricate relationships. Furthermore, to better support complex queries, we have developed an Importance-Driven Evidence Prioritization (IDEP) algorithm that utilizes the LLM to generate multiple evidence features, each with an associated importance score, which are then used to rank the evidence and produce the final retrieval results. Experimental results from six datasets demonstrate that our approach outperforms existing RAG techniques in application domains of interest to EBM, such as medical quizzing, hallucination detection, and decision support. Testsets and the constructed knowledge graph can be accessed at https://drive.google.com/file/d/1WJ9QTokK3MdkjEmwuFQxwH96j_Byawj_/view?usp=drive_linkhttps://drive.google.com/rag4ebm.

Paper

Share this book

Add to My Shelf

PROPHET: An Inferable Future Forecasting Benchmark with Causal Intervened Likelihood Estimation

by Li, Jia , Bai, Xiaoying , Li, Linyu in Artificial intelligence , Benchmarks , Cognition & reasoning

2026

Predicting future events based on news on the Web stands as one of the ultimate aspirations of artificial intelligence. Recent advances in large language model (LLM)-based systems have shown remarkable potential in forecasting future events, thereby garnering significant interest in the research community. Currently, several benchmarks have been established to evaluate the forecasting capabilities by formalizing the event prediction as a retrieval-augmented generation (RAG)-and-reasoning task. In these benchmarks, each prediction question is answered with relevant retrieved news articles downloaded from the Web. However, because there is no consideration of whether the questions can be supported by valid or sufficient supporting rationales, some of the questions in these benchmarks may be inherently noninferable. To address this issue, we introduce a new benchmark, PROPHET, which comprises inferable forecasting questions paired with relevant news for retrieval. To ensure the inferability of the benchmark, we propose Causal Intervened Likelihood (CIL), a statistical measure that assesses inferability through causal inference. In constructing this benchmark, we first collected recent trend forecasting questions, and then filtered the data using CIL resulting in an inferable benchmark for future forecasting. Through extensive experiments, we first demonstrate the validity of CIL and in-depth investigations into future forecasting with the aid of CIL. Subsequently, we evaluate several representative prediction methods on PROPHET. The overall results draws valuable insights for task of future directions.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter