Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
580 result(s) for "Pre-trained model"
Sort by:
TraceGuard: Fine-Tuning Pre-Trained Model by Using Stego Images to Trace Its User
Currently, a significant number of pre-trained models are published online to provide services to users owing to the rapid maturation and popularization of machine learning as a service (MLaaS). Some malicious users have pre-trained models illegally to redeploy them and earn money. However, most of the current methods focus on verifying the copyright of the model rather than tracing responsibility for the suspect model. In this study, TraceGuard is proposed, the first framework based on steganography for tracing a suspect self-supervised learning (SSL) pre-trained model, to ascertain which authorized user illegally released the suspect model or if the suspect model is independent. Concretely, the framework contains an encoder and decoder pair and the SSL pre-trained model. Initially, the base pre-trained model is frozen, and the encoder and decoder are jointly learned to ensure the two modules can embed the secret key into the cover image and extract the secret key from the embedding output by the base pre-trained model. Subsequently, the base pre-trained model is fine-tuned using stego images to implement a fingerprint while the encoder and decoder are frozen. To assure the effectiveness and robustness of the fingerprint and the utility of fingerprinted pre-trained models, three alternate steps of model stealing simulations, fine-tuning for uniqueness, and fine-tuning for utility are designed. Finally, the suspect pre-trained model is traced to its user by querying stego images. Experimental results demonstrate that TraceGuard can reliably trace suspect models and is robust against common fingerprint removal attacks such as fine-tuning, pruning, and model stealing. In the future, we will further improve the robustness against model stealing attack.
KitWaSor: Pioneering pre‐trained model for kitchen waste sorting with an innovative million‐level benchmark dataset
Intelligent sorting is an important prerequisite for the full quantitative consumption and harmless disposal of kitchen waste. The existing object detection method based on an ImageNet pre‐trained model is an effective way of sorting. Owing to significant domain gaps between natural images and kitchen waste images, it is difficult to reflect the characteristics of diverse scales and dense distribution in kitchen waste based on an ImageNet pre‐trained model, leading to poor generalisation. In this article, the authors propose the first pre‐trained model for kitchen waste sorting called KitWaSor, which combines both contrastive learning (CL) and masked image modelling (MIM) through self‐supervised learning (SSL). First, to address the issue of diverse scales, the authors propose a mixed masking strategy by introducing an incomplete masking branch based on the original random masking branch. It prevents the complete loss of small‐scale objects while avoiding excessive leakage of large‐scale object pixels. Second, to address the issue of dense distribution, the authors introduce semantic consistency constraints on the basis of the mixed masking strategy. That is, object semantic reasoning is performed through semantic consistency constraints to compensate for the lack of contextual information. To train KitWaSor, the authors construct the first million‐level kitchen waste dataset across seasonal and regional distributions, named KWD‐Million. Extensive experiments show that KitWaSor achieves state‐of‐the‐art (SOTA) performance on the two most relevant downstream tasks for kitchen waste sorting (i.e. image classification and object detection), demonstrating the effectiveness of the proposed KitWaSor.
Transfer learning data adaptation using conflation of low‐level textural features
Adapting the target dataset for a pre‐trained model is still challenging. These adaptation problems result from a lack of adequate transfer of traits from the source dataset; this often leads to poor model performance resulting in trial and error in selecting the best‐performing pre‐trained model. This paper introduces the conflation of source domain low‐level textural features extracted using the first layer of the pre‐trained model. The extracted features are compared to the conflated low‐level features of the target dataset to select a higher‐quality target dataset for improved pre‐trained model performance and adaptation. From comparing the various probability distance metrics, Kullback‐Leibler is adopted to compare the samples from both domains. We experiment on three publicly available datasets and two ImageNet pre‐trained models used in past studies for results comparisons. This proposed approach method yields two categories of the target samples with those with lower Kullback‐Leibler values giving better accuracy, precision and recall. The samples with the lower Kullback‐Leibler values give a higher margin accuracy rate of 0.22%–9.15%, thereby leading to better model adaptation and easier model selection process for the target transfer learning datasets and tasks. Adapting the target dataset for a pre‐trained model is still challenging due to poor source knowledge transfer in the target domain. This paper introduces the conflation of low‐level textural features in the source and target domains of the pretrained model allowing the selection of a higher quality target dataset for improved pre‐trained model performance and adaptation. This proposed approach uses target samples with lower Kullback‐Leibler values giving better accuracy, precision and recall with a margin accuracy rate of 6.21% to 7.27%, leading to better model adaptation for target transfer learning datasets and tasks.
Review of Deep Learning for Language Modeling
[Purpose/Significance] Deep learning for language modeling is one of the major methods and advanced technologies to enhance language intelligence of machines at present, which has become an indispensable important technical means for automatic processing and analysis of data resources, and intelligent mining of information and knowledge. However, there are still some difficulties in using deep learning for language modeling for technology development and application service in the library and information science (LIS) field. Therefore, this study systematically reviews and reveals the research progress, technical principles, and development methods of deep learning for language modeling, with the aim at providing reliable theoretical basis and feasible methodological paths for the deep understanding and application of deep learning for language modeling for librarians and fellow practitioners. [Method/Process] The data used in this study were collected from the WOS core database, CNKI literature database, arXiv preprint repository, GitHub open-source software hosting platform and the open resources on the Internet. Based on these data, this paper first systematically investigates the background, basic feature representation algorithms, and representative application development tools of deep learning for language modeling, reveals their dynamic evolution and technical principles, and analyzes the advantages and disadvantages and applicability of each algorithm model and development tool. Second, an in-depth analysis of the possible challenging problems faced by the development and application of deep learning for language modeling was performed, and two strategic approaches to expand their application capabilities were put forward. [Results/Conclusions] The important challenges faced by the application and development of deep learning for language modeling include numerous parameters and difficulties to adjust accuracy, relying on a large amount of accurate training data, difficulties in making changes, and the intellectual property and information security issues. In the future, we will start from two aspects of specific domains and feature engineering to expand and improve the application capabilities of deep learning for language modeling. Specifically, we focus on consideration of the collection and preparation of domain data, selection of model architecture, participation of domain experts, and optimization for specific tasks, in order to ensure that the data source of the model is more reliable and secure, and the application effect is more accurate and practical. Moreover, the strategic methods for feature engineering to expand the application capabilities of deep learning for language modeling include selecting appropriate features, feature pre-processing, feature selection, and feature dimensionality reduction. These strategies can help improve the performance and efficiency of deep learning for language models, making them more suitable for specific tasks or domains. To sum up, LIS institutions should leverage the deep learning for language modeling related technologies, guided by the needs of scientific research and social development, and based on advantages of existing literature data resources and knowledge services; they should carry out innovative professional or vertical domain intelligent knowledge management and application service, and develop technology and systems with independent intellectual property rights, which is their long-term sustainable development path.
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey
With the urgent demand for generalized deep models, many pre-trained big models are proposed, such as bidirectional encoder representations (BERT), vision transformer (ViT), generative pre-trained transformers (GPT), etc. Inspired by the success of these models in single domains (like computer vision and natural language processing), the multi-modal pre-trained big models have also drawn more and more attention in recent years. In this work, we give a comprehensive survey of these models and hope this paper could provide new insights and helps fresh researchers to track the most cutting-edge works. Specifically, we firstly introduce the background of multi-modal pre-training by reviewing the conventional deep learning, pre-training works in natural language process, computer vision, and speech. Then, we introduce the task definition, key challenges, and advantages of multi-modal pre-training models (MM-PTMs), and discuss the MM-PTMs with a focus on data, objectives, network architectures, and knowledge enhanced pre-training. After that, we introduce the downstream tasks used for the validation of large-scale MM-PTMs, including generative, classification, and regression tasks. We also give visualization and analysis of the model parameters and results on representative downstream tasks. Finally, we point out possible research directions for this topic that may benefit future works. In addition, we maintain a continuously updated paper list for large-scale pre-trained multi-modal big models: https://github.com/wangxiao5791509/MultiModal_BigModels_Survey.
Two‐Stage Early Exiting From Globality Towards Reliability
Early exiting has shown significant potential in accelerating the inference of pre‐trained language models (PLMs) by allowing easy samples to exit from shallow layers. However, existing early exiting methods primarily rely on local information from individual samples to estimate prediction uncertainty for making exiting decisions, overlooking the global information provided by the sample population. This impacts the estimation of prediction uncertainty, compromising the reliability of exiting decisions. To remedy this, inspired by principal component analysis (PCA), the authors define a residual score to capture the deviation of features from the principal space of the sample population, providing a global perspective for estimating prediction uncertainty. Building on this, a two‐stage exiting strategy is proposed that integrates global information from residual scores with local information from energy scores at both the decision and feature levels. This strategy incorporates three‐way decisions to enable more reliable exiting decisions for boundary region samples by delaying judgement. Extensive experiments on the GLUE benchmark validate that the method achieves an average speed‐up ratio of 2.17× across all tasks with minimal performance degradation. Additionally, it surpasses the state‐of‐the‐art E‐LANG by 11%$11\\%$in model acceleration, along with a performance improvement of 0.6 points, demonstrating a better performance‐efficiency trade‐off.
Multi-Stage Prompt Tuning for Political Perspective Detection in Low-Resource Settings
Political perspective detection in news media—identifying political bias in news articles—is an essential but challenging low-resource task. Prompt-based learning (i.e., discrete prompting and prompt tuning) achieves promising results in low-resource scenarios by adapting a pre-trained model to handle new tasks. However, these approaches suffer performance degradation when the target task involves a textual domain (e.g., a political domain) different from the pre-training task (e.g., masked language modeling on a general corpus). In this paper, we develop a novel multi-stage prompt tuning framework for political perspective detection. Our method involves two sequential stages: a domain- and task-specific prompt tuning stage. In the first stage, we tune the domain-specific prompts based on a masked political phrase prediction (MP3) task to adjust the language model to the political domain. In the second task-specific prompt tuning stage, we only tune task-specific prompts with a frozen language model and domain-specific prompts for downstream tasks. The experimental results demonstrate that our method significantly outperforms fine-tuning (i.e., model tuning) methods and state-of-the-art prompt tuning methods on the SemEval-2019 Task 4: Hyperpartisan News Detection and AllSides datasets.
Foundation and large language models: fundamentals, challenges, opportunities, and social impacts
Foundation and Large Language Models (FLLMs) are models that are trained using a massive amount of data with the intent to perform a variety of downstream tasks. FLLMs are very promising drivers for different domains, such as Natural Language Processing (NLP) and other AI-related applications. These models emerged as a result of the AI paradigm shift, involving the use of pre-trained language models (PLMs) and extensive data to train transformer models. FLLMs have also demonstrated impressive proficiency in addressing a wide range of NLP applications, including language generation, summarization, comprehension, complex reasoning, and question answering, among others. In recent years, there has been unprecedented interest in FLLMs-related research, driven by contributions from both academic institutions and industry players. Notably, the development of ChatGPT, a highly capable AI chatbot built around FLLMs concepts, has garnered considerable interest from various segments of society. The technological advancement of large language models (LLMs) has had a significant influence on the broader artificial intelligence (AI) community, potentially transforming the processes involved in the development and use of AI systems. Our study provides a comprehensive survey of existing resources related to the development of FLLMs and addresses current concerns, challenges and social impacts. Moreover, we emphasize on the current research gaps and potential future directions in this emerging and promising field.
Bilingual phrase induction with local hard negative sampling
Bilingual lexicon induction focuses on learning word translation pairs, also known as bitexts, from monolingual corpora by establishing a mapping between the source and target embedding spaces. Despite recent advancements, bilingual lexicon induction is limited to inducing bitexts consisting of individual words, lacking the ability to handle semantics‐rich phrases. To bridge this gap and support downstream cross‐lingual tasks, it is practical to develop a method for bilingual phrase induction that extracts bilingual phrase pairs from monolingual corpora without relying on cross‐lingual knowledge. In this paper, the authors propose a novel phrase embedding training method based on the skip‐gram structure. Specifically, a local hard negative sampling strategy that utilises negative samples of central tokens in sliding windows to enhance phrase embedding learning is introduced. The proposed method achieves competitive or superior performance compared to baseline approaches, with exceptional results recorded for distant languages. Additionally, we develop a phrase representation learning method that leverages multilingual pre‐trained language models. These mPLMs‐based representations can be combined with the above‐mentioned static phrase embeddings to further improve the accuracy of the bilingual phrase induction task. We manually construct a dataset of bilingual phrase pairs and integrate it with MUSE to facilitate the bilingual phrase induction task.
Revisiting Class-Incremental Learning with Pre-Trained Models: Generalizability and Adaptivity are All You Need
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge ( Aper ), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM’s generalizability and adapted model’s adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL .