Catalogue Search | MBRL

Studying word meaning evolution through incremental semantic shift detection

by Montanelli, Stefano , Tahmasebi, Nina , Ferrara, Alfio in Availability , Changes , Clustering

2025

The study of semantic shift , that is, of how words change meaning as a consequence of social practices, events and political circumstances, is relevant in Natural Language Processing, Linguistics, and Social Sciences. The increasing availability of large diachronic corpora and advance in computational semantics have accelerated the development of computational approaches to detecting such shift. In this paper, we introduce a novel approach to tracing the evolution of word meaning over time. Our analysis focuses on gradual changes in word semantics and relies on an incremental approach to semantic shift detection (SSD) called What is Done is Done (WiDiD). WiDiD leverages scalable and evolutionary clustering of contextualised word embeddings to detect semantic shift and capture temporal transactions in word meanings. Existing approaches to SSD: (a) significantly simplify the semantic shift problem to cover change between two (or a few) time points, and (b) consider the existing corpora as static. We instead treat SSD as an organic process in which word meanings evolve across tens or even hundreds of time periods as the corpus is progressively made available. This results in an extremely demanding task that entails a multitude of intricate decisions. We demonstrate the applicability of this incremental approach on a diachronic corpus of Italian parliamentary speeches spanning eighteen distinct time periods. We also evaluate its performance on seven popular labelled benchmarks for SSD across multiple languages. Empirical results show that our results are comparable to state-of-the-art approaches, while outperforming the state-of-the-art for certain languages.

Journal Article

Share this book

Add to My Shelf

Sense through time: diachronic word sense annotations for word sense induction and Lexical Semantic Change Detection

by Bravo-Marquez, Felipe , Arefyev, Nikolay , Schlechtweg, Dominik in Annotations , Change detection , Changes

2025

There has been extensive work on human word sense annotation, i.e., manually labeling word uses in natural texts according to their senses. Such labels were primarily created for the tasks of Word Sense Disambiguation (WSD) and Word Sense Induction (WSI). However, almost all datasets annotated with word senses are synchronic datasets, i.e., contain texts created in a relatively short period of time and often do not provide the creation date of the texts. This ignores possible applications in diachronic-historic settings, where the aim is to induce or disambiguate historical word senses or changes in senses across time. To facilitate investigations into historical WSD and WSI and to establish connections with the task of Lexical Semantic Change Detection (LSCD), there is a crucial need for historical word sense-annotated data. Hence, we created a new reliable diachronic WSD/WSI dataset ‘DWUG DE Sense’. We describe the preparation and annotation and analyze central statistics. We then describe a thorough evaluation of different prediction systems for jointly solving both WSI and LSCD tasks. All our systems are based on a state-of-the-art architecture that combines Word-in-Context models and graph clustering techniques with different hyperparameter settings. Our findings reveal that using the WSI task as optimization criterion yields better results for both tasks even when the LSCD task is the focal point of optimization. This underscores that although both tasks are related, WSI seems to be more general and able to incorporate the LSCD task.

Journal Article

Share this book

Add to My Shelf

Lexico-Semantic Change in the Kazakh Language of the COVID Era

by Kazanbayeva, Ainagul , Samenova, Sveta , Tursunova, Markhaba in Change agents , Color , Communicable Diseases

2023

This article discusses lexical and semantic changes during the COVID-19 pandemic. In this article, we describe semantic shifts, new concepts, and neologisms associated with the COVID-19 pandemic based on the results of an associative survey. A total of 142 respondents voluntarily participated in our online survey. The term ‘coronavirus’ was taken as a stimulus word. Respondents had to answer what colour and number the word ‘coronavirus’ is associated with. The results of the study show that the stimulus ‘coronavirus’ in the minds of people activates the colours red, green, black, blue, yellow and very weakly causes associations with brown, white, gold, purple, colourless, as well as the frequency of the number 19. Additionally, according to the results of the study, it can be said that during the COVID-19 pandemic, negative meanings of colourative vocabulary were actualized (except green, because this colour began to symbolize safety), and numbers and some new concepts that have a nonpositive colouring appeared.

Journal Article

Share this book

Add to My Shelf

Improving semantic change analysis by combining word embeddings and word frequencies

by Böhm Klemens , Englhardt Adrian , Schäler Martin in Algorithms , Analysis , Change detection

2020

Language is constantly evolving. As part of diachronic linguistics, semantic change analysis examines how the meanings of words evolve over time. Such semantic awareness is important to retrieve content from digital libraries. Recent research on semantic change analysis relying on word embeddings has yielded significant improvements over previous work. However, a recent, but somewhat neglected observation so far is that the rate of semantic shift negatively correlates with word-usage frequency. In this article, we therefore propose SCAF, Semantic Change Analysis with Frequency. It abstracts from the concrete embeddings and includes word frequencies as an orthogonal feature. SCAF allows using different combinations of embedding type, optimization algorithm and alignment method. Additionally, we leverage existing approaches for time series analysis, by using change detection methods to identify semantic shifts. In an evaluation with a realistic setup, SCAF achieves better detection rates than prior approaches, 95% instead of 51%. On the Google Books Ngram data set, our approach detects both known and yet unknown shifts for popular words.

Journal Article

Share this book

Add to My Shelf

Semantic change computation: A successive approach

by Chen, Xiaohe , Tang, Xuri , Qu, Weiguang in Categories , Coining , Computation

2016

The prevalence of creativity in the emergent online media language calls for more effective computational approach to semantic change. Two divergent metaphysical understandings are found with the task: juxtaposition-view of change and succession-view of change. This paper argues that the succession-view better reflects the essence of semantic change and proposes a successive framework for automatic semantic change detection. The framework analyzes the semantic change at both the word level and the individual-sense level inside a word by transforming the task into change pattern detection over time series data. At the word level, the framework models the word’s semantic change with S-shaped model and successfully correlates change patterns with classical semantic change categories such as broadening, narrowing, new word coining, metaphorical change, and metonymic change. At the sense level, the framework measures the conventionality of individual senses and distinguishes categories of temporary word usage, basic sense, novel sense and disappearing sense, again with S-shaped model. Experiments at both levels yield increased precision rate as compared with the baseline, supporting the succession-view of semantic change.

Journal Article

Share this book

Add to My Shelf

Natural Language Processing Reveals Vulnerable Mental Health Support Groups and Heightened Health Anxiety on Reddit During COVID-19: Observational Study

by Torous, John , Rumker, Laurie , Talkar, Tanya in Adolescent , Adult , Anxiety

2020

The COVID-19 pandemic is impacting mental health, but it is not clear how people with different types of mental health problems were differentially impacted as the initial wave of cases hit. The aim of this study is to leverage natural language processing (NLP) with the goal of characterizing changes in 15 of the world's largest mental health support groups (eg, r/schizophrenia, r/SuicideWatch, r/Depression) found on the website Reddit, along with 11 non-mental health groups (eg, r/PersonalFinance, r/conspiracy) during the initial stage of the pandemic. We created and released the Reddit Mental Health Dataset including posts from 826,961 unique users from 2018 to 2020. Using regression, we analyzed trends from 90 text-derived features such as sentiment analysis, personal pronouns, and semantic categories. Using supervised machine learning, we classified posts into their respective support groups and interpreted important features to understand how different problems manifest in language. We applied unsupervised methods such as topic modeling and unsupervised clustering to uncover concerns throughout Reddit before and during the pandemic. We found that the r/HealthAnxiety forum showed spikes in posts about COVID-19 early on in January, approximately 2 months before other support groups started posting about the pandemic. There were many features that significantly increased during COVID-19 for specific groups including the categories \"economic stress,\" \"isolation,\" and \"home,\" while others such as \"motion\" significantly decreased. We found that support groups related to attention-deficit/hyperactivity disorder, eating disorders, and anxiety showed the most negative semantic change during the pandemic out of all mental health groups. Health anxiety emerged as a general theme across Reddit through independent supervised and unsupervised machine learning analyses. For instance, we provide evidence that the concerns of a diverse set of individuals are converging in this unique moment of history; we discovered that the more users posted about COVID-19, the more linguistically similar (less distant) the mental health support groups became to r/HealthAnxiety (ρ=-0.96, P<.001). Using unsupervised clustering, we found the suicidality and loneliness clusters more than doubled in the number of posts during the pandemic. Specifically, the support groups for borderline personality disorder and posttraumatic stress disorder became significantly associated with the suicidality cluster. Furthermore, clusters surrounding self-harm and entertainment emerged. By using a broad set of NLP techniques and analyzing a baseline of prepandemic posts, we uncovered patterns of how specific mental health problems manifest in language, identified at-risk users, and revealed the distribution of concerns across Reddit, which could help provide better resources to its millions of users. We then demonstrated that textual analysis is sensitive to uncover mental health complaints as they appear in real time, identifying vulnerable groups and alarming themes during COVID-19, and thus may have utility during the ongoing pandemic and other world-changing events such as elections and protests.

Journal Article

Share this book

Add to My Shelf

Spatial-Temporal Semantic Perception Network for Remote Sensing Image Semantic Change Detection

by He, You , Zhang, Hanchao , Zhang, Ruiqian in Accuracy , Algorithms , Artificial intelligence

2023

Semantic change detection (SCD) is a challenging task in remote sensing, which aims to locate and identify changes between the bi-temporal images, providing detailed “from-to” change information. This information is valuable for various remote sensing applications. Recent studies have shown that multi-task networks, with dual segmentation branches and single change branch, are effective in SCD tasks. However, these networks primarily focus on extracting contextual information and ignore spatial details, resulting in the missed or false detection of small targets and inaccurate boundaries. To address the limitations of the aforementioned methods, this paper proposed a spatial-temporal semantic perception network (STSP-Net) for SCD. It effectively utilizes spatial detail information through the detail-aware path (DAP) and generates spatial-temporal semantic-perception features through combining deep contextual features. Meanwhile, the network enhances the representation of semantic features in spatial and temporal dimensions by leveraging a spatial attention fusion module (SAFM) and a temporal refinement detection module (TRDM). This augmentation results in improved sensitivity to details and adaptive performance balancing between semantic segmentation (SS) and change detection (CD). In addition, by incorporating the invariant consistency loss function (ICLoss), the proposed method constrains the consistency of land cover (LC) categories in invariant regions, thereby improving the accuracy and robustness of SCD. The comparative experimental results on three SCD datasets demonstrate the superiority of the proposed method in SCD. It outperforms other methods in various evaluation metrics, achieving a significant improvement. The Sek improvements of 2.84%, 1.63%, and 0.78% have been observed, respectively.

Journal Article

Share this book

Add to My Shelf

Tracking the dynamics of divergent thinking via semantic distance: Analytic methods and theoretical implications

by Hass, Richard W. in Adult , Averages , Behavioral Science and Psychology

2017

Divergent thinking has often been used as a proxy measure of creative thinking, but this practice lacks a foundation in modern cognitive psychological theory. This article addresses several issues with the classic divergent-thinking methodology and presents a new theoretical and methodological framework for cognitive divergent-thinking studies. A secondary analysis of a large dataset of divergent-thinking responses is presented. Latent semantic analysis was used to examine the potential changes in semantic distance between responses and the concept represented by the divergent-thinking prompt across successive response iterations. The results of linear growth modeling showed that although there is some linear increase in semantic distance across response iterations, participants high in fluid intelligence tended to give more distant initial responses than those with lower fluid intelligence. Additional analyses showed that the semantic distance of responses significantly predicted the average creativity rating given to the response, with significant variation in average levels of creativity across participants. Finally, semantic distance does not seem to be related to participants’ choices of their own most creative responses. Implications for cognitive theories of creativity are discussed, along with the limitations of the methodology and directions for future research.

Journal Article

Share this book

Add to My Shelf

SMNet: Symmetric Multi-Task Network for Semantic Change Detection in Remote Sensing Images Based on CNN and Transformer

by Niu, Yiting , Lu, Jun , Ding, Lei in Accuracy , Artificial intelligence , Artificial neural networks

2023

Deep learning has achieved great success in remote sensing image change detection (CD). However, most methods focus only on the changed regions of images and cannot accurately identify their detailed semantic categories. In addition, most CD methods using convolutional neural networks (CNN) have difficulty capturing sufficient global information from images. To address the above issues, we propose a novel symmetric multi-task network (SMNet) that integrates global and local information for semantic change detection (SCD) in this paper. Specifically, we employ a hybrid unit consisting of pre-activated residual blocks (PR) and transformation blocks (TB) to construct the (PRTB) backbone, which obtains more abundant semantic features with local and global information from bi-temporal images. To accurately capture fine-grained changes, the multi-content fusion module (MCFM) is introduced, which effectively enhances change features by distinguishing foreground and background information in complex scenes. In the meantime, the multi-task prediction branches are adopted, and the multi-task loss function is used to jointly supervise model training to improve the performance of the network. Extensive experimental results on the challenging SECOND and Landsat-SCD datasets, demonstrate that our SMNet obtains 71.95% and 85.65% at mean Intersection over Union (mIoU), respectively. In addition, the proposed SMNet achieves 20.29% and 51.14% at Separated Kappa coefficient (Sek) on the SECOND and Landsat-SCD datasets, respectively. All of the above proves the effectiveness and superiority of the proposed method.

Journal Article

Share this book

Add to My Shelf

Electrophysiological Evidence for Top-Down Lexical Influences on Early Speech Perception

by Getz, Laura M. , Toscano, Joseph C. in Acoustics , Ambiguity , Amplitude (Acoustics)

2019

An unresolved issue in speech perception concerns whether top-down linguistic information influences perceptual responses. We addressed this issue using the event-related-potential technique in two experiments that measured cross-modal sequential-semantic priming effects on the auditory N1, an index of acoustic-cue encoding. Participants heard auditory targets (e.g., “potatoes”) following associated visual primes (e.g., “MASHED”), neutral visual primes (e.g., “FACE”), or a visual mask (e.g., “XXXX”). Auditory targets began with voiced (/b/, /d/, /g/) or voiceless (/p/, /t/, /k/) stop consonants, an acoustic difference known to yield differences in N1 amplitude. In Experiment 1 (N = 21), semantic context modulated responses to upcoming targets, with smaller N1 amplitudes for semantic associates. In Experiment 2 (N = 29), semantic context changed how listeners encoded sounds: Ambiguous voice-onset times were encoded similarly to the voicing end point elicited by semantic associates. These results are consistent with an interactive model of spoken-word recognition that includes top-down effects on early perception.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter