Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
50,618
result(s) for
"topic modeling"
Sort by:
A survey on neural topic models: methods, applications, and challenges
by
Wu, Xiaobao
,
Luu, Anh Tuan
,
Nguyen, Thong
in
Algorithms
,
Application
,
Artificial Intelligence
2024
Topic models have been prevalent for decades to discover latent topics and infer topic proportions of documents in an unsupervised fashion. They have been widely used in various applications like text analysis and context recommendation. Recently, the rise of neural networks has facilitated the emergence of a new research field—neural topic models (NTMs). Different from conventional topic models, NTMs directly optimize parameters without requiring model-specific derivations. This endows NTMs with better scalability and flexibility, resulting in significant research attention and plentiful new methods and applications. In this paper, we present a comprehensive survey on neural topic models concerning methods, applications, and challenges. Specifically, we systematically organize current NTM methods according to their network structures and introduce the NTMs for various scenarios like short texts and cross-lingual documents. We also discuss a wide range of popular applications built on NTMs. Finally, we highlight the challenges confronted by NTMs to inspire future research.
Journal Article
Short text topic modelling approaches in the context of big data: taxonomy, survey, and analysis
2023
Social media platforms such as (Twitter, Facebook, and Weibo) are being increasingly embraced by individuals, groups, and organizations as a valuable source of information. This social media generated information comes in the form of tweets or posts, and normally characterized as short text, huge, sparse, and low density. Since many real-world applications need semantic interpretation of such short texts, research in Short Text Topic Modeling (STTM) has recently gained a lot of interest to reveal unique and cohesive latent topics. This article examines the current state of the art in STTM algorithms. It presents a comprehensive survey and taxonomy of STTM algorithms for short text topic modelling. The article also includes a qualitative and quantitative study of the STTM algorithms, as well as analyses of the various strengths and drawbacks of STTM techniques. Moreover, a comparative analysis of the topic quality and performance of representative STTM models is presented. The performance evaluation is conducted on two real-world Twitter datasets: the Real-World Pandemic Twitter (RW-Pand-Twitter) dataset and Real-world Cyberbullying Twitter (RW-CB-Twitter) dataset in terms of several metrics such as topic coherence, purity, NMI, and accuracy. Finally, the open challenges and future research directions in this promising field are discussed to highlight the trends of research in STTM. The work presented in this paper is useful for researchers interested in learning state-of-the-art short text topic modelling and researchers focusing on developing new algorithms for short text topic modelling.
Journal Article
A Review of Best Practice Recommendations for Text Analysis in R (and a User-Friendly App)
by
Banks, George C.
,
Ross, Roxanne L.
,
Wesslen, Ryan S.
in
Behavioral Science and Psychology
,
Best practice
,
Business and Management
2018
In recent decades, the amount of text available for organizational science research has grown tremendously. Despite the availability of text and advances in text analysis methods, many of these techniques remain largely segmented by discipline. Moreover, there is an increasing number of open-source tools (R, Python) for text analysis, yet these tools are not easily taken advantage of by social science researchers who likely have limited programming knowledge and exposure to computational methods. In this article, we compare quantitative and qualitative text analysis methods used across social sciences. We describe basic terminology and the overlooked, but critically important, steps in pre-processing raw text (e.g., selection of stop words; stemming). Next, we provide an exploratory analysis of open-ended responses from a prototypical survey dataset using topic modeling with R. We provide a list of best practice recommendations for text analysis focused on (1) hypothesis and question formation, (2) design and data collection, (3) data pre-processing, and (4) topic modeling. We also discuss the creation of scale scores for more traditional correlation and regression analyses. All the data are available in an online repository for the interested reader to practice with, along with a reference list for additional reading, an R markdown file, and an open source interactive topic model tool (topicApp; see https://github.com/wesslen/topicApp, https://github.com/wesslen/text-analysis-org-science, https://dataverse.unc.edu/dataset.xhtml?persistentId=doi:10.15139/S3/R4W7ZS).
Journal Article
A novel multiple kernel fuzzy topic modeling technique for biomedical data
2022
Background
Text mining in the biomedical field has received much attention and regarded as the important research area since a lot of biomedical data is in text format. Topic modeling is one of the popular methods among text mining techniques used to discover hidden semantic structures, so called topics. However, discovering topics from biomedical data is a challenging task due to the sparsity, redundancy, and unstructured format.
Methods
In this paper, we proposed a novel multiple kernel fuzzy topic modeling (MKFTM) technique using fusion probabilistic inverse document frequency and multiple kernel fuzzy c-means clustering algorithm for biomedical text mining. In detail, the proposed fusion probabilistic inverse document frequency method is used to estimate the weights of global terms while MKFTM generates frequencies of local and global terms with bag-of-words. In addition, the principal component analysis is applied to eliminate higher-order negative effects for term weights.
Results
Extensive experiments are conducted on six biomedical datasets. MKFTM achieved the highest classification accuracy 99.04%, 99.62%, 99.69%, 99.61% in the Muchmore Springer dataset and 94.10%, 89.45%, 92.91%, 90.35% in the Ohsumed dataset. The CH index value of MKFTM is higher, which shows that its clustering performance is better than state-of-the-art topic models.
Conclusion
We have confirmed from results that proposed MKFTM approach is very efficient to handles to sparsity and redundancy problem in biomedical text documents. MKFTM discovers semantically relevant topics with high accuracy for biomedical documents. Its gives better results for classification and clustering in biomedical documents. MKFTM is a new approach to topic modeling, which has the flexibility to work with a variety of clustering methods.
Journal Article
Past, present and future of research in relationship marketing - a machine learning perspective
by
Sharma, Anuj
,
Kumar, Satish
,
Das, Kallol
in
Brand loyalty
,
Business to business commerce
,
Customer relationship management
2022
PurposeThis paper aims to take stock of research done in the domain of relationship marketing (RM). Additionally, this article aims to identify the potential areas of future research.Design/methodology/approachThe authors have used machine learning-based structural topic modelling using R-software to analyse the dataset of 1,905 RM articles published between 1978 and 2020.FindingsStructural topic modeling (STM) analysis led to identifying 14 topics, out of which 7 (viz. customer loyalty, customer relationship management systems, interfirm and network relationships, relationship selling, services and relationship management, consumer brand relationships and relationship marketing research) have shown a rising trend. The study also proposes a taxonomical framework to summarize RM research.Originality/valueThis is the first comprehensive review of RM research spanning over more than four decades. The study’s insights would benefit future scholars of this field to plan/execute their research for greater publication success. Additionally, managers could use the practical implications for achieving better RM outcomes.
Journal Article
Development of a Japanese version of the Awe Experience Scale (AWE-S): A structural topic modeling approach version 2; peer review: 2 approved
2023
Background: Awe, a complex emotion, arises in response to perceptually and conceptually vast stimuli that transcend one's current frames of reference, which is associated with subjective psychological phenomena, such as a sense of self and consciousness. This study aimed to develop a Japanese version of the Awe Experience Scale (AWE-S), a widely used questionnaire that robustly measured the state of awe, and simultaneously investigated how the multiple facets of awe related to the narrative representations of awe experiences. Methods: The Japanese AWE-S was created via back-translation and its factor structure and validity was investigated through an online survey in Japan. Results: The results revealed that the Japanese AWE-S consisted of the same six factors as the original (i.e., time, self-loss, connectedness, vastness, physiological, and accommodation) and had sufficient internal consistency, test-retest reliability, construct validity, and also Japan-specific characteristics. The structured topic modeling generated seven potential topics of the descriptions of awe experiences, which were differently associated with each factor of the Japanese AWE-S. Conclusions: Our findings contribute to a deeper understanding of awe and reveal the constructs of awe in Japan through cross-cultural comparisons. Furthermore, this study provides conceptual and methodological implications regarding studies on awe.
Journal Article
Evaluation of unsupervised static topic models’ emergence detection ability
by
Groth, Paul
,
Sitruk, Jonathan
,
Li, Xue
in
Artificial Intelligence
,
Computational linguistics
,
Data Mining and Machine Learning
2025
Detecting emerging topics is crucial for understanding research trends, technological advancements, and shifts in public discourse. While unsupervised topic modeling techniques such as Latent Dirichlet allocation (LDA), BERTopic, and CoWords clustering are widely used for topic extraction, their ability to retrospectively detect emerging topics without relying on ground truth labels has not been systematically compared. This gap largely stems from the lack of a dedicated evaluation metric for measuring emergence detection. In this study, we introduce a quantitative evaluation metric to assess the effectiveness of topic models in detecting emerging topics. We evaluate three topic modeling approaches using both qualitative analysis and our proposed emergence detection metric. Our results indicate that, qualitatively, CoWords identifies emerging topics earlier than LDA and BERTopics. Quantitatively, our evaluation metric demonstrates that LDA achieves an average F1 score of 80.6% in emergence detection, outperforming BERTopic by 24.0%. These findings highlight the strengths and limitations of different topic models for emergence detection, while our proposed metric provides a robust framework for future benchmarking in this area.
Journal Article
Web content topic modeling using LDA and HTML tags
by
Altarturi, Hamza H.M.
,
Saadoon, Muntadher
,
Anuar, Nor Badrul
in
Analysis
,
Computational linguistics
,
Data mining
2023
An immense volume of digital documents exists online and offline with content that can offer useful information and insights. Utilizing topic modeling enhances the analysis and understanding of digital documents. Topic modeling discovers latent semantic structures or topics within a set of digital textual documents. The Internet of Things, Blockchain, recommender system, and search engine optimization applications use topic modeling to handle data mining tasks, such as classification and clustering. The usefulness of topic models depends on the quality of resulting term patterns and topics with high quality. Topic coherence is the standard metric to measure the quality of topic models. Previous studies build topic models to generally work on conventional documents, and they are insufficient and underperform when applied to web content data due to differences in the structure of the conventional and HTML documents. Neglecting the unique structure of web content leads to missing otherwise coherent topics and, therefore, low topic quality. This study aims to propose an innovative topic model to learn coherence topics in web content data. We present the HTML Topic Model (HTM), a web content topic model that takes into consideration the HTML tags to understand the structure of web pages. We conducted two series of experiments to demonstrate the limitations of the existing topic models and examine the topic coherence of the HTM against the widely used Latent Dirichlet Allocation (LDA) model and its variants, namely the Correlated Topic Model, the Dirichlet Multinomial Regression, the Hierarchical Dirichlet Process, the Hierarchical Latent Dirichlet Allocation, the pseudo-document based Topic Model, and the Supervised Latent Dirichlet Allocation models. The first experiment demonstrates the limitations of the existing topic models when applied to web content data and, therefore, the essential need for a web content topic model. When applied to web data, the overall performance dropped an average of five times and, in some cases, up to approximately 20 times lower than when applied to conventional data. The second experiment then evaluates the effectiveness of the HTM model in discovering topics and term patterns of web content data. The HTM model achieved an overall 35% improvement in topic coherence compared to the LDA.
Journal Article
A Topic Modelling Analysis of Living Labs Research
2018
This study applies topic modelling analysis on a corpus of 86 publications in the Technology Innovation Management Review (TIM Review) to understand how the phenomenon of living labs has been approached in the recent innovation management literature. Although the analysis is performed on a corpus collected from only one journal, the TIM Review has published the largest number of special issues on living labs to date, thus it reflects the advancement of the area in the scholarly literature. According to the analysis, research approaches to living labs can be categorized under seven broad topics: 1) Design, 2) Ecosystem, 3) City, 4) University, 5) Innovation, 6) User, and 7) Living lab. Moreover, each topic includes a set of characteristic subtopics. A trend analysis suggests that the emphasis of research on living labs is moving away from a conceptual focus on what living labs are and who is involved in their ecosystems to practical applications of how to design and manage living labs, their processes, and participants, especially users, as key stakeholders and in novel application areas such as the urban city context.
Journal Article
Mapping metaverse industrial architecture using LDA and bibliometrics based on technology news framing
2024
Enabling the public to grasp the state and trends of emerging technologies facilitates technology adoption, socioeconomic investment, and business growth. Science journalism is crucial in connecting the scientific community with the public. This study focuses on the metaverse due to its rapid expansion. To validate knowledge extraction, we analyze relevant metaverse news from TechCrunch.com, covering 2020 to 2023, to build the domain knowledge schema. This study introduces a novel approach that combines Latent Dirichlet Allocation (LDA) topic modeling with bibliometrics as a computational intelligent method to discover topics and construct knowledge based on technology news framing. LDA is used to identify the topics in metaverse news, while a bibliometrics method, i.e., co-word networks analysis, clarifies term association strength and visualizes the findings. The study outlines the seven key elements highlighted in technology journalism, which shape public perception and expectations of new technologies. These elements offer insights into the factors influencing the development, diffusion, and adoption of new technologies. The extracted representative terms and enterprises from eight topics help map the knowledge architecture diagram for the domain. The research contributes to helping stakeholders systematically understand metaverse technology topics and demonstrate collaborative partnerships between enterprises.
Journal Article