Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
Structural–Semantic Term Weighting for Interpretable Topic Modeling with Higher Coherence and Lower Token Overlap
by
Konnikov, Evgenii
, Yakob, Polina
, Golikov, Gleb
, Rodionov, Dmitriy
in
Bibliometrics
/ Coherence
/ coherence value
/ Data mining
/ Embedding
/ Large language models
/ large sparse text corpora
/ Latent Dirichlet Allocation (LDA)
/ Linear algebra
/ Modelling
/ News
/ Qualitative analysis
/ Russian language
/ Semantics
/ structural–semantic term weighting
/ Subject specialists
/ topic modeling
/ Weighting
2026
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
Structural–Semantic Term Weighting for Interpretable Topic Modeling with Higher Coherence and Lower Token Overlap
by
Konnikov, Evgenii
, Yakob, Polina
, Golikov, Gleb
, Rodionov, Dmitriy
in
Bibliometrics
/ Coherence
/ coherence value
/ Data mining
/ Embedding
/ Large language models
/ large sparse text corpora
/ Latent Dirichlet Allocation (LDA)
/ Linear algebra
/ Modelling
/ News
/ Qualitative analysis
/ Russian language
/ Semantics
/ structural–semantic term weighting
/ Subject specialists
/ topic modeling
/ Weighting
2026
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
Structural–Semantic Term Weighting for Interpretable Topic Modeling with Higher Coherence and Lower Token Overlap
by
Konnikov, Evgenii
, Yakob, Polina
, Golikov, Gleb
, Rodionov, Dmitriy
in
Bibliometrics
/ Coherence
/ coherence value
/ Data mining
/ Embedding
/ Large language models
/ large sparse text corpora
/ Latent Dirichlet Allocation (LDA)
/ Linear algebra
/ Modelling
/ News
/ Qualitative analysis
/ Russian language
/ Semantics
/ structural–semantic term weighting
/ Subject specialists
/ topic modeling
/ Weighting
2026
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
Structural–Semantic Term Weighting for Interpretable Topic Modeling with Higher Coherence and Lower Token Overlap
Journal Article
Structural–Semantic Term Weighting for Interpretable Topic Modeling with Higher Coherence and Lower Token Overlap
2026
Request Book From Autostore
and Choose the Collection Method
Overview
Topic modeling of large news streams is widely used to reconstruct economic and political narratives, which requires coherent topics with low lexical overlap while remaining interpretable to domain experts. We propose TF-SYN-NER-Rel, a structural–semantic term weighting scheme that extends classical TF-IDF by integrating positional, syntactic, factual, and named-entity coefficients derived from morphosyntactic and dependency parses of Russian news texts. The method is embedded into a standard Latent Dirichlet Allocation (LDA) pipeline and evaluated on a large Russian-language news corpus from the online archive of Moskovsky Komsomolets (over 600,000 documents), with political, financial, and sports subsets obtained via dictionary-based expert labeling. For each subset, TF-SYN-NER-Rel is compared with standard TF-IDF under identical LDA settings, and topic quality is assessed using the C_v coherence metric. To assess robustness, we repeat model training across multiple random initializations and report aggregate coherence statistics. Quantitative results show that TF-SYN-NER-Rel improves coherence and yields smoother, more stable coherence curves across the number of topics. Qualitative analysis indicates reduced lexical overlap between topics and clearer separation of event-centered and institutional themes, especially in political and financial news. Overall, the proposed pipeline relies on CPU-based NLP tools and sparse linear algebra, providing a computationally lightweight and interpretable complement to embedding- and LLM-based topic modeling in large-scale news monitoring.
Publisher
MDPI AG
This website uses cookies to ensure you get the best experience on our website.