Catalogue Search | MBRL

Large network community detection by fast label propagation

by Šubelj, Lovro , Traag, Vincent A. in 639/705/1041 , 639/705/1042 , 639/705/117

2023

Many networks exhibit some community structure. There exists a wide variety of approaches to detect communities in networks, each offering different interpretations and associated algorithms. For large networks, there is the additional requirement of speed. In this context, the so-called label propagation algorithm (LPA) was proposed, which runs in near-linear time. In partitions uncovered by LPA, each node is ensured to have most links to its assigned community. We here propose a fast variant of LPA (FLPA) that is based on processing a queue of nodes whose neighbourhood recently changed. We test FLPA exhaustively on benchmark networks and empirical networks, finding that it can run up to 700 times faster than LPA. In partitions found by FLPA, we prove that each node is again guaranteed to have most links to its assigned community. Our results show that FLPA is generally preferable to LPA.

Journal Article

Share this book

Add to My Shelf

Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods

by van Eck, Nees Jan , Waltman, Ludo , Šubelj, Lovro in Algorithms , Bibliographic coupling , Bibliometrics

2016

Clustering methods are applied regularly in the bibliometric literature to identify research areas or scientific fields. These methods are for instance used to group publications into clusters based on their relations in a citation network. In the network science literature, many clustering methods, often referred to as graph partitioning or community detection techniques, have been developed. Focusing on the problem of clustering the publications in a citation network, we present a systematic comparison of the performance of a large number of these clustering methods. Using a number of different citation networks, some of them relatively small and others very large, we extensively study the statistical properties of the results provided by different methods. In addition, we also carry out an expert-based assessment of the results produced by different methods. The expert-based assessment focuses on publications in the field of scientometrics. Our findings seem to indicate that there is a trade-off between different properties that may be considered desirable for a good clustering of publications. Overall, map equation methods appear to perform best in our analysis, suggesting that these methods deserve more attention from the bibliometric community.

Journal Article

Share this book

Add to My Shelf

Survey on graph embeddings and their applications to machine learning problems on graphs

by Subelj, Lovro , Nikitinsky, Nikita , Makarov, Ilya in Algorithms , Artificial Intelligence , Classification

2021

Dealing with relational data always required significant computational resources, domain expertise and task-dependent feature engineering to incorporate structural information into a predictive model. Nowadays, a family of automated graph feature engineering techniques has been proposed in different streams of literature. So-called graph embeddings provide a powerful tool to construct vectorized feature spaces for graphs and their components, such as nodes, edges and subgraphs under preserving inner graph properties. Using the constructed feature spaces, many machine learning problems on graphs can be solved via standard frameworks suitable for vectorized feature representation. Our survey aims to describe the core concepts of graph embeddings and provide several taxonomies for their description. First, we start with the methodological approach and extract three types of graph embedding models based on matrix factorization, random-walks and deep learning approaches. Next, we describe how different types of networks impact the ability of models to incorporate structural and attributed data into a unified embedding. Going further, we perform a thorough evaluation of graph embedding applications to machine learning problems on graphs, among which are node classification, link prediction, clustering, visualization, compression, and a family of the whole graph embedding algorithms suitable for graph classification, similarity and alignment problems. Finally, we overview the existing applications of graph embeddings to computer science domains, formulate open problems and provide experiment results, explaining how different networks properties result in graph embeddings quality in the four classic machine learning problems on graphs, such as node classification, link prediction, clustering and graph visualization. As a result, our survey covers a new rapidly growing field of network feature engineering, presents an in-depth analysis of models based on network types, and overviews a wide range of applications to machine learning problems on graphs.

Journal Article

Share this book

Add to My Shelf

War pact model of shrinking networks

by Šubelj, Lovro , Naglić, Luka in Algorithms , Armed Conflicts , Biology and Life Sciences

2019

Many real systems can be described by a set of interacting entities forming a complex network. To some surprise, these have been shown to share a number of structural properties regardless of their type or origin. It is thus of vital importance to design simple and intuitive models that can explain their intrinsic structure and dynamics. These can, for instance, be used to study networks analytically or to construct networks not observed in real life. Most models proposed in the literature are of two types. A model can be either static, where edges are added between a fixed set of nodes according to some predefined rule, or evolving, where the number of nodes or edges increases over time. However, some real networks do not grow but rather shrink, meaning that the number of nodes or edges decreases over time. We here propose a simple model of shrinking networks called the war pact model. We show that networks generated in such a way exhibit common structural properties of real networks. Furthermore, compared to classical models, these resemble international trade, correlates of war, Bitcoin transactions and other networks more closely. Network shrinking may therefore represent a reasonable explanation of the evolution of some networks and greater emphasis should be put on such models in the future.

Journal Article

Share this book

Add to My Shelf

Network-based statistical comparison of citation topology of bibliographic databases

by Bajec, Marko , Šubelj, Lovro , Fiala, Dalibor in 639/705/1042 , 639/705/1046 , 639/766/25

2014

Modern bibliographic databases provide the basis for scientific research and its evaluation. While their content and structure differ substantially, there exist only informal notions on their reliability. Here we compare the topological consistency of citation networks extracted from six popular bibliographic databases including Web of Science , CiteSeer and arXiv.org . The networks are assessed through a rich set of local and global graph statistics. We first reveal statistically significant inconsistencies between some of the databases with respect to individual statistics. For example, the introduced field bow-tie decomposition of DBLP Computer Science Bibliography substantially differs from the rest due to the coverage of the database, while the citation information within arXiv.org is the most exhaustive. Finally, we compare the databases over multiple graph statistics using the critical difference diagram. The citation topology of DBLP Computer Science Bibliography is the least consistent with the rest, while, not surprisingly, Web of Science is significantly more reliable from the perspective of consistency. This work can serve either as a reference for scholars in bibliometrics and scientometrics or a scientific evaluation guideline for governments and research agencies.

Journal Article

Share this book

Add to My Shelf

Terrorist attacks sharpen the binary perception of “Us” vs. “Them”

by Yasseri, Taha , Krstićev, Danijela Boberić , Jović, Milan in 639/705/1042 , 639/766/259 , Humanities and Social Sciences

2023

Terrorist attacks not only harm citizens but also shift their attention, which has long-lasting impacts on public opinion and government policies. Yet measuring the changes in public attention beyond media coverage has been methodologically challenging. Here we approach this problem by starting from Wikipedia’s répertoire of 5.8 million articles and a sample of 15 recent terrorist attacks. We deploy a complex exclusion procedure to identify topics and themes that consistently received a significant increase in attention due to these incidents. Examining their contents reveals a clear picture: terrorist attacks foster establishing a sharp boundary between “Us” (the target society) and “Them” (the terrorist as the enemy). In the midst of this, one seeks to construct identities of both sides. This triggers curiosity to learn more about “Them” and soul-search for a clearer understanding of “Us”. This systematic analysis of public reactions to disruptive events could help mitigate their societal consequences.

Journal Article

Share this book

Add to My Shelf

Computing Well-Balanced Spanning Trees of Unweighted Networks

by Šubelj, Lovro in Algorithms , breadth-first search , Computation

2025

A spanning tree of a network or graph is a subgraph that connects all nodes with the minimum number or total weight of edges. Spanning trees are among the simplest yet most effective techniques for network simplification, sampling, and uncovering a network’s backbone or skeleton. Prim’s algorithm and Kruskal’s algorithm are well-known algorithms for computing a spanning tree of a weighted network, and are therefore also the default procedure for unweighted networks in the most popular network libraries. In this paper, we empirically evaluate the performance of these algorithms on unweighted networks and compare them with priority-first search algorithms. We show that the distances between the nodes are better preserved by a simpler algorithm based on breadth-first search. The spanning trees are also more compact and well-balanced, as measured by classical graph indices. We support our findings with experiments on synthetic graphs and over a thousand real networks, and demonstrate the practical applications of the computed spanning trees. We conclude that for preserving the structure of an unweighted network, the breadth-first search algorithm should be the preferred choice.

Journal Article

Share this book

Add to My Shelf

SkipCor: Skip-Mention Coreference Resolution Using Linear-Chain Conditional Random Fields

by Bajec, Marko , Šubelj, Lovro , Žitnik, Slavko in Algorithms , Artificial Intelligence , Chains

2014

Coreference resolution tries to identify all expressions (called mentions) in observed text that refer to the same entity. Beside entity extraction and relation extraction, it represents one of the three complementary tasks in Information Extraction. In this paper we describe a novel coreference resolution system SkipCor that reformulates the problem as a sequence labeling task. None of the existing supervised, unsupervised, pairwise or sequence-based models are similar to our approach, which only uses linear-chain conditional random fields and supports high scalability with fast model training and inference, and a straightforward parallelization. We evaluate the proposed system against the ACE 2004, CoNLL 2012 and SemEval 2010 benchmark datasets. SkipCor clearly outperforms two baseline systems that detect coreferentiality using the same features as SkipCor. The obtained results are at least comparable to the current state-of-the-art in coreference resolution.

Journal Article

Share this book

Add to My Shelf

Convexity in complex networks

by ŠUBELJ, LOVRO , MARC, TILEN in Center and periphery , Computer Science , Graphs

2018

Metric graph properties lie in the heart of the analysis of complex networks, while in this paper we study their convexity through mathematical definition of a convex subgraph. A subgraph is convex if every geodesic path between the nodes of the subgraph lies entirely within the subgraph. According to our perception of convexity, convex network is such in which every connected subset of nodes induces a convex subgraph. We show that convexity is an inherent property of many networks that is not present in a random graph. Most convex are spatial infrastructure networks and social collaboration graphs due to their tree-like or clique-like structure, whereas the food web is the only network studied that is truly non-convex. Core–periphery networks are regionally convex as they can be divided into a non-convex core surrounded by a convex periphery. Random graphs, however, are only locally convex meaning that any connected subgraph of size smaller than the average geodesic distance between the nodes is almost certainly convex. We present different measures of network convexity and discuss its applications in the study of networks.

Journal Article

Share this book

Add to My Shelf

Quantifying the Consistency of Scientific Databases

by Kastrin, Andrej , Bajec, Marko , Levnajić, Zoran in Algorithms , Bibliometrics , Citations

2015

Science is a social process with far-reaching impact on our modern society. In recent years, for the first time we are able to scientifically study the science itself. This is enabled by massive amounts of data on scientific publications that is increasingly becoming available. The data is contained in several databases such as Web of Science or PubMed, maintained by various public and private entities. Unfortunately, these databases are not always consistent, which considerably hinders this study. Relying on the powerful framework of complex networks, we conduct a systematic analysis of the consistency among six major scientific databases. We found that identifying a single \"best\" database is far from easy. Nevertheless, our results indicate appreciable differences in mutual consistency of different databases, which we interpret as recipes for future bibliometric studies.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter