Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
4,105
result(s) for
"Information retrieval. Graph"
Sort by:
Relational retrieval using a combination of path-constrained random walks
2010
Scientific literature with rich metadata can be represented as a labeled directed graph. This graph representation enables a number of scientific tasks such as
ad hoc
retrieval or named entity recognition (NER) to be formulated as
typed proximity queries
in the graph. One popular proximity measure is called
Random Walk with Restart
(RWR), and much work has been done on the supervised learning of RWR measures by associating each edge label with a parameter. In this paper, we describe a novel learnable proximity measure which instead uses one weight per edge label
sequence
: proximity is defined by a weighted combination of simple “path experts”, each corresponding to following a particular sequence of labeled edges. Experiments on eight tasks in two subdomains of biology show that the new learning method significantly outperforms the RWR model (both trained and untrained). We also extend the method to support two additional types of experts to model intrinsic properties of entities:
query-independent experts
, which generalize the PageRank measure, and
popular entity experts
which allow rankings to be adjusted for particular entities that are especially important.
Journal Article
Controllability of complex networks
by
Slotine, Jean-Jacques
,
Liu, Yang-Yu
,
Barabási, Albert-László
in
631/114/2408
,
639/766/25
,
Algorithms
2011
The ultimate proof of our understanding of natural or technological systems is reflected in our ability to control them. Although control theory offers mathematical tools for steering engineered and natural systems towards a desired state, a framework to control complex self-organized systems is lacking. Here we develop analytical tools to study the controllability of an arbitrary complex directed network, identifying the set of driver nodes with time-dependent control that can guide the system’s entire dynamics. We apply these tools to several real networks, finding that the number of driver nodes is determined mainly by the network’s degree distribution. We show that sparse inhomogeneous networks, which emerge in many real complex systems, are the most difficult to control, but that dense and homogeneous networks can be controlled using a few driver nodes. Counterintuitively, we find that in both model and real systems the driver nodes tend to avoid the high-degree nodes.
How to control complex systems
Control theory can be used to steer engineered and natural systems towards a desired state, but a framework to control complex self-organized systems is lacking. Can such networks be controlled? Albert-László Barabási and colleagues tackle this question and arrive at precise mathematical answers that amount to 'yes, up to a point'. They develop analytical tools to study the controllability of an arbitrary complex directed network using both model and real systems, ranging from regulatory, neural and metabolic pathways in living organisms to food webs, cell-phone movements and social interactions. They identify the minimum set of driver nodes whose time-dependent control can guide the system's entire dynamics (
http://go.nature.com/wd9Ek2
). Surprisingly, these are not usually located at the network hubs.
Journal Article
Spectral Sparsification of Graphs
by
Teng, Shang-Hua
,
Spielman, Daniel A.
in
Algebra
,
Algorithmics. Computability. Computer arithmetics
,
Algorithms
2011
We introduce a new notion of graph sparsification based on spectral similarity of graph Laplacians: spectral sparsification requires that the Laplacian quadratic form of the sparsifier approximate that of the original. This is equivalent to saying that the Laplacian of the sparsifier is a good preconditioner for the Laplacian of the original. We prove that every graph has a spectral sparsifier of nearly linear size. Moreover, we present an algorithm that produces spectral sparsifiers in time$O(m\\log^{c}m)$ , where$m$is the number of edges in the original graph and$c$is some absolute constant. This construction is a key component of a nearly linear time algorithm for solving linear equations in diagonally dominant matrices. Our sparsification algorithm makes use of a nearly linear time algorithm for graph partitioning that satisfies a strong guarantee: if the partition it outputs is very unbalanced, then the larger part is contained in a subgraph of high conductance.
Journal Article
gStore: a graph-based SPARQL query engine
2014
We address efficient processing of SPARQL queries over RDF datasets. The proposed techniques, incorporated into the gStore system, handle, in a uniform and scalable manner, SPARQL queries with wildcards and aggregate operators over dynamic RDF datasets. Our approach is graph based. We store RDF data as a large graph and also represent a SPARQL query as a query graph. Thus, the query answering problem is converted into a subgraph matching problem. To achieve efficient and scalable query processing, we develop an index, together with effective pruning rules and efficient search algorithms. We propose techniques that use this infrastructure to answer aggregation queries. We also propose an effective maintenance algorithm to handle online updates over RDF repositories. Extensive experiments confirm the efficiency and effectiveness of our solutions.
Journal Article
Popularity versus similarity in growing networks
by
Papadopoulos, Fragkiskos
,
Krioukov, Dmitri
,
Kitsak, Maksim
in
631/114/2408
,
639/766/25
,
Applied sciences
2012
A framework is developed in which new connections to a growing network optimize geometric trade-offs between popularity and similarity, instead of simply preferring popular nodes; this approach accurately describes the large-scale evolution of various networks.
Networks driven by the liked and alike
Preferential attachment is a mechanism that attempts to explain the emergence of scaling in growing networks. If new connections are preferentially established with more popular nodes in a network, then the network is scale-free. So, because 'popularity is attractive', does preferential attachment predict network evolution? This study shows that popularity is a strong force in shaping complex network structure and dynamics, but so too is similarity. The authors develop a model that increases the accuracy of network-evolution predictions by considering the trade-offs between popularity and similarity. The model accurately describes large-scale evolution of technological (Internet), social and metabolic networks, predicting the probability of new links with high precision.
The principle
1
that ‘popularity is attractive’ underlies preferential attachment
2
, which is a common explanation for the emergence of scaling in growing networks. If new connections are made preferentially to more popular nodes, then the resulting distribution of the number of connections possessed by nodes follows power laws
3
,
4
, as observed in many real networks
5
,
6
. Preferential attachment has been directly validated for some real networks (including the Internet
7
,
8
), and can be a consequence of different underlying processes based on node fitness, ranking, optimization, random walks or duplication
9
,
10
,
11
,
12
,
13
,
14
,
15
,
16
. Here we show that popularity is just one dimension of attractiveness; another dimension is similarity
17
,
18
,
19
,
20
,
21
,
22
,
23
,
24
. We develop a framework in which new connections optimize certain trade-offs between popularity and similarity, instead of simply preferring popular nodes. The framework has a geometric interpretation in which popularity preference emerges from local optimization. As opposed to preferential attachment, our optimization framework accurately describes the large-scale evolution of technological (the Internet), social (trust relationships between people) and biological (
Escherichia coli
metabolic) networks, predicting the probability of new links with high precision. The framework that we have developed can thus be used for predicting new links in evolving networks, and provides a different perspective on preferential attachment as an emergent phenomenon.
Journal Article
Quantum speed-up for unsupervised learning
by
Aïmeur, Esma
,
Gambs, Sébastien
,
Brassard, Gilles
in
Algorithmics. Computability. Computer arithmetics
,
Algorithms
,
Applied sciences
2013
We show how the quantum paradigm can be used to speed up unsupervised learning algorithms. More precisely, we explain how it is possible to accelerate learning algorithms by quantizing some of their subroutines. Quantization refers to the process that partially or totally converts a classical algorithm to its quantum counterpart in order to improve performance. In particular, we give quantized versions of clustering via minimum spanning tree, divisive clustering and
k
-medians that are faster than their classical analogues. We also describe a distributed version of
k
-medians that allows the participants to save on the global communication cost of the protocol compared to the classical version. Finally, we design quantum algorithms for the construction of a neighbourhood graph, outlier detection as well as smart initialization of the cluster centres.
Journal Article
Energy-Based Geometric Multi-model Fitting
by
Boykov, Yuri
,
Isack, Hossam
in
Algorithmics. Computability. Computer arithmetics
,
Algorithms
,
Analysis
2012
Geometric model fitting is a typical chicken-&-egg problem: data points should be clustered based on geometric proximity to models whose unknown parameters must be estimated at the same time. Most existing methods, including generalizations of
RANSAC
, greedily search for models with most inliers (within a threshold) ignoring overall classification of points. We formulate geometric multi-model fitting as an optimal labeling problem with a global energy function balancing geometric errors and
regularity
of inlier clusters. Regularization based on spatial coherence (on some near-neighbor graph) and/or label costs is NP hard. Standard combinatorial algorithms with guaranteed approximation bounds (e.g.
α
-expansion) can minimize such regularization energies over a finite set of labels, but they are not directly applicable to a continuum of labels, e.g.
in line fitting. Our proposed approach (
PEaRL
) combines model sampling from data points as in
RANSAC
with iterative re-estimation of inliers and models’ parameters based on a global regularization functional. This technique efficiently explores the continuum of labels in the context of energy minimization. In practice,
PEaRL
converges to a good quality local minimum of the energy automatically selecting a small number of models that best explain the whole data set. Our tests demonstrate that our energy-based approach significantly improves the current state of the art in geometric model fitting currently dominated by various greedy generalizations of
RANSAC
.
Journal Article
Correlated network data publication via differential privacy
by
Desai, Bipin C
,
Fung, Benjamin C. M
,
Chen, Rui
in
Algorithms
,
Data analysis
,
Data correlation
2014
With the increasing prevalence of information networks, research on privacy-preserving network data publishing has received substantial attention recently. There are two streams of relevant research, targeting different privacy requirements. A large body of existing works focus on preventing node re-identification against adversaries with structural background knowledge, while some other studies aim to thwart edge disclosure. In general, the line of research on preventing edge disclosure is less fruitful, largely due to lack of a formal privacy model. The recent emergence of differential privacy has shown great promise for rigorous prevention of edge disclosure. Yet recent research indicates that differential privacy is vulnerable to data correlation, which hinders its application to network data that may be inherently correlated. In this paper, we show that differential privacy could be tuned to provide provable privacy guarantees even in the correlated setting by introducing an extra parameter, which measures the extent of correlation. We subsequently provide a holistic solution for non-interactive network data publication. First, we generate a private vertex labeling for a given network dataset to make the corresponding adjacency matrix form dense clusters. Next, we adaptively identify dense regions of the adjacency matrix by a data-dependent partitioning process. Finally, we reconstruct a noisy adjacency matrix by a novel use of the exponential mechanism. To our best knowledge, this is the first work providing a practical solution for publishing real-life network data via differential privacy. Extensive experiments demonstrate that our approach performs well on different types of real-life network datasets.
Journal Article
Event detection over twitter social media streams
2014
In recent years, microblogs have become an important source for reporting real-world events. A real-world occurrence reported in microblogs is also called a social event. Social events may hold critical materials that describe the situations during a crisis. In real applications, such as crisis management and decision making, monitoring the critical events over social streams will enable watch officers to analyze a whole situation that is a composite event, and make the right decision based on the detailed contexts such as what is happening, where an event is happening, and who are involved. Although there has been significant research effort on detecting a target event in social networks based on a single source, in crisis, we often want to analyze the composite events contributed by different social users. So far, the problem of integrating ambiguous views from different users is not well investigated. To address this issue, we propose a novel framework to detect composite social events over streams, which fully exploits the information of social data over multiple dimensions. Specifically, we first propose a graphical model called location-time constrained topic (LTT) to capture the content, time, and location of social messages. Using LTT, a social message is represented as a probability distribution over a set of topics by inference, and the similarity between two messages is measured by the distance between their distributions. Then, the events are identified by conducting efficient similarity joins over social media streams. To accelerate the similarity join, we also propose a variable dimensional extendible hash over social streams. We have conducted extensive experiments to prove the high effectiveness and efficiency of the proposed approach.
Journal Article
Unsupervised Learning for Graph Matching
by
Sukthankar, Rahul
,
Leordeanu, Marius
,
Hebert, Martial
in
Algorithmics. Computability. Computer arithmetics
,
Algorithms
,
Analysis
2012
Graph matching is an essential problem in computer vision that has been successfully applied to 2D and 3D feature matching and object recognition. Despite its importance, little has been published on learning the parameters that control graph matching, even though learning has been shown to be vital for improving the matching rate. In this paper we show how to perform parameter learning in an unsupervised fashion, that is when no correct correspondences between graphs are given during training. Our experiments reveal that unsupervised learning compares favorably to the supervised case, both in terms of efficiency and quality, while avoiding the tedious manual labeling of ground truth correspondences. We verify experimentally that our learning method can improve the performance of several state-of-the art graph matching algorithms. We also show that a similar method can be successfully applied to parameter learning for graphical models and demonstrate its effectiveness empirically.
Journal Article