Catalogue Search | MBRL

Experimental Evaluation of Graph Databases: JanusGraph, Nebula Graph, Neo4j, and TigerGraph

by Sá, Filipe , Monteiro, Jéssica , Bernardino, Jorge in Analysis , benchmark , Engines

2023

NoSQL databases were created with the primary goal of addressing the shortcomings in the efficiency of relational databases, and can be of four types: document, column, key-value, and graph databases. Graph databases can store data and relationships efficiently, and have a flexible and easy-to-understand data schema. In this paper, we perform an experimental evaluation of the four most popular graph databases: JanusGraph, Nebula Graph, Neo4j, and TigerGraph. Database performance is evaluated using the Linked Data Benchmark Council’s Social Network Benchmark (LDBC SNB). In the experiments, we analyze the execution time of the queries, the loading time of the nodes and the RAM and CPU usage for each database. In our analysis, Neo4j was the graph database with the best performance across all metrics.

Journal Article

Share this book

Add to My Shelf

Knowledge Graphs: A Practical Review of the Research Landscape

by Kejriwal, Mayank in applications , data mining , Graphs

2022

Knowledge graphs (KGs) have rapidly emerged as an important area in AI over the last ten years. Building on a storied tradition of graphs in the AI community, a KG may be simply defined as a directed, labeled, multi-relational graph with some form of semantics. In part, this has been fueled by increased publication of structured datasets on the Web, and well-publicized successes of large-scale projects such as the Google Knowledge Graph and the Amazon Product Graph. However, another factor that is less discussed, but which has been equally instrumental in the success of KGs, is the cross-disciplinary nature of academic KG research. Arguably, because of the diversity of this research, a synthesis of how different KG research strands all tie together could serve a useful role in enabling more ‘moonshot’ research and large-scale collaborations. This review of the KG research landscape attempts to provide such a synthesis by first showing what the major strands of research are, and how those strands map to different communities, such as Natural Language Processing, Databases and Semantic Web. A unified framework is suggested in which to view the distinct, but overlapping, foci of KG research within these communities.

Journal Article

Share this book

Add to My Shelf

Evaluating regular path queries on compressed adjacency matrices

by Arroyuelo, Diego , Gómez-Brandón, Adrián , Navarro, Gonzalo in Algebra , Algorithms , Boolean

2025

Regular Path Queries (RPQs), which are essentially regular expressions to be matched against the labels of paths in labeled graphs, are at the core of graph database query languages like SPARQL and GQL. A way to solve RPQs is to translate them into a sequence of operations on the adjacency matrices of each label. We design and implement a Boolean algebra on sparse matrix representations and, as an application, use them to handle RPQs. Our baseline representation uses the same space and time as the previously most compact index for RPQs, outperforming it on the hardest types of queries—those where both RPQ endpoints are unspecified. Our more succinct structure, based on k 2 -trees, is 4 times smaller than any existing representation that handles RPQs. While slower, it still solves complex RPQs in a few seconds and slightly outperforms the smallest previous structure on the hardest RPQs. Our new sparse-matrix-based solutions dominate a good portion of the space/time tradeoff map, being outperformed only by representations that use much more space. They also implement an algebra of Boolean matrices that is of independent interest beyond solving RPQs.

Journal Article

Share this book

Add to My Shelf

Optimizing RPQs over a compact graph representation

by Arroyuelo, Diego , Gómez-Brandón, Adrián , Rojas-Ledesma, Javiel in Algorithms , Computer Science , Database Management

2024

We propose techniques to evaluate regular path queries (RPQs) over labeled graphs (e.g., RDF). We apply a bit-parallel simulation of a Glushkov automaton representing the query over a ring : a compact wavelet-tree-based index of the graph. To the best of our knowledge, our approach is the first to evaluate RPQs over a compact representation of such graphs, where we show the key advantages of using Glushkov automata in this setting. Our scheme obtains optimal time, in terms of alternation complexity, for traversing the product graph. We further introduce various optimizations, such as the ability to process several automaton states and graph nodes/labels simultaneously, and to estimate relevant selectivities. Experiments show that our approach uses 3–5 × less space, and is over 5 × faster, on average, than the next best state-of-the-art system for evaluating RPQs.

Journal Article

Share this book

Add to My Shelf

Functional querying in graph databases

by Pokorný, Jaroslav in Artificial Intelligence , Computational Intelligence , Computer Applications

2018

The paper is focused on a functional querying in graph databases. We consider labelled property graph model and mention also the graph model behind XML databases. An attention is devoted to functional modelling of graph databases both at a conceptual and data level. The notions of graph conceptual schema and graph database schema are considered. The notion of a typed attribute is used as a basic structure both on the conceptual and database level. As a formal approach to declarative graph database querying a version of typed lambda calculus is used. This approach allows to use a logic necessary for querying, arithmetic as well as aggregation function. Another advantage is the ability to deal with relations and graphs in one integrated environment.

Journal Article

Share this book

Add to My Shelf

Distance-based correlation analysis for graph databases

by Dudáš, A. , Lauko, J. in 62H20 , 68P15 , Correlation Analysis

2025

Big data is often characterized by its volume, velocity, and variety, properties that entail the fact that the data contains values and relationships that are too complex to be stored using standard, relational, or document databases. Graph databases, commonly utilized for their capacity to model complex relationships between sets of objects, provide an effective framework for the processing and storing of such data. Afterwards, it is necessary to work with data further − analyse it using methods of descriptive statistics and statistical analysis, visualize it with the use of exploratory analysis techniques, and especially use this data to build analytical models for predictive and estimation purposes. The main objective of the presented study is the design and implementation of the predictive potential metric in graph databases, which is based on the structures found in the graph databases themselves. We focus on the examination of the correlation between the attribute values of individual database objects and the mutual distance of these objects in the defined graph space. The proposed metric is verified using standard prediction models built on a sizeable graph database.

Journal Article

Share this book

Add to My Shelf

Augmenting Orbital Debris Identification with Neo4j-Enabled Graph-Based Retrieval-Augmented Generation for Multimodal Large Language Models

by Roll, Daniel S. , Woo, Wai Lok , Kurt, Zeyneb in Artificial intelligence , Case studies , Comparative analysis

2025

This preliminary study covers the construction and application of a Graph-based Retrieval-Augmented Generation (GraphRAG) system integrating a multimodal LLM, Large Language and Vision Assistant (LLaVA) with graph database software (Neo4j) to enhance LLM output quality through structured knowledge retrieval. This is aimed at the field of orbital debris detection, proposed to support the current intelligent methods for such detection by introducing the beneficial properties of both LLMs and a corpus of external information. By constructing a dynamic knowledge graph from relevant research papers, context-aware retrieval is enabled, improving factual accuracy and minimizing hallucinations. The system extracts, summarizes, and embeds research papers into a Neo4j graph database, with API-powered LLM-generated relationships enriching interconnections. Querying this graph allows for contextual ranking of relevant documents, which are then provided as context to the LLM through prompt engineering during the inference process. A case study applying the technology to a synthetic image of orbital debris is discussed. Qualitative results indicate that the inclusion of GraphRAG and external information result in successful retrieval of information and reduced hallucinations. Further work to refine the system is necessary, as well as establishing benchmark tests to assess performance quantitatively. This approach offers a scalable and interpretable method for enhanced domain-specific knowledge retrieval, improving the qualitative quality of the LLM’s output when tasked with description-based activities.

Journal Article

Share this book

Add to My Shelf

MillenniumDB: An Open-Source Graph Database System

by Arroyuelo, Diego , Rojas, Carlos , Romero, Juan

2023

In this systems paper, we present MillenniumDB: a novel graph database engine that is modular, persistent, and open source. MillenniumDB is based on a graph data model, which we call domain graphs, that provides a simple abstraction upon which a variety of popular graph models can be supported, thus providing a flexible data management engine for diverse types of knowledge graph. The engine itself is founded on a combination of tried and tested techniques from relational data management, state-of-the-art algorithms for worst-case-optimal joins, as well as graph-specific algorithms for evaluating path queries. In this paper, we present the main design principles underlying MillenniumDB, describing the abstract graph model and query semantics supported, the concrete data model and query syntax implemented, as well as the storage, indexing, query planning and query evaluation techniques used. We evaluate MillenniumDB over real-world data and queries from the Wikidata knowledge graph, where we find that it outperforms other popular persistent graph database engines (including both enterprise and open source alternatives) that support similar query features.

Journal Article

Share this book

Add to My Shelf

MillenniumDB: An Open-Source Graph Database System

by Arroyuelo, Diego , Rojas, Carlos , Romero, Juan in Algorithms , Data base management systems , Data management

2023

In this systems paper, we present MillenniumDB: a novel graph database engine that is modular, persistent, and open source. MillenniumDB is based on a graph data model, which we call domain graphs, that provides a simple abstraction upon which a variety of popular graph models can be supported, thus providing a flexible data management engine for diverse types of knowledge graph. The engine itself is founded on a combination of tried and tested techniques from relational data management, state-of-the-art algorithms for worst-case-optimal joins, as well as graph-specific algorithms for evaluating path queries. In this paper, we present the main design principles underlying MillenniumDB, describing the abstract graph model and query semantics supported, the concrete data model and query syntax implemented, as well as the storage, indexing, query planning and query evaluation techniques used. We evaluate MillenniumDB over real-world data and queries from the Wikidata knowledge graph, where we find that it outperforms other popular persistent graph database engines (including both enterprise and open source alternatives) that support similar query features.

Journal Article

Share this book

Add to My Shelf

Graph Data Science with Python and Neo4j

by Eastridge, Timothy in Machine learning-Development

2024

Graph Data Science with Python and Neo4j is your ultimate guide to unleashing the potential of graph data science by blending Python's robust capabilities with Neo4j's innovative graph database technology. From fundamental concepts to advanced analytics and machine learning techniques, you'll learn how to leverage interconnected data to drive actionable insights. Beyond theory, this book focuses on practical application, providing you with the hands-on skills needed to tackle real-world challenges. You'll explore cutting-edge integrations with Large Language Models (LLMs) like ChatGPT to build advanced recommendation systems. With intuitive frameworks and interconnected data strategies, you'll elevate your analytical prowess. This book offers a straightforward approach to mastering graph data science. With detailed explanations, real-world examples, and a dedicated GitHub repository filled with code examples, this book is an indispensable resource for anyone seeking to enhance their data practices with graph technology. Join us on this transformative journey across various industries, and unlock new, actionable insights from your data.

eBook

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter