Catalogue Search | MBRL

Apache Spark 2.x machine learning cookbook : over 100 recipes to simplify machine learning model implementations with Spark

by Amirghodsi, Siamak, author , Rajendran, Meenakshi, author , Hall, Broderick, author in Spark (Electronic resource : Apache Software Foundation) , Machine learning.

Book

Share this book

Add to My Shelf

Predicting Systolic Blood Pressure in Real-Time Using Streaming Data and Deep Learning

by Sahal Radhya , Younis, Eman M , Ali, Abdelmgeid A in Artificial intelligence , Blood pressure , Data processing

2021

High systolic blood pressure causes many problems, including stroke, brain attack, and others. Therefore, examining blood pressure and discovering issues related to it at the right time can help prevent the occurrence of health problems. Nowadays, health-based data brings a new dimension to healthcare by exploiting the real-time patients’ data to early detect systolic blood pressure (SBP). Furthermore, technologies typically associated with smart and real-time data processing add value in the healthcaredomain, including artificial intelligence, data analytic technologies, and stream processing technologies. Thus, this paper introduces a systolic blood pressure prediction system that can predict SBP in real-time and, therefore, can avoid health problems that may stem from sudden high blood pressure. The proposed system works through two components, namely, developing an offline model and an online prediction pipeline. The aim of developing an offline model module is to develop the model using investigate different deep learning models to achieve the smallest root mean square error. It has been developed using Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), Bidirectional Short-Term Memory (BI-LSTM), Gated Recurrent Units (GRU) models andMedical Information Mart for Intensive Care (MIMC II) SBP time-series dataset. The online prediction pipeline module is using Apache Kafka and Apache Spark to predict the near future of SBP in real-time using the best deep learning model and SBP streaming time-series data. The experimental results indicate that the BI-LSTM model has achieved the best performance using three hidden layers, and it is used to predict the near future of SBP in real-time.

Journal Article

Share this book

Add to My Shelf

Mastering machine learning with Spark 2.x : create scalable machine learning applications to power a modern data-driven business using Spark

by Tellez, Alex, author , Pumperla, Max, author , Malohlava, Michal, author in Spark (Electronic resource : Apache Software Foundation) , Machine learning.

Book

Share this book

Add to My Shelf

Real-Time DDoS Attack Detection System Using Big Data Approach

by Yasin, Awais , Hakeem, Owais , Babar, Hafiz Muhammad Aqeel in Accuracy , Algorithms , Artificial intelligence

2021

Currently, the Distributed Denial of Service (DDoS) attack has become rampant, and shows up in various shapes and patterns, therefore it is not easy to detect and solve with previous solutions. Classification algorithms have been used in many studies and have aimed to detect and solve the DDoS attack. DDoS attacks are performed easily by using the weaknesses of networks and by generating requests for services for software. Real-time detection of DDoS attacks is difficult to detect and mitigate, but this solution holds significant value as these attacks can cause big issues. This paper addresses the prediction of application layer DDoS attacks in real-time with different machine learning models. We applied the two machine learning approaches Random Forest (RF) and Multi-Layer Perceptron (MLP) through the Scikit ML library and big data framework Spark ML library for the detection of Denial of Service (DoS) attacks. In addition to the detection of DoS attacks, we optimized the performance of the models by minimizing the prediction time as compared with other existing approaches using big data framework (Spark ML). We achieved a mean accuracy of 99.5% of the models both with and without big data approaches. However, in training and testing time, the big data approach outperforms the non-big data approach due to that the Spark computations in memory are in a distributed manner. The minimum average training and testing time in minutes was 14.08 and 0.04, respectively. Using a big data tool (Apache Spark), the maximum intermediate training and testing time in minutes was 34.11 and 0.46, respectively, using a non-big data approach. We also achieved these results using the big data approach. We can detect an attack in real-time in few milliseconds.

Journal Article

Share this book

Add to My Shelf

Graph algorithms : practical examples in Apache Spark and Neo4j

by Needham, Mark, author , Hodler, Amy E., author in Spark (Electronic resource : Apache Software Foundation) , Graph algorithms.

Book

Share this book

Add to My Shelf

A Digital Twin Decision Support System for the Urban Facility Management Process

by Calvio, Alessandro , Bujari, Armir , Corradi, Antonio in Apache Spark , Big Data , Data processing

2021

The ever increasing pace of IoT deployment is opening the door to concrete implementations of smart city applications, enabling the large-scale sensing and modeling of (near-)real-time digital replicas of physical processes and environments. This digital replica could serve as the basis of a decision support system, providing insights into possible optimizations of resources in a smart city scenario. In this article, we discuss an extension of a prior work, presenting a detailed proof-of-concept implementation of a Digital Twin solution for the Urban Facility Management (UFM) process. The Interactive Planning Platform for City District Adaptive Maintenance Operations (IPPODAMO) is a distributed geographical system, fed with and ingesting heterogeneous data sources originating from different urban data providers. The data are subject to continuous refinements and algorithmic processes, used to quantify and build synthetic indexes measuring the activity level inside an area of interest. IPPODAMO takes into account potential interference from other stakeholders in the urban environment, enabling the informed scheduling of operations, aimed at minimizing interference and the costs of operations.

Journal Article

Share this book

Add to My Shelf

Beginning Apache Spark 2 : with resilient distributed datasets, Spark SQL, structured streaming and Spark machine learning library

by Luu, Hien, author in Spark (Electronic resource : Apache Software Foundation) , Distributed databases.

Develop applications for the big data landscape with Spark and Hadoop. This book also explains the role of Spark in developing scalable machine learning and analytics applications with Cloud technologies. \"Beginning Apache Spark 2\" gives you an introduction to Apache Spark and shows you how to work with it.

Book

Share this book

Add to My Shelf

Elevating Smart Manufacturing with a Unified Predictive Maintenance Platform: The Synergy between Data Warehousing, Apache Spark, and Machine Learning

by Su, Naijing , Huang, Shifeng , Su, Chuanjun in Apache Spark , Artificial intelligence , data warehousing

2024

The transition to smart manufacturing introduces heightened complexity in regard to the machinery and equipment used within modern collaborative manufacturing landscapes, presenting significant risks associated with equipment failures. The core ambition of smart manufacturing is to elevate automation through the integration of state-of-the-art technologies, including artificial intelligence (AI), the Internet of Things (IoT), machine-to-machine (M2M) communication, cloud technology, and expansive big data analytics. This technological evolution underscores the necessity for advanced predictive maintenance strategies that proactively detect equipment anomalies before they escalate into costly downtime. Addressing this need, our research presents an end-to-end platform that merges the organizational capabilities of data warehousing with the computational efficiency of Apache Spark. This system adeptly manages voluminous time-series sensor data, leverages big data analytics for the seamless creation of machine learning models, and utilizes an Apache Spark-powered engine for the instantaneous processing of streaming data for fault detection. This comprehensive platform exemplifies a significant leap forward in smart manufacturing, offering a proactive maintenance model that enhances operational reliability and sustainability in the digital manufacturing era.

Journal Article

Share this book

Add to My Shelf

Mastering Spark for data science

by Morgan, Andrew, author in Spark (Electronic resource : Apache Software Foundation) , Data mining. , Machine learning.

\"Master the techniques and sophisticated analytics used to construct Spark-based solutions that scale to deliver production-grade data science products.\"

Book

Share this book

Add to My Shelf

Developing a restaurant recommended system via the Vietnamese food image classification

by Nguyen, Anh Thai , Pham, Viet Hoang , Phan, Truong Ho-Viet

2024

A recommendation system is a system that recommends products and services to users based on daily online searching habits. The recommender system is applied in many fields such as job searching, health care, education, music, and tourism. However, few studies have combined computer vision and collaborative filtering to build a restaurant recommendation system in the tourism sector. In this study, we presented a solution to build a restaurant recommendation system through Vietnamese food image classification. First, we used ResNet-34 which is a variant of the convolutional neural network to classify Vietnamese food images. Then, the system applied the alternative least square technique in matrix factorization and Apache Spark in distributed computing to train the restaurant location dataset. The output was the most relevant restaurant places list to show many choices to users. The experimental datasets included the Vietnamese image and the restaurant location datasets that were collected from kaggle.com and foody.vn websites. For image classification task evaluation, we compared ResNet-34 to variants of ResNet. For the restaurant recommendation task evaluation, we compared alternative least squares with k-nearest neighbor. The comparison results show that the proposed solution is better than traditional popular models.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter