Catalogue Search | MBRL

Big Data: Big Data Analysis, Issues and Challenges and Technologies

by Rawat, R , Yadav, R in Audio data , Big Data , big data analysis

2021

The data generated at an exponential rate has resulted in Big Data. This data has many characteristics and consists of structured, unstructured, and semi-structured data formats. It contains valuable information for the different types of stakeholders based on their need however it is not possible to meet them with the help of traditional tools and techniques. Here the big data technologies play a crucial role to handle, store, and process this tremendous amount of data in real-time. Big data analytics is used to extract meaningful information or patterns from the voluminous data. It can be further divided into three types i.e. text analytics, audio analytics, video analytics, and social media analytics. Big data analytics if followed by big data analysis process plays a significant role in generating meaningful information from big data. Big data analysis process consists of data acquisition, data storage, data management, data analytics, and finally data visualization. However, it is not simple and brings many challenges that need to be resolved. This paper presents the issues and challenges related to big data, prominent characteristics of big data, big data analytics, big data analysis process, and technologies used for processing the massive data.

Journal Article

Share this book

Add to My Shelf

Application of Big Data Processing Technology in Human Resource Management Information System

by Zeng, Jiayi in Algorithms , Big Data , Data mining

2021

As an important part of the high-level management model, human resource management has attracted much attention in all fields. Human resource management faces a large number of data problems, and it is urgent to discover valuable knowledge and provide decision-making support for enterprises to develop talent training strategies. The purpose of this article is to analyze the application of big data processing technology in human resource management information system. This paper proposes a literature research method, a case research method and a comparative research method to conduct a comprehensive analysis of the human resource management information system of a company. In addition, this paper takes the human resource data of a company as an example, through a large amount of data preprocessing work, design a human resource data mining system based on association rules is implemented, which realizes the application of incremental association rule mining algorithm to human resource data mining, realizes visualization, and finally mines valuable information. The experimental results of this paper show that the number of users of the client and server in the human resource management information system is less than 100, and the performance of the system can fully meet the needs of actual work in actual use.

Journal Article

Share this book

Add to My Shelf

Big Data

by Cady, Field in Big Data processing technology , Big Data software , cluster computing framework

2017

This chapter starts with an overview of two pieces of Big Data software that are particularly important: the Hadoop file system, which stores data on clusters, and the Spark cluster computing framework, which can process that data. It focuses on some of the fundamental concepts that underlie Big Data frameworks and cluster computing in general, including the famed MapReduce (MR) programming paradigm. Spark is the leading Big Data processing technology these days in the Hadoop ecosystem, having largely replaced traditional hadoop MR. Spark is written in Scala and runs faster while calling it from Scala, but the chapter introduces the Python API, which is called PySpark. The central data abstraction in PySpark is a “resilient distributed dataset” (RDD), which is just a collection of Python objects. MapReduce is the most popular programming paradigm for Big Data technologies. The chapter provides several guidelines applicable to any MR framework, including Spark.

Book Chapter

Share this book

Add to My Shelf

Big Data, Little Data, No Data

by Borgman, Christine L in Big data , Communication in learning and scholarship , Communication in learning and scholarship -- Technological innovations

2015,2016,2017

\"Big Data\" is on the covers ofScience, Nature, theEconomist, andWiredmagazines, on the front pages of theWall Street Journaland theNew York Times.But despite the media hyperbole, as Christine Borgman points out in this examination of data and scholarly research, having the right data is usually better than having more data; little data can be just as valuable as big data. In many cases, there are no data -- because relevant data don't exist, cannot be found, or are not available. Moreover, data sharing is difficult, incentives to do so are minimal, and data practices vary widely across disciplines.Borgman, an often-cited authority on scholarly communication, argues that data have no value or meaning in isolation; they exist within a knowledge infrastructure -- an ecology of people, practices, technologies, institutions, material objects, and relationships. After laying out the premises of her investigation -- six \"provocations\" meant to inspire discussion about the uses of data in scholarship -- Borgman offers case studies of data practices in the sciences, the social sciences, and the humanities, and then considers the implications of her findings for scholarly practice and research policy. To manage and exploit data over the long term, Borgman argues, requires massive investment in knowledge infrastructures; at stake is the future of scholarship.

eBook

Share this book

Add to My Shelf

‘Fit-for-purpose?’ – challenges and opportunities for applications of blockchain technology in the future of healthcare

by Clauson, Kevin A. , Kuo, Tsung-Ting , Church, George in Beyond Big Data to new Biomedical and Health Data Science moving to next century precision health , Biomedical Technology - methods , Biomedical Technology - organization & administration

2019

Blockchain is a shared distributed digital ledger technology that can better facilitate data management, provenance and security, and has the potential to transform healthcare. Importantly, blockchain represents a data architecture, whose application goes far beyond Bitcoin – the cryptocurrency that relies on blockchain and has popularized the technology. In the health sector, blockchain is being aggressively explored by various stakeholders to optimize business processes, lower costs, improve patient outcomes, enhance compliance, and enable better use of healthcare-related data. However, critical in assessing whether blockchain can fulfill the hype of a technology characterized as ‘revolutionary’ and ‘disruptive’, is the need to ensure that blockchain design elements consider actual healthcare needs from the diverse perspectives of consumers, patients, providers, and regulators. In addition, answering the real needs of healthcare stakeholders, blockchain approaches must also be responsive to the unique challenges faced in healthcare compared to other sectors of the economy. In this sense, ensuring that a health blockchain is ‘fit-for-purpose’ is pivotal. This concept forms the basis for this article, where we share views from a multidisciplinary group of practitioners at the forefront of blockchain conceptualization, development, and deployment.

Journal Article

Share this book

Add to My Shelf

Urban Planning and Smart City Decision Management Empowered by Real-Time Data Processing Using Big Data Analytics

by Jung, Changsu , Han, Jihun , Yoon, Yongtak in Batch processing , Big Data , Big Data analytics

2018

The Internet of Things (IoT), inspired by the tremendous growth of connected heterogeneous devices, has pioneered the notion of smart city. Various components, i.e., smart transportation, smart community, smart healthcare, smart grid, etc. which are integrated within smart city architecture aims to enrich the quality of life (QoL) of urban citizens. However, real-time processing requirements and exponential data growth withhold smart city realization. Therefore, herein we propose a Big Data analytics (BDA)-embedded experimental architecture for smart cities. Two major aspects are served by the BDA-embedded smart city. Firstly, it facilitates exploitation of urban Big Data (UBD) in planning, designing, and maintaining smart cities. Secondly, it occupies BDA to manage and process voluminous UBD to enhance the quality of urban services. Three tiers of the proposed architecture are liable for data aggregation, real-time data management, and service provisioning. Moreover, offline and online data processing tasks are further expedited by integrating data normalizing and data filtering techniques to the proposed work. By analyzing authenticated datasets, we obtained the threshold values required for urban planning and city operation management. Performance metrics in terms of online and offline data processing for the proposed dual-node Hadoop cluster is obtained using aforementioned authentic datasets. Throughput and processing time analysis performed with regard to existing works guarantee the performance superiority of the proposed work. Hence, we can claim the applicability and reliability of implementing proposed BDA-embedded smart city architecture in the real world.

Journal Article

Share this book

Add to My Shelf

Data Ingestion with Python Cookbook

by Esppenchutz, Gláucia in Big data , COMPUTERS / Data Science / General , Computing and Processing

2023,2024

Deploy your data ingestion pipeline, orchestrate, and monitor efficiently to prevent loss of data and quality Key Features Harness best practices to create a Python and PySpark data ingestion pipelineSeamlessly automate and orchestrate your data pipelines using Apache AirflowBuild a monitoring framework by integrating the concept of data observability into your pipelines Book Description Data Ingestion with Python Cookbook offers a practical approach to designing and implementing data ingestion pipelines. It presents real-world examples with the most widely recognized open source tools on the market to answer commonly asked questions and overcome challenges. You’ll be introduced to designing and working with or without data schemas, as well as creating monitored pipelines with Airflow and data observability principles, all while following industry best practices. The book also addresses challenges associated with reading different data sources and data formats. As you progress through the book, you’ll gain a broader understanding of error logging best practices, troubleshooting techniques, data orchestration, monitoring, and storing logs for further consultation. By the end of the book, you’ll have a fully automated set that enables you to start ingesting and monitoring your data pipeline effortlessly, facilitating seamless integration with subsequent stages of the ETL process. What you will learn Implement data observability using monitoring toolsAutomate your data ingestion pipelineRead analytical and partitioned data, whether schema or non-schema basedDebug and prevent data loss through efficient data monitoring and loggingEstablish data access policies using a data governance frameworkConstruct a data orchestration framework to improve data quality Who this book is for This book is for data engineers and data enthusiasts seeking a comprehensive understanding of the data ingestion process using popular tools in the open source community. For more advanced learners, this book takes on the theoretical pillars of data governance while providing practical examples of real-world scenarios commonly encountered by data engineers.

eBook

Share this book

Add to My Shelf

Artificial intelligence and big data analytics for supply chain resilience: a systematic literature review

in Artificial intelligence , Big Data , Business schools

2023

Artificial Intelligence (AI) and Big Data Analytics (BDA) have the potential to significantly improve resilience of supply chains and to facilitate more effective management of supply chain resources. Despite such potential benefits and the increase in popularity of AI and BDA in the context of supply chains, research to date is dispersed into research streams that is largely based on the publication outlet. We curate and synthesise this dispersed knowledge by conducting a systematic literature review of AI and BDA research in supply chain resilience that have been published in the Chartered Association of Business School (CABS) ranked journals between 2011 and 2021. The search strategy resulted in 522 studies, of which 23 were identified as primary papers relevant to this research. The findings advance knowledge by (i) assessing the current state of AI and BDA in supply chain literature, (ii) identifying the phases of supply chain resilience (readiness, response, recovery, adaptability) that AI and BDA have been reported to improve, and (iii) synthesising the reported benefits of AI and BDA in the context of supply chain resilience.

Journal Article

Share this book

Add to My Shelf

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey

by Bobák, Martin , Tran, Viet , Dlugolinsky, Stefan in Algorithms , Artificial intelligence , Big Data

2019

The combined impact of new computing resources and techniques with an increasing avalanche of large datasets, is transforming many research areas and may lead to technological breakthroughs that can be used by billions of people. In the recent years, Machine Learning and especially its subfield Deep Learning have seen impressive advances. Techniques developed within these two fields are now able to analyze and learn from huge amounts of real world examples in a disparate formats. While the number of Machine Learning algorithms is extensive and growing, their implementations through frameworks and libraries is also extensive and growing too. The software development in this field is fast paced with a large number of open-source software coming from the academy, industry, start-ups or wider open-source communities. This survey presents a recent time-slide comprehensive overview with comparisons as well as trends in development and usage of cutting-edge Artificial Intelligence software. It also provides an overview of massive parallelism support that is capable of scaling computation effectively and efficiently in the era of Big Data.

Journal Article

Share this book

Add to My Shelf

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

by Fadhel, Mohammed A. , Zhang, Jinglan , Santamaría, J. in Application , Artificial neural networks , Big Data

2021

In the last few years, the deep learning (DL) computing paradigm has been deemed the Gold Standard in the machine learning (ML) community. Moreover, it has gradually become the most widely used computational approach in the field of ML, thus achieving outstanding results on several complex cognitive tasks, matching or even beating those provided by human performance. One of the benefits of DL is the ability to learn massive amounts of data. The DL field has grown fast in the last few years and it has been extensively used to successfully address a wide range of traditional applications. More importantly, DL has outperformed well-known ML techniques in many domains, e.g., cybersecurity, natural language processing, bioinformatics, robotics and control, and medical information processing, among many others. Despite it has been contributed several works reviewing the State-of-the-Art on DL, all of them only tackled one aspect of the DL, which leads to an overall lack of knowledge about it. Therefore, in this contribution, we propose using a more holistic approach in order to provide a more suitable starting point from which to develop a full understanding of DL. Specifically, this review attempts to provide a more comprehensive survey of the most important aspects of DL and including those enhancements recently added to the field. In particular, this paper outlines the importance of DL, presents the types of DL techniques and networks. It then presents convolutional neural networks (CNNs) which the most utilized DL network type and describes the development of CNNs architectures together with their main features, e.g., starting with the AlexNet network and closing with the High-Resolution network (HR.Net). Finally, we further present the challenges and suggested solutions to help researchers understand the existing research gaps. It is followed by a list of the major DL applications. Computational tools including FPGA, GPU, and CPU are summarized along with a description of their influence on DL. The paper ends with the evolution matrix, benchmark datasets, and summary and conclusion.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter