Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Series TitleSeries Title
-
Reading LevelReading Level
-
YearFrom:-To:
-
More FiltersMore FiltersContent TypeItem TypeIs Full-Text AvailableSubjectPublisherSourceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
4,282
result(s) for
"Cluster analysis Data processing."
Sort by:
Cluster Analysis for Corpus Linguistics
by
Moisl, Hermann
in
Cluster analysis
,
Cluster analysis -- Data processing
,
Computational linguistics
2015
The series Quantitative Linguistics publishes books on all aspects of quantitative methods and models in linguistics, text analysis and related research fields. Specifically, the scope of the series covers the whole spectrum of theoretical and empirical research, ultimately striving for an exact mathematical formulation and empirical testing of hypotheses: observation and description of linguistic data, application of methods and models, discussion of methodological and epistemological issues, modelling of language and text phenomena.
Instant MapReduce Patterns - Hadoop Essentials How-To
2013
Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. This is a Packt Instant How-to guide, which provides concise and clear recipes for getting started with Hadoop.This book is for big data enthusiasts and would-be Hadoop programmers. It is also meant for Java programmers who either have not worked with Hadoop at all, or who know Hadoop and MapReduce but are not sure how to deepen their understanding.
Instant MapReduce Patterns - Hadoop Essentials How-to
by
Perera, Srinath
in
Big data and Business intelligence
,
COM018000 COMPUTERS / Data Processing
,
COMPUTERS / Data Modeling & Design
2013
MapReduce is a technology that enables users to process large datasets and Hadoop is an implementation of MapReduce. We are beginning to see more and more data becoming available, and this hides many insights that might hold key to success or failure. However, MapReduce has the ability to analyze this data and write code to process it. Instant MapReduce Patterns – Hadoop Essentials How-to is a concise introduction to Hadoop and programming with MapReduce. It is aimed to get you started and give you an overall feel for programming with Hadoop so that you will have a well-grounded foundation to understand and solve all of your MapReduce problems as needed. Instant MapReduce Patterns – Hadoop Essentials How-to will start with the configuration of Hadoop before moving on to writing simple examples and discussing MapReduce programming patterns. We will start simply by installing Hadoop and writing a word count program. After which, we will deal with the seven styles of MapReduce programs: analytics, set operations, cross correlation, search, graph, Joins, and clustering. For each case, you will learn the pattern and create a representative example program. The book also provides you with additional pointers to further enhance your Hadoop skills.
Recent Advances in Hybrid Metaheuristics for Data Clustering
by
Sourav De, Sandip Dey, Siddhartha Bhattacharyya, Sourav De, Sandip Dey, Siddhartha Bhattacharyya
in
Cluster analysis
,
Cluster analysis-Data processing
,
Metaheuristics
2020
An authoritative guide to an in-depth analysis of various state-of-the-art data clustering approaches using a range of computational intelligence techniques
Recent Advances in Hybrid Metaheuristics for Data Clustering offers a guide to the fundamentals of various metaheuristics and their application to data clustering. Metaheuristics are designed to tackle complex clustering problems where classical clustering algorithms have failed to be either effective or efficient. The authors—noted experts on the topic—provide a text that can aid in the design and development of hybrid metaheuristics to be applied to data clustering.
The book includes performance analysis of the hybrid metaheuristics in relationship to their conventional counterparts. In addition to providing a review of data clustering, the authors include in-depth analysis of different optimization algorithms. The text offers a step-by-step guide in the build-up of hybrid metaheuristics and to enhance comprehension. In addition, the book contains a range of real-life case studies and their applications. This important text:
* Includes performance analysis of the hybrid metaheuristics as related to their conventional counterparts
* Offers an in-depth analysis of a range of optimization algorithms
* Highlights a review of data clustering
* Contains a detailed overview of different standard metaheuristics in current use
* Presents a step-by-step guide to the build-up of hybrid metaheuristics
* Offers real-life case studies and applications
Written for researchers, students and academics in computer science, mathematics, and engineering, Recent Advances in Hybrid Metaheuristics for Data Clustering provides a text that explores the current data clustering approaches using a range of computational intelligence techniques.
Data Clustering in C++
by
Gan, Guojun
in
C++ (Computer program language)
,
Cluster analysis
,
Cluster analysis -- Data processing
2011
Using object-oriented design and programming techniques in C++, this book explores the commonalities of all data clustering algorithms to create a flexible set of reusable classes that simplifies the implementation of any data clustering algorithm. The first part of the text reviews basic concepts of data clustering, the UML, object-oriented programming in C++, and design patterns. The second section develops the data clustering base classes. The third part implements several popular data clustering algorithms. Additional topics such as data pre-processing, data and cluster visualization, and cluster interpretation are briefly covered.
Big Data Analytics with Hadoop 3
by
Alla, Sridhar
in
Big data
,
Cluster analysis-Data processing
,
COM018000 COMPUTERS / Data Processing
2018
Apache Hadoop is the most popular platform for big data processing to build powerful analytics solutions. This book shows you how to do just that, with the help of practical examples. You will be well-versed with the analytical capabilities of Hadoop ecosystem with Apache Spark and Apache Flink to perform big data analytics by the end of this book.
Securing Hadoop
by
Narayanan, Sudheesh
in
Apache Hadoop
,
Big data and Business intelligence
,
COMPUTERS / Database Management / General
2013
This book is a step-by-step tutorial filled with practical examples which will focus mainly on the key security tools and implementation techniques of Hadoop security.This book is great for Hadoop practitioners (solution architects, Hadoop administrators, developers, and Hadoop project managers) who are looking to get a good grounding in what Kerberos is all about and who wish to learn how to implement end-to-end Hadoop security within an enterprise setup. It’s assumed that you will have some basic understanding of Hadoop as well as be familiar with some basic security concepts.
Activation likelihood estimation meta-analysis revisited
by
Bzdok, Danilo
,
Eickhoff, Simon B.
,
Laird, Angela R.
in
Algorithms
,
Brain - anatomy & histology
,
Brain research
2012
A widely used technique for coordinate-based meta-analysis of neuroimaging data is activation likelihood estimation (ALE), which determines the convergence of foci reported from different experiments. ALE analysis involves modelling these foci as probability distributions whose width is based on empirical estimates of the spatial uncertainty due to the between-subject and between-template variability of neuroimaging data. ALE results are assessed against a null-distribution of random spatial association between experiments, resulting in random-effects inference. In the present revision of this algorithm, we address two remaining drawbacks of the previous algorithm. First, the assessment of spatial association between experiments was based on a highly time-consuming permutation test, which nevertheless entailed the danger of underestimating the right tail of the null-distribution. In this report, we outline how this previous approach may be replaced by a faster and more precise analytical method. Second, the previously applied correction procedure, i.e. controlling the false discovery rate (FDR), is supplemented by new approaches for correcting the family-wise error rate and the cluster-level significance. The different alternatives for drawing inference on meta-analytic results are evaluated on an exemplary dataset on face perception as well as discussed with respect to their methodological limitations and advantages. In summary, we thus replaced the previous permutation algorithm with a faster and more rigorous analytical solution for the null-distribution and comprehensively address the issue of multiple-comparison corrections. The proposed revision of the ALE-algorithm should provide an improved tool for conducting coordinate-based meta-analyses on functional imaging data.
► The permutation procedure of ALE is replaced by a faster and more accurate approach. ► Family-wise error correction and cluster-level inference are introduced into ALE. ► The current and revised implementation of ALE yields comparable results.
Journal Article
OceanSODA-ETHZ: a global gridded data set of the surface ocean carbonate system for seasonal to decadal studies of ocean acidification
2021
Ocean acidification has profoundly altered the ocean's carbonate chemistry since preindustrial times, with potentially serious consequences for marine life. Yet, no long-term, global observation-based data set exists that allows us to study changes in ocean acidification for all carbonate system parameters over the last few decades. Here, we fill this gap and present a methodologically consistent global data set of all relevant surface ocean parameters, i.e., dissolved inorganic carbon (DIC), total alkalinity (TA), partial pressure of CO2 (pCO2), pH, and the saturation state with respect to mineral CaCO3 (Ω) at a monthly resolution over the period 1985 through 2018 at a spatial resolution of 1∘×1∘. This data set, named OceanSODA-ETHZ, was created by extrapolating in time and space the surface ocean observations of pCO2 (from the Surface Ocean CO2 Atlas, SOCAT) and total alkalinity (TA; from the Global Ocean Data Analysis Project, GLODAP) using the newly developed Geospatial Random Cluster Ensemble Regression (GRaCER) method (code available at https://doi.org/10.5281/zenodo.4455354, Gregor, 2021). This method is based on a two-step (cluster-regression) approach but extends it by considering an ensemble of such cluster regressions, leading to improved robustness. Surface ocean DIC, pH, and Ω were then computed from the globally mapped pCO2 and TA using the thermodynamic equations of the carbonate system. For the open ocean, the cluster-regression method estimates pCO2 and TA with global near-zero biases and root mean squared errors of 12 µatm and 13 µmol kg−1, respectively. Taking into account also the measurement and representation errors, the total uncertainty increases to 14 µatm and 21 µmol kg−1, respectively. We assess the fidelity of the computed parameters by comparing them to direct observations from GLODAP, finding surface ocean pH and DIC global biases of near zero, as well as root mean squared errors of 0.023 and 16 µmol kg−1, respectively. These uncertainties are very comparable to those expected by propagating the total uncertainty from pCO2 and TA through the thermodynamic computations, indicating a robust and conservative assessment of the uncertainties. We illustrate the potential of this new data set by analyzing the climatological mean seasonal cycles of the different parameters of the surface ocean carbonate system, highlighting their commonalities and differences. Further, this data set provides a novel constraint on the global- and basin-scale trends in ocean acidification for all parameters. Concretely, we find for the period 1990 through 2018 global mean trends of 8.6 ± 0.1 µmol kg−1 per decade for DIC, −0.016 ± 0.000 per decade for pH, 16.5 ± 0.1 µatm per decade for pCO2, and −0.07 ± 0.00 per decade for Ω. The OceanSODA-ETHZ data can be downloaded from https://doi.org/10.25921/m5wx-ja34 (Gregor and Gruber, 2020).
Journal Article