Catalogue Search | MBRL

Cluster analysis for corpus linguistics

by Moisl, Hermann, 1949- author in Corpora (Linguistics) Data processing. , Cluster analysis Data processing. , Natural language processing (Computer science)

Book

Share this book

Add to My Shelf

Cluster Analysis for Corpus Linguistics

by Moisl, Hermann in Cluster analysis , Cluster analysis -- Data processing , Computational linguistics

2015

The series Quantitative Linguistics publishes books on all aspects of quantitative methods and models in linguistics, text analysis and related research fields. Specifically, the scope of the series covers the whole spectrum of theoretical and empirical research, ultimately striving for an exact mathematical formulation and empirical testing of hypotheses: observation and description of linguistic data, application of methods and models, discussion of methodological and epistemological issues, modelling of language and text phenomena.

eBook

Share this book

Add to My Shelf

Big Data Analytics with Hadoop 3

by Alla, Sridhar in Big data , Cluster analysis-Data processing , COMPUTERS / Data Science / General

2018,2024

Apache Hadoop is the most popular platform for big data processing to build powerful analytics solutions. This book shows you how to do just that, with the help of practical examples. You will be well-versed with the analytical capabilities of Hadoop ecosystem with Apache Spark and Apache Flink to perform big data analytics by the end of this book.

eBook

Share this book

Add to My Shelf

Instant MapReduce Patterns - Hadoop Essentials How-to

by Perera, Srinath

2013

In Detail MapReduce is a technology that enables users to process large datasets and Hadoop is an implementation of MapReduce. We are beginning to see more and more data becoming available, and this hides many insights that might hold key to success or failure. However, MapReduce has the ability to analyze this data and write code to process it. Instant MapReduce Patterns - Hadoop Essentials How-to is a concise introduction to Hadoop and programming with MapReduce. It is aimed to get you started and give you an overall feel for programming with Hadoop so that you will have a well-grounded foundation to understand and solve all of your MapReduce problems as needed. Instant MapReduce Patterns - Hadoop Essentials How-to will start with the configuration of Hadoop before moving on to writing simple examples and discussing MapReduce programming patterns. We will start simply by installing Hadoop and writing a word count program. After which, we will deal with the seven styles of MapReduce programs: analytics, set operations, cross correlation, search, graph, Joins, and clustering. For each case, you will learn the pattern and create a representative example program. The book also provides you with additional pointers to further enhance your Hadoop skills. Approach Filled with practical, step-by-step instructions and clear explanations for the most important and useful tasks. This is a Packt Instant How-to guide, which provides concise and clear recipes for getting started with Hadoop. Who this book is for This book is for big data enthusiasts and would-be Hadoop programmers. It is also meant for Java programmers who either have not worked with Hadoop at all, or who know Hadoop and MapReduce but are not sure how to deepen their understanding.

eBook

Share this book

Add to My Shelf

Instant MapReduce Patterns - Hadoop Essentials How-to

by Perera, Srinath in Apache Hadoop , Big data and Business intelligence , Cluster analysis

2013

MapReduce is a technology that enables users to process large datasets and Hadoop is an implementation of MapReduce. We are beginning to see more and more data becoming available, and this hides many insights that might hold key to success or failure. However, MapReduce has the ability to analyze this data and write code to process it. Instant MapReduce Patterns – Hadoop Essentials How-to is a concise introduction to Hadoop and programming with MapReduce. It is aimed to get you started and give you an overall feel for programming with Hadoop so that you will have a well-grounded foundation to understand and solve all of your MapReduce problems as needed. Instant MapReduce Patterns – Hadoop Essentials How-to will start with the configuration of Hadoop before moving on to writing simple examples and discussing MapReduce programming patterns. We will start simply by installing Hadoop and writing a word count program. After which, we will deal with the seven styles of MapReduce programs: analytics, set operations, cross correlation, search, graph, Joins, and clustering. For each case, you will learn the pattern and create a representative example program. The book also provides you with additional pointers to further enhance your Hadoop skills.

eBook

Share this book

Add to My Shelf

Recent Advances in Hybrid Metaheuristics for Data Clustering

by Sourav De, Sandip Dey, Siddhartha Bhattacharyya, Sourav De, Sandip Dey, Siddhartha Bhattacharyya in Cluster analysis , Cluster analysis-Data processing , Metaheuristics

2020

An authoritative guide to an in-depth analysis of various state-of-the-art data clustering approaches using a range of computational intelligence techniques Recent Advances in Hybrid Metaheuristics for Data Clustering offers a guide to the fundamentals of various metaheuristics and their application to data clustering. Metaheuristics are designed to tackle complex clustering problems where classical clustering algorithms have failed to be either effective or efficient. The authors—noted experts on the topic—provide a text that can aid in the design and development of hybrid metaheuristics to be applied to data clustering. The book includes performance analysis of the hybrid metaheuristics in relationship to their conventional counterparts. In addition to providing a review of data clustering, the authors include in-depth analysis of different optimization algorithms. The text offers a step-by-step guide in the build-up of hybrid metaheuristics and to enhance comprehension. In addition, the book contains a range of real-life case studies and their applications. This important text: * Includes performance analysis of the hybrid metaheuristics as related to their conventional counterparts * Offers an in-depth analysis of a range of optimization algorithms * Highlights a review of data clustering * Contains a detailed overview of different standard metaheuristics in current use * Presents a step-by-step guide to the build-up of hybrid metaheuristics * Offers real-life case studies and applications Written for researchers, students and academics in computer science, mathematics, and engineering, Recent Advances in Hybrid Metaheuristics for Data Clustering provides a text that explores the current data clustering approaches using a range of computational intelligence techniques.

eBook

Share this book

Add to My Shelf

Data Clustering in C++

by Gan, Guojun in C++ (Computer program language) , Cluster analysis , Cluster analysis -- Data processing

2011

Using object-oriented design and programming techniques in C++, this book explores the commonalities of all data clustering algorithms to create a flexible set of reusable classes that simplifies the implementation of any data clustering algorithm. The first part of the text reviews basic concepts of data clustering, the UML, object-oriented programming in C++, and design patterns. The second section develops the data clustering base classes. The third part implements several popular data clustering algorithms. Additional topics such as data pre-processing, data and cluster visualization, and cluster interpretation are briefly covered.

eBook

Share this book

Add to My Shelf

Activation likelihood estimation meta-analysis revisited

by Bzdok, Danilo , Eickhoff, Simon B. , Laird, Angela R. in Algorithms , Brain - anatomy & histology , Brain research

2012

A widely used technique for coordinate-based meta-analysis of neuroimaging data is activation likelihood estimation (ALE), which determines the convergence of foci reported from different experiments. ALE analysis involves modelling these foci as probability distributions whose width is based on empirical estimates of the spatial uncertainty due to the between-subject and between-template variability of neuroimaging data. ALE results are assessed against a null-distribution of random spatial association between experiments, resulting in random-effects inference. In the present revision of this algorithm, we address two remaining drawbacks of the previous algorithm. First, the assessment of spatial association between experiments was based on a highly time-consuming permutation test, which nevertheless entailed the danger of underestimating the right tail of the null-distribution. In this report, we outline how this previous approach may be replaced by a faster and more precise analytical method. Second, the previously applied correction procedure, i.e. controlling the false discovery rate (FDR), is supplemented by new approaches for correcting the family-wise error rate and the cluster-level significance. The different alternatives for drawing inference on meta-analytic results are evaluated on an exemplary dataset on face perception as well as discussed with respect to their methodological limitations and advantages. In summary, we thus replaced the previous permutation algorithm with a faster and more rigorous analytical solution for the null-distribution and comprehensively address the issue of multiple-comparison corrections. The proposed revision of the ALE-algorithm should provide an improved tool for conducting coordinate-based meta-analyses on functional imaging data. ► The permutation procedure of ALE is replaced by a faster and more accurate approach. ► Family-wise error correction and cluster-level inference are introduced into ALE. ► The current and revised implementation of ALE yields comparable results.

Journal Article

Share this book

Add to My Shelf

Vibrational spectroscopic image analysis of biological material using multivariate curve resolution–alternating least squares (MCR-ALS)

by Tauler, Romà , de Juan, Anna , Felten, Judith in 631/114/1564 , 631/1647/527/1821 , 631/1647/527/2257

2015

Chemical compositional information can be extracted from Raman and infrared microscopic images by MCR-ALS. The algorithm finds the spectral profiles of compounds contributing to each image pixel and their relative concentrations. Raman and Fourier transform IR (FTIR) microspectroscopic images of biological material (tissue sections) contain detailed information about their chemical composition. The challenge lies in identifying changes in chemical composition, as well as locating and assigning these changes to different conditions (pathology, anatomy, environmental or genetic factors). Multivariate data analysis techniques are ideal for decrypting such information from the data. This protocol provides a user-friendly pipeline and graphical user interface (GUI) for data pre-processing and unmixing of pixel spectra into their contributing pure components by multivariate curve resolution–alternating least squares (MCR-ALS) analysis. The analysis considers the full spectral profile in order to identify the chemical compounds and to visualize their distribution across the sample to categorize chemically distinct areas. Results are rapidly achieved (usually <30–60 min per image), and they are easy to interpret and evaluate both in terms of chemistry and biology, making the method generally more powerful than principal component analysis (PCA) or heat maps of single-band intensities. In addition, chemical and biological evaluation of the results by means of reference matching and segmentation maps (based on k -means clustering) is possible.

Journal Article

Share this book

Add to My Shelf

Spatiotemporal clustering: a review

by Ahmad, Amir , Khan, Shehroz S , Ansari, Mohd Yousuf in Algorithms , Artificial intelligence , Classification

2020

An increase in the size of data repositories of spatiotemporal data has opened up new challenges in the fields of spatiotemporal data analysis and data mining. Foremost among them is “spatiotemporal clustering,” a subfield of data mining that is increasingly becoming popular because of its applications in wide-ranging areas such as engineering, surveillance, transportation, environmental and seismology studies, and mobile data analysis. This review paper presents a comprehensive review of spatiotemporal clustering approaches and their applications as well as a brief tutorial on the taxonomy of data types in the spatiotemporal domain and patterns. Additionally, the data pre-processing techniques, access methods, cluster validation, space–time scan statistics, software tools, and datasets used by various spatiotemporal clustering algorithms are highlighted.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter