Catalogue Search | MBRL

Reliable estimates of beta diversity with incomplete sampling

by Kocsis, Ádám T. , Kiessling, Wolfgang , Zuschin, Martin in beta diversity , Biodiversity , Bray‐Curtis dissimilarity

2018

Beta diversity, the compositional variation among communities or assemblages, is crucial to understanding the principles of diversity assembly. The mean pairwise proportional dissimilarity expresses overall heterogeneity of samples in a data set and is among the most widely used and most robust measures of beta diversity. Obtaining a complete list of taxa and their abundances requires substantial taxonomic expertise and is time consuming. In addition, the information is generally incomplete due to sampling biases. Based on the concept of the ecological significance of dominant taxa, we explore whether determining proportional dissimilarity can be simplified based on dominant species. Using simulations and six case studies, we assess the correlation between complete community compositional data and reduced subsets of a varying number of dominant species. We find that gross beta diversity is usually depicted accurately when only the 80th percentile or five of the most abundant species of each site is considered. In data sets with very high evenness, at least the 10 most abundant species should be included. Focusing on dominant species also maintains the rank-order of beta diversity among sites. Our new approach will allow ecologists and paleobiologists to produce a far greater amount of data on diversity patterns with less time and effort, supporting conservation studies and basic science.

Journal Article

Share this book

Add to My Shelf

Learning-Based Dissimilarity for Clustering Categorical Data

by Medina-Pérez, Miguel Angel , Rivera Rios, Edgar Jacob , Lazo-Cortés, Manuel S. in categorical data , clustering , dissimilarity

2021

Comparing data objects is at the heart of machine learning. For continuous data, object dissimilarity is usually taken to be object distance; however, for categorical data, there is no universal agreement, for categories can be ordered in several different ways. Most existing category dissimilarity measures characterize the distance among the values an attribute may take using precisely the number of different values the attribute takes (the attribute space) and the frequency at which they occur. These kinds of measures overlook attribute interdependence, which may provide valuable information when capturing per-attribute object dissimilarity. In this paper, we introduce a novel object dissimilarity measure that we call Learning-Based Dissimilarity, for comparing categorical data. Our measure characterizes the distance between two categorical values of a given attribute in terms of how likely it is that such values are confused or not when all the dataset objects with the remaining attributes are used to predict them. To that end, we provide an algorithm that, given a target attribute, first learns a classification model in order to compute a confusion matrix for the attribute. Then, our method transforms the confusion matrix into a per-attribute dissimilarity measure. We have successfully tested our measure against 55 datasets gathered from the University of California, Irvine (UCI) Machine Learning Repository. Our results show that it surpasses, in terms of various performance indicators for data clustering, the most prominent distance relations put forward in the literature.

Journal Article

Share this book

Add to My Shelf

Comprehensive survey on hierarchical clustering algorithms and the recent developments

in Algorithms , Cluster analysis , Clustering

2023

Data clustering is a commonly used data processing technique in many fields, which divides objects into different clusters in terms of some similarity measure between data points. Comparing to partitioning clustering methods which give a flat partition of the data, hierarchical clustering methods can give multiple consistent partitions of the data at different levels for the same data without rerunning clustering, it can be used to better analyze the complex structure of the data. There are usually two kinds of hierarchical clustering methods: divisive and agglomerative. For the divisive clustering, the key issue is how to select a cluster for the next splitting procedure according to dissimilarity and how to divide the selected cluster. For agglomerative hierarchical clustering, the key issue is the similarity measure that is used to select the two most similar clusters for the next merge. Although both types of the methods produce the dendrogram of the data as output, the clustering results may be very different depending on the dissimilarity or similarity measure used in the clustering, and different types of methods should be selected according to different types of the data and different application scenarios. So, we have reviewed various hierarchical clustering methods comprehensively, especially the most recently developed methods, in this work. The similarity measure plays a crucial role during hierarchical clustering process, we have reviewed different types of the similarity measure along with the hierarchical clustering. More specifically, different types of hierarchical clustering methods are comprehensively reviewed from six aspects, and their advantages and drawbacks are analyzed. The application of some methods in real life is also discussed. Furthermore, we have also included some recent works in combining deep learning techniques and hierarchical clustering, which is worth serious attention and may improve the hierarchical clustering significantly in the future.

Journal Article

Share this book

Add to My Shelf

A New Typology Design of Performance Metrics to Measure Errors in Machine Learning Regression Algorithms

by Botchkarev, Alexei in Algorithms , Analysis , Business metrics

2019

Aim/Purpose: The aim of this study was to analyze various performance metrics and approaches to their classification. The main goal of the study was to develop a new typology that will help to advance knowledge of metrics and facilitate their use in machine learning regression algorithms Background: Performance metrics (error measures) are vital components of the evaluation frameworks in various fields. A performance metric can be defined as a logical and mathematical construct designed to measure how close are the actual results from what has been expected or predicted. A vast variety of performance metrics have been described in academic literature. The most commonly mentioned metrics in research studies are Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), etc. Knowledge about metrics properties needs to be systematized to simplify the design and use of the metrics. Methodology: A qualitative study was conducted to achieve the objectives of identifying related peer-reviewed research studies, literature reviews, critical thinking and inductive reasoning. Contribution: The main contribution of this paper is in ordering knowledge of performance metrics and enhancing understanding of their structure and properties by proposing a new typology, generic primary metrics mathematical formula and a visualization chart Findings: Based on the analysis of the structure of numerous performance metrics, we proposed a framework of metrics which includes four (4) categories: primary metrics, extended metrics, composite metrics, and hybrid sets of metrics. The paper identified three (3) key components (dimensions) that determine the structure and properties of primary metrics: method of determining point distance, method of normalization, method of aggregation of point distances over a data set. For each component, implementation options have been identified. The suggested new typology has been shown to cover a total of over 40 commonly used primary metrics Recommendations for Practitioners: Presented findings can be used to facilitate teaching performance metrics to university students and expedite metrics selection and implementation processes for practitioners Recommendation for Researchers: By using the proposed typology, researchers can streamline development of new metrics with predetermined properties Impact on Society: The outcomes of this study could be used for improving evaluation results in machine learning regression, forecasting and prognostics with direct or indirect positive impacts on innovation and productivity in a societal sense Future Research: Future research is needed to examine the properties of the extended metrics, composite metrics, and hybrid sets of metrics. Empirical study of the metrics is needed using R Studio or Azure Machine Learning Studio, to find associations between the properties of primary metrics and their “numerical” behavior in a wide spectrum of data characteristics and business or research requirements

Journal Article

Share this book

Add to My Shelf

PARTIAL DISTANCE CORRELATION WITH METHODS FOR DISSIMILARITIES

by Rizzo, Maria L. , Székely, Gábor J. in 62Gxx , 62H15 , 62H20

2014

Distance covariance and distance correlation are scalar coefficients that characterize independence of random vectors in arbitrary dimension. Properties, extensions and applications of distance correlation have been discussed in the recent literature, but the problem of defining the partial distance correlation has remained an open question of considerable interest. The problem of partial distance correlation is more complex than partial correlation partly because the squared distance covariance is not an inner product in the usual linear space. For the definition of partial distance correlation, we introduce a new Hubert space where the squared distance covariance is the inner product. We define the partial distance correlation statistics with the help of this Hubert space, and develop and implement a test for zero partial distance correlation. Our intermediate results provide an unbiased estimator of squared distance covariance, and a neat solution to the problem of distance correlation for dissimilarities rather than distances.

Journal Article

Share this book

Add to My Shelf

Community stability is related to animal diversity change

by Perez Rocha, Mariana , Sokol, Eric R. , Surasinghe, Thilina D. in anthropogenic activities , Anthropogenic factors , Aquatic ecosystems

2022

Understanding the drivers of community stability in times of increasing anthropogenic pressure is an urgent issue. Biodiversity is known to promote community stability, but studies of the biodiversity–stability relationship rarely consider the full complexity of biodiversity change. Furthermore, finding generalities that hold across taxonomic groups and spatial and temporal scales remains challenging because most investigations have narrow taxonomic, spatial, and temporal scopes. We used organismal data collected through the National Ecological Observatory Network (NEON) at sites across the contiguous United States to evaluate linkages between community stability and biodiversity change for four taxonomic groups: small mammals, ground beetles, fish, and freshwater macroinvertebrates. We defined community stability as constancy of aggregate species' abundance. We quantified change in biodiversity as (1) dissimilarity in community taxonomic and functional composition and species replacement and richness change components of that dissimilarity and (2) change in species' abundance distributions as captured by change in species rank, richness, and evenness. We found that community stability increased with species replacement and with contribution of species replacement to overall dissimilarity for all taxonomic groups, but declined with increasing change in species richness and evenness. This is consistent with the notion that temporal fluctuations in species abundance can help stabilize community properties. We also found that community stability was highest when change in community functional composition was either lower or higher than expected given reshuffling of each community's taxonomic composition. This suggests that long‐term community stability can result from fluctuations of functionally similar species in assemblages with high taxonomic reshuffling. On the contrary, the functional uniqueness of fluctuating species compensates for lower taxonomic reshuffling to drive stabilization of community properties. Our study provides an initial assessment of the relationship between community stability and biodiversity change and illustrates the utility of fine temporal resolution data collected across ecosystems and biomes to understand the general mechanisms underlying biodiversity–stability relationships.

Journal Article

Share this book

Add to My Shelf

Contrasting drivers of diversity in hosts and parasites across the tropical Andes

by McNew, Sabrina M. , Williamson, Jessie L. , Witt, Christopher C. in Biological Sciences , Ecology

2021

Geographic turnover in community composition is created and maintained by eco-evolutionary forces that limit the ranges of species. One such force may be antagonistic interactions among hosts and parasites, but its general importance is unknown. Understanding the processes that underpin turnover requires distinguishing the contributions of key abiotic and biotic drivers over a range of spatial and temporal scales. Here, we address these challenges using flexible, nonlinear models to identify the factors that underlie richness (alpha diversity) and turnover (beta diversity) patterns of interacting host and parasite communities in a global biodiversity hot spot. We sampled 18 communities in the Peruvian Andes, encompassing ∼1,350 bird species and ∼400 hemosporidian parasite lineages, and spanning broad ranges of elevation, climate, primary productivity, and species richness. Turnover in both parasite and host communities was most strongly predicted by variation in precipitation, but secondary predictors differed between parasites and hosts, and between contemporary and phylogenetic timescales. Host communities shaped parasite diversity patterns, but there was little evidence for reciprocal effects. The results for parasite communities contradicted the prevailing view that biotic interactions filter communities at local scales while environmental filtering and dispersal barriers shape regional communities. Rather, subtle differences in precipitation had strong, fine-scale effects on parasite turnover while host–community effects only manifested at broad scales. We used these models to map bird and parasite turnover onto the ecological gradients of the Andean landscape, illustrating beta-diversity hot spots and their mechanistic underpinnings.

Journal Article

Share this book

Add to My Shelf

Similar compositional turnover but distinct insular environmental and geographical drivers of native and exotic ants in two oceans

by Hui, Cang , Roura-Pascual, Núria , Latombe, Guillaume in Annual precipitation , Ants , Atlantic Ocean

2019

Aim This study aims to quantify the patterns in compositional turnover of native and exotic ants on small islands in two oceans, and to explore whether such patterns are driven by similar environmental, geographical and potentially biotic variables. Location Pacific and Atlantic islands. Time period Present. Major taxa studied Ants. Methods We applied Multi‐Site Generalised Dissimilarity Modelling (MS‐GDM), which relates zeta diversity, the number of species shared by a given number of islands, to differences in environmental, geographical and biotic drivers. The use of zeta diversity enabled us to differentiate the contribution of rare species (shared by few islands) from those of widespread ones (shared by multiple islands) to compositional turnover. For completion, we also related species richness of insular ants per island with the same set of explanatory variables using Generalised Additive Models (GAM). Results Pacific and Atlantic islands have similar patterns of ant species turnover and richness, albeit partly driven by different drivers. Native and exotic species turnover are mostly explained by the same set of variables in the Pacific (annual precipitation and distance to the nearest island), but not in the Atlantic (annual precipitation is a good predictor of native species turnover, but none of the variables considered in our study explained exotic species turnover). No signal of biotic interactions was detected at the insular community level. Main conclusions Successful invasion strategies may depend on a combination of factors specific to the region in question. In the Pacific, milder environments and the absence of natives on certain islands enable exotic ants to select the same types of environment as native ants. In the harsher Atlantic Ocean, however, native ant species are likely to be well adapted to local environmental conditions, making it harder for exotics to become established. Exotic ant species, therefore, potentially rely on other attributes to establish, such as a combination of tolerance to a wide range of environmental conditions and human‐mediated colonization.

Journal Article

Share this book

Add to My Shelf

Temporal dimensions of taxonomic and functional fish beta diversity: scaling environmental drivers in tropical transitional ecosystems

by de Andrade-Tubino, Magda Fernandes , Araújo, Francisco Gerson , Franco, Taynara Pontes in Abundance , Aquatic habitats , Biodiversity

2023

As human-induced environmental changes reduce niche opportunities from global to local scales, permanent shifts in species traits alter ecosystem processes. Disentangling multiscale mechanisms driving temporal changes in coastal fish biodiversity is thus critical to conservation aims. From seasonal to multiyear periods, we investigated how local environmental variation and landscape features shape temporal beta diversity (abundance-based and trait-based dissimilarities, and their components of replacement and nestedness) per zones in bays and coastal lagoons, Southeastern Brazil. At larger temporal dimensions, unaccounted processes in individual systems influenced primarily taxonomic dissimilarity, whereas functional responses to variation in marine influence varied between types of system and zones. From 2 years to seasons, taxonomic and functional dissimilarities decreased with abundance-based species replacement, and both trait-based components, respectively. Replacement processes were primarily related to marine influence (transparency, salinity, pH, and tidal phase) at larger temporal dimensions, and habitat availability (mangroves and nearby estuaries) and complexity (forest cover and landscape urbanization) in seasons. Seasonal variations in tidal phase (lower) and pH (higher), respectively, promoted taxonomic (abundance gradients) and functional (fish trait loss) nestedness under shorter environmental gradients. Spatial and temporal balances between marine influence, habitat quality, and seascape connectivity drive temporal dissimilarities in coastal fish assemblages.

Journal Article

Share this book

Add to My Shelf

How many dimensions are needed to accurately assess functional diversity? A pragmatic approach for assessing the quality of functional spaces

by Brosse, Sébastien , Grenouillet, Gaël , Maire, Eva in Biodiversity and Ecology , case studies , data collection

2015

Aim: Functional diversity is a key facet of biodiversity that is increasingly being measured to quantify its changes following disturbance and to understand its effects on ecosystem functioning. Assessing the functional diversity of assemblages based on species traits requires the building of a functional space (dendrogram or multidimensional space) where indices will be computed. However, there is still no consensus on the best method for measuring the quality of functional spaces. Innovation: Here we propose a framework for evaluating the quality of a functional space (i.e. the extent to which it is a faithful representation of the initial functional trait values). Using simulated dataseis, we analysed the influence of the number and type of functional traits used and of the number of species studied on the identity and quality of the best functional space. We also tested whether the quality of the functional space affects functional diversity patterns in local assemblages, using simulated datasets and a real study case. Main conclusions: The quality of functional space strongly varied between situations. Spaces having at least four dimensions had the highest quality, while functional dendrograms and two-dimensional functional spaces always had a low quality. Importantly, we showed that using a poor-quality functional space could led to a biased assessment of functional diversity and false ecological conclusions. Therefore, we advise a pragmatic approach consisting of computing all the possible functional spaces and selecting the most parsimonious one.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter