Catalogue Search | MBRL

Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization

by Ma, Zhigang , Hauptmann, Alexander G. , Nie, Feiping in Active learning , Algorithms , Analysis

2015

As a way to relieve the tedious work of manual annotation, active learning plays important roles in many applications of visual concept recognition. In typical active learning scenarios, the number of labelled data in the seed set is usually small. However, most existing active learning algorithms only exploit the labelled data, which often suffers from over-fitting due to the small number of labelled examples. Besides, while much progress has been made in binary class active learning, little research attention has been focused on multi-class active learning. In this paper, we propose a semi-supervised batch mode multi-class active learning algorithm for visual concept recognition. Our algorithm exploits the whole active pool to evaluate the uncertainty of the data. Considering that uncertain data are always similar to each other, we propose to make the selected data as diverse as possible, for which we explicitly impose a diversity constraint on the objective function. As a multi-class active learning algorithm, our algorithm is able to exploit uncertainty across multiple classes. An efficient algorithm is used to optimize the objective function. Extensive experiments on action recognition, object classification, scene recognition, and event detection demonstrate its advantages.

Journal Article

Share this book

Add to My Shelf

Fair Max–Min Diversity Maximization in Streaming and Sliding-Window Models

by Mathioudakis, Michael , Li, Jia , Wang, Yanhao in Algorithms , Analysis , Approximation

2023

Diversity maximization is a fundamental problem with broad applications in data summarization, web search, and recommender systems. Given a set X of n elements, the problem asks for a subset S of k≪n elements with maximum diversity, as quantified by the dissimilarities among the elements in S. In this paper, we study diversity maximization with fairness constraints in streaming and sliding-window models. Specifically, we focus on the max–min diversity maximization problem, which selects a subset S that maximizes the minimum distance (dissimilarity) between any pair of distinct elements within it. Assuming that the set X is partitioned into m disjoint groups by a specific sensitive attribute, e.g., sex or race, ensuring fairness requires that the selected subset S contains ki elements from each group i∈[m]. Although diversity maximization has been extensively studied, existing algorithms for fair max–min diversity maximization are inefficient for data streams. To address the problem, we first design efficient approximation algorithms for this problem in the (insert-only) streaming model, where data arrive one element at a time, and a solution should be computed based on the elements observed in one pass. Furthermore, we propose approximation algorithms for this problem in the sliding-window model, where only the latest w elements in the stream are considered for computation to capture the recency of the data. Experimental results on real-world and synthetic datasets show that our algorithms provide solutions of comparable quality to the state-of-the-art offline algorithms while running several orders of magnitude faster in the streaming and sliding-window settings.

Journal Article

Share this book

Add to My Shelf

Tell me something my friends do not know: diversity maximization in social networks

by Matakos Antonis , Tu Sijing , Gionis Aristides in Algorithms , Bubbles , Digital media

2020

Social media have a great potential to improve information dissemination in our society, yet they have been held accountable for a number of undesirable effects, such as polarization and filter bubbles. It is thus important to understand these negative phenomena and develop methods to combat them. In this paper, we propose a novel approach to address the problem of breaking filter bubbles in social media. We do so by aiming to maximize the diversity of the information exposed to connected social-media users. We formulate the problem of maximizing the diversity of exposure as a quadratic-knapsack problem. We show that the proposed diversity-maximization problem is inapproximable, and thus, we resort to polynomial nonapproximable algorithms, inspired by solutions developed for the quadratic-knapsack problem, as well as scalable greedy heuristics. We complement our algorithms with instance-specific upper bounds, which are used to provide empirical approximation guarantees for the given problem instances. Our experimental evaluation shows that a proposed greedy algorithm followed by randomized local search is the algorithm of choice given its quality-vs.-efficiency trade-off.

Journal Article

Share this book

Add to My Shelf

Multi-objective genetic programming strategies for topic-based search with a focus on diversity and global recall

by Lorenzetti, Carlos M. , Baggio, Cecilia , Cecchini, Rocío L. in Algorithms

2023

Topic-based search systems retrieve items by contextualizing the information seeking process on a topic of interest to the user. A key issue in topic-based search of text resources is how to automatically generate multiple queries that reflect the topic of interest in such a way that precision, recall, and diversity are achieved. The problem of generating topic-based queries can be effectively addressed by Multi-Objective Evolutionary Algorithms, which have shown promising results. However, two common problems with such an approach are loss of diversity and low global recall when combining results from multiple queries. This work proposes a family of Multi-Objective Genetic Programming strategies based on objective functions that attempt to maximize precision and recall while minimizing the similarity among the retrieved results. To this end, we define three novel objective functions based on result set similarity and on the information theoretic notion of entropy. Extensive experiments allow us to conclude that while the proposed strategies significantly improve precision after a few generations, only some of them are able to maintain or improve global recall. A comparative analysis against previous strategies based on Multi-Objective Evolutionary Algorithms, indicates that the proposed approach is superior in terms of precision and global recall. Furthermore, when compared to query-term-selection methods based on existing state-of-the-art term-weighting schemes, the presented Multi-Objective Genetic Programming strategies demonstrate significantly higher levels of precision, recall, and F1-score, while maintaining competitive global recall. Finally, we identify the strengths and limitations of the strategies and conclude that the choice of objectives to be maximized or minimized should be guided by the application at hand.

Journal Article

Share this book

Add to My Shelf

Maximizing the Diversity of Ensemble Random Forests for Tree Genera Classification Using High Density LiDAR Data

by Sohn, Gunho , Remmel, Tarmo , Miller, John in Accuracy , Classification , Classifiers

2016

Recent research into improving the effectiveness of forest inventory management using airborne LiDAR data has focused on developing advanced theories in data analytics. Furthermore, supervised learning as a predictive model for classifying tree genera (and species, where possible) has been gaining popularity in order to minimize this labor-intensive task. However, bottlenecks remain that hinder the immediate adoption of supervised learning methods. With supervised classification, training samples are required for learning the parameters that govern the performance of a classifier, yet the selection of training data is often subjective and the quality of such samples is critically important. For LiDAR scanning in forest environments, the quantification of data quality is somewhat abstract, normally referring to some metric related to the completeness of individual tree crowns; however, this is not an issue that has received much attention in the literature. Intuitively the choice of training samples having varying quality will affect classification accuracy. In this paper a Diversity Index (DI) is proposed that characterizes the diversity of data quality (Qi) among selected training samples required for constructing a classification model of tree genera. The training sample is diversified in terms of data quality as opposed to the number of samples per class. The diversified training sample allows the classifier to better learn the positive and negative instances and; therefore; has a higher classification accuracy in discriminating the “unknown” class samples from the “known” samples. Our algorithm is implemented within the Random Forests base classifiers with six derived geometric features from LiDAR data. The training sample contains three tree genera (pine; poplar; and maple) and the validation samples contains four labels (pine; poplar; maple; and “unknown”). Classification accuracy improved from 72.8%; when training samples were selected randomly (with stratified sample size); to 93.8%; when samples were selected with additional criteria; and from 88.4% to 93.8% when an ensemble method was used.

Journal Article

Share this book

Add to My Shelf

Prioritizing phylogenetic diversity captures functional diversity unreliably

by Diaz, Sandra , Mazel, Florent , Grenyer, Richard in 631/158/670 , 631/158/672 , Animals

2018

In the face of the biodiversity crisis, it is argued that we should prioritize species in order to capture high functional diversity (FD). Because species traits often reflect shared evolutionary history, many researchers have assumed that maximizing phylogenetic diversity (PD) should indirectly capture FD, a hypothesis that we name the “phylogenetic gambit”. Here, we empirically test this gambit using data on ecologically relevant traits from >15,000 vertebrate species. Specifically, we estimate a measure of surrogacy of PD for FD. We find that maximizing PD results in an average gain of 18% of FD relative to random choice. However, this average gain obscures the fact that in over one-third of the comparisons, maximum PD sets contain less FD than randomly chosen sets of species. These results suggest that, while maximizing PD protection can help to protect FD, it represents a risky conservation strategy. An ongoing conservation question is if we can maintain functional diversity by optimizing for preservation of phylogenetic diversity. Here, Mazel et al. show that functional diversity increases with phylogenetic diversity in some clades but not others, and thus could be a risky conservation strategy.

Journal Article

Share this book

Add to My Shelf

Bad Greenwashing, Good Greenwashing

by Wu, Yue , Xie, Jinhong , Zhang, Kaifu in Bargaining , Budget constraint , Companies

2020

With the growing popularity of corporate social responsibility (CSR), critics point out that firms tend to focus on salient CSR activities while slacking off on the unobservable ones, using CSR as a marketing gimmick. Firms’ emphasis on observable aspects and negligence of the unobservable aspects are often labeled as greenwashing. This paper develops a game-theoretic model of CSR investment, in which consumers are socially minded, but they can observe only a subset of CSR initiatives. Two types of firms are considered: those that are driven solely by profit maximization and those that are socially responsible, motivated not only by profit, but also by a genuine concern for the social good. Our analysis examines how information transparency affects a firm’s strategies and the social welfare, and we identify both positive and negative aspects of greenwashing. First, low transparency incentivizes a profit-driven firm to engage in greenwashing through observable investment. Greenwashing prevents consumers from making informed purchase decisions but raises overall CSR spending. Second, sufficiently high transparency eliminates greenwashing and can motivate a socially responsible firm to make extra observable investment under the threat of greenwashing on the part of a profit-driven firm. However, when transparency further increases, this extra investment diminishes. In addition, our paper studies the impacts of firms’ budget constraint and consumers’ bargaining power: Raising the budget and increasing consumers’ bargaining power can both lead to an inferior social outcome.

Journal Article

Share this book

Add to My Shelf

Prioritizing phylogenetic diversity to protect functional diversity of reef corals

by Darling, Emily S. , Ng, Linus W. K. , Huang, Danwei in Biodiversity , biogeography , conservation prioritization

2022

Aim The ecosystem functions and services of coral reefs are critical for coastal communities worldwide. Due to conservation resource limitation, species need to be prioritized to protect desirable properties of biodiversity, such as functional diversity (FD), which has been associated with greater ecosystem functioning but is difficult to quantify directly. Selecting species to maximize phylogenetic diversity (PD) has been shown to indirectly capture FD in certain other taxa but not corals. Here, we test this hypothesis, the “phylogenetic gambit”, on corals within global marine protected areas (MPAs). Location Global coral reefs. Methods Based on the global distributions of reef corals, a complete species‐level phylogeny and trait data, we compared the FD of coral assemblages within MPAs when selected to maximize PD versus FD for assemblages selected randomly. The relationships between PD and FD were also tested as predictors of surrogacy. We then used coral FD and PD to perform spatial prioritization of reefs for protection and assessed the congruence between the two approaches. Results Selecting assemblages to maximize PD captured significantly more FD than a random subset of species for 83.1% of all selection scenarios across MPAs and would protect on average 18.7% more FD than random selection. Spatial prioritization analyses showed some mismatches between PD‐ and FD‐optimized planning units, particularly in the Tropical Western Atlantic, but the high degree of overlap between the optimizations for other reef regions lends further credence to the PD‐maximizing strategy in conserving coral FD. Main Conclusions A PD‐maximizing strategy generally protects greater FD of coral assemblages relative to random selection of species, suggesting that the “phylogenetic gambit” is valid for reef corals. There are risks, however, and the mismatches between PD‐maximized and FD‐maximized MPA networks highlight specific shortcomings of the PD‐maximization approach. Nevertheless, in data‐deficient circumstances, maximizing PD may provide a viable alternative.

Journal Article

Share this book

Add to My Shelf

The role of livestock intensification and landscape structure in maintaining tropical biodiversity

by Alvarado, Fredy , Escobar, Federico , Arroyo-Rodríguez, Víctor in Agricultural landscapes , Agrochemicals , Beetles

2018

1. As tropical cattle ranching continues to expand, successful conservation will require an improved understanding of the relative impacts of different livestock systems and landscape structure on biodiversity. Here, we provide the first empirical and multi-scale assessment of the relative effects of livestock intensification and landscape structure on biodiversity in the threatened tropical dry forests of Mesoamerica. 2. We used a dataset of dung beetles (169,372 individuals from 33 species) collected from 201-km² landscapes, ranging from zero-yielding forest sites to high-yield cattle ranches and maize farms, to investigate the relative effect of livestock intensification (net cattle production; macrocyclic lactone use; annual dung production) landscape structure (landscape composition and configuration) at multiple spatial scales on different attributes of dung beetle communities using a multi-model averaging approach. 3. Dung beetle species richness, biomass and composition were more strongly related to landscape structure than to livestock intensification. 4. Forest cover was the best predictor of dung beetle assemblages, being positively related to species diversity and biomass across multiple spatial scales. The use of macrocyclic lactones was strong and negatively related to dung beetle communities at the local scale. 5. Synthesis and applications: Maximising forest protection through a \"land sparing\" strategy is likely to be the best strategy for reducing negative impacts of cattle farming on Neotropical dung beetle communities. However, increasing or maintaining yields while reducing agrochemical inputs will be important for conserving onfarm biodiversity and the ecosystem services that dung beetles provide in livestock-dominated landscapes.

Journal Article

Share this book

Add to My Shelf

Conservation prioritization can resolve the flagship species conundrum

by Beattie, Andrew , Grenyer, Richard , Harcourt, Robert in 631/158/672 , 704/158/670 , Animals

2020

Conservation strategies based on charismatic flagship species, such as tigers, lions, and elephants, successfully attract funding from individuals and corporate donors. However, critics of this species-focused approach argue it wastes resources and often does not benefit broader biodiversity. If true, then the best way of raising conservation funds excludes the best way of spending it. Here we show that this conundrum can be resolved, and that the flagship species approach does not impede cost-effective conservation. Through a tailored prioritization approach, we identify places containing flagship species while also maximizing global biodiversity representation (based on 19,616 terrestrial and freshwater species). We then compare these results to scenarios that only maximized biodiversity representation, and demonstrate that our flagship-based approach achieves 79−89% of our objective. This provides strong evidence that prudently selected flagships can both raise funds for conservation and help target where these resources are best spent to conserve biodiversity. Conservation actions focused on flagship species are effective at raising funds and awareness. Here, McGowan et al. show that prioritizing areas for conservation based on the presence of flagship species results in the selection of areas with ~ 79-89% of the total species that would be selected by maximizing biodiversity representation only.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter