Catalogue Search | MBRL

Sample Selection Bias and Presence-Only Distribution Models: Implications for Background and Pseudo-Absence Data

by Phillips, Steven J. , Ferrier, Simon , Elith, Jane in Animals , Applied ecology , background data

2009

Most methods for modeling species distributions from occurrence records require additional data representing the range of environmental conditions in the modeled region. These data, called background or pseudo-absence data, are usually drawn at random from the entire region, whereas occurrence collection is often spatially biased toward easily accessed areas. Since the spatial bias generally results in environmental bias, the difference between occurrence collection and background sampling may lead to inaccurate models. To correct the estimation, we propose choosing background data with the same bias as occurrence data. We investigate theoretical and practical implications of this approach. Accurate information about spatial bias is usually lacking, so explicit biased sampling of background sites may not be possible. However, it is likely that an entire target group of species observed by similar methods will share similar bias. We therefore explore the use of all occurrences within a target group as biased background data. We compare model performance using target-group background and randomly sampled background on a comprehensive collection of data for 226 species from diverse regions of the world. We find that target-group background improves average performance for all the modeling methods we consider, with the choice of background data having as large an effect on predictive performance as the choice of modeling method. The performance improvement due to target-group background is greatest when there is strong bias in the target-group presence records. Our approach applies to regression-based modeling methods that have been adapted for use with occurrence data, such as generalized linear or additive models and boosted regression trees, and to Maxent, a probability density estimation method. We argue that increased awareness of the implications of spatial bias in surveys, and possible modeling remedies, will substantially improve predictions of species distributions.

Journal Article

Share this book

Add to My Shelf

RETRACTED: Refined Design of Prefabricated Buildings under the Background of Big Data

by Luo, Jinlian in Big Data Background , Innovative Research , Prefabricated Architecture

2021

In the rapid development of the information age, BDT (big data technology) has been a very big development and a very wide range of applications, how to use BDT to collect information, and process and analyze the information, so as to quickly obtain the information content we want, this is the problem that many industries and fields have to face and solve. Since China’s construction industry entered the new century, its development speed has been greatly accelerated, and the whole construction industry has achieved rapid development, which also promotes the development of derivative industries, such as prefabricated construction industry. In order to explore how the prefabricated construction industry will develop and change under the background of BDT, we take two factories a and B of the prefabricated construction industry as the experimental research objects. Factory a applies BDT in its prefabricated construction, while factory B still operates according to the original method. Then, the experimental data show that the total net profit of a factory is 7.334 million yuan, the highest efficiency is 97%; while the total net profit of B factory is 5.686 million yuan, the highest efficiency is 82%.

Journal Article

Share this book

Add to My Shelf

Insights into the area under the receiver operating characteristic curve (AUC) as a discrimination measure in species distribution modelling

by Jiménez-Valverde, Alberto in Animal and plant ecology , Animal, plant and microbial ecology , Applied ecology

2012

Aim: The area under the receiver operating characteristic (ROC) curve (AUC) is a widely used statistic for assessing the discriminatory capacity of species distribution models. Here, I used simulated data to examine the interdependence of the AUC and classical discrimination measures (sensitivity and specificity) derived for the application of a threshold. I shall further exemplify with simulated data the implications of using the AUC to evaluate potential versus realized distribution models. Innovation: After applying the threshold that makes sensitivity and specificity equal, a strong relationship between the AUC and these two measures was found. This result is corroborated with real data. On the other hand, the AUC penalizes the models that estimate potential distributions (the regions where the species could survive and reproduce due to the existence of suitable environmental conditions), and favours those that estimate realized distributions (the regions where the species actually lives). Main conclusions: Firstly, the independence of the AUC from the threshold selection may be irrelevant in practice. This result also emphasizes the fact that the AUC assumes nothing about the relative costs of errors of omission and commission. However, in most real situations this premise may not be optimal. Measures derived from a contingency table for different cost ratio scenarios, together with the ROC curve, may be more informative than reporting just a single AUC value. Secondly, the AUC is only truly informative when there are true instances of absence available and the objective is the estimation of the realized distribution. When the potential distribution is the goal of the research, the AUC is not an appropriate performance measure because the weight of commission errors is much lower than that of omission errors.

Journal Article

Share this book

Add to My Shelf

Improving the estimation of the Boyce index using statistical smoothing methods for evaluating species distribution models with presence‐only data

by White, Matt , Newell, Graeme , Liu, Canran in Background data , Biodiversity , continuous Boyce index

2025

Species distribution models (SDMs) underpin a wide range of decisions concerning biodiversity. Although SDMs can be built using presence‐only data, rigorous evaluation of these models remains challenging. One evaluation method is the Boyce index (BI), which uses the relative frequencies between presence sites and background sites within a series of bins or moving windows spanning the entire range of predicted values from the SDM. Obtaining accurate estimates of the BI using these methods relies upon having a large number of presences, which is often not feasible, particularly for rare or restricted species that are often the focus of modelling. Wider application of the BI requires a method that can accurately and reliably estimate the BI using small numbers of presence records. In this study, we investigated the effectiveness of five statistical smoothing methods (i.e. thin plate regression splines, cubic regression splines, B‐splines, P‐splines and adaptive smoothers) and the mean of these five methods (denoted as ‘mean') to estimate the BI. We simulated 600 species with varying prevalence and built distribution models using random forest and Maxent methods. For training data, we used two levels for the number of presences (NPtrain: 20 and 500), along with 2 × NPtrain and 10000 random points (i.e. random background sites) for each modelling method. We used the number of presences at four levels (NPbi: 1000, 200, 50 and 10) to investigate its effect, together with 5000 random points to calculate the BI. Our results indicate that the BI estimates from the binning and moving window methods are severely affected by the decrease of NPbi, but all the estimates of the BI from smoothing‐based methods were almost always unbiased for realistic situations. Hence, we recommend these methods for estimating the BI for evaluating SDMs when verified absence data are unavailable.

Journal Article

Share this book

Add to My Shelf

The Discrete Empirical Interpolation Method: Canonical Structure and Formulation in Weighted Inner Product Spaces

by Drmač, Zlatko , Saibaba, Arvind Krishna

2018

Journal Article

Share this book

Add to My Shelf

Plotting receiver operating characteristic and precision–recall curves from presence and background data

by Guo, Qinghua , Li, Wenkai in Accuracy , Aerial photography , area under the curve

2021

The receiver operating characteristic (ROC) and precision–recall (PR) plots have been widely used to evaluate the performance of species distribution models. Plotting the ROC/PR curves requires a traditional test set with both presence and absence data (namely PA approach), but species absence data are usually not available in reality. Plotting the ROC/PR curves from presence‐only data while treating background data as pseudo absence data (namely PO approach) may provide misleading results. In this study, we propose a new approach to calibrate the ROC/PR curves from presence and background data with user‐provided information on a constant c, namely PB approach. Here, c defines the probability that species occurrence is detected (labeled), and an estimate of c can also be derived from the PB‐based ROC/PR plots given that a model with good ability of discrimination is available. We used five virtual species and a real aerial photography to test the effectiveness of the proposed PB‐based ROC/PR plots. Different models (or classifiers) were trained from presence and background data with various sample sizes. The ROC/PR curves plotted by PA approach were used to benchmark the curves plotted by PO and PB approaches. Experimental results show that the curves and areas under curves by PB approach are more similar to that by PA approach as compared with PO approach. The PB‐based ROC/PR plots also provide highly accurate estimations of c in our experiment. We conclude that the proposed PB‐based ROC/PR plots can provide valuable complements to the existing model assessment methods, and they also provide an additional way to estimate the constant c (or species prevalence) from presence and background data. The receiver operating characteristic (ROC) and precision–recall (PR) plots have been widely used to evaluate the performances of species distribution models. We propose a new approach to calibrate the ROC/PR curves from presence and background data with user‐provided information on a constant c. An estimate of c can also be derived from the ROC/PR plots given that a model with good ability of discrimination is available.

Journal Article

Share this book

Add to My Shelf

Repeated intratracheal instillation effects of commonly used vehicles in toxicity studies with mice

by Park, Se-Woong , Lim, Su-Jin , Park, Cheoljin in 631/1647 , 692/1537 , 704/4111

2024

Intratracheal instillation (ITI) is considered the most pragmatic approach for investigating the potential toxicities of various respiratory exposure materials. Various respiratory exposure materials, including nanomaterials, hazardous air pollutants, fine dust, and household biocides, have raised public health concerns because of limited toxicological information and increasing consumption. Hence, toxicity studies using ITI in laboratory animals are important to accurately assess the human risks associated with these respiratory-exposed materials. However, data to adequately support the study design of ITI toxicity studies, particularly those examining the effects of commonly used vehicles following repeated exposure are insufficient. Therefore, in this study, we examined the effects of 16 types of commonly used vehicles in toxicity studies following 14-day repeated ITI in mice. General health endpoints (mortality, clinical signs, and body weight) were monitored throughout the study period, and terminal endpoints (gross observation, lung weight, bronchoalveolar lavage fluid analysis, and lung histopathological examination) were assessed after terminal sacrifice. Saline and phosphate-buffered saline elicited the least response, whereas corn oil (50 µL) showed the most severe toxicity findings. In addition, several commonly used vehicles, including distilled water, sodium carboxymethyl cellulose, dimethyl sulfoxide, ethanol, Tween 20, and Tween 80, induced mild-to-severe toxicity in the respiratory system. Based on the results of this study, some commonly used vehicles in toxicity studies should be used with caution when the ITI exposure route is considered. These results provide important background information on the effects of vehicles in ITI toxicity studies along with valuable insights for designing toxicity studies using respiratory exposure materials.

Journal Article

Share this book

Add to My Shelf

New Possibilities of the Fourier Transformation: How to Describe an Arbitrary Frequency-Phase Modulated Signal?

by Litvinov, A. A. , Nigmatullin, R. R. , Osokin, S. I. in Algebra , Amplitudes , Analysis

2025

In this paper, the authors found a transformation that is valid for any arbitrary signal. This transformation is strictly periodical and, therefore, it allows to apply the ordinary -trans- formation for the fitting of the transformed signal. The most interesting application (in accordance with the author’s opinion) is the fitting of the frequency-phase modulated signals that located actually inside the found transformation. This new transformation will be useful for application of the responses of different complex systems when a particular model is absent. As available data, we consider cosmic microwave background data (CMB) associated with the background temperature fluctuations near K. These electro-magnetic (EM) fluctuations of the early Universe were measured at the wide frequency range 30–857 GHz. In this paper, we analyzed the measured data at 353 GHz corresponding to the taken zero pixels. Other details are described in the second section of the paper. This squared matrix corresponding to the measured data contains 2047 lines 2047 columns. If one considers each column as frequency-phase modulated signal, then amplitude-frequency response can be evaluated with the help of -transformation that has the period equals that is valid for any analyzed random signal. These ‘‘universal’’ behavior allows to fit a wide set of random signals and compare them with each other in terms of their amplitude-frequency responses (AFR). Concluding the abstract, one can say that these new possibilities of the traditional -analysis will serve as a common tool in the armory of the methods used by researchers in the data processing area.

Journal Article

Share this book

Add to My Shelf

Color and composition under big data technology: the art of visual communication in film art

by Yu, Shuyao in 78A10 , Big data background , Color and composition design

2024

In modern film art design, there are different kinds of art requirements to be involved. These artistic requirements embody our profound national culture in modern film art, making the organic combination of art and culture, which in turn achieves the promotion of art and provides more excellent traits for the development of modern film visual communication. At the same time, modern film design draws on traditional art from five aspects: composition, perspective, image, color and allegory, and applies its advantages and strengths completely to the process of modern film art design. Among them, the most important is the modern color and composition design. To illustrate the art of visual communication in film art cannot be separated from color and composition. In this paper, we analyze the dynamic changes of higher-order aberrations of near-eye and ortho-eye groups over time, find the turning point of the dynamic changes of higher-order aberrations, and compare the differences between the dynamic changes of higher-order aberrations of the two groups by interaction analysis. By analyzing the application of modern color and composition design in film, it is concluded that the art effect of modern film is inseparable from the design elements, and the color and composition influence caused by the product of the combination of design elements for human senses.

Journal Article

Share this book

Add to My Shelf

On the improvement, innovation and inheritance of stage makeup styling in opera under the background of big data

by Zhao, Yuqi in 62-07 , Big Data , Big data background

2024

Character styling design can clearly show the background of story characters and the characteristics of the times in the performance of stage plays. Integrating traditional culture with the art of stage plays is important for developing theatrical communication. In this paper, we analyze the factors that impact theatrical communication in the context of big data. Based on the original innovation diffusion model, it analyzes the limitations of its application, analyzes the innovation characteristics of theatrical stage makeup modeling from a qualitative perspective, finds that its diffusion characteristics do not conform to the prerequisite assumptions of the original innovation diffusion model, and confirms the improvement direction of the innovation diffusion model. Based on the analysis of audience data by the full data analysis method, the main influencing factors affecting the diffusion of opera heritage are identified, and their practical significance in the improved model is analyzed. The original innovation diffusion model is improved quantitatively, and an iterative diffusion model is established. Empirical analysis of the iterative diffusion model was conducted using the actual diffusion data of opera stage makeup styling. The research results show that the initial diffusion rates of the products are, in descending order, Cheese Superman, TikTok, Watermelon Video, and Punchbowl. Among them, the cumulative diffusion of TikTok is the highest at 14, and the diffusion rate of Watermelon Video is 0.68. It indicates that the above products effectively spread opera culture and highlight the charm of opera stage makeup styling.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter