Catalogue Search | MBRL

Statistics in kinesiology

by Vincent, William J , Weir, Joseph P., 1965- in Kinesiology, Applied Statistics.

Book

Share this book

Add to My Shelf

Top 10 algorithms in data mining

by Motoda, Hiroshi , McLachlan, Geoffrey J. , Yu, Philip S. in Algorithms , Computer engineering , Computer Science

2008

This paper presents the top 10 data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) in December 2006: C4.5, k -Means, SVM, Apriori, EM, PageRank, AdaBoost, k NN, Naive Bayes, and CART. These top 10 algorithms are among the most influential data mining algorithms in the research community. With each algorithm, we provide a description of the algorithm, discuss the impact of the algorithm, and review current and further research on the algorithm. These 10 algorithms cover classification, clustering, statistical learning, association analysis, and link mining, which are all among the most important topics in data mining research and development.

Journal Article

Share this book

Add to My Shelf

Statistical analysis with R

by Schmuller, Joseph, author in Mathematical statistics Data processing. , Statistics Data processing. , R (Computer program language)

Book

Share this book

Add to My Shelf

Heteroscedastic BART via Multiplicative Regression Trees

by George, E. I. , McCulloch, R. E. , Chipman, H. A. in Applied statistical inference , Applied statistics , Big data

2020

Bayesian additive regression trees (BART) has become increasingly popular as a flexible and scalable nonparametric regression approach for modern applied statistics problems. For the practitioner dealing with large and complex nonlinear response surfaces, its advantages include a matrix-free formulation and the lack of a requirement to prespecify a confining regression basis. Although flexible in fitting the mean, BART has been limited by its reliance on a constant variance error model. Alleviating this limitation, we propose HBART, a nonparametric heteroscedastic elaboration of BART. In BART, the mean function is modeled with a sum of trees, each of which determines an additive contribution to the mean. In HBART, the variance function is further modeled with a product of trees, each of which determines a multiplicative contribution to the variance. Like the mean model, this flexible, multidimensional variance model is entirely nonparametric with no need for the prespecification of a confining basis. Moreover, with this enhancement, HBART can provide insights into the potential relationships of the predictors with both the mean and the variance. Practical implementations of HBART with revealing new diagnostic plots are demonstrated with simulated and real data on used car prices and song year of release. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

Electromagnetics, control and robotics : a problems & solutions approach

by Parthasarathy, Harish, author in Engineering mathematics. , Robotics Mathematics. , MATHEMATICS / Applied

This book covers a variety of problems, and offers solutions to some, in: Statistical state and parameter estimation in nonlinear stochastic dynamical system in both the classical and quantum scenarios Propagation of electromagnetic waves in a plasma as described by the Boltzmann Kinetic Transport Equation Classical and Quantum General Relativity It will be of use to Engineering undergraduate students interested in analysing the motion of robots subject to random perturbation, and also to research scientists working in Quantum Filtering.

Book

Share this book

Add to My Shelf

Nonparametric Approach for Multiple Change Point Analysis of Multivariate Data

by James, Nicholas A. , Matteson, David S. in algorithms , Analysis , Analytical estimating

2014

Change point analysis has applications in a wide variety of fields. The general problem concerns the inference of a change in distribution for a set of time-ordered observations. Sequential detection is an online version in which new data are continually arriving and are analyzed adaptively. We are concerned with the related, but distinct, offline version, in which retrospective analysis of an entire sequence is performed. For a set of multivariate observations of arbitrary dimension, we consider nonparametric estimation of both the number of change points and the positions at which they occur. We do not make any assumptions regarding the nature of the change in distribution or any distribution assumptions beyond the existence of the αth absolute moment, for some α ∈ (0, 2). Estimation is based on hierarchical clustering and we propose both divisive and agglomerative algorithms. The divisive method is shown to provide consistent estimates of both the number and the location of change points under standard regularity assumptions. We compare the proposed approach with competing methods in a simulation study. Methods from cluster analysis are applied to assess performance and to allow simple comparisons of location estimates, even when the estimated number differs. We conclude with applications in genetics, finance, and spatio-temporal analysis. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

The end of average : how we succeed in a world that values sameness

by Rose, Todd, author in Individuality. , Conformity. , Average.

\"Weaving science, history, and his experiences as a high school dropout, Rose brings to life the untold story of how we came to embrace the scientifically flawed idea that averages can be used to understand individuals and offers a powerful alternative: the three principles of individuality\"-- Provided by publisher.

Book

Share this book

Add to My Shelf

The COSMIN checklist for assessing the methodological quality of studies on measurement properties of health status measurement instruments: an international Delphi study

by Knol, Dirk L. , Patrick, Donald L. , de Vet, Henrica C. W. in Applied statistics , Biostatistics , Clinical medicine

2010

Background Aim of the COSMIN study (COnsensusbased Standards for the selection of health status Measurement INstruments) was to develop a consensus-based checklist to evaluate the methodological quality of studies on measurement properties. We present the COSMIN checklist and the agreement of the panel on the items of the checklist. Methods A four-round Delphi study was performed with international experts (psychologists, epidemiologists, statisticians and clinicians). Of the 91 invited experts, 57 agreed to participate (63%). Panel members were asked to rate their (dis) agreement with each proposal on a five-point scale. Consensus was considered to be reached when at least 67% of the panel members indicated 'agree' or 'strongly agree'. Results Consensus was reached on the inclusion of the following measurement properties: internal consistency, reliability, measurement error, content validity (including face validity), construct validity (including structural validity, hypotheses testing and cross-cultural validity), criterion validity, responsiveness, and interpretability. The latter was not considered a measurement property. The panel also reached consensus on how these properties should be assessed. Conclusions The resulting COSMIN checklist could be useful when selecting a measurement instrument, peerreviewing a manuscript, designing or reporting a study on measurement properties, or for educational purposes.

Journal Article

Share this book

Add to My Shelf

SEQUENTIAL MULTI-SENSOR CHANGE-POINT DETECTION

by Xie, Yao , Siegmund, David in 62L10 , Applied statistics , Approximate values

2013

We develop a mixture procedure to monitor parallel streams of data for a change-point that affects only a subset of them, without assuming a spatial structure relating the data streams to one another. Observations are assumed initially to be independent standard normal random variables. After a change-point the observations in a subset of the streams of data have nonzero mean values. The subset and the post-change means are unknown. The procedure we study uses stream specific generalized likelihood ratio statistics, which are combined to form an overall detection statistic in a mixture model that hypothesizes an assumed fraction p 0 of affected data streams. An analytic expression is obtained for the average run length (ARL) when there is no change and is shown by simulations to be very accurate. Similarly, an approximation for the expected detection delay (EDD) after a change-point is also obtained. Numerical examples are given to compare the suggested procedure to other procedures for unstructured problems and in one case where the problem is assumed to have a well-defined geometric structure. Finally we discuss sensitivity of the procedure to the assumed value of p 0 and suggest a generalization.

Journal Article

Share this book

Add to My Shelf

A Guide to Teaching Data Science

by Irizarry, Rafael A. , Hicks, Stephanie C. in Active learning , Advocacy , Applied statistics

2018

Demand for data science education is surging and traditional courses offered by statistics departments are not meeting the needs of those seeking training. This has led to a number of opinion pieces advocating for an update to the Statistics curriculum. The unifying recommendation is that computing should play a more prominent role. We strongly agree with this recommendation, but advocate the main priority is to bring applications to the forefront as proposed by Nolan and Speed in 1999. We also argue that the individuals tasked with developing data science courses should not only have statistical training, but also have experience analyzing data with the main objective of solving real-world problems. Here, we share a set of general principles and offer a detailed guide derived from our successful experience developing and teaching a graduate-level, introductory data science course centered entirely on case studies. We argue for the importance of statistical thinking, as defined by Wild and Pfannkuch in 1999 and describe how our approach teaches students three key skills needed to succeed in data science, which we refer to as creating, connecting, and computing. This guide can also be used for statisticians wanting to gain more practical knowledge about data science before embarking on teaching an introductory course. Supplementary materials for this article are available online.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter