Catalogue Search | MBRL

Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets

by Shydeo Brandão Miyoshi, Newton , Cobre, Juliana , Cláudio Botazzo Delbem, Alexandre in Adult , Algorithms , Automation

2020

Digital datasets in several health care facilities, as hospitals and prehospital services, accumulated data from thousands of patients for more than a decade. In general, there is no local team with enough experts with the required different skills capable of analyzing them in entirety. The integration of those abilities usually demands a relatively long-period and is cost. Considering that scenario, this paper proposes a new Feature Sensitivity technique that can automatically deal with a large dataset. It uses a criterion-based sampling strategy from the Optimization based on Phylogram Analysis. Called FS-opa, the new approach seems proper for dealing with any types of raw data from health centers and manipulate their entire datasets. Besides, FS-opa can find the principal features for the construction of inference models without depending on expert knowledge of the problem domain. The selected features can be combined with usual statistical or machine learning methods to perform predictions. The new method can mine entire datasets from scratch. FS-opa was evaluated using a relatively large dataset from electronic health records of mental disorder prehospital services in Brazil. Cox's approach was integrated to FS-opa to generate survival analysis models related to the length of stay (LOS) in hospitals, assuming that it is a relevant aspect that can benefit estimates of the efficiency of hospitals and the quality of patient treatments. Since FS-opa can work with raw datasets, no knowledge from the problem domain was used to obtain the preliminary prediction models found. Results show that FS-opa succeeded in performing a feature sensitivity analysis using only the raw data available. In this way, FS-opa can find the principal features without bias of an inference model, since the proposed method does not use it. Moreover, the experiments show that FS-opa can provide models with a useful trade-off according to their representativeness and parsimony. It can benefit further analyses by experts since they can focus on aspects that benefit problem modeling.

Journal Article

Share this book

Add to My Shelf

Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis

by Shydeo Brandão Miyoshi, Newton , Cobre, Juliana , Cláudio Botazzo Delbem, Alexandre in Big data , Data mining , Management

2020

Journal Article

Share this book

Add to My Shelf

$A multiple time scale survival model with a cure fraction$

A multiple time scale survival model with a cure fraction

by Cobre, Juliana , Louzada, Francisco in Bayesian analysis , Economics , Finance

2012

Many recent survival studies propose modeling data with a cure fraction, i.e., data in which part of the population is not susceptible to the event of interest. This event may occur more than once for the same individual (recurrent event). We then have a scenario of recurrent event data in the presence of a cure fraction, which may appear in various areas such as oncology, finance, industries, among others. This paper proposes a multiple time scale survival model to analyze recurrent events using a cure fraction. The objective is analyzing the efficiency of certain interventions so that the studied event will not happen again in terms of covariates and censoring. All estimates were obtained using a sampling-based approach, which allows information to be input beforehand with lower computational effort. Simulations were done based on a clinical scenario in order to observe some frequentist properties of the estimation procedure in the presence of small and moderate sample sizes. An application of a well-known set of real mammary tumor data is provided.

Journal Article

Share this book

Add to My Shelf

A predictive Bayes factor approach to identify genes differentially expressed: An application to Escherichia coli bacterium data

by Cobre, Juliana , Saraiva, Erlandson F. , Louzada, Francisco in Data analysis , Datasets , Escherichia coli

2014

Identifying genes differentially expressed between a treatment and a control experimental condition is a common task for gene expression data analysts. Standard existing methods are the two-sample t-test, the regularized t-test (Cyber-T) and the Bayesian t-test. In this paper, we propose a Bayesian approach to identify genes differentially expressed based on the posterior probability of the difference calculated via the Bayes factor. In order to calculate the Bayes factor, we use the predictive density that is constructed by using the previously observed gene expression levels. We perform a simulation study with small sample sizes, which is usual in gene expression data analysis, to verify the performance of the proposed method and compare it with the standard ones. The results revel a better performance of the proposed methodology in identification of difference of means and/or variance. The methodology is also illustrated on the Escherichia coli bacterium dataset.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter