Catalogue Search | MBRL

An evaluation of time series summary statistics as features for clinical prediction tasks

by Lu, Menglin , Chen, Jingfeng , Guo, Chonghui in Algorithms , Clinical medicine , Clinical prediction tasks

2020

Background Clinical prediction tasks such as patient mortality, length of hospital stay, and disease diagnosis are highly important in critical care research. The existing studies for clinical prediction mainly used simple summary statistics to summarize information from physiological time series. However, this lack of statistics leads to a lack of information. In addition, using only maximum and minimum statistics to indicate patient features fails to provide an adequate explanation. Few studies have evaluated which summary statistics best represent physiological time series. Methods In this paper, we summarize 14 statistics describing the characteristics of physiological time series, including the central tendency, dispersion tendency, and distribution shape. Then, we evaluate the use of summary statistics of physiological time series as features for three clinical prediction tasks. To find the combinations of statistics that yield the best performances under different tasks, we use a cross-validation-based genetic algorithm to approximate the optimal statistical combination. Results By experiments using the EHRs of 6,927 patients, we obtained prediction results based on both single statistics and commonly used combinations of statistics under three clinical prediction tasks. Based on the results of an embedded cross-validation genetic algorithm, we obtained 25 optimal sets of statistical combinations and then tested their prediction results. By comparing the performances of prediction with single statistics and commonly used combinations of statistics with quantitative analyses of the optimal statistical combinations, we found that some statistics play central roles in patient representation and different prediction tasks have certain commonalities. Conclusion Through an in-depth analysis of the results, we found many practical reference points that can provide guidance for subsequent related research. Statistics that indicate dispersion tendency, such as min, max, and range, are more suitable for length of stay prediction tasks, and they also provide information for short-term mortality prediction. Mean and quantiles that reflect the central tendency of physiological time series are more suitable for mortality and disease prediction. Skewness and kurtosis perform poorly when used separately for prediction but can be used as supplementary statistics to improve the overall prediction effect.

Journal Article

Share this book

Add to My Shelf

Products Ranking Through Aspect-Based Sentiment Analysis of Online Heterogeneous Reviews

by Du, Zhonglian , Kou, Xinyue , Guo, Chonghui in Complexity , Consumers , Customers

2018

With the rapid growth of online shopping platforms, more and more customers intend to share their shopping experience and product reviews on the Internet. Both large quantity and various forms of online reviews bring difficulties for potential consumers to summary all the heterogenous reviews for reference. This paper proposes a new ranking method through online reviews based on different aspects of the alternative products, which combines both objective and subjective sentiment values. Firstly, weights of these aspects are determined with LDA topic model to calculate the objective sentiment value of the product. During this process, the realistic meaning of each aspect is also summarized. Then, consumers’ personalized preferences are taken into consideration while calculating total scores of alternative products. Meanwhile, comparative superiority between every two products also contributes to their final scores. Therefore, a directed graph model is constructed and the final score of each product is computed by improved PageRank algorithm. Finally, a case study is given to illustrate the feasibility and effectiveness of the proposed method. The result demonstrates that while considering only objective sentiment values of the product, the ranking result obtained by our proposed method has a strong correlation with the actual sales orders. On the other hand, if consumers express subjective preferences towards a certain aspect, the final ranking is also consistent with the actual performance of alternative products. It provides a new research idea for online customer review mining and personalized recommendation.

Journal Article

Share this book

Add to My Shelf

Big Data Analytics in Healthcare: Data-Driven Methods for Typical Treatment Pattern Mining

by Chen, Jingfeng , Guo, Chonghui in Big Data , Complexity , Data analysis

2019

A huge volume of digitized clinical data is generated and accumulated rapidly since the widespread adoption of Electronic Medical Records (EMRs). These big data in healthcare hold the promise of propelling healthcare evolving from a proficiency-based art to a data-driven science, from a reactive mode to a proactive mode, from one-size-fits-all medicine to personalized medicine. This paper first discusses the research background - big data analytics in healthcare, the research framework of big data analytics in healthcare, analysis of medical process, and the literature summary of treatment pattern mining. Then the challenges for data-driven typical treatment pattern mining are highlighted, including similarity measure between treatment records, typical treatment pattern extraction, evaluation and recommendation, when considering the rich temporal and heterogeneous medical information in EMRs. Furthermore, three categories of typical treatment patterns are mined from doctor order content, duration, and sequence view respectively, which can provide a data-driven guideline to achieve the “5R” goal for rational drug use and clinical pathways.

Journal Article

Share this book

Add to My Shelf

Fusion of heterogeneous incomplete hesitant preference relations in group decision making

by Guo, Chonghui , Zhang, Zhen in Decision making , group decision making , hesitant fuzzy set

2016

In this paper, we focus on the fusion of heterogeneous incomplete hesitant preference relations (including hesitant fuzzy preference relations and hesitant multiplicative preference relations) under group decision making settings. First, some simple formulae are developed to derive a priority weight vector from an incomplete hesitant fuzzy preference relation or an incomplete hesitant multiplicative preference relation based on the logarithmic least squares method. Based on the priority weight vector, an induced fuzzy or multiplicative preference relation can be derived for an incomplete hesitant preference relation. Moreover, the consistency indices of hesitant fuzzy preference relations and hesitant multiplicative preference relations are defined. Afterwards, an approach to group decision making based on incomplete hesitant fuzzy preference relations and incomplete hesitant multiplicative preference relations is developed to deal with group decision making problems with multiple decision organizations. Finally, three examples are used to illustrate the proposed approach.

Journal Article

Share this book

Add to My Shelf

Deriving priority weights from intuitionistic multiplicative preference relations under group decision-making settings

by Guo, Chonghui , Zhang, Zhen in Business and Management , Consistency , Decision analysis

2017

The intuitionistic multiplicative preference relation (IMPR), which takes into account both the ratio degree to which an alternative is preferred to another and the ratio degree to which an alternative is non-preferred to another, is a useful tool for decision makers to elicit their preference information using Saaty’s 1–9 scale. In this paper, we focus on group decision making with IMPRs. First, we analyze the flaws of the consistency definition of an IMPR in previous work and then propose a new definition to overcome the flaws. On this basis, a linear programming-based algorithm is developed to check and improve the consistency of an IMPR. Second, we discuss the relationships between an IMPR and a normalized intuitionistic multiplicative weight vector and develop two approaches to group decision making based on complete and incomplete IMPRs, respectively. Based on the proposed algorithm and approaches, a general framework for group decision making with IMPRs is proposed. Finally, some numerical examples are provided to demonstrate the proposed approaches. The results show that the proposed approaches can deal with group decision-making problems with IMPRs effectively.

Journal Article

Share this book

Add to My Shelf

Consistency-based algorithms to estimate missing elements for uncertain 2-tuple linguistic preference relations

by Guo, Chonghui , Zhang, Zhen in 2-tuple linguistic , additive consistency , Algorithms

2014

For actual decision making problems, decision makers sometimes may have difficulty to provide all the preference information over alternatives through pairwise comparisons. In this paper, we focus on estimating missing elements for an incomplete uncertain 2-tuple linguistic preference relation. First, the additive consistency of an uncertain 2-tuple linguistic preference relation is defined. Based on the defined additive consistency, we define acceptable incomplete uncertain 2-tuple linguistic preference relation and propose two new algorithms, including an iterative algorithm and an optimization-based algorithm to estimate the missing elements for an uncertain 2-tuple linguistic preference relation. Finally, some numerical examples are presented to illustrate the applicability of the two algorithms.

Journal Article

Share this book

Add to My Shelf

An Improved LDA Topic Modeling Method Based on Partition for Medium and Long Texts

by Wei, Wei , Lu, Menglin , Guo, Chonghui in Artificial Intelligence , Business and Management , Data mining

2021

Latent Dirichlet Allocation (LDA) is a topic model that represents a document as a distribution of multiple topics. It expresses each topic as a distribution of multiple words by mining semantic relationships hidden in text. However, traditional LDA ignores some of the semantic features hidden inside the document semantic structure of medium and long texts. Instead of using the original LDA to model the topic at the document level, it is better to refine the document into different semantic topic units. In this paper, we propose an improved LDA topic model based on partition (LDAP) for medium and long texts. LDAP not only preserves the benefits of the original LDA but also refines the modeled granularity from the document level to the semantic topic level, which is particularly suitable for the topic modeling of the medium and long text. The extensive experimental classification results on Fudan University corpus and Sougou Lab corpus demonstrate that LDAP achieves better performance compared with other topic models, such as LDA, HDP, LSA and doc2vec.

Journal Article

Share this book

Add to My Shelf

Mining Typical Treatment Duration Patterns for Rational Drug Use from Electronic Medical Records

by Sun, Leilei , Guo, Chonghui , Lu, Menglin in Algorithms , Clustering , Complexity

2019

Rational drug use requires that patients receive medications for an adequate period of time. The adequate duration time of medications not only improve the therapeutic effect of medicines, but also reduce the side effects and adverse reactions of medicines. This paper proposes a data-driven method to mine typical treatment duration patterns for rational drug use from electronic medical records (EMRs). Firstly, a quintuple is defined to describe drug use duration statistics (DUDS) for each drug and treatment record is further represented with DUDS vector (DUDSV). Next a similarity measure method is adopted to compute the similarity between treatment records. Meanwhile, a clustering algorithm is used to cluster all patient treatment records to extract typical treatment duration patterns including typical drug sets, effective drug use day sets, and the DUDSs of each typical drug. Then the extracted typical treatment duration patterns are evaluated and annotated based on patients’ demographic information, disease severity scores, treatment outcome and diagnostic information. Finally, a real-world EMR dataset is performed to indicate that the approachwe proposed can effectively mine typical treatment duration patterns from EMRs and recommend the appropriate treatment regimens for patients based on their admission information.

Journal Article

Share this book

Add to My Shelf

A Simulation Research Towards Better Leverage of Sales Ranking

by Sun, Leilei , Zuo, Yuqian , Guo, Chonghui in Complexity , Consumers , Economic Theory/Quantitative Economics/Mathematical Methods

2021

As a kind of the most significantly popular information in markets, the sales ranking has great impacts on consumer choice. However, there are few discussions on how sales ranking should be provided to consumers in the literature. This paper aims to answer the following two questions: 1) To what extent does the sales ranking influence consumer choices; 2) When the sales ranking should be provided to consumers. To do so, this paper first constructs a sales ranking model and then provides detailed simulation experiments to demonstrate the model. The experimental results show that for markets where consumer preferences are dramatically different, such as music and movie markets, sales rankings do not have significant influences on consumer choices and should not be provided to consumers until a large number of early independent consumer choices have been accumulated. But for markets in which consumer preferences are similar, such as markets for official supplies, sales rankings have more influences on consumer choices and should be provided to consumers earlier. Furthermore, an evolution strategy is proposed to ascertain the most suitable sales rankings (characterised by suitable influence strength and suitable release time) for some specified online markets. The comparison results show that the optimized sales rankings not only can help consumers discover higher-quality products but also can improve overall sales.

Journal Article

Share this book

Add to My Shelf

Non-unique cluster numbers determination methods based on stability in spectral clustering

by Borjigin, Sumuya , Guo, Chonghui in Algorithms , Analysis , Applied sciences

2013

Recently, a large amount of work has been devoted to the study of spectral clustering—a simple yet powerful method for finding structure in a data set using spectral properties of an associated pairwise similarity matrix. Most of the existing spectral clustering algorithms estimate only one cluster number or estimate non-unique cluster numbers based on eigengap criterion. However, the number of clusters not always exists one, and eigengap criterion lacks theoretical justification. In this paper, we propose non-unique cluster numbers determination methods based on stability in spectral clustering (NCNDBS). We first utilize the multiway normalized cut spectral clustering algorithm to cluster data set for a candidate cluster number . Then the ratio value of the multiway normalized cut criterion of the obtained clusters and the sum of the leading eigenvalues (descending sort) of the stochastic transition matrix is chosen as a standard to decide whether the is a reasonable cluster number. At last, by varying the scaling parameter in the Gaussian function, we judge whether the reasonable cluster number is also a stability one. By three stages, we can determine non-unique cluster numbers of a data set. The Lumpability theorem concluded by Meil and Xu provides a theoretical base for our methods. NCNDBS can estimate non-unique cluster numbers of the data set successfully by illustrative experiments.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter