Catalogue Search | MBRL

Modelo de minería de datos para la detección de enfermedades en pacientes de primer nivel de atención médica

by Méndez, Eric Ramos , Torres, Guillermo de los Santos , Moheno, Gerardo Arceo in Algorithms , Cluster analysis , Clustering

2024

Data mining is a tool that can currently be applied in different areas such as business, scientific research, and government security. In this sector, one of the most frequent problems occurs in diagnoses where erroneous diagnoses are common, resulting in patients not being given adequate treatment. [...]the objective of this research is the creation of a data mining model for the detection of diseases in first level medical care patients. For which a data mining model was developed using the K-means clustering algorithm. К-Means is an unsupervised clustering algorithm that is widely used to discover patterns in large unlabeled data sets. With the analysis of the variables that make up the dataset created for this project, it was determined that the data found can help understand the patterns of medication use, compliance and possible side effects through the analysis of the variables Primary condition and Secondary condition.

Journal Article

Share this book

Add to My Shelf

Método optimizado basado en algoritmo K-Means como herramienta en la detección de plagio de código fuente

by Callejas-Cuervo, Mauro , Mikusova, Miroslava , Duracik, Michal in Algorithms , Cluster analysis , Clustering

2020

Detecting plagiarism in the source code of projects developed by university students is even more difficult, given that year after year a countless number of new academic projects are generated and there are few tools available that allow for the analysis of large numbers of lines of code. [...]automatic detection requires the use of many computational resources and consumes a lot of time to execute. The results report that the algorithm was accelerated without compromising its original purpose and that a triple acceleration was achieved in comparison to the basic implementation without any optimization. Keywords: K-Means algorithm, source code, plagiarism, computational performance. 1.

Journal Article

Share this book

Add to My Shelf

A heuristic algorithm for solving the minimum sum-of-squares clustering problems

by Bagirov, Adil M. , Ordin, Burak in Algorithms , Cluster analysis , Clustering

2015

Clustering is an important task in data mining. It can be formulated as a global optimization problem which is challenging for existing global optimization techniques even in medium size data sets. Various heuristics were developed to solve the clustering problem. The global k -means and modified global k -means are among most efficient heuristics for solving the minimum sum-of-squares clustering problem. However, these algorithms are not always accurate in finding global or near global solutions to the clustering problem. In this paper, we introduce a new algorithm to improve the accuracy of the modified global k -means algorithm in finding global solutions. We use an auxiliary cluster problem to generate a set of initial points and apply the k -means algorithm starting from these points to find the global solution to the clustering problems. Numerical results on 16 real-world data sets clearly demonstrate the superiority of the proposed algorithm over the global and modified global k -means algorithms in finding global solutions to clustering problems.

Journal Article

Share this book

Add to My Shelf

Determinación de grupos de usuarios de bibliotecas digitales mediante el análisis de ficheros log

by Martínez-Comeche, Juan Antonio in agrupamiento , Algorithms , algoritmo k-means

2017

En este estudio se analiza el modo en que los usuarios realizan tareas de búsqueda y recuperación de información mediante consulta en la Biblioteca Digital Hispánica, distinguiendo grupos de usuarios en función de su distinto comportamiento informacional. Para ello se emplean los ficheros log recopilados por el servidor durante un año y se cotejan distintos algoritmos de agrupamiento. Se observa que el algoritmo k-means es un procedimiento de agrupamiento adecuado al análisis de extensos ficheros log de consultas en bibliotecas digitales. En el caso de la Biblioteca Digital Hispánica se distinguen tres grupos de usuarios cuyo comportamiento informacional distintivo se describe.

Journal Article

Share this book

Add to My Shelf

Comprehensive classification assessment of GNSS observation data quality by fusing k-means and KNN algorithms

by Xie, Wei , Li, Mengyuan , Huang, Guanwen in Algorithms , Carrier density , Classification

2024

The observation data is the basis for the global navigation satellite system (GNSS) to provide positioning, navigation and timing (PNT) service, and the observation quality directly determines the performance level of the PNT service. At present, the analysis of GNSS observations quality is partial and can only be based on a single index assessment. GNSS observation quality is difficult to analyze comprehensively by fusing multiple indicators. To solve the above problem, the supervised and unsupervised machine learning algorithms are applied, and a new comprehensive and classification method of GNSS observations quality based on the k-means clustering algorithm (k-means) and K-nearest neighbor algorithm (KNN) was proposed. The four core index features of GNSS observations, including data integrity rate, carrier-to-noise-density ratio (CNR), pseudorange multipath and the number of observations per slip, were selected to construct the sample dataset. The sample set was unsupervised clustered based on the k-means algorithm, and the classification label of GNSS observations quality was obtained. Then KNN algorithm was used to construct a comprehensive classification and evaluation model for GNSS observations quality. The data from 30 MGEX stations in the Asia–Pacific region in 2019 were selected for modeling analysis. The experiment results show that: (1) a strong correlation has been revealed between pseudorange multipath, CNR and the number of observations per slip. (2) The average classification correctness rate of the new model was over 90% by n-fold cross-validation. (3) The new model can effectively realize the automatic evaluation and classification of GNSS observations quality and easily distinguish the superiority and inferiority of the station observations. The relevant results provide a new idea for the automatic classification and assessment of GNSS observation quality.

Journal Article

Share this book

Add to My Shelf

Electricity load forecasting using clustering and ARIMA model for energy management in buildings

by Nepal, Bishnu , Yokoe, Aya , Yamaha, Motoi in Accuracy , ARIMA model , Artificial intelligence

2020

Understanding the energy consumption patterns of buildings and investing efforts toward energy load reduction is important for optimizing resources and conserving energy in buildings. In this research, we proposed a forecasting method for the electricity load of university buildings using a hybrid model comprising a clustering technique and the autoregressive integrated moving average (ARIMA) model. The novel approach includes clustering data of an entire year, including the forecasting day using K‐means clustering, and using the result to forecast the electricity peak load of university buildings. The combination of clustering and the ARIMA model has proved to increase the performance of forecasting rather than that using the ARIMA model alone. Forecasting electricity peak load with appreciable accuracy several hours before peak hours can provide the management authorities with sufficient time to design strategies for peak load reduction. This method can also be implemented in the demand response for reducing electricity bills by avoiding electricity usage during the high electricity rate hours. In this research, we proposed a method for forecasting the electricity load of university buildings using a hybrid model of clustering technique and autoregressive integrated moving average (ARIMA) model. The novel approach discussed in this paper includes clustering one whole year data including the forecasting day using K‐means clustering and using the result to forecast the electricity peak load of university buildings. The combination of clustering and ARIMA model has proved to increase the performance of forecasting rather than ARIMA model alone. This method can be used for energy conservation in buildings.

Journal Article

Share this book

Add to My Shelf

Asymmetric k-Means Clustering of the Asymmetric Self-Organizing Map

by Olszewski, Dominik in Algorithms , Artificial Intelligence , Asymmetry

2016

An asymmetric approach to clustering of the asymmetric self-organizing map is proposed. The clustering is performed using an improved asymmetric version of the well-known k -means algorithm. The improved asymmetric k -means algorithm is the second proposal of this paper. As a result, we obtain a two-stage fully asymmetric data analysis technique. In this way, we maintain the methodological consistency of the both utilized methods, because they are both formulated in asymmetric versions, and consequently, they both properly adjust to asymmetric relationships in analyzed data. The results of our experiments on real data confirm the effectiveness of the proposed approach.

Journal Article

Share this book

Add to My Shelf

Efficient algorithm for big data clustering on single machine

by Alguliyev, Rasim M. , Sukhostat, Lyudmila V. , Aliguliyev, Ramiz M. in Accelerometers , Algorithms , Big Data

2020

Big data analysis requires the presence of large computing powers, which is not always feasible. And so, it became necessary to develop new clustering algorithms capable of such data processing. This study proposes a new parallel clustering algorithm based on the k-means algorithm. It significantly reduces the exponential growth of computations. The proposed algorithm splits a dataset into batches while preserving the characteristics of the initial dataset and increasing the clustering speed. The idea is to define cluster centroids, which are also clustered, for each batch. According to the obtained centroids, the data points belong to the cluster with the nearest centroid. Real large datasets are used to conduct the experiments to evaluate the effectiveness of the proposed approach. The proposed approach is compared with k-means and its modification. The experiments show that the proposed algorithm is a promising tool for clustering large datasets in comparison with the k-means algorithm.

Journal Article

Share this book

Add to My Shelf

Optimal probabilistic scenario‐based operation and scheduling of prosumer microgrids considering uncertainties of renewable energy sources

by Hashemi‐Dezaki, Hamed , Faraji, Jamal , Ketabi, Abbas in Accuracy , Algorithms , Alternative energy sources

2020

Uncertainties of renewable energy sources (RESs) such as wind turbine (WT) and photovoltaic (PV) units are one of the considerable challenges of prosumer microgrids (PMGs) for the optimal day‐ahead operation. In this study, a new probabilistic scenario‐based method of optimal scheduling and operation of PMGs is developed. In this regard, different scenarios are generated using Monte Carlo Simulations (MCS). Furthermore, k‐means, k‐medoids, and differential evolution algorithms (DEA) are deployed to cluster the scenarios in the proposed method. A realistic commercial PMG in Iran is selected to apply the introduced method. The validity of the developed probabilistic optimization method for PMG operation is examined by comparing the results under various scenario reduction algorithms and MCS ones. The comparison of the obtained results and those of other existing deterministic methods highlights the advantages of the presented method. Furthermore, the sensitivity analyses are carried out to investigate the robustness of the developed method against the increase in the system uncertainty level. According to the test results, it is concluded that the k‐medoids algorithm has the best performance in comparison with the k‐means and the DEA‐based clustering under various conditions. Proposing a novel scenario‐based O.F to optimize the operation costs of prosumers. Comparison of the proposed method and other available deterministic ones. Comparison of different scenario reduction methods. Validation of the scenario reduction‐based method by using MCS. Investigation of the proposed method robustness against the uncertainty increment.

Journal Article

Share this book

Add to My Shelf

Evaluation of image processing technique in identifying rice blast disease in field conditions based on KNN algorithm improvement by K‐means

by Kozegar, Ehsan , Loni, Reyhaneh , Larijani, Mohammad Reza in Agricultural production , Agriculture , Algorithms

2019

Nowadays, rice farming is affected by various diseases that are economically significant and worthy of attention. One of these diseases is blast. Rice blast is one of the most important limiting factors in rice yield. The purpose of this study is the timely and rapid diagnosis of rice blast based on the image processing technique in field conditions. To do so, color images were prepared using image processing technique and improved KNN algorithm by K‐means was used to classify the images in Lab color space to detect disease spots on rice leaves. Squared classification was based on Euclidean distance, and the Otsu method was used to perform an automatic threshold histogram of images based on shape or to reduce the gray level in binary images. Finally, to determine the efficiency of the designed algorithm, sensitivity, specificity, and overall accuracy were examined. The classification results showed that the sensitivity and specificity of the designed algorithm were 92% and 91.7%, respectively, in the determination of the number of disease spots, and 96% and 95.65% in determining the quality of disease spots. The overall accuracy of the designed algorithm was 94%. Generally, the results obtained showed that the above method has a great potential for timely diagnosis of rice blast. Rice blast caused by fungus Magnaporthe oryzae is generally considered the most important disease of rice worldwide because of its extensive distribution and destructiveness under favorable conditions. In case of severe disease, all leaves of a plant may become dry. There are chemical methods to prevent this fungal disease, but the important point is the timely and accurate diagnosis of the disease. Considering the significance of the topic, it is very important to use the science of machine vision and image processing techniques, which today play a major role in precision agriculture.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter