Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
1,650 result(s) for "data‐mining tool"
Sort by:
Detection and classification of pneumonia using the Orange3 data mining tool
A chest X-ray can convey a lot about a patient's condition. However, it requires a specialized and skilled doctor to determine the type of lung disease with high accuracy. Here comes the role of deep learning techniques (DL) and artificial intelligence (AI) in accelerating the process of detecting lung diseases and classifying them with high precision, which saves time and effort for the patient and the doctor alike. This work presents a proposed model for a machine learning (ML) and AI system to analyze chest X-ray images and categorize them into four cases normal, viral pneumonia, bacterial pneumonia, and coronavirus disease 2019 (COVID-19). The system relies on extracting Mel frequency cepstral coefficient (MFCC) features from a dataset consisting of 4,800 chest X-ray images, and then these features are used to train four basic classifiers based on the data mining tool Orange3, which are adaptive boosting (AdaBoost), decision trees (DTs), gradient boosting (GB), and random forest (RF). The model was tested and evaluated, where the AdaBoost classifier excelled with an accuracy of 100%, followed by RF with an accuracy of 99.5%. Finally, GB and DTs came with a classification accuracy of 98.5%, and 97.2%, respectively.
Heart disease classification using data mining tools and machine learning techniques
Nowadays, in healthcare industry, data analysis can save lives by improving the medical diagnosis. And with the huge development in software engineering, different data mining tools are available for researchers, and used to conduct studies and experiments. For this, we have decided to compare six common data mining tools: Orange, Weka, RapidMiner, Knime, Matlab, and Scikit-Learn, using six machine learning techniques: Logistic Regression, Support Vector Machine, K Nearest Neighbors, Artificial Neural Network, Naïve Bayes, and Random Forest by classifying heart disease. The dataset used in this study has 13 features, one target variable, and 303 instances in which 139 suffers from cardiovascular disease and 164 are healthy subjects. Three performance measures were used to compare the performance of the techniques in each tool: the accuracy, the sensitivity, and the specificity. The results showed that Matlab was the best performing tool, and Matlab’s Artificial Neural Network model was the best performing technique. We concluded this research by plotting the Receiver operating characteristic curve of Matlab and by giving several recommendations on which tool to choose taking into account the users experience in the field of data mining.
Crack detection based on mel-frequency cepstral coefficients features using multiple classifiers
Crack detection plays an essential role in evaluating the strength of structures. In recent years, the use of machine learning and deep learning techniques combined with computer vision has emerged to assess the strength of structures and detect cracks. This research aims to use machine learning (ML) to create a crack detection model based on a dataset consisting of 2432 images of different surfaces that were divided into two groups: 70% of the training dataset and 30% of the testing dataset. The Orange3 data mining tool was used to build a crack detection model, where the support vector machine (SVM), gradient boosting (GB), naive Bayes (NB), and artificial neural network (ANN) were trained and verified based on 3 sets of features, mel-frequency cepstral coefficients (MFCC), delta MFCC (DMFCC), and delta-delta MFCC (DDMFCC) were extracted using MATLAB. The experimental results showed the superiority of SVM with a classification accuracy of (100%), while for NB the accuracy reached (93.9%-99.9%), and (99.9%) for ANN, and finally in GB the accuracy reached (99.8%).
Survey on Data Mining Techniques, Process and Algorithms
The term “Data Mining” is refers to the extraction of patterns and knowledge from large amounts of raw data and often defined as finding hidden information in a database. It insinuate analyzing data patterns in large volume of data using one or more software. Data mining involves effective data collection and warehousing as well as computer processing.
Second primary cancers: a retrospective analysis of real world data using the enhanced medical research engine ConSoRe in a French comprehensive cancer center
BackgroundSecond primary cancers (SPC) account for 18% of all cancers. We used the enhanced medical/health data mining tool ConSoRe to search aggregated data, analyze electronic patient records (EPR), and better characterize patients with SPC.MethodsThis retrospective cohort study used ConSoRe to identify EPRs from patients with SPC referred to the regional cancer center Leon Bérard from 1993 to 2017, and examined characteristics of patients with SPC, frequencies of first primary cancer (FPC) localization in the global population of patients with SPC, and time to SPC. Data set was extracted on January 1, 2018.ResultsAmong 296,530 EPRs, we identified 157,187 patients with FPC, including 13,002 (8%) patients with SPC. Between 2000 and 2010, the rate of SPC was 34%, and 52% of SPC were identified in the last years (2010–2017). In men, main cancers were head and neck cancer, lymphoma, and prostate carcinoma accounting for 15.6%, 12.8%, and 10.5% of FPC, while the three most common SPC were head and neck cancer (13.2%), lung cancer (11.8%) and lymphoma (9.2%). In women, breast cancers, lymphoma, and skin cancers accounted for 48.8%, 8%, and 5.1% of first cancers, and for 31.1%, 7% and 6% of SPC.ConclusionThe data mining tool ConSoRe contributes to access to real world data, and to better characterize patients with SPC. Expanding such approach to any comprehensive center will allow a global overview of the follow-up of patients with cancer, and help to improve long-term management and adapt surveillance.
Big Data Application and its Impact on Education
Big data is employed in widely different fields; we here study how education uses big data. We review the literature of the research about big data in education in the time interval from 2010 to 2020 then review the process of big educational data mining, the tools, and the applications of big data in education. This paper, with the help of these applications, explores the idea to improve the education process. Two methods are applied to validate education process and many parameters are discussed to complete the research.
Data mining tools -a case study for network intrusion detection
With the growth of data mining and machine learning approaches in recent years, many efforts have been made to generalize these sciences so that researchers from any field can easily utilize these sciences. One of the most important of these efforts is the development of data mining tools that try to hide the complexities from researchers so that they can achieve a professional output with any level of knowledge. This paper is focused on reviewing and comparing data mining and machine learning tools including WEKA, KNIME, Keel, Orange, Azure, IBM SPSS Modeler, R and Scikit-Learn to show what approach each of these methods has taken in the face of the complexities and problems of different scenarios of generalization of data mining and machine learning. In addition, for a more detailed review, this paper examines the challenge of network intrusion detection in two tools, Knime with graphical interface and Scikit-Learn with coding environment.
Data envelopment analysis and data mining to efficiency estimation and evaluation
Purpose This paper aims to assess the application of seven statistical and data mining techniques to second-stage data envelopment analysis (DEA) for bank performance. Design/methodology/approach Different statistical and data mining techniques are used to second-stage DEA for bank performance as a part of an attempt to produce a powerful model for bank performance with effective predictive ability. The projected data mining tools are classification and regression trees (CART), conditional inference trees (CIT), random forest based on CART and CIT, bagging, artificial neural networks and their statistical counterpart, logistic regression. Findings The results showed that random forests and bagging outperform other methods in terms of predictive power. Originality/value This is the first study to assess the impact of environmental factors on banking performance in Middle East and North Africa countries.
Development of a Model Using Data Mining Technique to Test, Predict and Obtain Knowledge from the Academics Results of Information Technology Students
Due to the huge amount of data obtained from students’ academic results in most tertiary institutions such as the colleges, polytechnics and universities, data mining has become one of the most effective tools for discovering vital knowledge from students’ dataset. The discovered knowledge can be productive in understanding numerous challenges in the scope of education and providing possible solutions to these challenges. The main objective of this research is to utilize the J48 decision algorithm model to test, classify and predict the students’ dataset by identifying some important attributes and instances. The analysis was conducted on the final year students’ academic results in C# programming amongst five universities which was imported in csv excel file dataset in WEKA environment. These training datasets contained the scores obtained in the examinations, grade remarks, grades, gender, and department. The knowledge extracted for the prediction model will help both the tutors and students to determine the success grade performance in the future. Flow lines, J48 decision trees, confusion matrices and a program flowchart were generated from the students’ dataset. The KAPPA value obtained from the prediction in this research ranges from 0.9070–0.9582 which perfectly agrees with the standard for an ideal analysis on datasets.
Research on Industry Data Analytics on Processing Procedure of Named 3-4-8-2 Components Combination for the Application Identification in New Chain Convenience Store
With the rapid economic boom of Asian countries, the president of Country-A has made great efforts to reform in recent years. The prospect of economic development is promising, and business opportunities are emerging gradually, depicting a prosperous scene; accordingly, people’s livelihood consumption also has changed significantly. The original main point of consumption for urban and rural people was the old and traditional grocery store with poor sanitation, but due to the economic improvement, the quality of consumption has also improved, and convenience stores are gradually replacing grocery store. However, convenience store management involves performance, logistic, competition, and personnel costs. Both whether the store can create a net profit and evaluate and select a new store will be important keys that significantly influence business performance. Therefore, this study attempts to use the industry data analysis method for highlighting a concept of processing an experience procedure of named 3-4-8-2 components combination in two stages. First, in the data preprocessing stage, this research considers 22 condition attributes and two types of decision factors, that include net profit and new store selection, and use both techniques of attribute selection and data discretization through the analysis and prediction of data mining tools. Next, in the experiment execution stage, three well-known classifiers (Bayes net, logistic regression, and J48 decision tree) with past good performance and four models (without preprocessing, with attribute selection, with data discretization, and with attribute selection and data discretization) are used for eight different experiments through two data verification methods (percentage split and cross-validation). Conclusively, three key results are identified from empirical analysis: (1) It is found that the prediction accuracy of the J48 decision tree classifier is relatively high and stable among the three classifiers in this study; at the same time, the J48 decision tree can yield comprehensible knowledge-based rules to instruct interested parties. (2) The results of this study show that the important attributes for the net profit decision attribute include the store type, POS number, and cashier number, while the important attributes for the new store selection include the store type and cashier number. (3) There is a difference in the selection of important attributes. Furthermore, four key valuable contributions are addressed from the empirical results, including academic contributions, enterprise contributions, application contributions, and management contributions. It is expected that the direction of store layout expansion can be found and identified through this study, but there are still many risks hidden behind the considerable business opportunities that need to be carefully managed.