Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
25,708
result(s) for
"Data transformation"
Sort by:
Towards data-driven culture in a Spanish automobile manufacturer: A case study
by
Villuendas, Diego
,
Fernandez, Vicenc
,
Esteller-Cucala, Maria
in
Automobile industry
,
Case studies
,
Company structure
2020
Purpose: Data-driven decision-making is a growing trend that lots of companies are nowadays willing to adopt. However, the organizational transformation needed is not always as simple and logical as it could seem and the comfort of the old habits can dim the change effort. The purpose of this study is to identify the potential problems that may arise in a real company's transformation from a traditional intuition-driven decision-making model to a data-driven model. Design/methodology/approach: In order to reach this goal, a single case study method was used. Initially a literature review was conducted to analyze both the importance of the change to a data-driven culture and the process of organizational change. Thus, a case study method was adopted in a company of the automotive sector that included experimentation in the website design decision-making process. Findings: As a result of the case study, it was found that all the most cited risks for the organizational change process commented in the literature appeared in the project. However, even being warned of potential dangers the specific actions to prevent the damages were not trivial. Originality/value: The study presents in detail, the application of an organizational change model in a company. Important insights can be extracted from the specific case of a digitalization performed inside traditional industrial company.
Journal Article
Meta-analysis accelerator: a comprehensive tool for statistical data conversion in systematic reviews with meta-analysis
by
Abbas, Abdallah
,
Hefnawy, Mahmoud Tarek
,
Negida, Ahmed
in
Accuracy
,
Data analysis
,
Data conversion
2024
Background
Systematic review with meta-analysis integrates findings from multiple studies, offering robust conclusions on treatment effects and guiding evidence-based medicine. However, the process is often hampered by challenges such as inconsistent data reporting, complex calculations, and time constraints. Researchers must convert various statistical measures into a common format, which can be error-prone and labor-intensive without the right tools.
Implementation
Meta-Analysis Accelerator was developed to address these challenges. The tool offers 21 different statistical conversions, including median & interquartile range (IQR) to mean & standard deviation (SD), standard error of the mean (SEM) to SD, and confidence interval (CI) to SD for one and two groups, among others. It is designed with an intuitive interface, ensuring that users can navigate the tool easily and perform conversions accurately and efficiently. The website structure includes a home page, conversion page, request a conversion feature, about page, articles page, and privacy policy page. This comprehensive design supports the tool’s primary goal of simplifying the meta-analysis process.
Results
Since its initial release in October 2023 as Meta Converter and subsequent renaming to Meta-Analysis Accelerator, the tool has gained widespread use globally. From March 2024 to May 2024, it received 12,236 visits from countries such as Egypt, France, Indonesia, and the USA, indicating its international appeal and utility. Approximately 46% of the visits were direct, reflecting its popularity and trust among users.
Conclusions
Meta-Analysis Accelerator significantly enhances the efficiency and accuracy of meta-analysis of systematic reviews by providing a reliable platform for statistical data conversion. Its comprehensive variety of conversions, user-friendly interface, and continuous improvements make it an indispensable resource for researchers. The tool’s ability to streamline data transformation ensures that researchers can focus more on data interpretation and less on manual calculations, thus advancing the quality and ease of conducting systematic reviews and meta-analyses.
Journal Article
Learning Gene Regulatory Networks from Next Generation Sequencing Data
2017
In recent years, next generation sequencing (NGS) has gradually replaced microarray as the major platform in measuring gene expressions. Compared to microarray, NGS has many advantages, such as less noise and higher throughput. However, the discreteness of NGS data also challenges the existing statistical methodology. In particular, there still lacks an appropriate statistical method for reconstructing gene regulatory networks using NGS data in the literature. The existing local Poisson graphical model method is not consistent and can only infer certain local structures of the network. In this article, we propose a random effect model-based transformation to continuize NGS data and then we transform the continuized data to Gaussian via a semiparametric transformation and apply an equivalent partial correlation selection method to reconstruct gene regulatory networks. The proposed method is consistent. The numerical results indicate that the proposed method can lead to much more accurate inference of gene regulatory networks than the local Poisson graphical model and other existing methods. The proposed data-continuized transformation fills the theoretical gap for how to transform discrete data to continuous data and facilitates NGS data analysis. The proposed data-continuized transformation also makes it feasible to integrate different types of data, such as microarray and RNA-seq data, in reconstruction of gene regulatory networks.
Journal Article
Study on Preprocessing Method of TCM Prescription Data in Data Mining
2021
Traditional Chinese medicine (TCM) prescriptions have been developed for thousands of years. Data forms are diverse, content is discrete and missing, and there are many uncertainties due to cultural and regional differences. Therefore, it has brought some difficulties to the mining of TCM prescriptions. Data based on the 3108 prescriptions for the treatment of typhoid fever, for example, is given priority to with data cleaning and data transformation of data preprocessing, prescriptions combined with multiple functions, expounds the unqualified prescriptions data cleansing, drug name normalization, dose for solving the problems of the unification, the data structured method, make the processed data can be effectively mining, It provides a strong support for exploring the compatibility law of prescription and the development of new drugs.
Journal Article
ImageGP 2 for enhanced data visualization and reproducible analysis in biomedical research
2024
ImageGP is an extensively utilized, open‐access platform for online data visualization and analysis. Over the past 7 years, it has catered to more than 700,000 usages globally, garnering substantial user feedback. The updated version, ImageGP 2 (available at https://www.bic.ac.cn/BIC), introduces a redesigned interface leveraging cutting‐edge web technologies to enhance functionality and user interaction. Key enhancements include the following: (i) Addition of modules for data format transformation, facilitating operations such as matrix merging, subsetting, and transformation between long and wide formats. (ii) Streamlined workflows with features like preparameter selection data validation and grouping of parameters with similar attributes. (iii) Expanded repertoire of visualization functions and analysis tools, including Weighted Gene Co‐Expression Network Analysis, differential gene expression analysis, and FASTA sequence processing. (iv) Personalized user space for uploading large data sets, tracking analysis history, and sharing reproducible analysis data, scripts, and results. (v) Enhanced user support through a simplified error debugging feature accessible with a single click. (vi) Introduction of an R package, ImageGP, enabling local data visualization and analysis. These updates position ImageGP 2 as a versatile tool serving both wet‐lab and dry‐lab researchers with expanded capabilities. The infinity symbol illustrates the seamless workflow of ImageGP 2, encompassing essential functions such as data format transformation, data validation, and parameter combination. This process culminates in the generation of diverse visual outputs, including line, point, and bar plots. Key features include a personalized user center for managing large data sets, interactive visualizations, and streamlined error feedback mechanisms. Additionally, the introduction of the ImageGP R package enables local and batch analyses. Overall, the infinity symbol embodies the limitless potential for data analysis and visualization offered by ImageGP 2. Highlights Advanced user interface, expanded analytical capabilities, and seamless data handling. New modules for data transformation and preparameter selection data validation. Personalized user center, reproducible scripts, seamless error debugging, the introduction of local analysis capabilities.
Journal Article
TWO-SAMPLE AND ANOVA TESTS FOR HIGH DIMENSIONAL MEANS
2019
This paper considers testing the equality of two high dimensional means. Two approaches are utilized to formulate L₂-type tests for better power performance when the two high dimensional mean vectors differ only in sparsely populated coordinates and the differences are faint. One is to conduct thresholding to remove the nonsignal bearing dimensions for variance reduction of the test statistics. The other is to transform the data via the precision matrix for signal enhancement. It is shown that the thresholding and data transformation lead to attractive detection boundaries for the tests. Furthermore, we demonstrate explicitly the effects of precision matrix estimation on the detection boundary for the test with thresholding and data transformation. Extension to multi-sample ANOVA tests is also investigated. Numerical studies are performed to confirm the theoretical findings and demonstrate the practical implementations.
Journal Article
Redefining Health Care Data Interoperability: Empirical Exploration of Large Language Models in Information Exchange
2024
Efficient data exchange and health care interoperability are impeded by medical records often being in nonstandardized or unstructured natural language format. Advanced language models, such as large language models (LLMs), may help overcome current challenges in information exchange.
This study aims to evaluate the capability of LLMs in transforming and transferring health care data to support interoperability.
Using data from the Medical Information Mart for Intensive Care III and UK Biobank, the study conducted 3 experiments. Experiment 1 assessed the accuracy of transforming structured laboratory results into unstructured format. Experiment 2 explored the conversion of diagnostic codes between the coding frameworks of the ICD-9-CM (International Classification of Diseases, Ninth Revision, Clinical Modification), and Systematized Nomenclature of Medicine Clinical Terms (SNOMED-CT) using a traditional mapping table and a text-based approach facilitated by the LLM ChatGPT. Experiment 3 focused on extracting targeted information from unstructured records that included comprehensive clinical information (discharge notes).
The text-based approach showed a high conversion accuracy in transforming laboratory results (experiment 1) and an enhanced consistency in diagnostic code conversion, particularly for frequently used diagnostic names, compared with the traditional mapping approach (experiment 2). In experiment 3, the LLM showed a positive predictive value of 87.2% in extracting generic drug names.
This study highlighted the potential role of LLMs in significantly improving health care data interoperability, demonstrated by their high accuracy and efficiency in data transformation and exchange. The LLMs hold vast potential for enhancing medical data exchange without complex standardization for medical terms and data structure.
Journal Article
A Data Transformation Methodology to Create Findable, Accessible, Interoperable, and Reusable Health Data: Software Design, Development, and Evaluation Study
by
Carmona-Pírez, Jonás
,
Sinaci, A Anil
,
Martinez-Garcia, Alicia
in
Computer programming
,
Criteria
,
Data
2023
Sharing health data is challenging because of several technical, ethical, and regulatory issues. The Findable, Accessible, Interoperable, and Reusable (FAIR) guiding principles have been conceptualized to enable data interoperability. Many studies provide implementation guidelines, assessment metrics, and software to achieve FAIR-compliant data, especially for health data sets. Health Level 7 (HL7) Fast Healthcare Interoperability Resources (FHIR) is a health data content modeling and exchange standard.
Our goal was to devise a new methodology to extract, transform, and load existing health data sets into HL7 FHIR repositories in line with FAIR principles, develop a Data Curation Tool to implement the methodology, and evaluate it on health data sets from 2 different but complementary institutions. We aimed to increase the level of compliance with FAIR principles of existing health data sets through standardization and facilitate health data sharing by eliminating the associated technical barriers.
Our approach automatically processes the capabilities of a given FHIR end point and directs the user while configuring mappings according to the rules enforced by FHIR profile definitions. Code system mappings can be configured for terminology translations through automatic use of FHIR resources. The validity of the created FHIR resources can be automatically checked, and the software does not allow invalid resources to be persisted. At each stage of our data transformation methodology, we used particular FHIR-based techniques so that the resulting data set could be evaluated as FAIR. We performed a data-centric evaluation of our methodology on health data sets from 2 different institutions.
Through an intuitive graphical user interface, users are prompted to configure the mappings into FHIR resource types with respect to the restrictions of selected profiles. Once the mappings are developed, our approach can syntactically and semantically transform existing health data sets into HL7 FHIR without loss of data utility according to our privacy-concerned criteria. In addition to the mapped resource types, behind the scenes, we create additional FHIR resources to satisfy several FAIR criteria. According to the data maturity indicators and evaluation methods of the FAIR Data Maturity Model, we achieved the maximum level (level 5) for being Findable, Accessible, and Interoperable and level 3 for being Reusable.
We developed and extensively evaluated our data transformation approach to unlock the value of existing health data residing in disparate data silos to make them available for sharing according to the FAIR principles. We showed that our method can successfully transform existing health data sets into HL7 FHIR without loss of data utility, and the result is FAIR in terms of the FAIR Data Maturity Model. We support institutional migration to HL7 FHIR, which not only leads to FAIR data sharing but also eases the integration with different research networks.
Journal Article
Size-Correction and Principal Components for Interspecific Comparative Studies
Phylogenetic methods for the analysis of species data are widely used in evolutionary studies. However, preliminary data transformations and data reduction procedures (such as a size-correction and principal components analysis, PCA) are often performed without first correcting for nonindependence among the observations for species. In the present short comment and attached R and MATLAB code, I provide an overview of statistically correct procedures for phylogenetic size-correction and PCA. I also show that ignoring phylogeny in preliminary transformations can result in significantly elevated variance and type I error in our statistical estimators, even if subsequent analysis of the transformed data is performed using phylogenetic methods. This means that ignoring phylogeny during preliminary data transformations can possibly lead to spurious results in phylogenetic statistical analyses of species data.
Journal Article
Enhanced framework embedded with data transformation and multi-objective feature selection algorithm for forecasting wind power
2025
The increasing global interest in utilizing wind turbines for power generation emphasizes the importance of accurate wind power forecasting in managing wind power. This paper proposed a framework that integrates a data transformation mechanism with a multi-objective none-dominated sorting genetic algorithm III (NSGA-III), coupled with a hybrid deep Recurrent Network (DRN) and Long Short-Term Memory (LSTM) architecture for modeling wind power. The feature selection algorithm, multi-objective NSGA-III, identifies the optimal subset features from wind energy datasets. These selected features undergo a data transformation process before being input into the hybrid DRN-LSTM for wind power forecasting. A comparative study demonstrates the proposal’s superior effectiveness and robustness compared to existing frameworks with the proposal achieving 2.6593e−10 and 1.630e−05 in terms of MSE and RMSE respectively whereas the classical algorithm recorded 8.8814e−07 and 9.424e−04. The study’s contributions lie in its approach integration of data transformation mechanism and the notable enhancements in wind power forecasting accuracy. Furthermore, the study offers valuable insights to guide research efforts in the future.
Journal Article