Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
8,405
result(s) for
"Data Note"
Sort by:
HUST bearing: a practical dataset for ball bearing fault diagnosis
2023
Objectives
The rapid growth of machine learning methods has led to an increase in the demand for data. For bearing fault diagnosis, the data acquisition is time-consuming with complicated processes. Existing datasets are only focused on only one type of bearing, which limits real-world applications. Therefore, the objective of this work is to propose a diverse dataset for ball bearing fault diagnosis based on vibration.
Data description
In this work, we introduce a practical dataset named
HUST bearing
, which provides a large set of vibration data on different ball bearings. This dataset contains 99 raw vibration signals of 6 types of defects (inner crack, outer crack, ball crack, and their 2-combinations) on 5 types of bearing (6204, 6205, 6206, 6207, and 6208) at 3 working conditions (0 W, 200 W, and 400 W). Each vibration signal is sampled at a rate of 51,200 samples per second for 10 s. The data acquisition system is elaborately designed with high reliability.
Journal Article
COVID19-CT-dataset: an open-access chest CT image repository of 1000+ patients with confirmed COVID-19 diagnosis
by
Ataei Nakhaei, Saeedeh
,
Shakouri, Shokouh
,
Kiani, Behzad
in
Analysis
,
Artificial Intelligence
,
Biomedical and Life Sciences
2021
Objectives
The ongoing Coronavirus disease 2019 (COVID-19) pandemic has drastically impacted the global health and economy. Computed tomography (CT) is the prime imaging modality for diagnosis of lung infections in COVID-19 patients. Data-driven and Artificial intelligence (AI)-powered solutions for automatic processing of CT images predominantly rely on large-scale, heterogeneous datasets. Owing to privacy and data availability issues, open-access and publicly available COVID-19 CT datasets are difficult to obtain, thus limiting the development of AI-enabled automatic diagnostic solutions. To tackle this problem, large CT image datasets encompassing diverse patterns of lung infections are in high demand.
Data description
In the present study, we provide an open-source repository containing 1000+ CT images of COVID-19 lung infections established by a team of board-certified radiologists. CT images were acquired from two main general university hospitals in Mashhad, Iran from March 2020 until January 2021. COVID-19 infections were ratified with matching tests including Reverse transcription polymerase chain reaction (RT-PCR) and accompanying clinical symptoms. All data are 16-bit grayscale images composed of 512 × 512 pixels and are stored in DICOM standard. Patient privacy is preserved by removing all patient-specific information from image headers. Subsequently, all images corresponding to each patient are compressed and stored in RAR format.
Journal Article
Genomes to fields 2024 maize genotype by environment prediction competition
by
Gore, Michael A.
,
Kaeppler, Shawn M.
,
Lopez-Cruz, Marco
in
Agricultural commodities
,
Agricultural production
,
Biomedical and Life Sciences
2026
Objectives
The genomes to fields (G2F) 2024 Maize Genotype by Environment (GxE) Prediction Competition challenged participants to develop and submit their best performing models to predict grain yield for the 2024 maize GxE project field trials, using G2F data collected from 2014 to 2023 and other publicly available data.
Data description
The G2F Maize GxE Project is a collaborative effort, with all generated data made publicly available. The resource presented here includes the training and test datasets used for the G2F 2024 Maize GxE Prediction Competition. Specifically, data collected from 2014 to 2023 served as the training set to predict grain yield in the 2024 test set. The dataset comprises phenotypic, genotypic, soil, weather, and environmental covariate data, along with metadata describing environments (year-location combinations). It has been curated and lightly filtered for quality control and to ensure consistent naming across years. Competitors also had access to readme files that describe the structure and content of the datasets.
Journal Article
Genomes to Fields 2022 Maize genotype by Environment Prediction Competition
by
Gore, Michael A.
,
Lopez-Cruz, Marco
,
Ertl, David
in
Analysis
,
Biomedical and Life Sciences
,
Biomedicine
2023
Objectives
The Genomes to Fields (G2F) 2022 Maize Genotype by Environment (GxE) Prediction Competition aimed to develop models for predicting grain yield for the 2022 Maize GxE project field trials, leveraging the datasets previously generated by this project and other publicly available data.
Data description
This resource used data from the Maize GxE project within the G2F Initiative [
1
]. The dataset included phenotypic and genotypic data of the hybrids evaluated in 45 locations from 2014 to 2022. Also, soil, weather, environmental covariates data and metadata information for all environments (combination of year and location). Competitors also had access to ReadMe files which described all the files provided. The Maize GxE is a collaborative project and all the data generated becomes publicly available [
2
]. The dataset used in the 2022 Prediction Competition was curated and lightly filtered for quality and to ensure naming uniformity across years.
Journal Article
Maize genomes to fields (G2F): 2014–2017 field seasons: genotype, phenotype, climatic, soil, and inbred ear image datasets
by
Singh, Maninder
,
Yeh, Cheng-Ting
,
Silverstein, Kevin
in
Agricultural production
,
Biomedical and Life Sciences
,
Biomedicine
2020
Objectives
Advanced tools and resources are needed to efficiently and sustainably produce food for an increasing world population in the context of variable environmental conditions. The maize genomes to fields (G2F) initiative is a multi-institutional initiative effort that seeks to approach this challenge by developing a flexible and distributed infrastructure addressing emerging problems. G2F has generated large-scale phenotypic, genotypic, and environmental datasets using publicly available inbred lines and hybrids evaluated through a network of collaborators that are part of the G2F’s genotype-by-environment (G × E) project. This report covers the public release of datasets for 2014–2017.
Data description
Datasets include inbred genotypic information; phenotypic, climatic, and soil measurements and metadata information for each testing location across years. For a subset of inbreds in 2014 and 2015, yield component phenotypes were quantified by image analysis. Data released are accompanied by README descriptions. For genotypic and phenotypic data, both raw data and a version without outliers are reported. For climatic data, a version calibrated to the nearest airport weather station and a version without outliers are reported. The 2014 and 2015 datasets are updated versions from the previously released files [
1
] while 2016 and 2017 datasets are newly available to the public.
Journal Article
2020-2021 field seasons of Maize GxE project within the Genomes to Fields Initiative
by
Schnable, James
,
Gore, Michael A.
,
Aviles, Alejandro Castro
in
Analysis
,
Biomedical and Life Sciences
,
Biomedicine
2023
Objectives
This release note describes the Maize GxE project datasets within the Genomes to Fields (G2F) Initiative. The Maize GxE project aims to understand genotype by environment (GxE) interactions and use the information collected to improve resource allocation efficiency and increase genotype predictability and stability, particularly in scenarios of variable environmental patterns. Hybrids and inbreds are evaluated across multiple environments and phenotypic, genotypic, environmental, and metadata information are made publicly available.
Data description
The datasets include phenotypic data of the hybrids and inbreds evaluated in 30 locations across the US and one location in Germany in 2020 and 2021, soil and climatic measurements and metadata information for all environments (combination of year and location), ReadMe, and description files for each data type. A set of common hybrids is present in each environment to connect with previous evaluations. Each environment had a collaborator responsible for collecting and submitting the data, the GxE coordination team combined all the collected information and removed obvious erroneous data. Collaborators received the combined data to use, verify and declare that the data generated in their own environments was accurate. Combined data is released to the public with minimal filtering to maintain fidelity to the original data.
Journal Article
An IMU-based dataset of falls, activities of daily living, and prayer movements (AybuFall)
by
Kocaoğlu, Sıtkı
,
Tokgöz, Nazime
in
Accelerometry
,
Accidental Falls
,
Activities of Daily Living
2026
Objectives
Publicly available datasets are essential for the development, evaluation, and benchmarking of fall detection and human activity recognition algorithms. Although numerous datasets include falls and activities of daily living (ADLs), prayer movements—despite exhibiting motion patterns that may resemble falls—remain largely underrepresented. The objective of this study is to present a publicly available IMU-based dataset that explicitly includes prayer movements alongside falls and ADLs, thereby addressing an important gap in existing datasets and supporting methodological research on activity classification and false-positive reduction.
Data description
The dataset comprises motion recordings of 11 types of fall movements, 13 types of activities of daily living (ADLs), and 5 types of prayer movements. Data were collected from 17 healthy young adult participants using two wearable IMU sensors placed on the forehead and forearm. Each activity was performed three times by each participant. Tri-axial accelerometer, gyroscope, and magnetometer signals were recorded at a sampling frequency of 200 Hz. All recordings were manually labeled by direct observation during data acquisition. The dataset is publicly available and systematically organized to support algorithm development, benchmarking, and reproducible research in fall detection and human activity recognition. Although data were collected from young adults, the dataset is intended as a controlled reference resource, and applicability to other populations requires further validation.
Journal Article
An accelerometer-based dataset for monitoring slag in steel manufacturing
by
Dumond, Patrick
,
Ayres, Lucas Mantuan
,
de Souza Leite Cuadros, Marco Antonio
in
Accelerometers
,
Accelerometry - methods
,
Algorithms
2025
Objectives
Slag detection in steel manufacturing is essential for ensuring high product quality and process efficiency. The purpose of the accelerometer-based data is to allow for accurate monitoring and differentiation between slag and molten metal flow. This is vital to prevent equipment damage, maintain steel quality, and enhance operational effectiveness. The data is collected specifically to support the development of machine learning models for real-time monitoring in the steel production process, addressing the critical need for precise slag detection.
Data description
The Steel Slag Flow Dataset (SSFD) offers a comprehensive set of data obtained from a triaxial accelerometer during various stages of steel production. By leveraging this dataset, researchers can effectively analyze and classify the flow of slag versus molten metal. The dataset allows for data-driven approaches so that machine learning researchers can optimize steel manufacturing processes, ensuring high-quality steel production and minimizing the risks associated with slag contamination. The SSFD provides a valuable resource for researchers seeking to enhance predictive maintenance and monitoring in industrial applications.
Journal Article
Blood transcriptome analysis of patients with uncomplicated bacterial infection and sepsis
by
McLean, Anthony S.
,
Herwanto, Velma
,
Tang, Benjamin
in
Analysis
,
Archives & records
,
Bacterial Infections
2021
Objectives
Hospitalized patients who presented within the last 24 h with a bacterial infection were recruited. Participants were assigned into sepsis and uncomplicated infection groups. In addition, healthy volunteers were recruited as controls. RNA was prepared from whole blood, depleted from beta-globin mRNA and sequenced. This dataset represents a highly valuable resource to better understand the biology of sepsis and to identify biomarkers for severe sepsis in humans.
Data description
The data presented here consists of raw and processed transcriptome data obtained by next generation RNA sequencing from 105 peripheral blood samples from patients with uncomplicated infections, patients who developed sepsis, septic shock patients, and healthy controls. It is provided as raw sequenced reads and as normalized log
2
transformed relative expression levels. This data will allow performing detailed analyses of gene expression changes between uncomplicated infections and sepsis patients, such as identification of differentially expressed genes, co-regulated modules as well as pathway activation studies.
Journal Article
Models and data of AMPlify: a deep learning tool for antimicrobial peptide prediction
by
Warren, René L.
,
Li, Chenkai
,
Birol, Inanc
in
Amino Acid Sequence
,
Animals
,
Anti-Bacterial Agents
2023
Objectives
Antibiotic resistance is a rising global threat to human health and is prompting researchers to seek effective alternatives to conventional antibiotics, which include antimicrobial peptides (AMPs). Recently, we have reported AMPlify, an attentive deep learning model for predicting AMPs in databases of peptide sequences. In our tests, AMPlify outperformed the state-of-the-art. We have illustrated its use on data describing the American bullfrog (
Rana [Lithobates] catesbeiana
) genome. Here we present the model files and training/test data sets we used in that study. The original model (the balanced model) was trained on a balanced set of AMP and non-AMP sequences curated from public databases. In this data note, we additionally provide a model trained on an imbalanced set, in which non-AMP sequences far outnumber AMP sequences. We note that the balanced and imbalanced models would serve different use cases, and both would serve the research community, facilitating the discovery and development of novel AMPs.
Data description
This data note provides two sets of models, as well as two AMP and four non-AMP sequence sets for training and testing the balanced and imbalanced models. Each model set includes five single sub-models that form an ensemble model. The first model set corresponds to the original model trained on a balanced training set that has been described in the original AMPlify manuscript, while the second model set was trained on an imbalanced training set.
Journal Article