Catalogue Search | MBRL

The evolution of raw data archiving and the growth of its importance in crystallography

by McMahon, Brian , Kroon-Batenburg, Loes M. J. , Helliwell, John R. in ground truth , raw data archive hardware , raw data measuring hardware

2024

The hardware for data archiving has expanded capacities for digital storage enormously in the past decade or more. The IUCr evaluated the costs and benefits of this within an official working group which advised that raw data archiving would allow ground truth reproducibility in published studies. Consultations of the IUCr's Commissions ensued via a newly constituted standing advisory committee, the Committee on Data. At all stages, the IUCr financed workshops to facilitate community discussions and possible methods of raw data archiving implementation. The recent launch of the IUCrData journal's Raw Data Letters is a milestone in the implementation of raw data archiving beyond the currently published studies: it includes diffraction patterns that have not been fully interpreted, if at all. The IUCr 75th Congress in Melbourne included a workshop on raw data reuse, discussing the successes and ongoing challenges of raw data reuse. This article charts the efforts of the IUCr to facilitate discussions and plans relating to raw data archiving and reuse within the various communities of crystallography, diffraction and scattering.

Journal Article

Share this book

Add to My Shelf

Investigating the Use of Pretrained Convolutional Neural Network on Cross-Subject and Cross-Dataset EEG Emotion Recognition

by Cimtay, Yucel , Ekmekcioglu, Erhan in Accuracy , Classification , convolutional neural network

2020

The electroencephalogram (EEG) has great attraction in emotion recognition studies due to its resistance to deceptive actions of humans. This is one of the most significant advantages of brain signals in comparison to visual or speech signals in the emotion recognition context. A major challenge in EEG-based emotion recognition is that EEG recordings exhibit varying distributions for different people as well as for the same person at different time instances. This nonstationary nature of EEG limits the accuracy of it when subject independency is the priority. The aim of this study is to increase the subject-independent recognition accuracy by exploiting pretrained state-of-the-art Convolutional Neural Network (CNN) architectures. Unlike similar studies that extract spectral band power features from the EEG readings, raw EEG data is used in our study after applying windowing, pre-adjustments and normalization. Removing manual feature extraction from the training system overcomes the risk of eliminating hidden features in the raw data and helps leverage the deep neural network’s power in uncovering unknown features. To improve the classification accuracy further, a median filter is used to eliminate the false detections along a prediction interval of emotions. This method yields a mean cross-subject accuracy of 86.56% and 78.34% on the Shanghai Jiao Tong University Emotion EEG Dataset (SEED) for two and three emotion classes, respectively. It also yields a mean cross-subject accuracy of 72.81% on the Database for Emotion Analysis using Physiological Signals (DEAP) and 81.8% on the Loughborough University Multimodal Emotion Dataset (LUMED) for two emotion classes. Furthermore, the recognition model that has been trained using the SEED dataset was tested with the DEAP dataset, which yields a mean prediction accuracy of 58.1% across all subjects and emotion classes. Results show that in terms of classification accuracy, the proposed approach is superior to, or on par with, the reference subject-independent EEG emotion recognition studies identified in literature and has limited complexity due to the elimination of the need for feature extraction.

Journal Article

Share this book

Add to My Shelf

The science is in the data

by McMahon, Brian , Kroon-Batenburg, Loes M. J. , Helliwell, John R. in Archives & records , crystallographic science case studies , Crystallography

2017

Understanding published research results should be through one's own eyes and include the opportunity to work with raw diffraction data to check the various decisions made in the analyses by the original authors. Today, preserving raw diffraction data is technically and organizationally viable at a growing number of data archives, both centralized and distributed, which are empowered to register data sets and obtain a preservation descriptor, typically a `digital object identifier'. This introduces an important role of preserving raw data, namely understanding where we fail in or could improve our analyses. Individual science area case studies in crystallography are provided.

Journal Article

Share this book

Add to My Shelf

No raw data, no science: another possible source of the reproducibility crisis

by Miyakawa, Tsuyoshi in Biomedical and Life Sciences , Biomedicine , Brain research

2020

A reproducibility crisis is a situation where many scientific studies cannot be reproduced. Inappropriate practices of science, such as HARKing, p-hacking, and selective reporting of positive results, have been suggested as causes of irreproducibility. In this editorial, I propose that a lack of raw data or data fabrication is another possible cause of irreproducibility. As an Editor-in-Chief of Molecular Brain , I have handled 180 manuscripts since early 2017 and have made 41 editorial decisions categorized as “Revise before review,” requesting that the authors provide raw data. Surprisingly, among those 41 manuscripts, 21 were withdrawn without providing raw data, indicating that requiring raw data drove away more than half of the manuscripts. I rejected 19 out of the remaining 20 manuscripts because of insufficient raw data. Thus, more than 97% of the 41 manuscripts did not present the raw data supporting their results when requested by an editor, suggesting a possibility that the raw data did not exist from the beginning, at least in some portions of these cases. Considering that any scientific study should be based on raw data, and that data storage space should no longer be a challenge, journals, in principle, should try to have their authors publicize raw data in a public database or journal site upon the publication of the paper to increase reproducibility of the published results and to increase public trust in science.

Journal Article

Share this book

Add to My Shelf

False-positive results released by direct-to-consumer genetic tests highlight the importance of clinical confirmation testing for appropriate patient care

by Tippin Davis, Brigette , Gutierrez, Stephanie , Tandy-Connor, Stephany in Adult , Aged , Biomedical and Life Sciences

2018

There is increasing demand from the public for direct-to-consumer (DTC) genetic tests, and the US Food and Drug Administration limits the type of health-related claims DTC tests can market. Some DTC companies provide raw genotyping data to customers if requested, and these raw data may include variants occurring in genes recommended by the American College of Medical Genetics and Genomics to be reported as incidental/secondary findings. The purpose of this study was to review the outcome of requests for clinical confirmation of DTC results that were received by our laboratory and to analyze variant classification concordance. We identified 49 patient samples received for further testing that had previously identified genetic variants reported in DTC raw data. For each case identified, information pertaining to the outcome of clinical confirmation testing as well as classification of the DTC variant was collected and analyzed. Our analyses indicated that 40% of variants in a variety of genes reported in DTC raw data were false positives. In addition, some variants designated with the “increased risk” classification in DTC raw data or by a third-party interpretation service were classified as benign at Ambry Genetics as well as several other clinical laboratories, and are noted to be common variants in publicly available population frequency databases. Our results demonstrate the importance of confirming DTC raw data variants in a clinical laboratory that is well versed in both complex variant detection and classification.

Journal Article

Share this book

Add to My Shelf

Hong Kong UrbanNav: An Open-Source Multisensory Dataset for Benchmarking Urban Navigation Algorithms

by Ng, Hoi-Fung , Bai, Xiwei , Hsu, Li-Ta in Algorithms , Canyons , Datasets

2023

Accurate positioning in urban canyons remains a challenging problem. To facilitate the research and development of reliable and precise positioning methods using multiple sensors in urban canyons, we built a multisensory dataset, UrbanNav, collected in diverse, challenging urban scenarios in Hong Kong. The dataset provides multi-sensor data, including data from multi-frequency global navigation satellite system (GNSS) receivers, an inertial measurement unit (IMU), multiple light detection and ranging (lidar) units, and cameras. Meanwhile, the ground truth of the positioning (with centimeter-level accuracy) is postprocessed by commercial software from NovAtel using an integrated GNSS real-time kinematic and fiber optics gyroscope inertial system. In this paper, the sensor systems, spatial and temporal calibration, data formats, and scenario descriptions are presented in detail. Meanwhile, the benchmark performance of several existing positioning methods is provided as a baseline. Based on the evaluations, we conclude that GNSS can provide satisfactory results in a middle-class urban canyon if an appropriate receiver and algorithms are applied. Both visual and lidar odometry are satisfactory in deep urban canyons, whereas tunnels are still a major challenge. Multisensory integration with the aid of an IMU is a promising solution for achieving seamless positioning in cities. The dataset in its entirety can be found on GitHub at https://github.com/IPNL-POLYU/UrbanNavDataset.

Journal Article

Share this book

Add to My Shelf

Whole-genome sequencing offers additional but limited clinical utility compared with reanalysis of whole-exome sequencing

by Alfadhel, Majid , Qudsi, Ahmed Al , Alfares, Ahmed in Adult , Biomedical and Life Sciences , Biomedicine

2018

Purpose Whole-exome sequencing (WES) and whole-genome sequencing (WGS) are used to diagnose genetic and inherited disorders. However, few studies comparing the detection rates of WES and WGS in clinical settings have been performed. Methods Variant call format files were generated and raw data analysis was performed in cases in which the final molecular results showed discrepancies. We classified the possible explanations for the discrepancies into three categories: the time interval between the two tests, the technical limitations of WES, and the impact of the sequencing system type. Results This cohort comprised 108 patients with negative array comparative genomic hybridization and negative or inconclusive WES results before WGS was performed. Ten (9%) patients had positive WGS results. However, after reanalysis the WGS hit rate decreased to 7% (7 cases). In four cases the variants were identified by WES but missed for different reasons. Only 3 cases (3%) were positive by WGS but completely unidentified by WES. Conclusion In this study, we showed that 30% of the positive cases identified by WGS could be identified by reanalyzing the WES raw data, and WGS achieved an only 7% higher detection rate. Therefore, until the cost of WGS approximates that of WES, reanalyzing WES raw data is recommended before performing WGS.

Journal Article

Share this book

Add to My Shelf

Automatic Human Sleep Stage Scoring Using Deep Neural Networks

by Wichniak, Adam , Omlin, Ximena , Buhmann, Joachim in Agreements , Algorithms , Artificial intelligence

2018

The classification of sleep stages is the first and an important step in the quantitative analysis of polysomnographic recordings. Sleep stage scoring relies heavily on visual pattern recognition by a human expert and is time consuming and subjective. Thus, there is a need for automatic classification. In this work we developed machine learning algorithms for sleep classification: random forest (RF) classification based on features and artificial neural networks (ANNs) working both with features and raw data. We tested our methods in healthy subjects and in patients. Most algorithms yielded good results comparable to human interrater agreement. Our study revealed that deep neural networks (DNNs) working with raw data performed better than feature-based methods. We also demonstrated that taking the local temporal structure of sleep into account a priori is important. Our results demonstrate the utility of neural network architectures for the classification of sleep.

Journal Article

Share this book

Add to My Shelf

More Is Less: Signal Processing and the Data Deluge

by Baraniuk, Richard G in Algorithms , Automatic Data Processing , Cameras

2011

The data deluge is changing the operating environment of many sensing systems from data-poor to data-rich--so data-rich that we are in jeopardy of being overwhelmed. Managing and exploiting the data deluge require a reinvention of sensor system design and signal processing theory. The potential pay-offs are huge, as the resulting sensor systems will enable radically new information technologies and powerful new tools for scientific discovery.

Journal Article

Share this book

Add to My Shelf

Cytoplasmic Volume Modulates Spindle Size During Embryogenesis

by Heald, Rebecca , Vahey, Michael D. , Good, Matthew C. in Animals , Availability , Cell Division

2013

Rapid and reductive cell divisions during embryogenesis require that intracellular structures adapt to a wide range of cell sizes. The mitotic spindle presents a central example of this flexibility, scaling with the dimensions of the cell to mediate accurate chromosome segregation. To determine whether spindle size regulation is achieved through a developmental program or is intrinsically specified by cell size or shape, we developed a system to encapsulate cytoplasm from Xenopus eggs and embryos inside cell-like compartments of defined sizes. Spindle size was observed to shrink with decreasing compartment size, similar to what occurs during early embryogenesis, and this scaling trend depended on compartment volume rather than shape. Thus, the amount of cytoplasmic material provides a mechanism for regulating the size of intracellular structures.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter