Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
      More Filters
      Clear All
      More Filters
      Source
    • Language
3,056 result(s) for "Multimodal data"
Sort by:
Table Tennis Tutor: Forehand Strokes Classification Based on Multimodal Data and Neural Networks
Beginner table-tennis players require constant real-time feedback while learning the fundamental techniques. However, due to various constraints such as the mentor’s inability to be around all the time, expensive sensors and equipment for sports training, beginners are unable to get the immediate real-time feedback they need during training. Sensors have been widely used to train beginners and novices for various skills development, including psychomotor skills. Sensors enable the collection of multimodal data which can be utilised with machine learning to classify training mistakes, give feedback, and further improve the learning outcomes. In this paper, we introduce the Table Tennis Tutor (T3), a multi-sensor system consisting of a smartphone device with its built-in sensors for collecting motion data and a Microsoft Kinect for tracking body position. We focused on the forehand stroke mistake detection. We collected a dataset recording an experienced table tennis player performing 260 short forehand strokes (correct) and mimicking 250 long forehand strokes (mistake). We analysed and annotated the multimodal data for training a recurrent neural network that classifies correct and incorrect strokes. To investigate the accuracy level of the aforementioned sensors, three combinations were validated in this study: smartphone sensors only, the Kinect only, and both devices combined. The results of the study show that smartphone sensors alone perform sub-par than the Kinect, but similar with better precision together with the Kinect. To further strengthen T3’s potential for training, an expert interview session was held virtually with a table tennis coach to investigate the coach’s perception of having a real-time feedback system to assist beginners during training sessions. The outcome of the interview shows positive expectations and provided more inputs that can be beneficial for the future implementations of the T3.
SAIN: Search-And-INfer, a Mathematical and Computational Framework for Personalised Multimodal Data Modelling with Applications in Healthcare
Personalised modelling has become dominant in personalised medicine and precision health. It creates a computational model for an individual based on large data repositories of existing personalised data, aiming to achieve the best possible personal diagnosis or prognosis and derive an informative explanation for it. Current methods are still working on a single data modality or treating all modalities with the same method. The proposed method, SAIN (Search-And-INfer), offers better results and an informative explanation for classification and prediction tasks on a new multimodal object (sample) using a database of similar multimodal objects. The method is based on different distance measures suitable for each data modality and introduces a new formula to aggregate all modalities into a single vector distance measure to find the closest objects to a new one, and then use them for a probabilistic inference. This paper describes SAIN and applies it to two types of multimodal data, cardiovascular diagnosis and EEG time series, modelled by integrating modalities, such as numbers, categories, images, and time series, and using a software implementation of SAIN.
Multimodal Data Fusion in Learning Analytics: A Systematic Review
Multimodal learning analytics (MMLA), which has become increasingly popular, can help provide an accurate understanding of learning processes. However, it is still unclear how multimodal data is integrated into MMLA. By following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, this paper systematically surveys 346 articles on MMLA published during the past three years. For this purpose, we first present a conceptual model for reviewing these articles from three dimensions: data types, learning indicators, and data fusion. Based on this model, we then answer the following questions: 1. What types of data and learning indicators are used in MMLA, together with their relationships; and 2. What are the classifications of the data fusion methods in MMLA. Finally, we point out the key stages in data fusion and the future research direction in MMLA. Our main findings from this review are (a) The data in MMLA are classified into digital data, physical data, physiological data, psychometric data, and environment data; (b) The learning indicators are behavior, cognition, emotion, collaboration, and engagement; (c) The relationships between multimodal data and learning indicators are one-to-one, one-to-any, and many-to-one. The complex relationships between multimodal data and learning indicators are the key for data fusion; (d) The main data fusion methods in MMLA are many-to-one, many-to-many and multiple validations among multimodal data; and (e) Multimodal data fusion can be characterized by the multimodality of data, multi-dimension of indicators, and diversity of methods.
Hierarchical feature representation and multimodal fusion with deep learning for AD/MCI diagnosis
For the last decade, it has been shown that neuroimaging can be a potential tool for the diagnosis of Alzheimer's Disease (AD) and its prodromal stage, Mild Cognitive Impairment (MCI), and also fusion of different modalities can further provide the complementary information to enhance diagnostic accuracy. Here, we focus on the problems of both feature representation and fusion of multimodal information from Magnetic Resonance Imaging (MRI) and Positron Emission Tomography (PET). To our best knowledge, the previous methods in the literature mostly used hand-crafted features such as cortical thickness, gray matter densities from MRI, or voxel intensities from PET, and then combined these multimodal features by simply concatenating into a long vector or transforming into a higher-dimensional kernel space. In this paper, we propose a novel method for a high-level latent and shared feature representation from neuroimaging modalities via deep learning. Specifically, we use Deep Boltzmann Machine (DBM)22Although it is clear from the context that the acronym DBM denotes “Deep Boltzmann Machine” in this paper, we would clearly indicate that DBM here is not related to “Deformation Based Morphometry”., a deep network with a restricted Boltzmann machine as a building block, to find a latent hierarchical feature representation from a 3D patch, and then devise a systematic method for a joint feature representation from the paired patches of MRI and PET with a multimodal DBM. To validate the effectiveness of the proposed method, we performed experiments on ADNI dataset and compared with the state-of-the-art methods. In three binary classification problems of AD vs. healthy Normal Control (NC), MCI vs. NC, and MCI converter vs. MCI non-converter, we obtained the maximal accuracies of 95.35%, 85.67%, and 74.58%, respectively, outperforming the competing methods. By visual inspection of the trained model, we observed that the proposed method could hierarchically discover the complex latent patterns inherent in both MRI and PET. •A novel method for a high-level latent feature representation from neuroimaging data•A systematic method for joint feature representation of multimodal neuroimaging data•Hierarchical patch-level information fusion via an ensemble classifier•Maximal diagnostic accuracies of 93.52% (AD vs. NC), 85.19% (MCI vs. NC), and 74.58% (MCI converter vs. MCI non-converter)
A review on multimodal medical image fusion: Compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics
Over the past two decades, medical imaging has been extensively apply to diagnose diseases. Medical experts continue to have difficulties for diagnosing diseases with a single modality owing to a lack of information in this domain. Image fusion may be use to merge images of specific organs with diseases from a variety of medical imaging systems. Anatomical and physiological data may be included in multi-modality image fusion, making diagnosis simpler. It is a difficult challenge to find the best multimodal medical database with fusion quality evaluation for assessing recommended image fusion methods. As a result, this article provides a complete overview of multimodal medical image fusion methodologies, databases, and quality measurements. In this article, a compendious review of different medical imaging modalities and evaluation of related multimodal databases along with the statistical results is provided. The medical imaging modalities are organized based on radiation, visible-light imaging, microscopy, and multimodal imaging. The medical imaging acquisition is categorized into invasive or non-invasive techniques. The fusion techniques are classified into six main categories: frequency fusion, spatial fusion, decision-level fusion, deep learning, hybrid fusion, and sparse representation fusion. In addition, the associated diseases for each modality and fusion approach presented. The quality assessments fusion metrics are also encapsulated in this article. This survey provides a baseline guideline to medical experts in this technical domain that may combine preoperative, intraoperative, and postoperative imaging, Multi-sensor fusion for disease detection, etc. The advantages and drawbacks of the current literature are discussed, and future insights are provided accordingly.
Multimodal Sensing for Depression Risk Detection: Integrating Audio, Video, and Text Data
Depression is a major psychological disorder with a growing impact worldwide. Traditional methods for detecting the risk of depression, predominantly reliant on psychiatric evaluations and self-assessment questionnaires, are often criticized for their inefficiency and lack of objectivity. Advancements in deep learning have paved the way for innovations in depression risk detection methods that fuse multimodal data. This paper introduces a novel framework, the Audio, Video, and Text Fusion-Three Branch Network (AVTF-TBN), designed to amalgamate auditory, visual, and textual cues for a comprehensive analysis of depression risk. Our approach encompasses three dedicated branches—Audio Branch, Video Branch, and Text Branch—each responsible for extracting salient features from the corresponding modality. These features are subsequently fused through a multimodal fusion (MMF) module, yielding a robust feature vector that feeds into a predictive modeling layer. To further our research, we devised an emotion elicitation paradigm based on two distinct tasks—reading and interviewing—implemented to gather a rich, sensor-based depression risk detection dataset. The sensory equipment, such as cameras, captures subtle facial expressions and vocal characteristics essential for our analysis. The research thoroughly investigates the data generated by varying emotional stimuli and evaluates the contribution of different tasks to emotion evocation. During the experiment, the AVTF-TBN model has the best performance when the data from the two tasks are simultaneously used for detection, where the F1 Score is 0.78, Precision is 0.76, and Recall is 0.81. Our experimental results confirm the validity of the paradigm and demonstrate the efficacy of the AVTF-TBN model in detecting depression risk, showcasing the crucial role of sensor-based data in mental health detection.
Bayesian fusion and multimodal DCM for EEG and fMRI
This paper asks whether integrating multimodal EEG and fMRI data offers a better characterisation of functional brain architectures than either modality alone. This evaluation rests upon a dynamic causal model that generates both EEG and fMRI data from the same neuronal dynamics. We introduce the use of Bayesian fusion to provide informative (empirical) neuronal priors – derived from dynamic causal modelling (DCM) of EEG data – for subsequent DCM of fMRI data. To illustrate this procedure, we generated synthetic EEG and fMRI timeseries for a mismatch negativity (or auditory oddball) paradigm, using biologically plausible model parameters (i.e., posterior expectations from a DCM of empirical, open access, EEG data). Using model inversion, we found that Bayesian fusion provided a substantial improvement in marginal likelihood or model evidence, indicating a more efficient estimation of model parameters, in relation to inverting fMRI data alone. We quantified the benefits of multimodal fusion with the information gain pertaining to neuronal and haemodynamic parameters – as measured by the Kullback-Leibler divergence between their prior and posterior densities. Remarkably, this analysis suggested that EEG data can improve estimates of haemodynamic parameters; thereby furnishing proof-of-principle that Bayesian fusion of EEG and fMRI is necessary to resolve conditional dependencies between neuronal and haemodynamic estimators. These results suggest that Bayesian fusion may offer a useful approach that exploits the complementary temporal (EEG) and spatial (fMRI) precision of different data modalities. We envisage the procedure could be applied to any multimodal dataset that can be explained by a DCM with a common neuronal parameterisation. •Multimodal DCM shows how the same neuronal activity causes multiple measurements.•Bayesian fusion of EEG/fMRI resolves conditional dependencies between parameters.•Information gain quantifies the added benefits of multimodal Bayesian fusion.
Detecting Emotions through Electrodermal Activity in Learning Contexts: A Systematic Review
There is a strong increase in the use of devices that measure physiological arousal through electrodermal activity (EDA). Although there is a long tradition of studying emotions during learning, researchers have only recently started to use EDA to measure emotions in the context of education and learning. This systematic review aimed to provide insight into how EDA is currently used in these settings. The review aimed to investigate the methodological aspects of EDA measures in educational research and synthesize existing empirical evidence on the relation of physiological arousal, as measured by EDA, with learning outcomes and learning processes. The methodological results pointed to considerable variation in the usage of EDA in educational research and indicated that few implicit standards exist. Results regarding learning revealed inconsistent associations between physiological arousal and learning outcomes, which seem mainly due to underlying methodological differences. Furthermore, EDA frequently fluctuated during different stages of the learning process. Compared to this unimodal approach, multimodal designs provide the potential to better understand these fluctuations at critical moments. Overall, this review signals a clear need for explicit guidelines and standards for EDA processing in educational research in order to build a more profound understanding of the role of physiological arousal during learning.
Isabl Platform, a digital biobank for processing multimodal patient data
Background The widespread adoption of high throughput technologies has democratized data generation. However, data processing in accordance with best practices remains challenging and the data capital often becomes siloed. This presents an opportunity to consolidate data assets into digital biobanks—ecosystems of readily accessible, structured, and annotated datasets that can be dynamically queried and analysed. Results We present Isabl, a customizable plug-and-play platform for the processing of multimodal patient-centric data. Isabl's architecture consists of a relational database (Isabl DB), a command line client (Isabl CLI), a RESTful API (Isabl API) and a frontend web application (Isabl Web). Isabl supports automated deployment of user-validated pipelines across the entire data capital. A full audit trail is maintained to secure data provenance, governance and ensuring reproducibility of findings. Conclusions As a digital biobank, Isabl supports continuous data utilization and automated meta analyses at scale, and serves as a catalyst for research innovation, new discoveries, and clinical translation.
Phase-specific multimodal biomarkers enable explainable assessment of upper limb dysfunction in chronic stroke
BackgroundObjective and precise assessment of upper limb dysfunction post-stroke is critical for guiding rehabilitation. While promising, current methods using wearable sensors and machine learning (ML) often lack interpretability and neglect underlying, phase-specific kinetic deficits (e.g., muscle forces and joint torques) within functional tasks. This study aimed to develop and validate an explainable assessment framework that leverages musculoskeletal kinetic modeling to extract phase-specific, multimodal (kinematic and kinetic) biomarkers to assess upper limb dysfunction in chronic stroke.MethodsSixty-five adults with chronic stroke and 20 healthy controls performed a standardized hand-to-mouth (HTM) task. Stroke participants were allocated to a model-development cohort (n = 47) and an independent test cohort (n = 18). Using IMU and sEMG data, we employed musculoskeletal modeling to extract phase-specific kinematic (e.g., inter-joint coordination, trunk displacement) and kinetic (e.g., mechanical work, smoothness, co-contraction index) biomarkers from four task phases. A Lasso regression model was trained to predict FMA-UL scores, validated via 5-fold cross-validation and the independent test cohort. Explainable AI (SHAP) was used to identify key predictive features.ResultsCompared with controls, patients showed phase-specific alterations including greater trunk displacement and reduced inter-joint coordination and mechanical work (all p < 0.05). The Lasso model achieved strong performance in internal validation (R2 = 0.932; MAE = 0.799) and generalized well to the independent test cohort (R2 = 0.881; MAE = 0.954). SHAP identified trunk displacement in phase 2 (TD_2), elbow–shoulder coordination in phase 3 (IC_elb_elv_3), and trunk displacement in phase 3 (TD_3) as dominant predictors; larger trunk displacement contributed negatively to predicted FMA-UL scores.ConclusionIntegrating phase-specific multimodal biomarkers with explainable ML yields an interpretable upper-limb dysfunction. By highlighting phase-specific kinetic and kinematic targets (e.g., trunk compensation and inter-joint coordination), the framework supports individualized, precision rehabilitation.