Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Discipline
      Discipline
      Clear All
      Discipline
  • Is Peer Reviewed
      Is Peer Reviewed
      Clear All
      Is Peer Reviewed
  • Item Type
      Item Type
      Clear All
      Item Type
  • Subject
      Subject
      Clear All
      Subject
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
256 result(s) for "Multimodal databases"
Sort by:
A review on multimodal medical image fusion: Compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics
Over the past two decades, medical imaging has been extensively apply to diagnose diseases. Medical experts continue to have difficulties for diagnosing diseases with a single modality owing to a lack of information in this domain. Image fusion may be use to merge images of specific organs with diseases from a variety of medical imaging systems. Anatomical and physiological data may be included in multi-modality image fusion, making diagnosis simpler. It is a difficult challenge to find the best multimodal medical database with fusion quality evaluation for assessing recommended image fusion methods. As a result, this article provides a complete overview of multimodal medical image fusion methodologies, databases, and quality measurements. In this article, a compendious review of different medical imaging modalities and evaluation of related multimodal databases along with the statistical results is provided. The medical imaging modalities are organized based on radiation, visible-light imaging, microscopy, and multimodal imaging. The medical imaging acquisition is categorized into invasive or non-invasive techniques. The fusion techniques are classified into six main categories: frequency fusion, spatial fusion, decision-level fusion, deep learning, hybrid fusion, and sparse representation fusion. In addition, the associated diseases for each modality and fusion approach presented. The quality assessments fusion metrics are also encapsulated in this article. This survey provides a baseline guideline to medical experts in this technical domain that may combine preoperative, intraoperative, and postoperative imaging, Multi-sensor fusion for disease detection, etc. The advantages and drawbacks of the current literature are discussed, and future insights are provided accordingly.
Assessment of Clinical Metadata on the Accuracy of Retinal Fundus Image Labels in Diabetic Retinopathy in Uganda: Case-Crossover Study Using the Multimodal Database of Retinal Images in Africa
Labeling color fundus photos (CFP) is an important step in the development of artificial intelligence screening algorithms for the detection of diabetic retinopathy (DR). Most studies use the International Classification of Diabetic Retinopathy (ICDR) to assign labels to CFP, plus the presence or absence of macular edema (ME). Images can be grouped as referrable or nonreferrable according to these classifications. There is little guidance in the literature about how to collect and use metadata as a part of the CFP labeling process. This study aimed to improve the quality of the Multimodal Database of Retinal Images in Africa (MoDRIA) by determining whether the availability of metadata during the image labeling process influences the accuracy, sensitivity, and specificity of image labels. MoDRIA was developed as one of the inaugural research projects of the Mbarara University Data Science Research Hub, part of the Data Science for Health Discovery and Innovation in Africa (DS-I Africa) initiative. This is a crossover assessment with 2 groups and 2 phases. Each group had 10 randomly assigned labelers who provided an ICDR score and the presence or absence of ME for each of the 50 CFP in a test image with and without metadata including blood pressure, visual acuity, glucose, and medical history. Specificity and sensitivity of referable retinopathy were based on ICDR scores, and ME was calculated using a 2-sided t test. Comparison of sensitivity and specificity for ICDR scores and ME with and without metadata for each participant was calculated using the Wilcoxon signed rank test. Statistical significance was set at P<.05. The sensitivity for identifying referrable DR with metadata was 92.8% (95% CI 87.6-98.0) compared with 93.3% (95% CI 87.6-98.9) without metadata, and the specificity was 84.9% (95% CI 75.1-94.6) with metadata compared with 88.2% (95% CI 79.5-96.8) without metadata. The sensitivity for identifying the presence of ME was 64.3% (95% CI 57.6-71.0) with metadata, compared with 63.1% (95% CI 53.4-73.0) without metadata, and the specificity was 86.5% (95% CI 81.4-91.5) with metadata compared with 87.7% (95% CI 83.9-91.5) without metadata. The sensitivity and specificity of the ICDR score and the presence or absence of ME were calculated for each labeler with and without metadata. No findings were statistically significant. The sensitivity and specificity scores for the detection of referrable DR were slightly better without metadata, but the difference was not statistically significant. We cannot make definitive conclusions about the impact of metadata on the sensitivity and specificity of image labels in our study. Given the importance of metadata in clinical situations, we believe that metadata may benefit labeling quality. A more rigorous study to determine the sensitivity and specificity of CFP labels with and without metadata is recommended.
An insight into multimodal databases for social signal processing: acquisition, efforts, and directions
The importance of context-aware computing in understanding social signals gave a rise to a new emerging domain, called social signal processing (SSP). SSP depends heavily on the existence of comprehensive multimodal databases containing the descriptors of social context and behaviors, such as situational environment, roles and gender of human participants. In the recent paper SSP community has emphasized how current research lacks of the adequate data, for the greatest part because acquisition and annotation of large multimodal datasets are time- and resource-consuming for the researchers. This paper aims to collect the existing work in this scope and to deliver the key aspects and clear directions for managing the multimodal behavior data. It reviews some of the existing databases, gives their important characteristics and draws the most important tools and methods conducted in capturing and managing social behavior signals. Summarizing the relevant findings it also addresses the existing issues and proposes fundamental topics that need to be investigated in the future research.[PUBLICATION ABSTRACT] Erratum DOI: 10.1007/s10462-014-9424-4
The USC CreativeIT database of multimodal dyadic interactions: from speech and full body motion capture to continuous emotional annotations
Improvised acting is a viable technique to study expressive human communication and to shed light into actors' creativity. The USC CreativelT database provides a novel, freely-available multimodal resource for the study of theatrical improvisation and rich expressive human behavior (speech and body language) in dyadic interactions. The theoretical design of the database is based on the well-established improvisation technique of Active Analysis in order to provide naturally induced affective and expressive, goal-driven interactions. This database contains dyadic theatrical improvisations performed by 16 actors, providing detailed full body motion capture data and audio data of each participant in an interaction. The carefully engineered data collection, the improvisation design to elicit natural emotions and expressive speech and body language, as well as the well-developed annotation processes provide a gateway to study and model various aspects of theatrical performance, expressive behaviors and human communication and interaction.
Human Action Recognition from Motion Capture Data based on Curve Matching
Human action recognition can be used in a wide variety of scenarios in many areas of human activity, such as medicine, public safety, gaming and entertainment, etc. In this paper, we focus on the problem of human action recognition based on data obtained using motion capture systems. To solve this problem, we use an approach based on the transition of original motion capture data to sequences of points in a lower-dimensional subspace and subsequent classification of human actions by matching the trajectories of points in the specified subspace. In particular, to match the trajectories, we explore the Frechet distance and the Dynamic Time Warping distance in both dependent and independent forms. To form a lower-dimensional subspace, we consider two well-proven approaches: the Principal Component Analysis technique and supervised feature selection procedure. We compare obtained results with alternative techniques using open Berkeley Human Action Database.
Multimodal E-Textbook Development for the Course of Intercultural Communication of National Image
This study investigates e-textbook development for the course of intercultural communication of national image for English majors and learners in the context of integrating ideological and curriculum education in the Chinese mainland. Under the framework of Fairclough's three-dimensional discourse analysis and glocalization in intercultural communication, the study proposes an e-textbook development workflow involving text design, discursive database construction, and social investigation and explores the unit design strategies for the course, paying special attention to integrate ideological elements properly into intercultural communication studies in each unit. Following authenticity principle and presentation-practice-production (P-P-P) model, the study constructs an e-textbook system featuring by unit design with contents and modules both linguistic theories based and intercultural communication oriented. This e-textbook will contribute to the cultivation of a locally grounded, globally minded intercultural communicator of national image.
HiMotion: a new research resource for the study of behavior, cognition, and emotion
The HiMotion research project was designed to create a multimodal database and several support tools for the study of human behavior, cognition and emotion, in the context of computer-based tasks designed to elicit cognitive load and specialized affective responses. The database includes both human-computer interaction (HCI) and psychophysiological data, collected through an experimental setup that we devised for synchronized recording of keyboard, mouse, and central/ peripheral nervous system measurements. Currently we provide a battery of five different cognitive tasks, and a video bank for affective elicitation, together with a set of introductory and self-reporting screens. We have conducted two experiments, one involving a population of 27 subjects, which followed the cognitive tasks protocol, and another involving a population of 20 subjects, which followed the video bank visualization protocol. We provide an overview of several studies that have used the HiMotion database to test multiple hypothesis in the behavioral and affective domains, highlighting the usefulness of our contribution.
Multi-modal emotion recognition using EEG and speech signals
Automatic Emotion Recognition (AER) is critical for naturalistic Human–Machine Interactions (HMI). Emotions can be detected through both external behaviors, e.g., tone of voice and internal physiological signals, e.g., electroencephalogram (EEG). In this paper, we first constructed a multi-modal emotion database, named Multi-modal Emotion Database with four modalities (MED4). MED4 consists of synchronously recorded signals of participants’ EEG, photoplethysmography, speech and facial images when they were influenced by video stimuli designed to induce happy, sad, angry and neutral emotions. The experiment was performed with 32 participants in two environment conditions, a research lab with natural noises and an anechoic chamber. Four baseline algorithms were developed to verify the database and the performances of AER methods, Identification-vector + Probabilistic Linear Discriminant Analysis (I-vector + PLDA), Temporal Convolutional Network (TCN), Extreme Learning Machine (ELM) and Multi-Layer Perception Network (MLP). Furthermore, two fusion strategies on feature-level and decision-level respectively were designed to utilize both external and internal information of human status. The results showed that EEG signals generate higher accuracy in emotion recognition than that of speech signals (achieving 88.92% in anechoic room and 89.70% in natural noisy room vs 64.67% and 58.92% respectively). Fusion strategies that combine speech and EEG signals can improve overall accuracy of emotion recognition by 25.92% when compared to speech and 1.67% when compared to EEG in anechoic room and 31.74% and 0.96% in natural noisy room. Fusion methods also enhance the robustness of AER in the noisy environment. The MED4 database will be made publicly available, in order to encourage researchers all over the world to develop and validate various advanced methods for AER. •A database, MED4, with multi-modality and multi-environment, was constructed under controlled conditions for four types of emotions.•EEG data with variable-length were tested and shown to be more efficient in emotion recognition.•Compared to speech signals, EEG signals contain more emotional information and are more robust to noise.•The decision-level fusion method of WAVER outperforms the other methods in both accuracy and noise-robustness for emotion recognition.
Isabl Platform, a digital biobank for processing multimodal patient data
Background The widespread adoption of high throughput technologies has democratized data generation. However, data processing in accordance with best practices remains challenging and the data capital often becomes siloed. This presents an opportunity to consolidate data assets into digital biobanks—ecosystems of readily accessible, structured, and annotated datasets that can be dynamically queried and analysed. Results We present Isabl, a customizable plug-and-play platform for the processing of multimodal patient-centric data. Isabl's architecture consists of a relational database (Isabl DB), a command line client (Isabl CLI), a RESTful API (Isabl API) and a frontend web application (Isabl Web). Isabl supports automated deployment of user-validated pipelines across the entire data capital. A full audit trail is maintained to secure data provenance, governance and ensuring reproducibility of findings. Conclusions As a digital biobank, Isabl supports continuous data utilization and automated meta analyses at scale, and serves as a catalyst for research innovation, new discoveries, and clinical translation.
Moving-Talker, Speaker-Independent Feature Study, and Baseline Results Using the CUAVE Multimodal Speech Corpus
Strides in computer technology and the search for deeper, more powerful techniques in signal processing have brought multimodal research to the forefront in recent years. Audio-visual speech processing has become an important part of this research because it holds great potential for overcoming certain problems of traditional audio-only methods. Difficulties, due to background noise and multiple speakers in an application environment, are significantly reduced by the additional information provided by visual features. This paper presents information on a new audio-visual database, a feature study on moving speakers, and on baseline results for the whole speaker group. Although a few databases have been collected in this area, none has emerged as a standard for comparison. Also, efforts to date have often been limited, focusing on cropped video or stationary speakers. This paper seeks to introduce a challenging audio-visual database that is flexible and fairly comprehensive, yet easily available to researchers on one DVD. The Clemson University Audio-Visual Experiments (CUAVE) database is a speaker-independent corpus of both connected and continuous digit strings totaling over 7000 utterances. It contains a wide variety of speakers and is designed to meet several goals discussed in this paper. One of these goals is to allow testing of adverse conditions such as moving talkers and speaker pairs. A feature study of connected digit strings is also discussed. It compares stationary and moving talkers in a speaker-independent grouping. An image-processing-based contour technique, an image transform method, and a deformable template scheme are used in this comparison to obtain visual features. This paper also presents methods and results in an attempt to make these techniques more robust to speaker movement. Finally, initial baseline speaker-independent results are included using all speakers, and conclusions as well as suggested areas of research are given.