Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
256
result(s) for
"multimodal database"
Sort by:
A review on multimodal medical image fusion: Compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics
by
Gandomi, Amir H.
,
Rehman, Eid
,
Azam, Muhammad Adeel
in
Algorithms
,
Benchmarking
,
Business metrics
2022
Over the past two decades, medical imaging has been extensively apply to diagnose diseases. Medical experts continue to have difficulties for diagnosing diseases with a single modality owing to a lack of information in this domain. Image fusion may be use to merge images of specific organs with diseases from a variety of medical imaging systems. Anatomical and physiological data may be included in multi-modality image fusion, making diagnosis simpler. It is a difficult challenge to find the best multimodal medical database with fusion quality evaluation for assessing recommended image fusion methods. As a result, this article provides a complete overview of multimodal medical image fusion methodologies, databases, and quality measurements.
In this article, a compendious review of different medical imaging modalities and evaluation of related multimodal databases along with the statistical results is provided. The medical imaging modalities are organized based on radiation, visible-light imaging, microscopy, and multimodal imaging.
The medical imaging acquisition is categorized into invasive or non-invasive techniques. The fusion techniques are classified into six main categories: frequency fusion, spatial fusion, decision-level fusion, deep learning, hybrid fusion, and sparse representation fusion. In addition, the associated diseases for each modality and fusion approach presented. The quality assessments fusion metrics are also encapsulated in this article.
This survey provides a baseline guideline to medical experts in this technical domain that may combine preoperative, intraoperative, and postoperative imaging, Multi-sensor fusion for disease detection, etc. The advantages and drawbacks of the current literature are discussed, and future insights are provided accordingly.
Journal Article
The USC CreativeIT database of multimodal dyadic interactions: from speech and full body motion capture to continuous emotional annotations
2016
Improvised acting is a viable technique to study expressive human communication and to shed light into actors' creativity. The USC CreativelT database provides a novel, freely-available multimodal resource for the study of theatrical improvisation and rich expressive human behavior (speech and body language) in dyadic interactions. The theoretical design of the database is based on the well-established improvisation technique of Active Analysis in order to provide naturally induced affective and expressive, goal-driven interactions. This database contains dyadic theatrical improvisations performed by 16 actors, providing detailed full body motion capture data and audio data of each participant in an interaction. The carefully engineered data collection, the improvisation design to elicit natural emotions and expressive speech and body language, as well as the well-developed annotation processes provide a gateway to study and model various aspects of theatrical performance, expressive behaviors and human communication and interaction.
Journal Article
Human Action Recognition from Motion Capture Data based on Curve Matching
2025
Human action recognition can be used in a wide variety of scenarios in many areas of human activity, such as medicine, public safety, gaming and entertainment, etc. In this paper, we focus on the problem of human action recognition based on data obtained using motion capture systems. To solve this problem, we use an approach based on the transition of original motion capture data to sequences of points in a lower-dimensional subspace and subsequent classification of human actions by matching the trajectories of points in the specified subspace. In particular, to match the trajectories, we explore the Frechet distance and the Dynamic Time Warping distance in both dependent and independent forms. To form a lower-dimensional subspace, we consider two well-proven approaches: the Principal Component Analysis technique and supervised feature selection procedure. We compare obtained results with alternative techniques using open Berkeley Human Action Database.
Journal Article
HiMotion: a new research resource for the study of behavior, cognition, and emotion
2014
The HiMotion research project was designed to create a multimodal database and several support tools for the study of human behavior, cognition and emotion, in the context of computer-based tasks designed to elicit cognitive load and specialized affective responses. The database includes both human-computer interaction (HCI) and psychophysiological data, collected through an experimental setup that we devised for synchronized recording of keyboard, mouse, and central/ peripheral nervous system measurements. Currently we provide a battery of five different cognitive tasks, and a video bank for affective elicitation, together with a set of introductory and self-reporting screens. We have conducted two experiments, one involving a population of 27 subjects, which followed the cognitive tasks protocol, and another involving a population of 20 subjects, which followed the video bank visualization protocol. We provide an overview of several studies that have used the HiMotion database to test multiple hypothesis in the behavioral and affective domains, highlighting the usefulness of our contribution.
Journal Article
Multimodal E-Textbook Development for the Course of Intercultural Communication of National Image
2022
This study investigates e-textbook development for the course of intercultural communication of national image for English majors and learners in the context of integrating ideological and curriculum education in the Chinese mainland. Under the framework of Fairclough's three-dimensional discourse analysis and glocalization in intercultural communication, the study proposes an e-textbook development workflow involving text design, discursive database construction, and social investigation and explores the unit design strategies for the course, paying special attention to integrate ideological elements properly into intercultural communication studies in each unit. Following authenticity principle and presentation-practice-production (P-P-P) model, the study constructs an e-textbook system featuring by unit design with contents and modules both linguistic theories based and intercultural communication oriented. This e-textbook will contribute to the cultivation of a locally grounded, globally minded intercultural communicator of national image.
Journal Article
Assessment of Clinical Metadata on the Accuracy of Retinal Fundus Image Labels in Diabetic Retinopathy in Uganda: Case-Crossover Study Using the Multimodal Database of Retinal Images in Africa
2024
Labeling color fundus photos (CFP) is an important step in the development of artificial intelligence screening algorithms for the detection of diabetic retinopathy (DR). Most studies use the International Classification of Diabetic Retinopathy (ICDR) to assign labels to CFP, plus the presence or absence of macular edema (ME). Images can be grouped as referrable or nonreferrable according to these classifications. There is little guidance in the literature about how to collect and use metadata as a part of the CFP labeling process.
This study aimed to improve the quality of the Multimodal Database of Retinal Images in Africa (MoDRIA) by determining whether the availability of metadata during the image labeling process influences the accuracy, sensitivity, and specificity of image labels. MoDRIA was developed as one of the inaugural research projects of the Mbarara University Data Science Research Hub, part of the Data Science for Health Discovery and Innovation in Africa (DS-I Africa) initiative.
This is a crossover assessment with 2 groups and 2 phases. Each group had 10 randomly assigned labelers who provided an ICDR score and the presence or absence of ME for each of the 50 CFP in a test image with and without metadata including blood pressure, visual acuity, glucose, and medical history. Specificity and sensitivity of referable retinopathy were based on ICDR scores, and ME was calculated using a 2-sided t test. Comparison of sensitivity and specificity for ICDR scores and ME with and without metadata for each participant was calculated using the Wilcoxon signed rank test. Statistical significance was set at P<.05.
The sensitivity for identifying referrable DR with metadata was 92.8% (95% CI 87.6-98.0) compared with 93.3% (95% CI 87.6-98.9) without metadata, and the specificity was 84.9% (95% CI 75.1-94.6) with metadata compared with 88.2% (95% CI 79.5-96.8) without metadata. The sensitivity for identifying the presence of ME was 64.3% (95% CI 57.6-71.0) with metadata, compared with 63.1% (95% CI 53.4-73.0) without metadata, and the specificity was 86.5% (95% CI 81.4-91.5) with metadata compared with 87.7% (95% CI 83.9-91.5) without metadata. The sensitivity and specificity of the ICDR score and the presence or absence of ME were calculated for each labeler with and without metadata. No findings were statistically significant.
The sensitivity and specificity scores for the detection of referrable DR were slightly better without metadata, but the difference was not statistically significant. We cannot make definitive conclusions about the impact of metadata on the sensitivity and specificity of image labels in our study. Given the importance of metadata in clinical situations, we believe that metadata may benefit labeling quality. A more rigorous study to determine the sensitivity and specificity of CFP labels with and without metadata is recommended.
Journal Article
WAUC: A Multi-Modal Database for Mental Workload Assessment Under Physical Activity
by
Lafond, Daniel
,
Tiwari, Abhishek
,
Falk, Tiago H.
in
ambulant subjects
,
Data collection
,
Datasets
2020
Assessment of mental workload is crucial for applications that require sustained attention and where conditions such as mental fatigue and drowsiness must be avoided. Previous work that attempted to devise objective methods to model mental workload were mainly based on neurological or physiological data collected when the participants performed tasks that did not involve physical activity. While such models may be useful for scenarios that involve static operators, they may not apply in real-world situations where operators are performing tasks under varying levels of physical activity, such as those faced by first responders, firefighters, and police officers. Here, we describe WAUC, a multimodal database of mental Workload Assessment Under physical aCtivity. The study involved 48 participants who performed the NASA Revised Multi-Attribute Task Battery II under three different activity level conditions. Physical activity was manipulated by changing the speed of a stationary bike or a treadmill. During data collection, six neural and physiological modalities were recorded, namely: electroencephalography, electrocardiography, breathing rate, skin temperature, galvanic skin response, and blood volume pulse, in addition to 3-axis accelerometry. Moreover, participants were asked to answer the NASA Task Load Index questionnaire after each experimental section, as well as rate their physical fatigue level on the Borg fatigue scale. In order to bring our experimental setup closer to real-world situations, all signals were monitored using wearable, off-the-shelf devices. In this paper, we describe the adopted experimental protocol, as well as validate the subjective, neural, and physiological data collected. The WAUC database, including the raw data and features, subjective ratings, and scripts to reproduce the experiments reported herein will be made available at:
http://musaelab.ca/resources/
.
Journal Article
Moving-Talker, Speaker-Independent Feature Study, and Baseline Results Using the CUAVE Multimodal Speech Corpus
by
Gurbuz, Sabri
,
Gowdy, John N.
,
Tufekci, Zekeriya
in
Audio data
,
audio-visual speech recognition
,
Background noise
2002
Strides in computer technology and the search for deeper, more powerful techniques in signal processing have brought multimodal research to the forefront in recent years. Audio-visual speech processing has become an important part of this research because it holds great potential for overcoming certain problems of traditional audio-only methods. Difficulties, due to background noise and multiple speakers in an application environment, are significantly reduced by the additional information provided by visual features. This paper presents information on a new audio-visual database, a feature study on moving speakers, and on baseline results for the whole speaker group. Although a few databases have been collected in this area, none has emerged as a standard for comparison. Also, efforts to date have often been limited, focusing on cropped video or stationary speakers. This paper seeks to introduce a challenging audio-visual database that is flexible and fairly comprehensive, yet easily available to researchers on one DVD. The Clemson University Audio-Visual Experiments (CUAVE) database is a speaker-independent corpus of both connected and continuous digit strings totaling over 7000 utterances. It contains a wide variety of speakers and is designed to meet several goals discussed in this paper. One of these goals is to allow testing of adverse conditions such as moving talkers and speaker pairs. A feature study of connected digit strings is also discussed. It compares stationary and moving talkers in a speaker-independent grouping. An image-processing-based contour technique, an image transform method, and a deformable template scheme are used in this comparison to obtain visual features. This paper also presents methods and results in an attempt to make these techniques more robust to speaker movement. Finally, initial baseline speaker-independent results are included using all speakers, and conclusions as well as suggested areas of research are given.
Journal Article
An insight into multimodal databases for social signal processing: acquisition, efforts, and directions
2014
The importance of context-aware computing in understanding social signals gave a rise to a new emerging domain, called social signal processing (SSP). SSP depends heavily on the existence of comprehensive multimodal databases containing the descriptors of social context and behaviors, such as situational environment, roles and gender of human participants. In the recent paper SSP community has emphasized how current research lacks of the adequate data, for the greatest part because acquisition and annotation of large multimodal datasets are time- and resource-consuming for the researchers. This paper aims to collect the existing work in this scope and to deliver the key aspects and clear directions for managing the multimodal behavior data. It reviews some of the existing databases, gives their important characteristics and draws the most important tools and methods conducted in capturing and managing social behavior signals. Summarizing the relevant findings it also addresses the existing issues and proposes fundamental topics that need to be investigated in the future research.[PUBLICATION ABSTRACT] Erratum DOI: 10.1007/s10462-014-9424-4
Journal Article
Project, toolkit, and database of neuroinformatics ecosystem: A summary of previous studies on “Frontiers in Neuroinformatics”
2022
In the field of neuroscience, the core of cohort study project consists in the collection, analysis and sharing of multi-modal data. Recent years has witnessed a host of efficient and high-quality toolkits published and employed to improve the quality of multi-modal data in cohort study. In turn, to glean answers to relevant questions from such a huge collection of papers is a time-consuming task for cohort researchers. As part of our efforts to tackle this problem, we propose a hierarchical neuroscience knowledge base that consists of projects/organizations, multi-modal databases and toolkits, so as to facilitate researchers’ answer searching process. We first classified the topics of the articles in Frontiers in Neuroinformatics according to the multi-modal data life cycle, and from these articles extracted such information objects as projects/organizations, multi-modal databases, and toolkits. Then, we map these information objects into our proposed knowledge base framework. A Python-based query tool has also been developed in tandem for quicker access the knowledge base, (accessible at https://github.com/Romantic-Pumpkin/PDT_fninf). Finally, based on constructed knowledge base, we discussed some key research issues and underlying trends in different stages of the multi-modal data life cycle.
Journal Article