Catalogue Search | MBRL

On the Consistency of 360 Video Quality Assessment in Repeated Subjective Tests: A Pilot Study

by Zepernick, Hans-Juergen , Elwardy, Majed , Hu, Yan in 360 video , 360° video , annotated dataset

2024

Immersive media such as virtual reality, augmented reality, and 360◦ video have seen tremendous technological developments in recent years. Furthermore, the advances in head-mounted displays (HMDs) offer the users increased immersive experiences compared to conventional displays. To develop novel immersive media systems and services that satisfy the expectations of the users, it is essential to conduct subjective tests revealing users’ perceived quality of immersive media. However, due to the new viewing dimensions provided by HMDs and the potential of interacting with the content, a wide range of subjective tests are required to understand the many aspects of user behavior in and quality perception of immersive media. The ground truth obtained by such subjective tests enable the development of optimized immersive media systems that fulfill the expectations of the users. This article focuses on the consistency of 360◦ video quality assessment to reveal whether users’ subjective quality assessment of such immersive visual stimuli changes fundamentally over time or is kept consistent with each user having their own behavior signature. A pilot study was conducted under pandemic conditions with participants given the task of rating the quality of 360◦ video stimuli on an HMD in standing and seated viewing. The choice of conducting a pilot study is motivated by the fact that immersive media impose high cognitive load on the participants and the need to keep the number of participants under pandemic conditions as low as possible. To gain insight into the consistency of the participants’ 360◦ video assessment over time, three sessions were held for each participant and each viewing condition with long and short breaks between sessions. In particular, the opinion scores and head movements were recorded for each participant and each session in standing and seated viewing. The statistical analysis of this data leads to the conjecture that the quality rating stays consistent throughout these sessions with each participant having their own quality assessment signature. The head movements, indicating the participants’ scene exploration during the quality assessment task, also remain consistent for each participant according their individual narrower or wider scene exploration signature. These findings are more pronounced for standing viewing than for seated viewing. This work supports the role of pilot studies being a useful approach of conducting pre-tests on immersive media quality under opportunity-limited conditions and for the planning of subsequent full subjective tests with a large panel of participants. The annotated RQA360 dataset containing the data recorded in the repeated subjective tests is made publicly available to the research community.

Journal Article

Share this book

Add to My Shelf

Performance Metrics for Multilabel Emotion Classification: Comparing Micro, Macro, and Weighted F1-Scores

by Hinojosa Lee, Maria Cristina , Braet, Johan , Springael, Johan in Accuracy , Analysis , annotated datasets

2024

This study compares various F1-score variants—micro, macro, and weighted—to assess their performance in evaluating text-based emotion classification. Lexicon distillation is employed using the multilabel emotion-annotated datasets XED and GoEmotions. The aim of this paper is to understand when each F1-score variant is better suited for evaluating text-based multilabel emotion classification. Unigram lexicons were derived from the annotated GoEmotions and XED datasets through a binary classification approach. The distilled lexicons were then applied to the GoEmotions and XED annotated datasets to calculate their emotional content, and the results were compared. The findings highlight the behavior of each F1-score variant under different class distributions, emphasizing the importance of appropriate metric selection for reliable model performance evaluation in imbalanced multilabel datasets. Additionally, this study also investigates the effect of the aggregation of negative emotions into broader categories on said F1 metrics. The contribution of this study is to provide insights into how different F1-score variants could improve the reliability of multilabel emotion classifier evaluation, particularly in the context of class imbalance present in the case of phishing emails.

Journal Article

Share this book

Add to My Shelf

Liver margin segmentation in abdominal CT images using U-Net and Detectron2: annotated dataset for deep learning models

by Salimi, Ali , Zonouri, Seyed Abed , Seifi, Mehrdad in 639/166/985 , 639/166/987 , 692/700/1421

2025

The segmentation of liver margins in computed tomography (CT) images presents significant challenges due to the complex anatomical variability of the liver, with critical implications for medical diagnostics and treatment planning. In this study, we leverage a substantial dataset of over 4,200 abdominal CT images, meticulously annotated by expert radiologists from Taleghani Hospital in Kermanshah, Iran. Now made available to the research community, this dataset serves as a rich resource for enhancing and validating various neural network models. We employed two advanced deep neural network models, U-Net and Detectron2, for liver segmentation tasks. In terms of the Mask Intersection over Union (Mask IoU) metric, U-Net achieved an Mask IoU of 0.903, demonstrating high efficacy in simpler cases. In contrast, Detectron2 outperformed U-Net with an Mask IoU of 0.974, particularly excelling in accurately delineating liver boundaries in complex cases where the liver appears segmented into two distinct regions within the images. This highlights Detectron2’s advanced potential in handling anatomical variations that pose challenges for other models. Our findings not only provide a robust comparative analysis of these models but also establish a framework for further enhancements in medical imaging segmentation tasks. The initiative aims not just to refine liver margin detection but also to facilitate the development of automated systems for diagnosing liver diseases, with potential future applications extending these methodologies to other abdominal organs, potentially transforming the landscape of computational diagnostics in healthcare.

Journal Article

Share this book

Add to My Shelf

Annotated Video Footage for Automated Identification and Counting of Fish in Unconstrained Seagrass Habitats

by Lopez-Marcano, Sebastian , Ditria, Ellen M. , Connolly, Rod M. in Algorithms , annotated dataset , Annotations

2021

Automated monitoring using deep learning can reduce labor costs and increase efficiency and has been shown to be equally or more accurate than humans at processing data (Torney et al., 2019; Ditria et al., 2020a). [...]the expansion of deep learning techniques in the last few years in marine science call for higher volumes of data for training than traditional machine learning methods. [...]there is a need for accessible, quality annotated datasets for deep learning models to further the progress of applying these techniques in ecology. The contributions of this dataset include: (1) a comprehensive dataset of ecologically important fish species that captures the complexity of backgrounds observed in unconstrained seagrass ecosystems to form a robust and flexible model; (2) a variety of modalities for rapid and flexible testing or comparison of different frameworks, and (3) unaltered imagery for investigation of possible data augmentation and performance enhancement using pre- and post-processing techniques. Dataset To continue the development of automated tools for fish monitoring, we report a dataset “Annotated videos of luderick from estuaries in southeast Queensland, Australia” which was used to train a deep learning algorithm for automated species identification and abundance counts presented in Ditria et al.

Journal Article

Share this book

Add to My Shelf

Annotated Dataset for Training Cloud Segmentation Neural Networks Using High-Resolution Satellite Remote Sensing Imagery

by Zhang, Jie , Gao, Zebin , Zuo, Xinjie in Accuracy , Algorithms , annotated dataset

2024

The integration of satellite data with deep learning has revolutionized various tasks in remote sensing, including classification, object detection, and semantic segmentation. Cloud segmentation in high-resolution satellite imagery is a critical application within this domain, yet progress in developing advanced algorithms for this task has been hindered by the scarcity of specialized datasets and annotation tools. This study addresses this challenge by introducing CloudLabel, a semi-automatic annotation technique leveraging region growing and morphological algorithms including flood fill, connected components, and guided filter. CloudLabel v1.0 streamlines the annotation process for high-resolution satellite images, thereby addressing the limitations of existing annotation platforms which are not specifically adapted to cloud segmentation, and enabling the efficient creation of high-quality cloud segmentation datasets. Notably, we have curated the Annotated Dataset for Training Cloud Segmentation (ADTCS) comprising 32,065 images (512 × 512) for cloud segmentation based on CloudLabel. The ADTCS dataset facilitates algorithmic advancement in cloud segmentation, characterized by uniform cloud coverage distribution and high image entropy (mainly 5–7). These features enable deep learning models to capture comprehensive cloud characteristics, enhancing recognition accuracy and reducing ground object misclassification. This contribution significantly advances remote sensing applications and cloud segmentation algorithms.

Journal Article

Share this book

Add to My Shelf

Mix-Lingual Relation Extraction: Dataset and a Training Approach

by Chen, Jia-Jun , Ma, Zheng , Chu, You-Gang in Artificial Intelligence , Computer Science , Data Structures and Information Theory

2025

Relation extraction is a pivotal task within the field of natural language processing, boasting numerous real-world applications. Existing research predominantly centers on monolingual relation extraction or cross-lingual enhancement for relation extraction. However, there exists a notable gap in understanding relation extraction within mix-lingual (or code-switching) scenarios. In these scenarios, individuals blend content from different languages within sentences, generating mix-lingual content. The effectiveness of existing relation extraction models in such scenarios remains largely unexplored due to the absence of dedicated datasets. To address this gap, we introduce the Mix-Lingual Relation Extraction (MixRE) task and construct a human-annotated dataset MixRED to support this task. Additionally, we propose a hierarchical training approach for the mix-lingual scenario named Mix-Lingual Training (MixTrain), designed to enhance the performance of large language models (LLMs) when capturing relational dependencies from mix-lingual content spanning different semantic levels. Our experiments involve evaluating state-of-the-art supervised models and LLMs on the constructed dataset, with results indicating that MixTrain notably improves model performance. Moreover, we investigate the effectiveness of using mix-lingual content as a tool to transfer learned relational dependencies across different languages. Additionally, we delve into factors influencing model performance for both supervised models and LLMs in the novel MixRE task.

Journal Article

Share this book

Add to My Shelf

NewsSumm: The World’s Largest Human-Annotated Multi-Document News Summarization Dataset for Indian English

by Agarwal, Megha , Agrawal, Avinash , Motghare, Manish in abstractive summarization , Annotations , Benchmarks

2025

The rapid growth of digital journalism has heightened the need for reliable multi-document summarization (MDS) systems, particularly in underrepresented, low-resource, and culturally distinct contexts. However, current progress is hindered by a lack of large-scale, high-quality non-Western datasets. Existing benchmarks—such as CNN/DailyMail, XSum, and MultiNews—are limited by language, regional focus, or reliance on noisy, auto-generated summaries. We introduce NewsSumm, the largest human-annotated MDS dataset for Indian English, curated by over 14,000 expert annotators through the Suvidha Foundation. Spanning 36 Indian English newspapers from 2000 to 2025 and covering more than 20 topical categories, NewsSumm includes over 317,498 articles paired with factually accurate, professionally written abstractive summaries. We detail its robust collection, annotation, and quality control pipelines, and present extensive statistical, linguistic, and temporal analyses that underscore its scale and diversity. To establish benchmarks, we evaluate PEGASUS, BART, and T5 models on NewsSumm, reporting aggregate and category-specific ROUGE scores, as well as factual consistency metrics. All NewsSumm dataset materials are openly released via Zenodo. NewsSumm offers a foundational resource for advancing research in summarization, factuality, timeline synthesis, and domain adaptation for Indian English and other low-resource language settings.

Journal Article

Share this book

Add to My Shelf

Deep Learning Methods for Ancient Arabic Handwritten Script Recognition: A Review of Challenges and Approaches

by Lamghari, Nidal , Bouchantouf, Aya in Accuracy , Annotations , Artificial neural networks

2025

The problem is made more difficult by the fact that the recognition of ancient handwritten Arabic script (AHR) is written in cursive, has different historical styles, and the manuscripts are often damaged. In addition, ancient handwriting does not follow modern standards of handwriting spacing, which includes overly spaced-out words and overly complex diacritics, which makes it extremely difficult to process. This irregularity causes ambiguity in character segmentation and word boundaries, increasing the error rate in automatic recognition systems. Even with modern advancements in deep learning through the use of CNNs, LSTMs, and hybrid models, AHR is still extremely complex and requires a lot of exploration. Some recent models have achieved accuracy between 70% and 90% on modern Arabic datasets, but performance drops to 50%–75% when applied to ancient texts due to noise, script variation, and limited annotated data. The article consolidates the major issues and recent developments with regard to dataset constraints, preprocessing requirements, and machine learning methodologies. This review is based on the analysis of over 50 peer-reviewed papers published between 2016 and 2024. It is also focused on the importance of deep learning in the image feature extraction by CNNs, sequential feature modeling by LSTMs, and combination of both – hybrids. For instance, CNN-LSTM architectures have shown promising results on historical scripts with limited training data. With so little annotated data available, it concentrates on the augmentation of datasets and creation of synthetic data. Techniques such as elastic distortions, GAN-generated samples, and noise injection are discussed as potential solutions. This work aims to improve the accuracy and scalability of AHR through analysis of existing techniques and identification of the gaps for further research to aid in digitization and analysis of manuscripts to safeguard them as a part of cultural heritage. In particular, this review highlights the lack of standardized benchmarks and the need for multilingual ancient Arabic datasets to support reproducible research.

Journal Article

Share this book

Add to My Shelf

A Review of Generative Adversarial Networks for Computer Vision Tasks

by Radu, Șerban , Florea, Adina Magda , Simion, Ana-Maria in Computer vision , Datasets , Editing

2024

In recent years, computer vision tasks have gained a lot of popularity, accompanied by the development of numerous powerful architectures consistently delivering outstanding results when applied to well-annotated datasets. However, acquiring a high-quality dataset remains a challenge, particularly in sensitive domains like medical imaging, where expense and ethical concerns represent a challenge. Generative adversarial networks (GANs) offer a possible solution to artificially expand datasets, providing a basic resource for applications requiring large and diverse data. This work presents a thorough review and comparative analysis of the most promising GAN architectures. This review is intended to serve as a valuable reference for selecting the most suitable architecture for diverse projects, diminishing the challenges posed by limited and constrained datasets. Furthermore, we developed practical experimentation, focusing on the augmentation of a medical dataset derived from a colonoscopy video. We also applied one of the GAN architectures outlined in our work to a dataset consisting of histopathology images. The goal was to illustrate how GANs can enhance and augment datasets, showcasing their potential to improve overall data quality. Through this research, we aim to contribute to the broader understanding and application of GANs in scenarios where dataset scarcity poses a significant obstacle, particularly in medical imaging applications.

Journal Article

Share this book

Add to My Shelf

A Computational Approach to Hand Pose Recognition in Early Modern Paintings

by Cetinić, Eva , Impett, Leonardo , Bernasconi, Valentine in annotated dataset , Art history , Classification

2023

Hands represent an important aspect of pictorial narration but have rarely been addressed as an object of study in art history and digital humanities. Although hand gestures play a significant role in conveying emotions, narratives, and cultural symbolism in the context of visual art, a comprehensive terminology for the classification of depicted hand poses is still lacking. In this article, we present the process of creating a new annotated dataset of pictorial hand poses. The dataset is based on a collection of European early modern paintings, from which hands are extracted using human pose estimation (HPE) methods. The hand images are then manually annotated based on art historical categorization schemes. From this categorization, we introduce a new classification task and perform a series of experiments using different types of features, including our newly introduced 2D hand keypoint features, as well as existing neural network-based features. This classification task represents a new and complex challenge due to the subtle and contextually dependent differences between depicted hands. The presented computational approach to hand pose recognition in paintings represents an initial attempt to tackle this challenge, which could potentially advance the use of HPE methods on paintings, as well as foster new research on the understanding of hand gestures in art.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter