Search Results Heading

MBRLSearchResults

mbrl.module.common.modules.added.book.to.shelf
Title added to your shelf!
View what I already have on My Shelf.
Oops! Something went wrong.
Oops! Something went wrong.
While trying to add the title to your shelf something went wrong :( Kindly try again later!
Are you sure you want to remove the book from the shelf?
Oops! Something went wrong.
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
    Done
    Filters
    Reset
  • Language
      Language
      Clear All
      Language
  • Subject
      Subject
      Clear All
      Subject
  • Item Type
      Item Type
      Clear All
      Item Type
  • Discipline
      Discipline
      Clear All
      Discipline
  • Year
      Year
      Clear All
      From:
      -
      To:
  • More Filters
30 result(s) for "Thong, William"
Sort by:
Automatic Labeling of Vertebral Levels Using a Robust Template-Based Approach
Context. MRI of the spinal cord provides a variety of biomarkers sensitive to white matter integrity and neuronal function. Current processing methods are based on manual labeling of vertebral levels, which is time consuming and prone to user bias. Although several methods for automatic labeling have been published; they are not robust towards image contrast or towards susceptibility-related artifacts. Methods. Intervertebral disks are detected from the 3D analysis of the intensity profile along the spine. The robustness of the disk detection is improved by using a template of vertebral distance, which was generated from a training dataset. The developed method has been validated using T1- and T2-weighted contrasts in ten healthy subjects and one patient with spinal cord injury. Results. Accuracy of vertebral labeling was 100%. Mean absolute error was 2.1 ± 1.7 mm for T2-weighted images and 2.3 ± 1.6 mm for T1-weighted images. The vertebrae of the spinal cord injured patient were correctly labeled, despite the presence of artifacts caused by metallic implants. Discussion. We proposed a template-based method for robust labeling of vertebral levels along the whole spinal cord for T1- and T2-weighted contrasts. The method is freely available as part of the spinal cord toolbox.
Object Priors for Classifying and Localizing Unseen Actions
This work strives for the classification and localization of human actions in videos, without the need for any labeled video training examples. Where existing work relies on transferring global attribute or object information from seen to unseen action videos, we seek to classify and spatio-temporally localize unseen actions in videos from image-based object information only. We propose three spatial object priors, which encode local person and object detectors along with their spatial relations. On top we introduce three semantic object priors, which extend semantic matching through word embeddings with three simple functions that tackle semantic ambiguity, object discrimination, and object naming. A video embedding combines the spatial and semantic object priors. It enables us to introduce a new video retrieval task that retrieves action tubes in video collections based on user-specified objects, spatial relations, and object size. Experimental evaluation on five action datasets shows the importance of spatial and semantic object priors for unseen actions. We find that persons and objects have preferred spatial relations that benefit unseen action localization, while using multiple languages and simple object filtering directly improves semantic matching, leading to state-of-the-art results for both unseen action classification and localization.
Three-dimensional morphology study of surgical adolescent idiopathic scoliosis patient from encoded geometric models
Purpose The classification of three-dimensional (3D) spinal deformities remains an open question in adolescent idiopathic scoliosis. Recent studies have investigated pattern classification based on explicit clinical parameters. An emerging trend however seeks to simplify complex spine geometries and capture the predominant modes of variability of the deformation. The objective of this study is to perform a 3D characterization and morphology analysis of the thoracic and thoraco/lumbar scoliotic spines (cross-sectional study). The presence of subgroups within all Lenke types will be investigated by analyzing a simplified representation of the geometric 3D reconstruction of a patient’s spine, and to establish the basis for a new classification approach based on a machine learning algorithm. Methods Three-dimensional reconstructions of coronal and sagittal standing radiographs of 663 patients, for a total of 915 visits, covering all types of deformities in adolescent idiopathic scoliosis (single, double and triple curves) and reviewed by the 3D Classification Committee of the Scoliosis Research Society, were analyzed using a machine learning algorithm based on stacked auto-encoders. The codes produced for each 3D reconstruction would be then grouped together using an unsupervised clustering method. For each identified cluster, Cobb angle and orientation of the plane of maximum curvature in the thoracic and lumbar curves, axial rotation of the apical vertebrae, kyphosis (T4–T12), lordosis (L1–S1) and pelvic incidence were obtained. No assumptions were made regarding grouping tendencies in the data nor were the number of clusters predefined. Results Eleven groups were revealed from the 915 visits, wherein the location of the main curve, kyphosis and lordosis were the three major discriminating factors with slight overlap between groups. Two main groups emerge among the eleven different clusters of patients: a first with small thoracic deformities and large lumbar deformities, while the other with large thoracic deformities and small lumbar curvature. The main factor that allowed identifying eleven distinct subgroups within the surgical patients (major curves) from Lenke type-1 to type-6 curves, was the location of the apical vertebra as identified by the planes of maximum curvature obtained in both thoracic and thoraco/lumbar segments. Both hypokyphotic and hyperkypothic clusters were primarily composed of Lenke 1–4 curve type patients, while a hyperlordotic cluster was composed of Lenke 5 and 6 curve type patients. Conclusion The stacked auto-encoder analysis technique helped to simplify the complex nature of 3D spine models, while preserving the intrinsic properties that are typically measured with explicit parameters derived from the 3D reconstruction.
Apprentissage de représentations pour la classification d'images biomédicales
The growing accessibility of medical imaging provides new clinical applications for patient care. New clinically relevant features can now be discovered to understand, describe and represent a disease. Traditional algorithms based on hand-engineered features usually fail in biomedical applications because of their lack of ability to capture the high variability in the data. Representation learning, often called deep learning, tackles this challenge by learning multiple levels of representation. The hypothesis of this master’s thesis is that representation learning for biomedical image classification will yield additional information for the physician in his decision-making process. Therefore, the main objective is to assess the feasibility of representation learning for two different biomedical applications in order to learn clinically relevant structures within the data. First, a non-supervised learning algorithm extracts discriminant features to describe spine deformities that require a surgical intervention in patients with adolescent idiopathic scoliosis. The sub-objective is to propose an alternative to existing scoliosis classifications that only characterize spine deformities in 2D whereas a scoliotic is often deformed in 3D. 915 spine reconstructions from 663 patients were collected. Stacked auto-encoders learn a hidden representation of these reconstructions. This low-dimensional representation disentangles the main factors of variation in the geometrical appearance of spinal deformities. Sub-groups are clustered with the k-means++ algorithm. Eleven statistically significant sub-groups are extracted to explain how the different deformations of a scoliotic spine are distributed. Secondly, a supervised learning algorithm extracts discriminant features in medical images. The sub-objective is to classify every voxel in the image in order to produce kidney segmentations. 79 contrast-enhanced CT scans from 63 patients with renal complications were collected. A convolutional network is trained on a patch-based training scheme. Simple modifications to the architecture of the network, without modifying the parameters, compute the kidney segmentations on the whole image in a small amount of time. Results show high scores on the metrics used to assess the segmentations. Dice scores are 94.35% for the left kidney and 93.07% for the right kidney. The results show new perspectives for the diseases addressed in this master’s thesis. Representation learning algorithms exhibit new opportunities for an application in other biomedical tasks as long as enough observations are available.
Beyond Skin Tone: A Multidimensional Measure of Apparent Skin Color
This paper strives to measure apparent skin color in computer vision, beyond a unidimensional scale on skin tone. In their seminal paper Gender Shades, Buolamwini and Gebru have shown how gender classification systems can be biased against women with darker skin tones. Subsequently, fairness researchers and practitioners have adopted the Fitzpatrick skin type classification as a common measure to assess skin color bias in computer vision systems. While effective, the Fitzpatrick scale only focuses on the skin tone ranging from light to dark. Towards a more comprehensive measure of skin color, we introduce the hue angle ranging from red to yellow. When applied to images, the hue dimension reveals additional biases related to skin color in both computer vision datasets and models. We then recommend multidimensional skin color scales, relying on both skin tone and hue, for fairness assessments.
Query by Activity Video in the Wild
This paper focuses on activity retrieval from a video query in an imbalanced scenario. In current query-by-activity-video literature, a common assumption is that all activities have sufficient labelled examples when learning an embedding. This assumption does however practically not hold, as only a portion of activities have many examples, while other activities are only described by few examples. In this paper, we propose a visual-semantic embedding network that explicitly deals with the imbalanced scenario for activity retrieval. Our network contains two novel modules. The visual alignment module performs a global alignment between the input video and fixed-sized visual bank representations for all activities. The semantic module performs an alignment between the input video and fixed-sized semantic activity representations. By matching videos with both visual and semantic activity representations that are of equal size over all activities, we no longer ignore infrequent activities during retrieval. Experiments on a new imbalanced activity retrieval benchmark show the effectiveness of our approach for all types of activities.
Augmented Datasheets for Speech Datasets and Ethical Decision-Making
Speech datasets are crucial for training Speech Language Technologies (SLT); however, the lack of diversity of the underlying training data can lead to serious limitations in building equitable and robust SLT products, especially along dimensions of language, accent, dialect, variety, and speech impairment - and the intersectionality of speech features with socioeconomic and demographic features. Furthermore, there is often a lack of oversight on the underlying training data - commonly built on massive web-crawling and/or publicly available speech - with regard to the ethics of such data collection. To encourage standardized documentation of such speech data components, we introduce an augmented datasheet for speech datasets, which can be used in addition to \"Datasheets for Datasets\". We then exemplify the importance of each question in our augmented datasheet based on in-depth literature reviews of speech data used in domains such as machine learning, linguistics, and health. Finally, we encourage practitioners - ranging from dataset creators to researchers - to use our augmented datasheet to better define the scope, properties, and limits of speech datasets, while also encouraging consideration of data-subject protection and user community empowerment. Ethical dataset creation is not a one-size-fits-all process, but dataset creators can use our augmented datasheet to reflexively consider the social context of related SLT applications and data sources in order to foster more inclusive SLT products downstream.
Feature and Label Embedding Spaces Matter in Addressing Image Classifier Bias
This paper strives to address image classifier bias, with a focus on both feature and label embedding spaces. Previous works have shown that spurious correlations from protected attributes, such as age, gender, or skin tone, can cause adverse decisions. To balance potential harms, there is a growing need to identify and mitigate image classifier bias. First, we identify in the feature space a bias direction. We compute class prototypes of each protected attribute value for every class, and reveal an existing subspace that captures the maximum variance of the bias. Second, we mitigate biases by mapping image inputs to label embedding spaces. Each value of the protected attribute has its projection head where classes are embedded through a latent vector representation rather than a common one-hot encoding. Once trained, we further reduce in the feature space the bias effect by removing its direction. Evaluation on biased image datasets, for multi-class, multi-label and binary classifications, shows the effectiveness of tackling both feature and label embedding spaces in improving the fairness of the classifier predictions, while preserving classification performance.
Bias-Awareness for Zero-Shot Learning the Seen and Unseen
Generalized zero-shot learning recognizes inputs from both seen and unseen classes. Yet, existing methods tend to be biased towards the classes seen during training. In this paper, we strive to mitigate this bias. We propose a bias-aware learner to map inputs to a semantic embedding space for generalized zero-shot learning. During training, the model learns to regress to real-valued class prototypes in the embedding space with temperature scaling, while a margin-based bidirectional entropy term regularizes seen and unseen probabilities. Relying on a real-valued semantic embedding space provides a versatile approach, as the model can operate on different types of semantic information for both seen and unseen classes. Experiments are carried out on four benchmarks for generalized zero-shot learning and demonstrate the benefits of the proposed bias-aware classifier, both as a stand-alone method or in combination with generated features.
Ethical Considerations for Responsible Data Curation
Human-centric computer vision (HCCV) data curation practices often neglect privacy and bias concerns, leading to dataset retractions and unfair models. HCCV datasets constructed through nonconsensual web scraping lack crucial metadata for comprehensive fairness and robustness evaluations. Current remedies are post hoc, lack persuasive justification for adoption, or fail to provide proper contextualization for appropriate application. Our research focuses on proactive, domain-specific recommendations, covering purpose, privacy and consent, and diversity, for curating HCCV evaluation datasets, addressing privacy and bias concerns. We adopt an ante hoc reflective perspective, drawing from current practices, guidelines, dataset withdrawals, and audits, to inform our considerations and recommendations.