Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
68
result(s) for
"Schramowski, Patrick"
Sort by:
Large pre-trained language models contain human-like biases of what is right and wrong to do
by
Kersting, Kristian
,
Rothkopf, Constantin A.
,
Turan, Cigdem
in
4007/4009
,
639/705/117
,
Artificial intelligence
2022
Artificial writing is permeating our lives due to recent advances in large-scale, transformer-based language models (LMs) such as BERT, GPT-2 and GPT-3. Using them as pre-trained models and fine-tuning them for specific tasks, researchers have extended the state of the art for many natural language processing tasks and shown that they capture not only linguistic knowledge but also retain general knowledge implicitly present in the data. Unfortunately, LMs trained on unfiltered text corpora suffer from degenerated and biased behaviour. While this is well established, we show here that recent LMs also contain human-like biases of what is right and wrong to do, reflecting existing ethical and moral norms of society. We show that these norms can be captured geometrically by a ‘moral direction’ which can be computed, for example, by a PCA, in the embedding space. The computed ‘moral direction’ can rate the normativity (or non-normativity) of arbitrary phrases without explicitly training the LM for this task, reflecting social norms well. We demonstrate that computing the ’moral direction’ can provide a path for attenuating or even preventing toxic degeneration in LMs, showcasing this capability on the RealToxicityPrompts testbed.
Large language models identify patterns in the relations between words and capture their relations in an embedding space. Schramowski and colleagues show that a direction in this space can be identified that separates ‘right’ and ‘wrong’ actions as judged by human survey participants.
Journal Article
Making deep neural networks right for the right scientific reasons by interacting with their explanations
by
Kersting, Kristian
,
Stammer, Wolfgang
,
Shao, Xiaoting
in
631/114/1305
,
631/449
,
Active learning
2020
Deep neural networks have demonstrated excellent performances in many real-world applications. Unfortunately, they may show Clever Hans-like behaviour (making use of confounding factors within datasets) to achieve high performance. In this work we introduce the novel learning setting of explanatory interactive learning and illustrate its benefits on a plant phenotyping research task. Explanatory interactive learning adds the scientist into the training loop, who interactively revises the original model by providing feedback on its explanations. Our experimental results demonstrate that explanatory interactive learning can help to avoid Clever Hans moments in machine learning and encourages (or discourages, if appropriate) trust in the underlying model.
Deep learning approaches can show excellent performance but still have limited practical use if they learn to predict based on confounding factors in a dataset, for instance text labels in the corner of images. By using an explanatory interactive learning approach, with a human expert in the loop during training, it becomes possible to avoid predictions based on confounding factors.
Journal Article
Extending Hyperspectral Imaging for Plant Phenotyping to the UV-Range
by
Kersting, Kristian
,
Kuska, Matheus Thomas
,
Brugger, Anna
in
Abiotic stress
,
Amino acids
,
Apoptosis
2019
Previous plant phenotyping studies have focused on the visible (VIS, 400–700 nm), near-infrared (NIR, 700–1000 nm) and short-wave infrared (SWIR, 1000–2500 nm) range. The ultraviolet range (UV, 200–380 nm) has not yet been used in plant phenotyping even though a number of plant molecules like flavones and phenol feature absorption maxima in this range. In this study an imaging UV line scanner in the range of 250–430 nm is introduced to investigate crop plants for plant phenotyping. Observing plants in the UV-range can provide information about important changes of plant substances. To record reliable and reproducible time series results, measurement conditions were defined that exclude phototoxic effects of UV-illumination in the plant tissue. The measurement quality of the UV-camera has been assessed by comparing it to a non-imaging UV-spectrometer by measuring six different plant-based substances. Given the findings of these preliminary studies, an experiment has been defined and performed monitoring the stress response of barley leaves to salt stress. The aim was to visualize the effects of abiotic stress within the UV-range to provide new insights into the stress response of plants. Our study demonstrated the first use of a hyperspectral sensor in the UV-range for stress detection in plant phenotyping.
Journal Article
Exploiting Cultural Biases via Homoglyphs in Text-to-Image Synthesis
2023
Models for text-to-image synthesis, such as DALL-E 2 and Stable Diffusion, have recently drawn a lot of interest from academia and the general public. These models are capable of producing high-quality images that depict a variety of concepts and styles when conditioned on textual descriptions. However, these models adopt cultural characteristics associated with specific Unicode scripts from their vast amount of training data, which may not be immediately apparent. We show that by simply inserting single non-Latin characters in the textual description, common models reflect cultural biases in their generated images. We analyze this behavior both qualitatively and quantitatively and identify a model’s text encoder as the root cause of the phenomenon. Such behavior can be interpreted as a model feature, offering users a simple way to customize the image generation and reflect their own cultural background. Yet, malicious users or service providers may also try to intentionally bias the image generation. One goal might be to create racist stereotypes by replacing Latin characters with similarly-looking characters from non-Latin scripts, so-called homoglyphs. To mitigate such unnoticed script attacks, we propose a novel homoglyph unlearning method to fine-tune a text encoder, making it robust against homoglyph manipulations.
Journal Article
Does CLIP Know My Face?
2024
With the rise of deep learning in various applications, privacy concerns around the protection of training data have become a critical area of research. Whereas prior studies have focused on privacy risks in single-modal models, we introduce a novel method to assess privacy for multi-modal models, specifically vision-language models like CLIP. The proposed Identity Inference Attack (IDIA) reveals whether an individual was included in the training data by querying the model with images of the same person. Letting the model choose from a wide variety of possible text labels, the model reveals whether it recognizes the person and, therefore, was used for training. Our large-scale experiments on CLIP demonstrate that individuals used for training can be identified with very high accuracy. We confirm that the model has learned to associate names with depicted individuals, implying the existence of sensitive information that can be extracted by adversaries. Our results highlight the need for stronger privacy protection in large-scale models and suggest that IDIAs can be used to prove the unauthorized use of data for training and to enforce privacy laws. This article appears in the AI & Society track.
Journal Article
A typology for exploring the mitigation of shortcut behaviour
by
Kersting, Kristian
,
Stammer, Wolfgang
,
Friedrich, Felix
in
639/705
,
639/705/117
,
Active learning
2023
As machine learning models become larger, and are increasingly trained on large and uncurated datasets in weakly supervised mode, it becomes important to establish mechanisms for inspecting, interacting with and revising models. These are necessary to mitigate shortcut learning effects and to guarantee that the model’s learned knowledge is aligned with human knowledge. Recently, several explanatory interactive machine learning methods have been developed for this purpose, but each has different motivations and methodological details. In this work, we provide a unification of various explanatory interactive machine learning methods into a single typology by establishing a common set of basic modules. We discuss benchmarks and other measures for evaluating the overall abilities of explanatory interactive machine learning methods. With this extensive toolbox, we systematically and quantitatively compare several explanatory interactive machine learning methods. In our evaluations, all methods are shown to improve machine learning models in terms of accuracy and explainability. However, we found remarkable differences in individual benchmark tasks, which reveal valuable application-relevant aspects for the integration of these benchmarks in the development of future methods.
Explanatory interactive machine learning methods have been developed to facilitate the learning process between the machine and the user. Friedrich et al. provide a unification of various explanatory interactive machine learning methods into a single typology, and present benchmarks for evaluating such methods.
Journal Article
Explanatory Interactive Machine Learning
by
Rohde, Gernot
,
Kersting, Kristian
,
Stammer, Wolfgang
in
Artificial intelligence
,
Explainable artificial intelligence
,
Machine learning
2023
The most promising standard machine learning methods can deliver highly accurate classification results, often outperforming standard white-box methods. However, it is hardly possible for humans to fully understand the rationale behind the black-box results, and thus, these powerful methods hamper the creation of new knowledge on the part of humans and the broader acceptance of this technology. Explainable Artificial Intelligence attempts to overcome this problem by making the results more interpretable, while Interactive Machine Learning integrates humans into the process of insight discovery. The paper builds on recent successes in combining these two cutting-edge technologies and proposes how Explanatory Interactive Machine Learning (XIL) is embedded in a generalizable Action Design Research (ADR) process – called XIL-ADR. This approach can be used to analyze data, inspect models, and iteratively improve them. The paper shows the application of this process using the diagnosis of viral pneumonia, e.g., Covid-19, as an illustrative example. By these means, the paper also illustrates how XIL-ADR can help identify shortcomings of standard machine learning projects, gain new insights on the part of the human user, and thereby can help to unlock the full potential of AI-based systems for organizations and research.
Journal Article
Machine Learning Assisted Pattern Matching: Insight into Oxide Electronic Device Performance by Phase Determination in 4D-STEM Datasets
by
Zintler, Alexander
,
Eilhardt, Robert
,
Kersting, Kristian
in
Four-dimensional Scanning Transmission Electron Microscopy (4D-STEM): New Experiments and Data Analyses for Determining Materials Functionality and Biological Structures
,
Learning algorithms
,
Machine learning
2020
Journal Article
Does CLIP Know My Face?
by
Kersting, Kristian
,
Brack, Manuel
,
Hintersdorf, Dominik
in
Data points
,
Deep learning
,
Inference
2024
With the rise of deep learning in various applications, privacy concerns around the protection of training data have become a critical area of research. Whereas prior studies have focused on privacy risks in single-modal models, we introduce a novel method to assess privacy for multi-modal models, specifically vision-language models like CLIP. The proposed Identity Inference Attack (IDIA) reveals whether an individual was included in the training data by querying the model with images of the same person. Letting the model choose from a wide variety of possible text labels, the model reveals whether it recognizes the person and, therefore, was used for training. Our large-scale experiments on CLIP demonstrate that individuals used for training can be identified with very high accuracy. We confirm that the model has learned to associate names with depicted individuals, implying the existence of sensitive information that can be extracted by adversaries. Our results highlight the need for stronger privacy protection in large-scale models and suggest that IDIAs can be used to prove the unauthorized use of data for training and to enforce privacy laws.
Inferring Offensiveness In Images From Natural Language Supervision
by
Kersting, Kristian
,
Schramowski, Patrick
in
Computer vision
,
Datasets
,
Natural language processing
2021
Probing or fine-tuning (large-scale) pre-trained models results in state-of-the-art performance for many NLP tasks and, more recently, even for computer vision tasks when combined with image data. Unfortunately, these approaches also entail severe risks. In particular, large image datasets automatically scraped from the web may contain derogatory terms as categories and offensive images, and may also underrepresent specific classes. Consequently, there is an urgent need to carefully document datasets and curate their content. Unfortunately, this process is tedious and error-prone. We show that pre-trained transformers themselves provide a methodology for the automated curation of large-scale vision datasets. Based on human-annotated examples and the implicit knowledge of a CLIP based model, we demonstrate that one can select relevant prompts for rating the offensiveness of an image. In addition to e.g. privacy violation and pornographic content previously identified in ImageNet, we demonstrate that our approach identifies further inappropriate and potentially offensive content.