Catalogue Search | MBRL

Robust, Interpretable, and Usable Entity Representations

by Onoe, Yasumasa in Computer science , Information science , Information Technology

2023

Knowledge about real-world entities (e.g., people, organizations, etc.) is a key component to understand natural language text. Just as humans perceive and learn about those entities via books, large-scale language models (LMs) acquire entity knowledge from a massive amount of text during pretraining. Empirically, it is widely known that the prior knowledge about this world is one of the main drivers of LMs’ remarkable performance on downstream tasks.However, in LMs, entity representations are distributed over a large number of parameters, making the entity representations difficult to interpret. Additionally, entity knowledge can involve complex reasoning, in addition to relational knowledge. This type of inference capability is not tested by existing knowledge probing benchmarks. Furthermore, LMs’ entity knowledge is frozen at the time of pretraining while the real world changes as time goes by. As LMs have become the foundation of modern NLP applications, ensuring that they possess accurate and current information about real-world entities is crucial for enhancing the performance of these NLP systems.In this thesis, we explore entity representations from three different aspects. Interpretability: we first design interpretable entity representations where each entity is represented as a set of concepts (i.e., entity types) and their associated probabilities. Since these entity representations possess characteristics of both discrete and continuous features, these representations are human-readable and can be used in a continuous model. In addition, the model predictions can be post-hoc modified using a small number of rules to incorporate domain knowledge. As an alternative way to build interpretable entity representations, we study box embeddings which are expressive and have suitable geometry for representing hierarchical relationships in entity types. Boxes can nest, overlap, or be completely disjoint to capture subtypes, correlation, or disjunction relations, leading to better entity representations.Complex reasoning: we create a new benchmark to measure LMs’ inference capabilities over entity knowledge combined with commonsense reasoning. This kind of inference requires implicit knowledge about entities that are not explicitly included in the input context, which makes this problem quite challenging. On this benchmark, we observe a large gap between LMs and human performance, indicating brittleness of current entity representations. Knowledge propagation: lastly, we propose a new task: entity knowledge propagation. This task evaluates how LMs’ parametric knowledge is updated when completely new entity knowledge is injected. We construct benchmark datasets for this task and evaluate existing knowledge editing methods. Surprisingly, the state-of-the-art knowledge editing techniques show little propagation of injected knowledge, underperforming a simple prompt baseline. Our results suggest that more effective model updating approaches are needed to properly inject new entity knowledge into LMs.Our results indicate that the three aspects of entity representations remain challenging even for the state of the art models. We believe that our work serves as an exciting foundation for further research, aiming to create more robust, interpretable, and usable entity representations. In the last chapter of this thesis, we outline the potential directions for future work stemming from this Ph.D. dissertation.

Dissertation

Share this book

Add to My Shelf

Interpretable Entity Representations through Large-Scale Typing

by Onoe, Yasumasa , Durrett, Greg in Domains , Natural language processing , Performance enhancement

2020

In standard methodology for natural language processing, entities in text are typically embedded in dense vector spaces with pre-trained models. The embeddings produced this way are effective when fed into downstream models, but they require end-task fine-tuning and are fundamentally difficult to interpret. In this paper, we present an approach to creating entity representations that are human readable and achieve high performance on entity-related tasks out of the box. Our representations are vectors whose values correspond to posterior probabilities over fine-grained entity types, indicating the confidence of a typing model's decision that the entity belongs to the corresponding type. We obtain these representations using a fine-grained entity typing model, trained either on supervised ultra-fine entity typing data (Choi et al. 2018) or distantly-supervised examples from Wikipedia. On entity probing tasks involving recognizing entity identity, our embeddings used in parameter-free downstream models achieve competitive performance with ELMo- and BERT-based embeddings in trained models. We also show that it is possible to reduce the size of our type set in a learning-based way for particular domains. Finally, we show that these embeddings can be post-hoc modified through a small number of rules to incorporate domain knowledge and improve performance.

Paper

Share this book

Add to My Shelf

Jamp: Controlled Japanese Temporal Inference Dataset for Evaluating Generalization Capacity of Language Models

by Onoe, Yasumasa , Sugimoto, Tomoki , Yanaka, Hitomi in Annotations , Datasets , Inference

2023

Natural Language Inference (NLI) tasks involving temporal inference remain challenging for pre-trained language models (LMs). Although various datasets have been created for this task, they primarily focus on English and do not address the need for resources in other languages. It is unclear whether current LMs realize the generalization capacity for temporal inference across languages. In this paper, we present Jamp, a Japanese NLI benchmark focused on temporal inference. Our dataset includes a range of temporal inference patterns, which enables us to conduct fine-grained analysis. To begin the data annotation process, we create diverse inference templates based on the formal semantics test suites. We then automatically generate diverse NLI examples by using the Japanese case frame dictionary and well-designed templates while controlling the distribution of inference patterns and gold labels. We evaluate the generalization capacities of monolingual/multilingual LMs by splitting our dataset based on tense fragments (i.e., temporal inference patterns). Our findings demonstrate that LMs struggle with specific linguistic phenomena, such as habituality, indicating that there is potential for the development of more effective NLI models across languages.

Paper

Share this book

Add to My Shelf

Fine-Grained Entity Typing for Domain Independent Entity Linking

by Onoe, Yasumasa , Durrett, Greg in Datasets , Encyclopedias , Model testing

2020

Neural entity linking models are very powerful, but run the risk of overfitting to the domain they are trained in. For this problem, a domain is characterized not just by genre of text but even by factors as specific as the particular distribution of entities, as neural models tend to overfit by memorizing properties of frequent entities in a dataset. We tackle the problem of building robust entity linking models that generalize effectively and do not rely on labeled entity linking data with a specific entity distribution. Rather than predicting entities directly, our approach models fine-grained entity properties, which can help disambiguate between even closely related entities. We derive a large inventory of types (tens of thousands) from Wikipedia categories, and use hyperlinked mentions in Wikipedia to distantly label data and train an entity typing model. At test time, we classify a mention with this typing model and use soft type predictions to link the mention to the most similar candidate entity. We evaluate our entity linking system on the CoNLL-YAGO dataset (Hoffart et al., 2011) and show that our approach outperforms prior domain-independent entity linking systems. We also test our approach in a harder setting derived from the WikilinksNED dataset (Eshel et al., 2017) where all the mention-entity pairs are unseen during test time. Results indicate that our approach generalizes better than a state-of-the-art neural model on the dataset.

Paper

Share this book

Add to My Shelf

Learning to Denoise Distantly-Labeled Data for Entity Typing

by Onoe, Yasumasa , Durrett, Greg in Labels , Maintenance , Noise reduction

2019

Distantly-labeled data can be used to scale up training of statistical models, but it is typically noisy and that noise can vary with the distant labeling technique. In this work, we propose a two-stage procedure for handling this type of data: denoise it with a learned model, then train our final model on clean and denoised distant data with standard supervised training. Our denoising approach consists of two parts. First, a filtering function discards examples from the distantly labeled data that are wholly unusable. Second, a relabeling function repairs noisy labels for the retained examples. Each of these components is a model trained on synthetically-noised examples generated from a small manually-labeled set. We investigate this approach on the ultra-fine entity typing task of Choi et al. (2018). Our baseline model is an extension of their model with pre-trained ELMo representations, which already achieves state-of-the-art performance. Adding distant data that has been denoised with our learned models gives further performance gains over this base model, outperforming models trained on raw distant data or heuristically-denoised distant data.

Paper

Share this book

Add to My Shelf

Unblocking Fine-Grained Evaluation of Detailed Captions: An Explaining AutoRater and Critic-and-Revise Pipeline

by Szpektor, Idan , Bitton, Yonatan , Onoe, Yasumasa in Annotations , Benchmarks , Errors

2025

Large Vision-Language Models (VLMs) now generate highly detailed, paragraphlength image captions, yet evaluating their factual accuracy remains challenging. Current methods often miss fine-grained errors, being designed for shorter texts or lacking datasets with verified inaccuracies. We introduce DOCCI-Critique, a benchmark with 1,400 VLM-generated paragraph captions (100 images, 14 VLMs) featuring over 10,216 sentence-level human annotations of factual correctness and explanatory rationales for errors, all within paragraph context. Building on this, we develop VNLI-Critique, a model for automated sentence-level factuality classification and critique generation. We highlight three key applications: (1) VNLI-Critique demonstrates robust generalization, validated by state-of-the-art performance on the M-HalDetect benchmark and strong results in CHOCOLATE claim verification. (2) The VNLI-Critique driven AutoRater for DOCCI-Critique provides reliable VLM rankings, showing excellent alignment with human factuality judgments (e.g., 0.98 Spearman). (3) An innovative Critic-and-Revise pipeline, where critiques from VNLI-Critique guide LLM-based corrections, achieves substantial improvements in caption factuality (e.g., a 46% gain on DetailCaps-4870). Our work offers a crucial benchmark alongside practical tools, designed to significantly elevate the standards for fine-grained evaluation and foster the improvement of VLM image understanding. Project page: https://google.github.io/unblocking-detail-caption

Paper

Share this book

Add to My Shelf

Improving and Diagnosing Knowledge-Based Visual Question Answering via Entity Enhanced Knowledge Injection

by Garcia-Olano, Diego , Onoe, Yasumasa , Ghosh, Joydeep in Datasets , Knowledge , Knowledge representation

2021

Knowledge-Based Visual Question Answering (KBVQA) is a bi-modal task requiring external world knowledge in order to correctly answer a text question and associated image. Recent single modality text work has shown knowledge injection into pre-trained language models, specifically entity enhanced knowledge graph embeddings, can improve performance on downstream entity-centric tasks. In this work, we empirically study how and whether such methods, applied in a bi-modal setting, can improve an existing VQA system's performance on the KBVQA task. We experiment with two large publicly available VQA datasets, (1) KVQA which contains mostly rare Wikipedia entities and (2) OKVQA which is less entity-centric and more aligned with common sense reasoning. Both lack explicit entity spans and we study the effect of different weakly supervised and manual methods for obtaining them. Additionally we analyze how recently proposed bi-modal and single modal attention explanations are affected by the incorporation of such entity enhanced representations. Our results show substantial improved performance on the KBVQA task without the need for additional costly pre-training and we provide insights for when entity knowledge injection helps improve a model's understanding. We provide code and enhanced datasets for reproducibility.

Paper

Share this book

Add to My Shelf

Cross-Lingual Fine-Grained Entity Typing

by Onoe, Yasumasa , Durrett, Greg , Selvaraj, Nila in Languages , Training

2021

The growth of cross-lingual pre-trained models has enabled NLP tools to rapidly generalize to new languages. While these models have been applied to tasks involving entities, their ability to explicitly predict typological features of these entities across languages has not been established. In this paper, we present a unified cross-lingual fine-grained entity typing model capable of handling over 100 languages and analyze this model's ability to generalize to languages and entities unseen during training. We train this model on cross-lingual training data collected from Wikipedia hyperlinks in multiple languages (training languages). During inference, our model takes an entity mention and context in a particular language (test language, possibly not in the training languages) and predicts fine-grained types for that entity. Generalizing to new languages and unseen entities are the fundamental challenges of this entity typing setup, so we focus our evaluation on these settings and compare against simple yet powerful string match baselines. Experimental results show that our approach outperforms the baselines on unseen languages such as Japanese, Tamil, Arabic, Serbian, and Persian. In addition, our approach substantially improves performance on unseen entities (even in unseen languages) over the baselines, and human evaluation shows a strong ability to predict relevant types in these settings.

Paper

Share this book

Add to My Shelf

Intermediate Entity-based Sparse Interpretable Representation Learning

by Wallace, Byron C , Garcia-Olano, Diego , Onoe, Yasumasa in Debugging , Performance prediction , Representation learning

2022

Interpretable entity representations (IERs) are sparse embeddings that are \"human-readable\" in that dimensions correspond to fine-grained entity types and values are predicted probabilities that a given entity is of the corresponding type. These methods perform well in zero-shot and low supervision settings. Compared to standard dense neural embeddings, such interpretable representations may permit analysis and debugging. However, while fine-tuning sparse, interpretable representations improves accuracy on downstream tasks, it destroys the semantics of the dimensions which were enforced in pre-training. Can we maintain the interpretable semantics afforded by IERs while improving predictive performance on downstream tasks? Toward this end, we propose Intermediate enTity-based Sparse Interpretable Representation Learning (ItsIRL). ItsIRL realizes improved performance over prior IERs on biomedical tasks, while maintaining \"interpretability\" generally and their ability to support model debugging specifically. The latter is enabled in part by the ability to perform \"counterfactual\" fine-grained entity type manipulation, which we explore in this work. Finally, we propose a method to construct entity type based class prototypes for revealing global semantic properties of classes learned by our model.

Paper

Share this book

Add to My Shelf

Entity Cloze By Date: What LMs Know About Unseen Entities

by Zhang, Michael J Q , Onoe, Yasumasa , Durrett, Greg in Automatic identification , Benchmarks , Datasets

2022

Language models (LMs) are typically trained once on a large-scale corpus and used for years without being updated. However, in a dynamic world, new entities constantly arise. We propose a framework to analyze what LMs can infer about new entities that did not exist when the LMs were pretrained. We derive a dataset of entities indexed by their origination date and paired with their English Wikipedia articles, from which we can find sentences about each entity. We evaluate LMs' perplexity on masked spans within these sentences. We show that models more informed about the entities, such as those with access to a textual definition of them, achieve lower perplexity on this benchmark. Our experimental results demonstrate that making inferences about new entities remains difficult for LMs. Given its wide coverage on entity knowledge and temporal indexing, our dataset can be used to evaluate LMs and techniques designed to modify or extend their knowledge. Our automatic data collection pipeline can be easily used to continually update our benchmark.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter