Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
22
result(s) for
"Einolghozati, Arash"
Sort by:
Sensing and Molecular Communication Using Synthetic Cells: Theory and Algorithms
2016
Molecular communication (MC) is a novel communication paradigm in which molecules are used to encode, transmit and decode information. MC is the primary method by which biological entities exchange information and hence, cooperate with each other. MC is a promising paradigm to enable communication between nano-bio machines, e.g., biosensors with potential applications such as cancer and disease detection, smart drug delivery, toxicity detection etc.The objective of this research is to establish the fundamentals of diffusion-based molecular communication and sensing via biological agents (e.g., synthetic bacteria) from a communication and information theory perspective, and design algorithms for reliable communication and sensing systems. In the first part of the thesis, we develop models for the diffusion channel as well as the molecular sensing at the receiver and obtain the maximum achievable rate for such a communication system. Next, we study reliability in MC. We design practical nodes by employing synthetic bacteria as the basic element of a biologicallycompatible communication system and show how reliable nodes can be formed out of the collective behavior of a population of unreliable bio-agents. We model the probabilistic behavior of bacteria, obtain the node sensing capacity and propose a practical modulation scheme. In order to improve the reliability, we also introduce relaying and error-detecting codes for MC.In the second part of the thesis, we study the molecular sensing problem with potential applications in disease detection. We establish the rate-distortion theory for molecular sensing and investigate as to how distortion can be minimized via an optimal quantizer. We also study sensor cell arrays in which sensing redundancy is achieved by using multiple sensors to measure several molecular inputs simultaneously. We study the interference in sensing molecular inputs and propose a probabilistic message passing algorithm to solve the pattern detection over the molecular inputs of interest.
Dissertation
Small But Funny: A Feedback-Driven Approach to Humor Distillation
2024
The emergence of Large Language Models (LLMs) has brought to light promising language generation capabilities, particularly in performing tasks like complex reasoning and creative writing. Consequently, distillation through imitation of teacher responses has emerged as a popular technique to transfer knowledge from LLMs to more accessible, Small Language Models (SLMs). While this works well for simpler tasks, there is a substantial performance gap on tasks requiring intricate language comprehension and creativity, such as humor generation. We hypothesize that this gap may stem from the fact that creative tasks might be hard to learn by imitation alone and explore whether an approach, involving supplementary guidance from the teacher, could yield higher performance. To address this, we study the effect of assigning a dual role to the LLM - as a \"teacher\" generating data, as well as a \"critic\" evaluating the student's performance. Our experiments on humor generation reveal that the incorporation of feedback significantly narrows the performance gap between SLMs and their larger counterparts compared to merely relying on imitation. As a result, our research highlights the potential of using feedback as an additional dimension to data when transferring complex language abilities via distillation.
CoDi: Conversational Distillation for Grounded Question Answering
2024
Distilling conversational skills into Small Language Models (SLMs) with approximately 1 billion parameters presents significant challenges. Firstly, SLMs have limited capacity in their model parameters to learn extensive knowledge compared to larger models. Secondly, high-quality conversational datasets are often scarce, small, and domain-specific. Addressing these challenges, we introduce a novel data distillation framework named CoDi (short for Conversational Distillation, pronounced \"Cody\"), allowing us to synthesize large-scale, assistant-style datasets in a steerable and diverse manner. Specifically, while our framework is task agnostic at its core, we explore and evaluate the potential of CoDi on the task of conversational grounded reasoning for question answering. This is a typical on-device scenario for specialist SLMs, allowing for open-domain model responses, without requiring the model to \"memorize\" world knowledge in its limited weights. Our evaluations show that SLMs trained with CoDi-synthesized data achieve performance comparable to models trained on human-annotated data in standard metrics. Additionally, when using our framework to generate larger datasets from web data, our models surpass larger, instruction-tuned models in zero-shot conversational grounded reasoning tasks.
Improving Faithfulness of Abstractive Summarization by Controlling Confounding Effect of Irrelevant Sentences
2024
Lack of factual correctness is an issue that still plagues state-of-the-art summarization systems despite their impressive progress on generating seemingly fluent summaries. In this paper, we show that factual inconsistency can be caused by irrelevant parts of the input text, which act as confounders. To that end, we leverage information-theoretic measures of causal effects to quantify the amount of confounding and precisely quantify how they affect the summarization performance. Based on insights derived from our theoretical results, we design a simple multi-task model to control such confounding by leveraging human-annotated relevant sentences when available. Crucially, we give a principled characterization of data distributions where such confounding can be large thereby necessitating the use of human annotated relevant sentences to generate factual summaries. Our approach improves faithfulness scores by 20\\% over strong baselines on AnswerSumm \\citep{fabbri2021answersumm}, a conversation summarization dataset where lack of faithfulness is a significant issue due to the subjective nature of the task. Our best method achieves the highest faithfulness score while also achieving state-of-the-art results on standard metrics like ROUGE and METEOR. We corroborate these improvements through human evaluation.
Sound Natural: Content Rephrasing in Dialog Systems
by
Diedrick, Keith
,
Gupta, Sonal
,
Einolghozati, Arash
in
Autoregressive models
,
Queries
,
Regression analysis
2020
We introduce a new task of rephrasing for a more natural virtual assistant. Currently, virtual assistants work in the paradigm of intent slot tagging and the slot values are directly passed as-is to the execution engine. However, this setup fails in some scenarios such as messaging when the query given by the user needs to be changed before repeating it or sending it to another user. For example, for queries like 'ask my wife if she can pick up the kids' or 'remind me to take my pills', we need to rephrase the content to 'can you pick up the kids' and 'take your pills' In this paper, we study the problem of rephrasing with messaging as a use case and release a dataset of 3000 pairs of original query and rephrased query. We show that BART, a pre-trained transformers-based masked language model with auto-regressive decoding, is a strong baseline for the task, and show improvements by adding a copy-pointer and copy loss to it. We analyze different tradeoffs of BART-based and LSTM-based seq2seq models, and propose a distilled LSTM-based seq2seq as the best practical model.
A Study on the Efficiency and Generalization of Light Hybrid Retrievers
2023
Hybrid retrievers can take advantage of both sparse and dense retrievers. Previous hybrid retrievers leverage indexing-heavy dense retrievers. In this work, we study \"Is it possible to reduce the indexing memory of hybrid retrievers without sacrificing performance\"? Driven by this question, we leverage an indexing-efficient dense retriever (i.e. DrBoost) and introduce a LITE retriever that further reduces the memory of DrBoost. LITE is jointly trained on contrastive learning and knowledge distillation from DrBoost. Then, we integrate BM25, a sparse retriever, with either LITE or DrBoost to form light hybrid retrievers. Our Hybrid-LITE retriever saves 13X memory while maintaining 98.0% performance of the hybrid retriever of BM25 and DPR. In addition, we study the generalization capacity of our light hybrid retrievers on out-of-domain dataset and a set of adversarial attacks datasets. Experiments showcase that light hybrid retrievers achieve better generalization performance than individual sparse and dense retrievers. Nevertheless, our analysis shows that there is a large room to improve the robustness of retrievers, suggesting a new research direction.
Design and Analysis of Wireless Communication Systems Using Diffusion-Based Molecular Communication Among Bacteria
by
Einolghozati, Arash
,
Sardari, Mohsen
,
Fekri, Faramarz
in
Bacteria
,
Coding
,
Communications systems
2014
The design of biologically-inspired wireless communication systems using bacteria as the basic element of the system is initially motivated by a phenomenon called \\emph{Quorum Sensing}. Due to high randomness in the individual behavior of a bacterium, reliable communication between two bacteria is almost impossible. Therefore, we have recently proposed that a population of bacteria in a cluster is considered as a bio node in the network capable of molecular transmission and reception. This proposition enables us to form a reliable bio node out of many unreliable bacteria. In this paper, we study the communication between two nodes in such a network where information is encoded in the concentration of molecules by the transmitter. The molecules produced by the bacteria in the transmitter node propagate through the diffusion channel. Then, the concentration of molecules is sensed by the bacteria population in the receiver node which would decode the information and output light or fluorescent as a result. The uncertainty in the communication is caused by all three components of communication, i.e., transmission, propagation and reception. We study the theoretical limits of the information transfer rate in the presence of such uncertainties. Finally, we consider M-ary signaling schemes and study their achievable rates and corresponding error probabilities.
Likelihood Ratios and Generative Classifiers for Unsupervised Out-of-Domain Detection In Task Oriented Dialog
2019
The task of identifying out-of-domain (OOD) input examples directly at test-time has seen renewed interest recently due to increased real world deployment of models. In this work, we focus on OOD detection for natural language sentence inputs to task-based dialog systems. Our findings are three-fold: First, we curate and release ROSTD (Real Out-of-Domain Sentences From Task-oriented Dialog) - a dataset of 4K OOD examples for the publicly available dataset from (Schuster et al. 2019). In contrast to existing settings which synthesize OOD examples by holding out a subset of classes, our examples were authored by annotators with apriori instructions to be out-of-domain with respect to the sentences in an existing dataset. Second, we explore likelihood ratio based approaches as an alternative to currently prevalent paradigms. Specifically, we reformulate and apply these approaches to natural language inputs. We find that they match or outperform the latter on all datasets, with larger improvements on non-artificial OOD benchmarks such as our dataset. Our ablations validate that specifically using likelihood ratios rather than plain likelihood is necessary to discriminate well between OOD and in-domain data. Third, we propose learning a generative classifier and computing a marginal likelihood (ratio) for OOD detection. This allows us to use a principled likelihood while at the same time exploiting training-time labels. We find that this approach outperforms both simple likelihood (ratio) based and other prior approaches. We are hitherto the first to investigate the use of generative classifiers for OOD detection at test-time.
Improving Robustness of Task Oriented Dialog Systems
2019
Task oriented language understanding in dialog systems is often modeled using intents (task of a query) and slots (parameters for that task). Intent detection and slot tagging are, in turn, modeled using sentence classification and word tagging techniques respectively. Similar to adversarial attack problems with computer vision models discussed in existing literature, these intent-slot tagging models are often over-sensitive to small variations in input -- predicting different and often incorrect labels when small changes are made to a query, thus reducing their accuracy and reliability. However, evaluating a model's robustness to these changes is harder for language since words are discrete and an automated change (e.g. adding `noise') to a query sometimes changes the meaning and thus labels of a query. In this paper, we first describe how to create an adversarial test set to measure the robustness of these models. Furthermore, we introduce and adapt adversarial training methods as well as data augmentation using back-translation to mitigate these issues. Our experiments show that both techniques improve the robustness of the system substantially and can be combined to yield the best results.
soc2seq: Social Embedding meets Conversation Model
by
Bhatia, Parminder
,
Marsal Gavalda
,
Einolghozati, Arash
in
Applications programs
,
Mobile computing
2017
While liking or upvoting a post on a mobile app is easy to do, replying with a written note is much more difficult, due to both the cognitive load of coming up with a meaningful response as well as the mechanics of entering the text. Here we present a novel textual reply generation model that goes beyond the current auto-reply and predictive text entry models by taking into account the content preferences of the user, the idiosyncrasies of their conversational style, and even the structure of their social graph. Specifically, we have developed two types of models for personalized user interactions: a content-based conversation model, which makes use of location together with user information, and a social-graph-based conversation model, which combines content-based conversation models with social graphs.