Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
142
result(s) for
"Park, Seong-Bae"
Sort by:
Low-Resourced Alphabet-Level Pivot-Based Neural Machine Translation for Translating Korean Dialects
2025
Developing a machine translator from a Korean dialect to a foreign language presents significant challenges due to a lack of a parallel corpus for direct dialect translation. To solve this issue, this paper proposes a pivot-based machine translation model that consists of two sub-translators. The first sub-translator is a sequence-to-sequence model with minGRU as an encoder and GRU as a decoder. It normalizes a dialect sentence into a standard sentence, and it employs alphabet-level tokenization. The other type of sub-translator is a legacy translator, such as off-the-shelf neural machine translators or LLMs, which translates the normalized standard sentence to a foreign sentence. The effectiveness of the alphabet-level tokenization and the minGRU encoder for the normalization model is demonstrated through empirical analysis. Alphabet-level tokenization is proven to be more effective for Korean dialect normalization than other widely used sub-word tokenizations. The minGRU encoder exhibits comparable performance to GRU as an encoder, and it is faster and more effective in managing longer token sequences. The pivot-based translation method is also validated through a broad range of experiments, and its effectiveness in translating Korean dialects to English, Chinese, and Japanese is demonstrated empirically.
Journal Article
Aspect-Based Sentiment Analysis Using Aspect Map
by
Park, Seyoung
,
Park, Seong-Bae
,
Noh, Yunseok
in
aspect-based sentiment analysis
,
class activation mapping
,
Classification
2019
Aspect-based sentiment analysis (ABSA) is the task of classifying the sentiment of a specific aspect in a text. Because a single text usually has multiple aspects which are expressed independently, ABSA is a crucial task for in-depth opinion mining. A key point of solving ABSA is to align sentiment expressions with their proper target aspect in a text. Thus, many recent neural models have applied attention mechanisms to learning the alignment. However, it is problematic to depend solely on attention mechanisms to achieve this, because most sentiment expressions such as “nice” and “bad” are too general to be aligned with a proper aspect even through an attention mechanism. To solve this problem, this paper proposes a novel convolutional neural network (CNN)-based aspect-level sentiment classification model, which consists of two CNNs. Because sentiment expressions relevant to an aspect usually appear near the aspect expressions of the aspect, the proposed model first finds the aspect expressions for a given aspect and then focuses on the sentiment expressions around the aspect expressions to determine the final sentiment of an aspect. Thus, the first CNN extracts the positional information of aspect expressions for a target aspect and expresses the information as an aspect map. Even if there exist no data with annotations on direct relation between aspects and their expressions, the aspect map can be obtained effectively by learning it in a weakly supervised manner. Then, the second CNN classifies the sentiment of the target aspect in a text using the aspect map. The proposed model is evaluated on SemEval 2016 Task 5 dataset and is compared with several baseline models. According to the experimental results, the proposed model does not only outperform the baseline models but also shows state-of-the-art performance for the dataset.
Journal Article
A Multitask-Based Neural Machine Translation Model with Part-of-Speech Tags Integration for Arabic Dialects
by
Baniata, Laith H.
,
Park, Seyoung
,
Park, Seong-Bae
in
Accuracy
,
Arabic dialects
,
Arabic language
2018
The statistical machine translation for the Arabic language integrates external linguistic resources such as part-of-speech tags. The current research presents a Bidirectional Long Short-Term Memory (Bi-LSTM)—Conditional Random Fields (CRF) segment-level Arabic Dialect POS tagger model, which will be integrated into the Multitask Neural Machine Translation (NMT) model. The proposed solution for NMT is based on the recurrent neural network encoder-decoder NMT model that has been introduced recently. The study has proposed and developed a unified Multitask NMT model that shares an encoder between the two tasks; Arabic Dialect (AD) to Modern Standard Arabic (MSA) translation task and the segment-level POS tagging tasks. A shared layer and an invariant layer are shared between the translation tasks. By training translation tasks and POS tagging task alternately, the proposed model can leverage the characteristic information and improve the translation quality from Arabic dialects to Modern Standard Arabic. The experiments are conducted from Levantine Arabic (LA) to MSA and Maghrebi Arabic (MA) to MSA translation tasks. As an additional linguistic resource, the segment-level part-of-speech tags for Arabic dialects were also exploited. Experiments suggest that translation quality and the performance of POS tagger were improved with the implementation of multitask learning approach.
Journal Article
Automated extraction of Biomarker information from pathology reports
2018
Background
Pathology reports are written in free-text form, which precludes efficient data gathering. We aimed to overcome this limitation and design an automated system for extracting biomarker profiles from accumulated pathology reports.
Methods
We designed a new data model for representing biomarker knowledge. The automated system parses immunohistochemistry reports based on a “slide paragraph” unit defined as a set of immunohistochemistry findings obtained for the same tissue slide. Pathology reports are parsed using context-free grammar for immunohistochemistry, and using a tree-like structure for surgical pathology. The performance of the approach was validated on manually annotated pathology reports of 100 randomly selected patients managed at Seoul National University Hospital.
Results
High F-scores were obtained for parsing biomarker name and corresponding test results (0.999 and 0.998, respectively) from the immunohistochemistry reports, compared to relatively poor performance for parsing surgical pathology findings. However, applying the proposed approach to our single-center dataset revealed information on 221 unique biomarkers, which represents a richer result than biomarker profiles obtained based on the published literature. Owing to the data representation model, the proposed approach can associate biomarker profiles extracted from an immunohistochemistry report with corresponding pathology findings listed in one or more surgical pathology reports. Term variations are resolved by normalization to corresponding preferred terms determined by expanded dictionary look-up and text similarity-based search.
Conclusions
Our proposed approach for biomarker data extraction addresses key limitations regarding data representation and can handle reports prepared in the clinical setting, which often contain incomplete sentences, typographical errors, and inconsistent formatting.
Journal Article
Strong Influence of Responses in Training Dialogue Response Generator
2021
The sequence-to-sequence model is a widely used model for dialogue response generators, but it tends to generate safe responses for most input queries. Since safe responses are unattractive and boring, a number of efforts have been made to make the generator produce diverse responses, but generating diverse responses is yet an open problem. As a solution to this problem, this paper proposes a novel response generator, Response Generator with Response Weight (RGRW). The proposed response generator is a transformer-based sequence-to-sequence model of which the encoder is a pre-trained Bidirectional Encoder Representations from Transformers (BERT) and the decoder is a variant of Generative Pre-Training of a language model-2 (GPT-2). Since the attention on the response is not reflected enough at the transformer-based sequence-to-sequence model, the proposed generator enhances the influence of a response by the response weight, which determines the importance of each token in a query with respect to the response. Then, the decoder of the generator processes the response weight as well as a query encoding to generate a diverse response. The effectiveness of RGRW is proven by showing that it generates more diverse and informative responses than the baseline response generator by focusing more on the tokens that are important for generating the response. Additionally, the proposed model overwhelms the Commonsense Knowledge-Aware Dialogue generation model (ConKADI), which is a state-of-the-art model.
Journal Article
Question Difficulty Estimation Based on Attention Model for Question Answering
2021
This paper addresses a question difficulty estimation of which goal is to estimate the difficulty level of a given question in question-answering (QA) tasks. Since a question in the tasks is composed of a questionary sentence and a set of information components such as a description and candidate answers, it is important to model the relationship among the information components to estimate the difficulty level of the question. However, existing approaches to this task modeled a simple relationship such as a relationship between a questionary sentence and a description, but such simple relationships are insufficient to predict the difficulty level accurately. Therefore, this paper proposes an attention-based model to consider the complicated relationship among the information components. The proposed model first represents bi-directional relationships between a questionary sentence and each information component using a dual multi-head co-attention, since the questionary sentence is a key factor in the QA questions and it affects and is affected by information components. Then, the proposed model considers inter-information relationship over the bi-directional representations through a self-attention model. The inter-information relationship helps predict the difficulty of the questions accurately which require reasoning over multiple kinds of information components. The experimental results from three well-known and real-world QA data sets prove that the proposed model outperforms the previous state-of-the-art and pre-trained language model baselines. It is also shown that the proposed model is robust against the increase of the number of information components.
Journal Article
An Approach to Knowledge Base Completion by a Committee-Based Knowledge Graph Embedding
2020
Knowledge bases such as Freebase, YAGO, DBPedia, and Nell contain a number of facts with various entities and relations. Since they store many facts, they are regarded as core resources for many natural language processing tasks. Nevertheless, they are not normally complete and have many missing facts. Such missing facts keep them from being used in diverse applications in spite of their usefulness. Therefore, it is significant to complete knowledge bases. Knowledge graph embedding is one of the promising approaches to completing a knowledge base and thus many variants of knowledge graph embedding have been proposed. It maps all entities and relations in knowledge base onto a low dimensional vector space. Then, candidate facts that are plausible in the space are determined as missing facts. However, any single knowledge graph embedding is insufficient to complete a knowledge base. As a solution to this problem, this paper defines knowledge base completion as a ranking task and proposes a committee-based knowledge graph embedding model for improving the performance of knowledge base completion. Since each knowledge graph embedding has its own idiosyncrasy, we make up a committee of various knowledge graph embeddings to reflect various perspectives. After ranking all candidate facts according to their plausibility computed by the committee, the top-k facts are chosen as missing facts. Our experimental results on two data sets show that the proposed model achieves higher performance than any single knowledge graph embedding and shows robust performances regardless of k. These results prove that the proposed model considers various perspectives in measuring the plausibility of candidate facts.
Journal Article
A Topical Category-Aware Neural Text Summarizer
by
Kim, So-Eon
,
Park, Seong-Bae
,
Kaibalina, Nazira
in
Art galleries & museums
,
attention mechanism
,
class activation map
2020
The advent of the sequence-to-sequence model and the attention mechanism has increased the comprehension and readability of automatically generated summaries. However, most previous studies on text summarization have focused on generating or extracting sentences only from an original text, even though every text has a latent topic category. That is, even if a topic category helps improve the summarization quality, there have been no efforts to utilize such information in text summarization. Therefore, this paper proposes a novel topical category-aware neural text summarizer which is differentiated from legacy neural summarizers in that it reflects the topic category of an original text into generating a summary. The proposed summarizer adopts the class activation map (CAM) as topical influence of the words in the original text. Since the CAM excerpts the words relevant to a specific category from the text, it allows the attention mechanism to be influenced by the topic category. As a result, the proposed neural summarizer reflects the topical information of a text as well as the content information into a summary by combining the attention mechanism and CAM. The experiments on The New York Times Annotated Corpus show that the proposed model outperforms the legacy attention-based sequence-to-sequence model, which proves that it is effective at reflecting a topic category into automatic summarization.
Journal Article
Enriching Knowledge Base by Parse Tree Pattern and Semantic Filter
2020
This paper proposes a simple knowledge base enrichment based on parse tree patterns with a semantic filter. Parse tree patterns are superior to lexical patterns used commonly in many previous studies in that they can manage long distance dependencies among words. In addition, the proposed semantic filter, which is a combination of WordNet-based similarity and word embedding similarity, removes parse tree patterns that are semantically irrelevant to the meaning of a target relation. According to our experiments using the DBpedia ontology and Wikipedia corpus, the average accuracy of the top 100 parse tree patterns for ten relations is 68%, which is 16% higher than that of lexical patterns, and the average accuracy of the newly extracted triples is 60.1%. These results prove that the proposed method produces more relevant patterns for the relations of seed knowledge, and thus more accurate triples are generated by the patterns.
Journal Article
Learning Translation-Based Knowledge Graph Embeddings by N-Pair Translation Loss
by
Park, Seong-Bae
,
Kim, A-Yeong
,
Song, Hyun-Je
in
Artificial intelligence
,
Design and construction
,
Knowledge
2020
Translation-based knowledge graph embeddings learn vector representations of entities and relations by treating relations as translation operators over the entities in an embedding space. Since the translation is represented through a score function, translation-based embeddings are trained in general by minimizing a margin-based ranking loss, which assigns a low score to positive triples and a high score to negative triples. However, this type of embedding suffers from slow convergence and poor local optima because the loss adopts only one pair of a positive and a negative triple at a single update of learning parameters. Therefore, this paper proposes the N-pair translation loss that considers multiple negative triples at one update. The N-pair translation loss employs multiple negative triples as well as one positive triple and allows the positive triple to be compared against the multiple negative triples at each parameter update. As a result, it becomes possible to obtain better vector representations rapidly. The experimental results on link prediction prove that the proposed loss helps to quickly converge toward good optima at the early stage of training.
Journal Article