Catalogue Search | MBRL

The Chinese computer : a global history of the information age

by Mullaney, Thomas S. (Thomas Shawn), author in Chinese language Data processing History. , Chinese character sets (Data processing) , Chinese language Written Chinese.

2024

\"Exploration of the largely unknown history of Chinese-language computing systems, accessible to an audience unfamiliar with the Chinese language or the technical workings of personal computers\"-- Provided by publisher.

BOOK

Share this book

Add to My Shelf

HowNet and the computation of meaning

by Dong, Qiang , Dong, Zhendong in Chinese language , Data processing , English language

2006

It is widely acknowledged that natural language processing, as an indispensable means for information technology, requires the strong support of world knowledge as well as linguistic knowledge. This book is a theoretical exploration into the extra-linguistic knowledge needed for natural language processing and a panoramic description of HowNet as a case study.

eBook

Share this book

Add to My Shelf

Grounding in Chinese written narrative discourse

by Li, Wendan, author in Chinese language Spoken language Research. , Chinese language Spoken language Data processing. , Discourse analysis, Narrative China.

Book

Share this book

Add to My Shelf

Automatic noun phrase extraction from full Chinese text

by Li, Wenjie in Linguistics , Systems design , Systems science

1997

In this thesis, a new statistics-based partial parser CNPext for extraction of maximal-length noun phrase in Chinese is presented. Given a Chinese run text as the input, the CNPext system performs the following: (1) noun phrase boundary determination; and (2) ambiguities resolution for relative clause and prepositional phrase modifiers. The noun phrase extraction module consisted of two stages: it first finds all boundary candidates, and then pairs the opening and ending candidates to form the final noun phrase. Our system is superior to other noun phrase extraction systems, as it can resolve the structural ambiguities, a problem faced by many natural language processing systems. Others simply fail to do so as they cannot handle ambiguities incurred by relative clause and prepositional phrase modifiers. However, our experiments showed that merely statistics-based approaches with part-of-speech tags are not adequate for the purpose; semantic information at a higher level is needed for this. Our proposed algorithm used the semantic class relation between a verb-noun (preposition-noun) pair derived from the standard Chinese thesaurus, to work out which phrase structure is more semantically acceptable. Our work is the first comprehensive attempt in automatic Chinese noun phrase extraction. It not only proposes an effective way to automatically extract noun phrases from large running texts but also gives an impetus to the other work in similar areas, e.g. verb phrase extraction. Exploring effective methods for a complete noun phrase extraction system in the Chinese world is a challenging exercise. We hope this project has provided some insight, if not the complete solutions, to the problems and enables the development of advanced, practical Chinese information processing systems soon to handle the ever growing volume of information.

Dissertation

Share this book

Add to My Shelf

The scientification of China

by Sun, Zhaohao, author , Wang, Paul P., author in Science Social aspects China. , Science and state China. , Language policy China.

Book

Share this book

Add to My Shelf

$On the fractal patterns of language structures$

On the fractal patterns of language structures

by Bernardes, Américo Tristão , Mello, Heliana , Ribeiro, Leonardo Costa in Algorithms , Analysis , Arabic language

2023

Natural Language Processing (NLP) makes use of Artificial Intelligence algorithms to extract meaningful information from unstructured texts, i.e., content that lacks metadata and cannot easily be indexed or mapped onto standard database fields. It has several applications, from sentiment analysis and text summary to automatic language translation. In this work, we use NLP to figure out similar structural linguistic patterns among several different languages. We apply the word2vec algorithm that creates a vector representation for the words in a multidimensional space that maintains the meaning relationship between the words. From a large corpus we built this vectorial representation in a 100-dimensional space for English, Portuguese, German, Spanish, Russian, French, Chinese, Japanese, Korean, Italian, Arabic, Hebrew, Basque, Dutch, Swedish, Finnish, and Estonian. Then, we calculated the fractal dimensions of the structure that represents each language. The structures are multi-fractals with two different dimensions that we use, in addition to the token-dictionary size rate of the languages, to represent the languages in a three-dimensional space. Finally, analyzing the distance among languages in this space, we conclude that the closeness there is tendentially related to the distance in the Phylogenetic tree that depicts the lines of evolutionary descent of the languages from a common ancestor.

Journal Article

Share this book

Add to My Shelf

Using Interactive Virtual Reality Tools in an Advanced Chinese Language Class: a Case Study

by Xie, Ying , Chen, Yan , Ryder, Lanhui in Authentic Learning , Cardboard , Case studies

2019

This case study explored college students’ use of interactive virtual reality tools (Google Cardboard and Expeditions) for learning Chinese as a foreign language. Specifically, the purpose of the study was to probe into students’ perceived benefits and challenges of using VR tools for Chinese language and culture learning. Twelve students were paired and role-played as virtual tour guides for six locations throughout a semester. Every two weeks, each dyad studied a particular Chinese tourist attraction or location and presented orally in Chinese as virtual tour guides by using the VR tools. Data collection included class observations of all presentations by each dyad, 24 reflections (two per participant, after the first and fifth presentations), and individual follow-up interviews. The study indicated that the real-life view VR tools offered an authentic context for Chinese language learning, sparked interest in the virtually presented locales, and encouraged students to further explore the target culture.

Journal Article

Share this book

Add to My Shelf

Prosody Dominates Over Semantics in Emotion Word Processing: Evidence From Cross-Channel and Cross-Modal Stroop Effects

by Ding, Hongwei , Lin, Yi , Zhang, Yang in Adults , Attention , Bilingualism

2020

Purpose: Emotional speech communication involves multisensory integration of linguistic (e.g., semantic content) and paralinguistic (e.g., prosody and facial expressions) messages. Previous studies on linguistic versus paralinguistic salience effects in emotional speech processing have produced inconsistent findings. In this study, we investigated the relative perceptual saliency of emotion cues in cross-channel auditory alone task (i.e., semantics-prosody Stroop task) and cross-modal audiovisual task (i.e., semantics-prosody-face Stroop task). Method: Thirty normal Chinese adults participated in two Stroop experiments with spoken emotion adjectives in Mandarin Chinese. Experiment 1 manipulated auditory pairing of emotional prosody (happy or sad) and lexical semantic content in congruent and incongruent conditions. Experiment 2 extended the protocol to cross-modal integration by introducing visual facial expression during auditory stimulus presentation. Participants were asked to judge emotional information for each test trial according to the instruction of selective attention. Results: Accuracy and reaction time data indicated that, despite an increase in cognitive demand and task complexity in Experiment 2, prosody was consistently more salient than semantic content for emotion word processing and did not take precedence over facial expression. While congruent stimuli enhanced performance in both experiments, the facilitatory effect was smaller in Experiment 2. Conclusion: Together, the results demonstrate the salient role of paralinguistic prosodic cues in emotion word processing and congruence facilitation effect in multisensory integration. Our study contributes tonal language data on how linguistic and paralinguistic messages converge in multisensory speech processing and lays a foundation for further exploring the brain mechanisms of cross-channel/modal emotion integration with potential clinical applications.

Journal Article

Share this book

Add to My Shelf

Unlocking the Secrets Behind Advanced Artificial Intelligence Language Models in Deidentifying Chinese-English Mixed Clinical Text: Development and Validation Study

by Wu, Chi-Shin , Lee, You-Qian , Lee, Chung-Hong in Archives & records , Artificial Intelligence , Automation

2024

The widespread use of electronic health records in the clinical and biomedical fields makes the removal of protected health information (PHI) essential to maintain privacy. However, a significant portion of information is recorded in unstructured textual forms, posing a challenge for deidentification. In multilingual countries, medical records could be written in a mixture of more than one language, referred to as code mixing. Most current clinical natural language processing techniques are designed for monolingual text, and there is a need to address the deidentification of code-mixed text. The aim of this study was to investigate the effectiveness and underlying mechanism of fine-tuned pretrained language models (PLMs) in identifying PHI in the code-mixed context. Additionally, we aimed to evaluate the potential of prompting large language models (LLMs) for recognizing PHI in a zero-shot manner. We compiled the first clinical code-mixed deidentification data set consisting of text written in Chinese and English. We explored the effectiveness of fine-tuned PLMs for recognizing PHI in code-mixed content, with a focus on whether PLMs exploit naming regularity and mention coverage to achieve superior performance, by probing the developed models' outputs to examine their decision-making process. Furthermore, we investigated the potential of prompt-based in-context learning of LLMs for recognizing PHI in code-mixed text. The developed methods were evaluated on a code-mixed deidentification corpus of 1700 discharge summaries. We observed that different PHI types had preferences in their occurrences within the different types of language-mixed sentences, and PLMs could effectively recognize PHI by exploiting the learned name regularity. However, the models may exhibit suboptimal results when regularity is weak or mentions contain unknown words that the representations cannot generate well. We also found that the availability of code-mixed training instances is essential for the model's performance. Furthermore, the LLM-based deidentification method was a feasible and appealing approach that can be controlled and enhanced through natural language prompts. The study contributes to understanding the underlying mechanism of PLMs in addressing the deidentification process in the code-mixed context and highlights the significance of incorporating code-mixed training instances into the model training phase. To support the advancement of research, we created a manipulated subset of the resynthesized data set available for research purposes. Based on the compiled data set, we found that the LLM-based deidentification method is a feasible approach, but carefully crafted prompts are essential to avoid unwanted output. However, the use of such methods in the hospital setting requires careful consideration of data security and privacy concerns. Further research could explore the augmentation of PLMs and LLMs with external knowledge to improve their strength in recognizing rare PHI.

Journal Article

Share this book

Add to My Shelf

ECBTNet: English-Foreign Chinese intelligent translation via multi-subspace attention and hyperbolic tangent LSTM

by Yang, Jing in Artificial Intelligence , Asian cultural groups , Attention

2023

The translation and sharing of languages around the world has become a necessary precondition for the movement of people. Teaching Chinese as a foreign language (TCFL) undertakes international function of spreading national culture. How to translate Chinese as a foreign language into English has become an important task. Machine translation has moved beyond the realm of theory to practical use as a result of advancements in computing. Deep learning is a prominent and relatively young subfield of machine learning that has shown promising results in a variety of fields. This paper aims to develop a TCFL-oriented English-Chinese neural machine translation model. First, this paper proposes a hyperbolic tangent long short-term memory network (HTLSTM). This will integrate future information and historical information to extract more sufficient contextual semantic information. Secondly, this paper proposes a multi-subspace attention mechanism. This integrates multiple attention calculation functions in the multi-subspace attention mechanism (MSATT). Thirdly, this paper combines HTLSTM with MSATT to construct an English-Chinese bilingual neural translation model called ECBTNet. The multi-subspace attention maps hidden state of hyperbolic tangent long-term short-term memory network to multiple subspaces. This then uses multiple attention calculation functions in the multi-attention mechanism when calculating the attention score. By applying different attention calculation functions in different subspaces to extract omni-directional context information features, accurate attention calculation results can be obtained. Finally, a systematic experiment is carried out, and the experimental data verify the feasibility of applying ECBTNet to the field of English-Chinese translation in TCFL.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter