Catalogue Search | MBRL

Zehazki : gaztelania-euskara hiztegia = diccionario castellano-euskera

by Sarasola, Ibon author in Basque language Dictionaries Spanish. , Spanish language Dictionaries Basque , Basque language Texts

Share this book

Add to My Shelf

Standard Basque : a progressive grammar

by Coene, Armand de , Rijk, Rudolf P. G. de in Basque language , Basque language -- Grammar , Basque language -- Textbooks for foreign speakers -- English

2008,2007

The first modern pedagogically oriented reference to the grammar of standard Basque (Euskara Batua), in two volumes: Part 1 presents detailed grammar lessons, Part 2 glosses and supplementary materials.

eBook

Share this book

Add to My Shelf

Simplicissimus

by Telleria, Patxo author in Theater , Basque language Texts

Share this book

Add to My Shelf

Assessment of the E3C corpus for the recognition of disorders in clinical texts

by Lavelli, Alberto , Toti, Daniele , Zanoli, Roberto in Abdominal pain , Acknowledgment , Annotations

2024

Disorder named entity recognition (DNER) is a fundamental task of biomedical natural language processing, which has attracted plenty of attention. This task consists in extracting named entities of disorders such as diseases, symptoms, and pathological functions from unstructured text. The European Clinical Case Corpus (E3C) is a freely available multilingual corpus (English, French, Italian, Spanish, and Basque) of semantically annotated clinical case texts. The entities of type disorder in the clinical cases are annotated at both mention and concept level. At mention -level, the annotation identifies the entity text spans, for example, abdominal pain. At concept level, the entity text spans are associated with their concept identifiers in Unified Medical Language System, for example, C0000737. This corpus can be exploited as a benchmark for training and assessing information extraction systems. Within the context of the present work, multiple experiments have been conducted in order to test the appropriateness of the mention-level annotation of the E3C corpus for training DNER models. In these experiments, traditional machine learning models like conditional random fields and more recent multilingual pre-trained models based on deep learning were compared with standard baselines. With regard to the multilingual pre-trained models, they were fine-tuned (i) on each language of the corpus to test per-language performance, (ii) on all languages to test multilingual learning, and (iii) on all languages except the target language to test cross-lingual transfer learning. Results show the appropriateness of the E3C corpus for training a system capable of mining disorder entities from clinical case texts. Researchers can use these results as the baselines for this corpus to compare their own models. The implemented models have been made available through the European Language Grid platform for quick and easy access.

Journal Article

Share this book

Add to My Shelf

Joemak eta polasak

by Astiz, Íñigo 1985- author , Mutuberria, Maite 1985- illustrator in Children's poetry, Basque 21th century , Basque language Texts

Share this book

Add to My Shelf

The corpus of Basque simplified texts (CBST)

by Gonzalez-Dios, Itziar , Aranzabe, María Jesús , Díaz de Ilarraza, Arantza in Annotations , Basque language , Basque people

2018

In this paper we present the corpus of Basque simplified texts. This corpus compiles 227 original sentences of science popularisation domain and two simplified versions of each sentence. The simplified versions have been created following different approaches: the structural, by a court translator who considers easy-to-read guidelines and the intuitive, by a teacher based on her experience. The aim of this corpus is to make a comparative analysis of simplified text. To that end, we also present the annotation scheme we have created to annotate the corpus. The annotation scheme is divided into eight macro-operations: delete, merge, split, transformation, insert, reordering, no operation and other. These macro-operations can be classified into different operations. We also relate our work and results to other languages. This corpus will be used to corroborate the decisions taken and to improve the design of the automatic text simplification system for Basque.

Journal Article

Share this book

Add to My Shelf

Etxeak eta hilobiak

by Atxaga, Bernardo author in Basque fiction 21th century , Basque literature 21th century , Basque language Texts

Share this book

Add to My Shelf

Exploring Automatic Readability Assessment for Science Documents within a Multilingual Educational Context

by Aranberri, Nora , Aldabe, Itziar , Uçar, Suna-Şeyma in Academic achievement , Accuracy , Active Learning

2024

Current student-centred, multilingual, active teaching methodologies require that teachers have continuous access to texts that are adequate in terms of topic and language competence. However, the task of finding appropriate materials is arduous and time consuming for teachers. To build on automatic readability assessment research that could help to assist teachers, we explore the performance of natural language processing approaches when dealing with educational science documents for secondary education. Currently, readability assessment is mainly explored in English. In this work we extend our research to Basque and Spanish together with English by compiling context-specific corpora and then testing the performance of feature-based machine-learning and deep learning models. Based on the evaluation of our results, we find that our models do not generalize well although deep learning models obtain better accuracy and F1 in all configurations. Further research in this area is still necessary to determine reliable characteristics of training corpora and model parameters to ensure generalizability.

Journal Article

Share this book

Add to My Shelf

Itsas bizimina

by Otxoteko, Pello, 1970- author in Basque poetry 21th century , Basque language Texts , Basque literature 21th century

Share this book

Add to My Shelf

Measuring language distance for historical texts in Basque

by Estarrona, Ainara , Padilla-Moyano, Manuel , Etxeberria, Izaskun in Basque language , Basque people , Corpus linguistics

2023

Measuring distance between languages, dialects and language varieties, both synchronically and diachronically, is a topic of growing interest in NLP. Based on our Syntactically Annotated Historical COrpus in BAsque (SAHCOBA) and previous work in perplexity-based language distance proposed by Gamallo, Pichel and Alegria (2017, 2020), we have compared historical corpora with current texts in the standard variety and calculated the language distances between them. As the standard Basque is based on the central dialects, the starting hypothesis is that the oldest texts and the dialects on the extremes will be the most distant. The results obtained have largely confirmed the thesis of traditional dialectology: peripheral dialects show a strong idiosyncrasy and are more distant from the rest.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter