Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Series TitleSeries Title
-
Reading LevelReading Level
-
YearFrom:-To:
-
More FiltersMore FiltersContent TypeItem TypeIs Full-Text AvailableSubjectCountry Of PublicationPublisherSourceTarget AudienceDonorLanguagePlace of PublicationContributorsLocation
Done
Filters
Reset
92,024
result(s) for
"natural language processing"
Sort by:
Review of natural language processing techniques for characterizing positive energy districts
by
Han, Mengjie
,
Zhang, Xingxing
,
Shah, Juveria
in
Emissions
,
Modelling
,
Natural language processing
2023
The concept of Positive Energy Districts (PEDs) has emerged as a crucial aspect of endeavours aimed at accelerating the transition to zero carbon emissions and climate-neutral living spaces. The focus of research has shifted from energy-efficient individual buildings to entire districts, where the objective is to achieve a positive energy balance over a specific timeframe. The consensus on the conceptualization of a PED has been evolving and a standardized checklist for identifying and evaluating its constituent elements needs to be addressed. This study aims to develop a methodology for characterizing PEDs by leveraging natural language processing (NLP) techniques to model, extract, and map these elements. Furthermore, a review of state-of-the-art research papers is conducted to ascertain their contribution to assessing the effectiveness of NLP models. The findings indicate that NLP holds significant potential in modelling the majority of the identified elements across various domains. To establish a systematic framework for AI modelling, it is crucial to adopt approaches that integrate established and innovative techniques for PED characterization. Such an approach would enable a comprehensive and effective implementation of NLP within the context of PEDs, facilitating the creation of sustainable and resilient urban environments.
Journal Article
Introduction to natural language processing
\"The book provides a technical perspective on the most contemporary data-driven approaches, focusing on techniques from supervised and unsupervised machine learning. It also includes background in the salient linguistic issues, as well as computational representations and algorithms. The first section of the book explores what can be with individual words. The second section concerns structured representations such as sequences, trees, and graphs. The third section highlights different approaches to the representation and analysis of linguistic meaning. The final section describes three of the most transformative applications of natural language processing: information extraction, machine translation, and text generation. The book describes the technical foundations of the field, including the most relevant machine learning techniques, algorithms, and linguistic representations. From these foundations, it extends to contemporary research in areas such as deep learning. Each chapter contains exercises that include paper-and-pencil analysis of the computational algorithms and linguistic issues, as well as software implementations\"-- Provided by publisher.
The 2019 n2c2/OHNLP Track on Clinical Semantic Textual Similarity: Overview
2020
Semantic textual similarity is a common task in the general English domain to assess the degree to which the underlying semantics of 2 text segments are equivalent to each other. Clinical Semantic Textual Similarity (ClinicalSTS) is the semantic textual similarity task in the clinical domain that attempts to measure the degree of semantic equivalence between 2 snippets of clinical text. Due to the frequent use of templates in the Electronic Health Record system, a large amount of redundant text exists in clinical notes, making ClinicalSTS crucial for the secondary use of clinical text in downstream clinical natural language processing applications, such as clinical text summarization, clinical semantics extraction, and clinical information retrieval.
Our objective was to release ClinicalSTS data sets and to motivate natural language processing and biomedical informatics communities to tackle semantic text similarity tasks in the clinical domain.
We organized the first BioCreative/OHNLP ClinicalSTS shared task in 2018 by making available a real-world ClinicalSTS data set. We continued the shared task in 2019 in collaboration with National NLP Clinical Challenges (n2c2) and the Open Health Natural Language Processing (OHNLP) consortium and organized the 2019 n2c2/OHNLP ClinicalSTS track. We released a larger ClinicalSTS data set comprising 1642 clinical sentence pairs, including 1068 pairs from the 2018 shared task and 1006 new pairs from 2 electronic health record systems, GE and Epic. We released 80% (1642/2054) of the data to participating teams to develop and fine-tune the semantic textual similarity systems and used the remaining 20% (412/2054) as blind testing to evaluate their systems. The workshop was held in conjunction with the American Medical Informatics Association 2019 Annual Symposium.
Of the 78 international teams that signed on to the n2c2/OHNLP ClinicalSTS shared task, 33 produced a total of 87 valid system submissions. The top 3 systems were generated by IBM Research, the National Center for Biotechnology Information, and the University of Florida, with Pearson correlations of r=.9010, r=.8967, and r=.8864, respectively. Most top-performing systems used state-of-the-art neural language models, such as BERT and XLNet, and state-of-the-art training schemas in deep learning, such as pretraining and fine-tuning schema, and multitask learning. Overall, the participating systems performed better on the Epic sentence pairs than on the GE sentence pairs, despite a much larger portion of the training data being GE sentence pairs.
The 2019 n2c2/OHNLP ClinicalSTS shared task focused on computing semantic similarity for clinical text sentences generated from clinical notes in the real world. It attracted a large number of international teams. The ClinicalSTS shared task could continue to serve as a venue for researchers in natural language processing and medical informatics communities to develop and improve semantic textual similarity techniques for clinical text.
Journal Article
The handbook of computational linguistics and natural language processing
by
Clark, Alexander
,
Fox, Chris
,
Lappin, Shalom
in
Computational
,
Computational linguistics
,
Language Arts & Disciplines
2010,2013
This comprehensive reference work provides an overview of the concepts, methodologies, and applications in computational linguistics and natural language processing (NLP). Features contributions by the top researchers in the field, reflecting the work that is driving the discipline forward Includes an introduction to the major theoretical issues in these fields, as well as the central engineering applications that the work has produced Presents the major developments in an accessible way, explaining the close connection between scientific understanding of the computational properties of natural language and the creation of effective language technologies Serves as an invaluable state-of-the-art reference source for computational linguists and software engineers developing NLP applications in industrial research and development labs of software companies
Natural language processing recipes : unlocking text data with machine learning and deep learning using Python
Implement natural language processing applications with Python using a problem-solution approach. This book has numerous coding exercises that will help you to quickly deploy natural language processing techniques, such as text classification, parts of speech identification, topic modeling, text summarization, and sentiment analysis. \"Natural language processing recipes\" starts by offering solutions for cleaning and preprocessing text data and ways to analyze it with advanced algorithms. You'll see practical applications of the semantic as well as syntactic analysis of text, as well as complex natural language processing approaches that involve text normalization, advanced preprocessing, POS tagging, parsing, text summarization, and sentiment analysis. You will also learn various applications of machine learning and deep learning in natural language processing. By using the recipes in this book, you will have a toolbox of solutions to apply to your own projects in the real world, making your development time quicker and more efficient. You will: Apply NLP techniques using Python libraries such as NLTK, TextBlob, soaCy, Stanford CoreNLP, and many more ; Implement the concepts of information retrieval, text summarization, sentiment analysis, and other advanced natural language processing techniques ; Identify machine learning and deep learning techniques for natural language processing and natural language generation problems.
University Student Dropout Prediction Using Pretrained Language Models
2023
Predicting student dropout from universities is an imperative but challenging task. Numerous data-driven approaches that utilize both student demographic information (e.g., gender, nationality, and high school graduation year) and academic information (e.g., GPA, participation in activities, and course evaluations) have shown meaningful results. Recently, pretrained language models have achieved very successful results in understanding the tasks associated with structured data as well as textual data. In this paper, we propose a novel student dropout prediction framework based on demographic and academic information, using a pretrained language model to capture the relationship between different forms of information. To this end, we first formulate both types of information in natural language form. We then recast the student dropout prediction task as a natural language inference (NLI) task. Finally, we fine-tune the pretrained language models to predict student dropout. In particular, we further enhance the model using a continuous hypothesis. The experimental results demonstrate that the proposed model is effective for the freshmen dropout prediction task. The proposed method exhibits significant improvements of as much as 9.00% in terms of F1-score compared with state-of-the-art techniques.
Journal Article
Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification
by
van Osch, Dirk
,
van der Harst, Pim
,
Teske, Arco
in
Annotations
,
Artificial intelligence
,
Automatic classification
2025
Background
Clinical machine learning research and artificial intelligence driven clinical decision support models rely on clinically accurate labels. Manually extracting these labels with the help of clinical specialists is often time-consuming and expensive. This study tests the feasibility of automatic span- and document-level diagnosis extraction from unstructured Dutch echocardiogram reports.
Methods
We included 115,692 unstructured echocardiogram reports from the University Medical Center Utrecht, a large university hospital in the Netherlands. A randomly selected subset was manually annotated for the occurrence and severity of eleven commonly described cardiac characteristics. We developed and tested several automatic labelling techniques at both span and document levels, using weighted and macro F1-score, precision, and recall for performance evaluation. We compared the performance of span labelling against document labelling methods, which included both direct document classifiers and indirect document classifiers that rely on span classification results.
Results
The SpanCategorizer and MedRoBERTa.nl models outperformed all other span and document classifiers, respectively. The weighted F1-score varied between characteristics, ranging from 0.60 to 0.93 in SpanCategorizer and 0.96 to 0.98 in MedRoBERTa.nl. Direct document classification was superior to indirect document classification using span classifiers. SetFit achieved competitive document classification performance using only 10% of the training data. Utilizing a reduced label set yielded near-perfect document classification results.
Conclusion
We recommend using our published SpanCategorizer and MedRoBERTa.nl models for span- and document-level diagnosis extraction from Dutch echocardiography reports. For settings with limited training data, SetFit may be a promising alternative for document classification. Future research should be aimed at training a RoBERTa based span classifier and applying English based models on translated echocardiogram reports.
Journal Article