Catalogue Search | MBRL

The book of the cat

by Hyland, Angus, author , Roberts, Caroline, author in Cats in art. , Cats in art History. , Cats Pictorial works.

Book

Share this book

Add to My Shelf

Getting More Out of Biomedical Documents with GATE's Full Lifecycle Open Source Text Analytics

by Tablan, Valentin , Bontcheva, Kalina , Roberts, Angus in Biology , Biomedical Research , Cloud computing

2013

This software article describes the GATE family of open source text analysis tools and processes. GATE is one of the most widely used systems of its type with yearly download rates of tens of thousands and many active users in both academic and industrial contexts. In this paper we report three examples of GATE-based systems operating in the life sciences and in medicine. First, in genome-wide association studies which have contributed to discovery of a head and neck cancer mutation association. Second, medical records analysis which has significantly increased the statistical power of treatment/outcome models in the UK's largest psychiatric patient cohort. Third, richer constructs in drug-related searching. We also explore the ways in which the GATE family supports the various stages of the lifecycle present in our examples. We conclude that the deployment of text mining for document abstraction or rich search and navigation is best thought of as a process, and that with the right computational tools and data collection strategies this process can be made defined and repeatable. The GATE research programme is now 20 years old and has grown from its roots as a specialist development tool for text processing to become a rather comprehensive ecosystem, bringing together software developers, language engineers and research staff from diverse fields. GATE now has a strong claim to cover a uniquely wide range of the lifecycle of text analysis systems. It forms a focal point for the integration and reuse of advances that have been made by many people (the majority outside of the authors' own group) who work in text processing for biomedicine and other areas. GATE is available online under GNU open source licences and runs on all major operating systems. Support is available from an active user and developer community and also on a commercial basis.

Journal Article

Share this book

Add to My Shelf

The book of the horse : horses in art

by Hyland, Angus, author , Roberts, Caroline, 1966- author in Horses in art.

Horses have been the inspiration for hundreds of works of art over the centuries. This book celebrates the horse in all its glory, from its role as working animal in war and sport, to beloved pet. The horse's beauty and majesty is caught by artists as diverse as Stubbs, Toulouse-Lautrec, Picasso, Magritte, Frink, and Freud. 'The Book of the Horse' is a cool and quirky collection of equine art and illustration by artists from around the world. Interspersed through the illustrations are short texts about the artists and their subjects. Beautifully designed and packaged, the book will appeal to horse lovers of all ages.

Book

Share this book

Add to My Shelf

Distributions of recorded pain in mental health records: a natural language processing based study

by Roberts, Angus , Chaturvedi, Jaya , Ashworth, Mark in Bipolar disorder , Chronic Pain , Comorbidity

2024

ObjectiveThe objective of this study is to determine demographic and diagnostic distributions of physical pain recorded in clinical notes of a mental health electronic health records database by using natural language processing and examine the overlap in recorded physical pain between primary and secondary care.Design, setting and participantsThe data were extracted from an anonymised version of the electronic health records of a large secondary mental healthcare provider serving a catchment of 1.3 million residents in south London. These included patients under active referral, aged 18+ at the index date of 1 July 2018 and having at least one clinical document (≥30 characters) between 1 July 2017 and 1 July 2019. This cohort was compared with linked primary care records from one of the four local government areas.OutcomeThe primary outcome of interest was the presence of recorded physical pain within the clinical notes of the patients, not including psychological or metaphorical pain.ResultsA total of 27 211 patients were retrieved. Of these, 52% (14,202) had narrative text containing relevant mentions of physical pain. Older patients (OR 1.17, 95% CI 1.15 to 1.19), females (OR 1.42, 95% CI 1.35 to 1.49), Asians (OR 1.30, 95% CI 1.16 to 1.45) or black (OR 1.49, 95% CI 1.40 to 1.59) ethnicities, living in deprived neighbourhoods (OR 1.64, 95% CI 1.55 to 1.73) showed higher odds of recorded pain. Patients with severe mental illnesses were found to be less likely to report pain (OR 0.43, 95% CI 0.41 to 0.46, p<0.001). 17% of the cohort from secondary care also had records from primary care.ConclusionThe findings of this study show sociodemographic and diagnostic differences in recorded pain. Specifically, lower documentation across certain groups indicates the need for better screening protocols and training on recognising varied pain presentations. Additionally, targeting improved detection of pain for minority and disadvantaged groups by care providers can promote health equity.

Journal Article

Share this book

Add to My Shelf

Natural language processing to extract symptoms of severe mental illness from clinical text: the Clinical Record Interactive Search Comprehensive Data Extraction (CRIS-CODE) project

by Kolliakou, Anna , Ball, Michael , Jayatilleke, Nishamali in Bipolar disorder , Business intelligence , Catatonia

2017

ObjectivesWe sought to use natural language processing to develop a suite of language models to capture key symptoms of severe mental illness (SMI) from clinical text, to facilitate the secondary use of mental healthcare data in research.DesignDevelopment and validation of information extraction applications for ascertaining symptoms of SMI in routine mental health records using the Clinical Record Interactive Search (CRIS) data resource; description of their distribution in a corpus of discharge summaries.SettingElectronic records from a large mental healthcare provider serving a geographic catchment of 1.2 million residents in four boroughs of south London, UK.ParticipantsThe distribution of derived symptoms was described in 23 128 discharge summaries from 7962 patients who had received an SMI diagnosis, and 13 496 discharge summaries from 7575 patients who had received a non-SMI diagnosis.Outcome measuresFifty SMI symptoms were identified by a team of psychiatrists for extraction based on salience and linguistic consistency in records, broadly categorised under positive, negative, disorganisation, manic and catatonic subgroups. Text models for each symptom were generated using the TextHunter tool and the CRIS database.ResultsWe extracted data for 46 symptoms with a median F1 score of 0.88. Four symptom models performed poorly and were excluded. From the corpus of discharge summaries, it was possible to extract symptomatology in 87% of patients with SMI and 60% of patients with non-SMI diagnosis.ConclusionsThis work demonstrates the possibility of automatically extracting a broad range of SMI symptoms from English text discharge summaries for patients with an SMI diagnosis. Descriptive data also indicated that most symptoms cut across diagnoses, rather than being restricted to particular groups.

Journal Article

Share this book

Add to My Shelf

CogStack - experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital

by Groza, Tudor , Stringer, Clive , Jackson, Richard in Algorithms , Analysis , Architecture

2018

Background Traditional health information systems are generally devised to support clinical data collection at the point of care. However, as the significance of the modern information economy expands in scope and permeates the healthcare domain, there is an increasing urgency for healthcare organisations to offer information systems that address the expectations of clinicians, researchers and the business intelligence community alike. Amongst other emergent requirements, the principal unmet need might be defined as the 3R principle (right data, right place, right time) to address deficiencies in organisational data flow while retaining the strict information governance policies that apply within the UK National Health Service (NHS). Here, we describe our work on creating and deploying a low cost structured and unstructured information retrieval and extraction architecture within King’s College Hospital, the management of governance concerns and the associated use cases and cost saving opportunities that such components present. Results To date, our CogStack architecture has processed over 300 million lines of clinical data, making it available for internal service improvement projects at King’s College London. On generated data designed to simulate real world clinical text, our de-identification algorithm achieved up to 94% precision and up to 96% recall. Conclusion We describe a toolkit which we feel is of huge value to the UK (and beyond) healthcare community. It is the only open source, easily deployable solution designed for the UK healthcare environment, in a landscape populated by expensive proprietary systems. Solutions such as these provide a crucial foundation for the genomic revolution in medicine.

Journal Article

Share this book

Add to My Shelf

Can natural language processing models extract and classify instances of interpersonal violence in mental healthcare electronic records: an applied evaluative study

by Williams, Marcus V , Velupillai, Sumithra , Botelle, Riley in Algorithms , Annotations , Archives & records

2022

ObjectiveThis paper evaluates the application of a natural language processing (NLP) model for extracting clinical text referring to interpersonal violence using electronic health records (EHRs) from a large mental healthcare provider.DesignA multidisciplinary team iteratively developed guidelines for annotating clinical text referring to violence. Keywords were used to generate a dataset which was annotated (ie, classified as affirmed, negated or irrelevant) for: presence of violence, patient status (ie, as perpetrator, witness and/or victim of violence) and violence type (domestic, physical and/or sexual). An NLP approach using a pretrained transformer model, BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) was fine-tuned on the annotated dataset and evaluated using 10-fold cross-validation.SettingWe used the Clinical Records Interactive Search (CRIS) database, comprising over 500 000 de-identified EHRs of patients within the South London and Maudsley NHS Foundation Trust, a specialist mental healthcare provider serving an urban catchment area.ParticipantsSearches of CRIS were carried out based on 17 predefined keywords. Randomly selected text fragments were taken from the results for each keyword, amounting to 3771 text fragments from the records of 2832 patients.Outcome measuresWe estimated precision, recall and F1 score for each NLP model. We examined sociodemographic and clinical variables in patients giving rise to the text data, and frequencies for each annotated violence characteristic.ResultsBinary classification models were developed for six labels (violence presence, perpetrator, victim, domestic, physical and sexual). Among annotations affirmed for the presence of any violence, 78% (1724) referred to physical violence, 61% (1350) referred to patients as perpetrator and 33% (731) to domestic violence. NLP models’ precision ranged from 89% (perpetrator) to 98% (sexual); recall ranged from 89% (victim, perpetrator) to 97% (sexual).ConclusionsState of the art NLP models can extract and classify clinical text on violence from EHRs at acceptable levels of scale, efficiency and accuracy.

Journal Article

Share this book

Add to My Shelf

Dementia-related volumetric assessments in neuroradiology reports: a natural language processing-based study

by Mayers, Adam John , Booth, Christopher , Roberts, Angus in Aged , Aged, 80 and over , Algorithms

2025

ObjectivesStructural MRI of the brain is routinely performed on patients referred to memory clinics; however, resulting radiology reports, including volumetric assessments, are conventionally stored as unstructured free text. We sought to use natural language processing (NLP) to extract text relating to intracranial volumetric assessment from brain MRI text reports to enhance routine data availability for research purposes.SettingElectronic records from a large mental healthcare provider serving a geographic catchment of 1.3 million residents in four boroughs of south London, UK.DesignA corpus of 4007 de-identified brain MRI reports from patients referred to memory assessment services. An NLP algorithm was developed, using a span categorisation approach, to extract six binary (presence/absence) categories from the text reports: (i) global volume loss, (ii) hippocampal/medial temporal lobe volume loss and (iii) other lobar/regional volume loss. Distributions of these categories were evaluated.ResultsThe overall F1 score for the six categories was 0.89 (precision 0.92, recall 0.86), with the following precision/recall for each category: presence of global volume loss 0.95/0.95, absence of global volume loss 0.94/0.77, presence of regional volume loss 0.80/0.58, absence of regional volume loss 0.91/0.93, presence of hippocampal volume loss 0.90/0.88, and absence of hippocampal volume loss 0.94/0.92.ConclusionsThese results support the feasibility and accuracy of using NLP techniques to extract volumetric assessments from radiology reports, and the potential for automated generation of novel meta-data from dementia assessments in electronic health records.

Journal Article

Share this book

Add to My Shelf

Generation and evaluation of artificial mental health records for Natural Language Processing

by Verma, Somain , Velupillai, Sumithra , Puntis, Stephen in 692/308 , 706/648 , Biomedicine

2020

A serious obstacle to the development of Natural Language Processing (NLP) methods in the clinical domain is the accessibility of textual data. The mental health domain is particularly challenging, partly because clinical documentation relies heavily on free text that is difficult to de-identify completely. This problem could be tackled by using artificial medical data. In this work, we present an approach to generate artificial clinical documents. We apply this approach to discharge summaries from a large mental healthcare provider and discharge summaries from an intensive care unit. We perform an extensive intrinsic evaluation where we (1) apply several measures of text preservation; (2) measure how much the model memorises training data; and (3) estimate clinical validity of the generated text based on a human evaluation task. Furthermore, we perform an extrinsic evaluation by studying the impact of using artificial text in a downstream NLP text classification task. We found that using this artificial data as training data can lead to classification results that are comparable to the original results. Additionally, using only a small amount of information from the original data to condition the generation of the artificial data is successful, which holds promise for reducing the risk of these artificial data retaining rare information from the original data. This is an important finding for our long-term goal of being able to generate artificial clinical data that can be released to the wider research community and accelerate advances in developing computational methods that use healthcare data.

Journal Article

Share this book

Add to My Shelf

A survey on clinical natural language processing in the United Kingdom from 2007 to 2022

by Poon, Michael T. C. , Dong, Hang , Collier, Nigel in 639/705/1042 , 692/308/575 , Biomedicine

2022

Much of the knowledge and information needed for enabling high-quality clinical research is stored in free-text format. Natural language processing (NLP) has been used to extract information from these sources at scale for several decades. This paper aims to present a comprehensive review of clinical NLP for the past 15 years in the UK to identify the community, depict its evolution, analyse methodologies and applications, and identify the main barriers. We collect a dataset of clinical NLP projects ( n = 94; £ = 41.97 m) funded by UK funders or the European Union’s funding programmes. Additionally, we extract details on 9 funders, 137 organisations, 139 persons and 431 research papers. Networks are created from timestamped data interlinking all entities, and network analysis is subsequently applied to generate insights. 431 publications are identified as part of a literature review, of which 107 are eligible for final analysis. Results show, not surprisingly, clinical NLP in the UK has increased substantially in the last 15 years: the total budget in the period of 2019–2022 was 80 times that of 2007–2010. However, the effort is required to deepen areas such as disease (sub-)phenotyping and broaden application domains. There is also a need to improve links between academia and industry and enable deployments in real-world settings for the realisation of clinical NLP’s great potential in care delivery. The major barriers include research and development access to hospital data, lack of capable computational resources in the right places, the scarcity of labelled data and barriers to sharing of pretrained models.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter