Asset Details

MbrlCatalogueTitleDetail

Do you wish to reserve the book?

Applications of Natural Language Processing and Large Language Models for Social Determinants of Health: A Systematic Review (Preprint)

by Zhang, Ziyuan , Chen, Yankai , Sarker, Abeed , Rogers, Hannah , Rajwal, Swati , Pandey, Avinash Kumar , Liu, Michael X. , Xiao, Yunyu , Das, Sudeshna

in Development and Evaluation of Research Methods, Instruments and Tools / Digital Health Reviews / Electronic Health Records / Electronic records / Humans / Large Language Models / Medical records / Natural Language Processing / Public health / Registered Report / Review / Social Determinants of Health / Social media / Technology application

2026

Yes Please

Hey, we have placed the reservation for you!

By the way, why not check out events that you can attend while you pick your title.

Oops! Something went wrong.

Looks like we were not able to place the reservation. Kindly try again later.

Are you sure you want to remove the book from the shelf?

Applications of Natural Language Processing and Large Language Models for Social Determinants of Health: A Systematic Review (Preprint)

by Zhang, Ziyuan , Chen, Yankai , Sarker, Abeed , Rogers, Hannah , Rajwal, Swati , Pandey, Avinash Kumar , Liu, Michael X. , Xiao, Yunyu , Das, Sudeshna

2026

Confirm

Do you wish to request the book?

Applications of Natural Language Processing and Large Language Models for Social Determinants of Health: A Systematic Review (Preprint)

by Zhang, Ziyuan , Chen, Yankai , Sarker, Abeed , Rogers, Hannah , Rajwal, Swati , Pandey, Avinash Kumar , Liu, Michael X. , Xiao, Yunyu , Das, Sudeshna

2026

Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy

How would you like to get it?

Submit

We have requested the book for you!

Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.

Oops! Something went wrong.

Looks like we were not able to place your request. Kindly try again later.

Journal Article

Applications of Natural Language Processing and Large Language Models for Social Determinants of Health: A Systematic Review (Preprint)

Zhang, Ziyuan,

Chen, Yankai,

Sarker, Abeed,

Rogers, Hannah,

Rajwal, Swati,

Pandey, Avinash Kumar,

Liu, Michael X.,

Xiao, Yunyu,

Das, Sudeshna

2026

Overview

Social determinants of health (SDOH) are the social, economic, and environmental conditions that influence health outcomes. SDOH information is often embedded in unstructured text, such as notes in electronic health records and social media posts. Advances in natural language processing (NLP), including emergent large language models (LLMs), offer opportunities to extract, analyze, and interpret SDOH expressions from free text for inclusion in downstream analyses. Existing literature on NLP applications for SDOH is dispersed across disciplines and characterized by methodological heterogeneity and variability in study quality and scope, complicating synthesis and cross-study comparison. This study aimed to examine the use of NLP, including LLMs, in SDOH research, and highlight gaps and future research directions. We conducted a systematic review following PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines, searching 7 major databases for publications between 2014 and November 2025. We included journal and conference proceedings papers that applied NLP methods to identify, classify, extract, or predict SDOH from text. Three reviewers independently screened studies and extracted data; conflicts were resolved by two senior reviewers. We abstracted study metadata, dataset characteristics, NLP approaches, SDOH domains addressed, and NLP performance metrics. We also conducted risk-of-bias analyses and identified influential studies based on relative citation counts. 142 studies met the inclusion criteria. Nearly two-thirds (89/142, 62.7%) were published between 2023 and 2025, reflecting rapid recent growth. Most studies relied on electronic health records (93/142, 65.5%) and private datasets (81/142, 57.0%), while only 20.4% (29/142) used publicly available data. Commonly studied SDOH domains were housing instability (72/142, 50.7%), employment (65/142, 45.8%), and financial conditions (63/142, 44.4%); structural factors, such as immigration status (5/142, 3.5%), were rarely examined. Of studies that reported evaluation metrics, most focused on classification (26/83, 31.32%) or extraction (38/83, 45.7%), and used cross-sectional designs. Reported model performances were typically strong, with median F1-scores ranging roughly from 0.75 to 0.85 across model categories. Only 49 studies shared code, and fewer than half clearly described model interpretability or reproducibility practices. LLMs (including encoder-decoder models) appeared in 19.7% (28/142) of studies, highlighting emerging interest but also raising new concerns around transparency and governance. This review provides a timely synthesis of NLP and LLM applications across the SDOH research spectrum, addressing an important gap in a topic receiving increasing research attention. By comparing task formulations, data sources, and performance patterns, the review clarifies the research readiness of current approaches and reveals critical gaps. Our findings advance the field by highlighting the absence of a unified SDOH framework, uneven availability of public benchmarks, and limited evaluation of real-world deployment. Addressing these gaps through transparent, inclusive dataset development and implementation-focused evaluation is essential for translating NLP advances into equitable, real-world health impact.

Share this book

Add to My Shelf

Publisher

Journal of Medical Internet Research,JMIR Publications Inc,JMIR Publications

Subject

Development and Evaluation of Research Methods, Instruments and Tools

/ Digital Health Reviews

/ Electronic Health Records

/ Electronic records

/ Humans

/ Large Language Models

/ Medical records