Catalogue Search | MBRL

Investigating behavioral intentions toward DeepSeek adoption using a moderated-mediation analysis with PLS-SEM and fsQCA

by Kumar, Jai , Gao, Han , Kumar, Dheeraj in 4000/4001 , 4000/4008 , 4014/160

2026

Despite the growing integration of AI in education, students often lack awareness of specialized tools, which limits their adoption and effective use. Using partial least squares structural equation modeling (PLS-SEM) and Fuzzy-set qualitative comparative analysis (fsQCA), this study aimed to investigate behavioral intentions toward DeepSeek adoption among university students. The conceptual framework of the research combines key factors from the Unified Theory of Acceptance and Use of Technology (UTAUT). A survey using a five-point Likert scale was administered to students from Pakistani universities. The study’s findings show that DeepSeek awareness has a significant impact on DeepSeek adoption, which is moderated by perceived ease of use and mediated by facilitating conditions, social influence, and performance expectancy. fsQCA further reveals configurations of factors that jointly lead to high adoption, highlighting multiple pathways rather than single causal effects. This study intends to benefit companies, academic institutions, and the global community by providing insight into how students perceive the DeepSeek services in an educational setting. The study contributes theoretically by integrating awareness into the UTAUT framework and applying fsQCA, providing a novel perspective on AI adoption in education. Finally, the conclusions of this study will help AI developers improve their product and service delivery, as well as regulators manage the usage of AI-powered bots.

Journal Article

Share this book

Add to My Shelf

Can large language models generate geospatial code?

by Hou, Shuyang , Guan, Xuefeng , Xiang, Longgang in DeepSeek , Geospatial code generation , large language models

2025

As large language models increasingly exhibit hallucinations such as refusal to respond, generation of non-executable code, and poor readability in geospatial code generation tasks, establishing a systematic and quantifiable evaluation framework has become essential for advancing their application in GIS. This paper introduces the GeoCode-Eval framework, the first comprehensive evaluation framework for LLMs in geospatial code generation. Grounded in three dimensions – cognition and memory, understanding and interpretation, and innovation and creation – the framework addresses eight competency levels, including platform and tool cognition, functional knowledge, dataset recognition, information extraction, and various code-related tasks. To support this, the GeoCode-Bench benchmark was developed, consisting of 5,000 multiple-choice questions, 1,500 true/false questions, 1,500 fill-in-the-blank questions, and 1,000 coding tasks. Using six indicators, namely executability, accuracy, readability, location correctness, content correctness, and summary completeness, the study evaluates twelve representative models spanning four categories, including DeepSeek-Coder-V2 and GeoCode-GPT (7B). A combination of analytical methods, including entropy weighting, the coefficient of variation, skewness, and kurtosis, is applied to examine model capability distribution, indicator distribution, code type characteristics, and error type patterns. Results show consistent performance in tool cognition and code summarization, while significant performance gaps persist in code generation, completion, and correction. Common errors include data type and syntax issues. This study provides a quantifiable foundation for the evaluation of capabilities and future optimization of LLMs in geospatial code generation, thereby extending the application boundaries of LLMs in GIS and offering valuable insights into the development of evaluation methodologies for LLM applications in other vertical domains.

Journal Article

Share this book

Add to My Shelf

Copilot, not autopilot: Clinician oversight as the linchpin of safe AI-driven clinical decision making

by Wang, Hui , Jiang, Yan , Wang, Hong in Anesthesia , Artificial intelligence , Chatbots

2025

Journal Article

Share this book

Add to My Shelf

Large Language Models for Transforming Healthcare: A Perspective on DeepSeek‐R1

by Zhou, Jinsong , Chen, Yingcong , He, Sixu in Accuracy , AI for healthcare , AI interpretability

2025

DeepSeek‐R1 is an open‐source Large Language Model (LLM) with advanced reasoning capabilities. It has gained significant attention for its impressive advantages including low costs and visualized reasoning steps. Recent advancements in reasoning LLMs like ChatGPT‐o1 have significantly exhibited their considerable reasoning potential, but the closed‐source nature of existing models limits customization and transparency, presenting substantial barriers to their integration into healthcare systems. This gap motivates the exploration of DeepSeek‐R1 in the medical field. Thus, we comprehensively review the transformative potential, applications, and challenges of DeepSeek‐R1 in healthcare. Specifically, we investigate how DeepSeek‐R1 can enhance clinical decision support, patient engagement, and medical education to help for clinic, outpatient and medical research. Furthermore, we critically evaluate challenges including modality limitations (text‐only), hallucination risks, and ethical issues, particularly related to patient autonomy and safety‐focused recommendations. By assessing DeepSeek‐R1′s integration potential, this perspective highlights promising opportunities for advancing medical AI while emphasizing necessary improvements to maximize clinical reliability and ethical compliance. This paper provides valuable guidance for future research directions and elucidates practical application scenarios for DeepSeek‐R1′s successful integration into healthcare settings. This paper explores the potential of DeepSeek‐R1, an open‐source LLM with transparent reasoning and low deployment costs, especially for clinical decision support, patient engagement, and medical education. We highlight integration opportunities in healthcare while discussing challenges such as hallucinations, ethical concerns, and text‐only modality, offering guidance for future research directions and responsible adoption of reasoning LLMs in medical settings.

Journal Article

Share this book

Add to My Shelf

Can deepseek and ChatGPT be used in the diagnosis of oral pathologies?

by Kaygisiz, Ömer Faruk , Teke, Mehmet Turhan in Accuracy , Artificial Intelligence , Artificial intelligent

2025

Objective Artificial intelligence (AI) has been widely used in various medical fields to support diagnostic development. The development of different AI techniques has made important contributions to early diagnoses. This research compares and evaluates the diagnostic accuracy of ChatGPT-4o and Deepseek-v3 AI applications in 16 clinical case scenarios in oral pathologies. Methodology Clinical case scenarios of 16 imaginary oral pathologies were prepared by the authors. The cases were asked to provide 3 possible preliminary diagnoses to two different AI applications, DeepSeek-V3 and ChatGPT-4o, and to reference the literature for these diagnoses. The diagnoses of both AI applications were evaluated with Likert scale by 20 different specialists from two different specialties. Results The mean score for DeepSeek-v3 was 4.02 ± 0.36. For ChatGPT-4o it was 3.15 ± 0.41. According to the average scores, both models performed at a moderate to high level. Also, between the two AI models. DeepSeek-v3 was statistically better in 9 out of 16 clinical scenarios, while ChatGPT-4o was statistically better in 1 question. In general, DeepSeek-v3 was statistically more successful in the comparison of the two models ( p = 0.024). In terms of references, ChatGPT-4o showed 62 references and 50 of them were fake, while 8 out of 48 references were fake in DeepSeek-v3. Conclusions Chatbot applications have the potential to become a valuable consultant for clinicians in the future thanks to its fast-processing ability. It is clear that it can help healthcare services by reducing the workload of clinicians. It can be said that the Deepseek-v3 model produces better results compared to ChatGPT-4o, but both applications need to be improved for routine use. It is thought that the release of versions of AI models that can only perform scans in the medical field and respond to clinicians by providing more reliable resources may make these models more valuable.

Journal Article

Share this book

Add to My Shelf

Benchmarking Large Language Models on the Taiwan Neurology Board Examinations (2018–2024): A Comparative Evaluation of GPT-4o, GPT-o1, DeepSeek-V3, and DeepSeek-R1

by Lin, Shih-Yi , Hsu, Ying-Yu , Chang, Shih-Sheng in Accuracy , Benchmarks , Chatbots

2026

Background and Purpose: Neurology requires integration of clinical reasoning, imaging interpretation, and current knowledge, making it an ideal field for evaluating large language models (LLMs). Methods: Using 1715 questions from the Taiwan Neurology Board Examination (2018–2024), we assessed four LLMs—GPT-4o, GPT-o1, DeepSeek-V3, and DeepSeek-R1—across four formats: single-choice, multiple-choice, true–false, and image-based items. Results: GPT-o1 achieved the highest overall accuracy (83.86%) and demonstrated strong performance on cognitively demanding tasks (82.50% on true–false; 77.26% on image-based). DeepSeek-V3 scored lowest (65.62%) and showed the greatest variability. Statistical analyses confirmed significant inter-model differences (p < 0.01). Accuracy declined across all models in 2024, coinciding with shifts in question design. DeepSeek-R1 was further penalized by alignment-based refusals, resulting in up to 3.81% score loss. Conclusions: These results position the Taiwan Neurology Board Exam as a rigorous benchmark for LLM evaluation and underscore GPT-o1’s potential utility in neurology education and decision support.

Journal Article

Share this book

Add to My Shelf

Promoting Responsible DeepSeek Deployment in Health Care: Scoping Review Comparing Grey and White Literature

by Xu, Chang , Zeng, Yihang , Huang, Jiaqi in China , Decision support systems , Delivery of Health Care

2025

DeepSeek is an open-source large language model (LLM), and it has greatly accelerated LLM adoption in health care. Its rapid deployment has sparked concerns regarding its impact on patient outcomes and safety. However, little is known about how DeepSeek is used and regulated in health care. This study aimed to (1) systematically review the characteristics of DeepSeek deployed in the top 100 hospitals in China, and (2) compare the performance and risks of DeepSeek between hospital disclosures and research evidence. We searched the official websites and WeChat accounts of the top 100 hospitals in China and the databases of Web of Science and PubMed, using the terms \"DeepSeek\" and \"large language models.\" Searches were limited to records after January 15, 2025, when DeepSeek was first released. All searches were conducted on May 20, 2025, with an update on June 28, 2025. We extracted the basic characteristics of DeepSeek; its aims, evaluation approach, performance, and risks; and hospital regulations. A coding framework was developed covering the application scenarios, evaluation dimensions, and risk sources of LLMs. The risk of bias was assessed using the Joanna Briggs Institute checklist. We identified a total of 58 DeepSeek models in 48 out of the top 100 Chinese hospitals and found 27 studies in the literature. The first hospital deployment of DeepSeek was recorded on February 10, 2025, and deployment rapidly expanded to 37 hospitals within a month. Concurrently, most related research studies (20/27, 74%) were published after May 2025. Among deployments and studies that reported version information, DeepSeek-reasoner (R1) was the most frequently used model, and private deployment was the predominant approach. DeepSeek was mainly used to assist in clinical decision-making, including patient diagnosis and treatment recommendation. Among hospital disclosures, only 36% (21/58) clearly indicated a predeployment assessment, 22% (13/58) presented assessment results, and 9% (5/58) identified potential risks and countermeasures. We found poor transparency in hospital reporting, with none of the disclosures presenting evaluation details. Hospitals were more likely to report higher performance and fewer risks for DeepSeek. This is one of the first scoping reviews to reveal the rapid, widespread deployment of DeepSeek in China's leading hospitals, primarily for clinical decision support. The deployment of DeepSeek in China's leading hospitals poses potential risks to patient outcomes and safety. We highlight the urgent need for existing regulations to be expanded to downstream developers and users to promote the responsible use of LLMs in health care. Hospitals need to use a more rigorous validation process and adopt a more transparent reporting policy. The main limitations of this review include the restriction to top-tier hospitals and the inherent constraints of gray literature. These factors should be considered when interpreting the findings.

Journal Article

Share this book

Add to My Shelf

Educators’ Perspectives on DeepSeek in ELT: A Qualitative Case Study of Pedagogical Potentials and Pitfalls in Chinese Higher Education

by Racquel Xervaser, Claudia , Ashikin Izhar, Nurul , Zhu, Mengjia

2025

Aim/Purpose: This study aimed to investigate the perspectives of English Language Teaching (ELT) educators on DeepSeek, emphasizing its pedagogical value, practical challenges, and instructional potential in higher education. Background: The integration of artificial intelligence (AI) in ELT is reshaping instructional practices globally, particularly in response to rapid technological advancements and shifts toward digital and student-centered learning. In China, these transformations have been accelerated by national education reforms, globalization, and the COVID-19 pandemic, prompting a reconfiguration of teaching approaches through online, blended, and AI-supported modalities. AI tools, including writing assistants and speech recognition systems, have begun to enhance learner autonomy, engagement, and performance by providing real-time, personalized feedback. Among these tools, DeepSeek has emerged as a promising platform that combines advanced information retrieval and generative capabilities, supporting lesson planning, content development, and academic writing. This paper explores how ELT educators in higher education perceive and apply DeepSeek in their teaching, with a focus on its pedagogical benefits, practical challenges, and instructional potential. Methodology: This study examined a qualitative approach to ELT educators’ perspectives on the benefits, challenges, and instructional potential of integrating DeepSeek into higher education in China. Using purposive sampling, data were collected through open-ended questionnaires from 12 ELT educators at a public Chinese university where DeepSeek has been implemented across academic and administrative functions. Thematic analysis was conducted to examine patterns in participants’ responses across three phases of implementation involving before, during, and after classroom use, to provide an in-depth understanding of DeepSeek’s pedagogical impact. Contribution: This study is one of the few to explore the integration of DeepSeek into ELT in higher education. Unlike more widely studied AI tools, DeepSeek was selected for its emerging use in Chinese educational settings and its distinct instructional features, including structured content generation and multimodal support. By focusing on this specific tool, the study expands the scope of AI in education research and offers new empirical insights into its pedagogical value, implementation challenges, and potential to support personalized and learner-centered teaching. Findings: Findings indicate that DeepSeek offered consistent pedagogical support across three instructional phases (before class, during class, and after class). The most pronounced impact was observed in the before-class phase, where it significantly enhanced lesson preparation efficiency and pedagogical innovation through structured content generation, procedural design, and instructional resource enrichment. During class, DeepSeek supported content diversification, real-time pedagogical adjustments, and student engagement. After class, DeepSeek supported feedback provision, learner autonomy, and extended learning, though its influence was comparatively limited. Overall, the integration of DeepSeek contributed to improved instructional coherence and fostered a shift toward more learner-centered pedagogical practices. Recommendations for Practitioners: This study recommends that practitioners who integrate DeepSeek ensure comprehensive educator training to utilize the tool’s features and functionalities effectively. Additionally, they should focus on maintaining a balance between AI-driven support and traditional pedagogical methods to preserve the human elements of teaching, such as empathy and critical thinking. Practitioners should also consider ethical implications, such as data privacy and potential biases in AI models, and ensure that DeepSeek is used as a complementary resource rather than a replacement for educator expertise. Recommendation for Researchers: Researchers need to understand the evolving role of AI tools such as DeepSeek in enhancing ELT practices and exploring their long-term impact on student outcomes. Future studies should investigate the scalability of AI integration across diverse educational settings and examine how AI tools can be further refined to address emerging pedagogical challenges. Additionally, research should focus on evaluating the ethical concerns associated with AI in education, including data privacy, algorithmic bias, and the implications for educator-student relationships. Researchers are also encouraged to explore the balance between AI and human interaction in fostering a more effective and holistic learning environment. Impact on Society: AI technology-based learning, using DeepSeek, could enhance students’ learning outcomes and assist educators in developing content, leading to a more efficient and effective higher education system. The proper integration of DeepSeek into traditional teaching methods can promote its use and maximize its potential for enhancing learning experiences. Future Research: Additional research should be conducted to explore and measure the impact of DeepSeek on student motivation, engagement, and academic performance. Further studies should investigate its use across different disciplines and educational contexts to evaluate its effectiveness in diverse learning environments.

Journal Article

Share this book

Add to My Shelf

Preliminary evaluation of DeepSeek-R1 and GPT-5.3 in selected PET/CT clinical scenarios: patient preparation, report interpretation, and diagnostic reasoning

by Tianyue Li , Runze Duan , Lu Zheng in [18F]FDG PET/CT , artificial intelligence , Chatbot

2026

ObjectiveTo evaluate the performance of DeepSeek (R1 version), an open-source large language model, in three core clinical scenarios: answering patients’ common questions, interpreting PET/CT reports with follow-up inquiries, and diagnosing complex cases, and comparison with GPT-5.3, to verify the clinical applicability of DeepSeek-R1 as an alternative AI assistant.MethodsA total of 39 standardized tasks were assigned to both models, including responding to 15 frequently asked questions about [18F]FDG PET/CT, interpreting 12 anonymized reports of lung cancer and lymphoma (with follow-up inquiries regarding tumor staging or treatment), and providing primary and differential diagnoses for 10 difficult cases. Both models were accessed via their official platforms with default parameters, and all prompts and evaluation criteria were kept identical for cross-model comparison. Two senior nuclear medicine physicians independently rated the model responses using a 4-point standardized scale (assessing appropriateness, helpfulness, inter-trial consistency, and reference validity) and a binary scale for empathy; Cohen’s Kappa coefficient was used to evaluate inter-rater agreement. McNemar’s test was used to compare paired proportions of appropriateness, empathy, and response inconsistency between the two models.ResultsAcross the 39 tasks, DeepSeek-R1 achieved 94.9% appropriateness and 100% helpfulness. Specifically, 91.7% of responses to follow-up inquiries about tumor staging or treatment were rated empathetic. However, 7.7% of regenerated responses showed substantial inconsistencies, primarily in tumor staging, and only 37% of cited references were fully valid, with 11.1% being invalid. GPT-5.3 exhibited equivalent core performance to DeepSeek-R1 with 94.9% appropriateness and 100% helpfulness, a slightly lower substantial inconsistency rate (5.1%), favorable reference validity (33% fully valid, 7.4% invalid), but a notably lower empathy score (66.7%) for follow-up inquiries. McNemar tests showed identical appropriateness (p = 1.00) and no significant difference in inconsistency (p = 1.00, 95% CI 0.60–14.80) between models. DeepSeek-R1 had higher empathy, the difference was not significant (p = 0.25, 95% CI 0.09–0.66). For the 10 identical difficult cases, both models reached 10% primary diagnosis accuracy and 60% differential diagnosis accuracy.ConclusionDeepSeek-R1 and GPT-5.3 have complementary strengths but similar reference hallucination issues and cannot replace clinicians. DeepSeek-R1 is a cost-effective auxiliary tool, with future optimization needed for consistency, diagnostic accuracy and reference validity.

Journal Article

Share this book

Add to My Shelf

Artificial intelligence, conceptions of distillation and the reframing of reasoning

by Beer, David , Jacobsen, Benjamin N

2026

This article, which takes a specific example in order to look at how concepts shape developments in AI models, claims that the deployment of the concept of distillation is facilitating a particular reframing of reasoning within artificial intelligence. The concept of distillation is circulating through AI development cultures and has taken a certain shape in recent developments. The launch of DeepSeek-R1 reasoning model represents a pivotal and potentially disruptive moment in the direction of AI. This article explores how the concept of distillation has been central to its framing and impact, and how that concept will continue to shape AI into the future. Through a close reading of the research paper that accompanied the launch of DeepSeek-R1, this article looks at how distillation, as an existing idea within AI research, is reanimated as a concept. The article explores the way in which distillation is involved in the advancement of what we have called here power without scale . We then look directly at how distillation is applied to reframe reasoning, before then looking at how measures and benchmarking of reasoning are established in order to validate the transformative effect of distillation in AI models. The article deals with the role of the concept of distillation in shaping perceptions, infrastructures, planning, and expectations of AI.

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter