Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
10
result(s) for
"AI-assisted programming"
Sort by:
Natural Language Generation and Understanding of Big Code for AI-Assisted Programming: A Review
by
Guo, Shangxin
,
Ho, Siu-Wai
,
Tan, Chee-Wei
in
AI-assisted programming
,
Algorithms
,
Applications programs
2023
This paper provides a comprehensive review of the literature concerning the utilization of Natural Language Processing (NLP) techniques, with a particular focus on transformer-based large language models (LLMs) trained using Big Code, within the domain of AI-assisted programming tasks. LLMs, augmented with software naturalness, have played a crucial role in facilitating AI-assisted programming applications, including code generation, code completion, code translation, code refinement, code summarization, defect detection, and clone detection. Notable examples of such applications include the GitHub Copilot powered by OpenAI’s Codex and DeepMind AlphaCode. This paper presents an overview of the major LLMs and their applications in downstream tasks related to AI-assisted programming. Furthermore, it explores the challenges and opportunities associated with incorporating NLP techniques with software naturalness in these applications, with a discussion on extending AI-assisted programming capabilities to Apple’s Xcode for mobile software development. This paper also presents the challenges of and opportunities for incorporating NLP techniques with software naturalness, empowering developers with advanced coding assistance and streamlining the software development process.
Journal Article
A comparative study of AI and human programming on environmental sustainability
2025
Despite rising concerns over AI’s environmental impact, a recent article claimed human writers emit 130 to 1,500 times more greenhouse gases than AI. While the study utilized life cycle assessment methodology, it overlooked a critical factor: output quality. Unlike AI, which instantly generates content, human writers work for months, producing superior results. To provide a more objective comparison, we analyzed the environmental impacts of human and AI programmers generating functionally equivalent code. Using the USA Computing Olympiad database, we developed infrastructure to evaluate multiple GPT-based models. To our knowledge, this is the first study to correctness-control and quantitatively assess AI’s environmental impact in code generation. To address AI’s inaccuracies, we built a multi-round correction process to iteratively fix responses. We calculated AI emissions from usage and embodied impacts, while human emissions were estimated using average computing power consumption. Our case study results show that smaller models can match the environmental impact of human programmers when they succeed, though they often fail. However, standard, widely-used models are far more environmentally strenuous. For example, GPT-4 emitted between 5 and 19 times more
eq than humans, underscoring a much greater trade-off between efficiency and environmental cost than previously understood.
Journal Article
AI-Assisted Programming Tasks Using Code Embeddings and Transformers
by
Kotsiantis, Sotiris
,
Verykios, Vassilios
,
Tzagarakis, Manolis
in
Algorithms
,
Artificial intelligence
,
Computational linguistics
2024
This review article provides an in-depth analysis of the growing field of AI-assisted programming tasks, specifically focusing on the use of code embeddings and transformers. With the increasing complexity and scale of software development, traditional programming methods are becoming more time-consuming and error-prone. As a result, researchers have turned to the application of artificial intelligence to assist with various programming tasks, including code completion, bug detection, and code summarization. The utilization of artificial intelligence for programming tasks has garnered significant attention in recent times, with numerous approaches adopting code embeddings or transformer technologies as their foundation. While these technologies are popular in this field today, a rigorous discussion, analysis, and comparison of their abilities to cover AI-assisted programming tasks is still lacking. This article discusses the role of code embeddings and transformers in enhancing the performance of AI-assisted programming tasks, highlighting their capabilities, limitations, and future potential in an attempt to outline a future roadmap for these specific technologies.
Journal Article
Redefining the Programmer: Human-AI Collaboration, LLMs, and Security in Modern Software Engineering
by
Le, Hanh
,
Cruz, Elyson De La
,
Meduri, Karthik
in
Artificial intelligence
,
Automation
,
Collaboration
2025
The rapid integration of artificial intelligence (AI) into software development, driven by large language models (LLMs), is reshaping the role of programmers from traditional coders into strategic collaborators within Industry 4.0 ecosystems. This qualitative study employs a hermeneutic phenomenological approach to explore the lived experiences of Information Technology (IT) professionals as they navigate a dynamic technological landscape marked by intelligent automation, shifting professional identities, and emerging ethical concerns. Findings indicate that developers are actively adapting to AI-augmented environments by engaging in continuous upskilling, prompt engineering, interdisciplinary collaboration, and heightened ethical awareness. However, participants also voiced growing concerns about the reliability and security of AI-generated code, noting that these tools can introduce hidden vulnerabilities and reduce critical engagement due to automation bias. Many described instances of flawed logic, insecure patterns, or syntactically correct but contextually inappropriate suggestions, underscoring the need for rigorous human oversight. Additionally, the study reveals anxieties around job displacement and the gradual erosion of fundamental coding skills, particularly in environments where AI tools dominate routine development tasks. These findings highlight an urgent need for educational reforms, industry standards, and organizational policies that prioritize both technical robustness and the preservation of human expertise. As AI becomes increasingly embedded in software engineering workflows, this research offers timely insights into how developers and organizations can responsibly integrate intelligent systems to promote accountability, resilience, and innovation across the software development lifecycle.
Journal Article
The impact of AI-assisted pair programming on student motivation, programming anxiety, collaborative learning, and programming performance: a comparative study with traditional pair programming and individual approaches
by
Zhang, Rui
,
Pan, Lihu
,
Liu, Dandan
in
AI-assisted pair programming
,
Anxiety
,
Applications programs
2025
Purpose
This study investigates the impact of AI-assisted pair programming on undergraduate students’ intrinsic motivation, programming anxiety, and performance, relative to both human–human pair programming and individual programming approaches.
Methods
A quasi-experimental design was conducted over two academic years (2023–2024) with 234 undergraduate students in a Java web application development course. Intact class sections were randomly assigned to AI-assisted pair programming (using GPT-3.5 Turbo in 2023 and Claude 3 Opus in 2024), human–human pair programming, or individual programming conditions. Data on intrinsic motivation, programming anxiety, collaborative perceptions, and programming performance were collected at three time points using validated instruments.
Results
Compared to individual programming, AI-assisted pair programming significantly increased intrinsic motivation (
p
< .001, d = 0.35) and reduced programming anxiety (
p
< .001), producing outcomes comparable to human–human pair programming. AI-assisted groups also outperformed both individual and human–human groups in programming tasks (
p
< .001). However, human–human pair programming fostered the highest perceptions of collaboration and social presence, surpassing both AI-assisted and individual conditions (
p
< .001). Mediation analysis revealed that perceived usefulness of the AI assistant significantly mediated the relationship between the programming approach and student outcomes, highlighting the importance of positive perceptions in leveraging AI tools for educational benefits. No significant differences emerged between the two AI models employed, indicating that both GPT-3.5 Turbo and Claude 3 Opus provided similar benefits.
Conclusion
While AI-assisted pair programming enhances motivation, reduces anxiety, and improves performance, it does not fully match the collaborative depth and social presence achieved through human–human pairing. These findings highlight the complementary strengths of AI and human interaction: AI support can bolster learning outcomes, yet human partners offer richer social engagement. As AI capabilities advance, educators should integrate such tools thoughtfully, ensuring that technology complements rather than replaces the interpersonal dynamics and skill development central to effective programming education.
Journal Article
Impact of Developer Queries on the Effectiveness of Conversational Large Language Models in Programming
by
Taneski, Viktor
,
Karakatič, Sašo
,
Rek, Patrik
in
AI-assisted programming
,
Chatbots
,
Collaboration
2025
This study investigates the effects of LLM-based coding assistance on web application development by students using a frontend framework. Rather than comparing different models, it focuses on how students interact with LLM tools to isolate the impact of query type on coding success. To this end, participants were instructed to rely exclusively on LLMs for writing code, based on a given set of specifications, and their queries were categorized into seven types: Error Fixing (EF), Feature Implementation (FI), Code Optimization (CO), Code Understanding (CU), Best Practices (BP), Documentation (DOC), and Concept Clarification (CC). The results reveal that students who queried LLMs for error fixing (EF) were statistically more likely to have runnable code, regardless of prior knowledge. Additionally, students seeking code understanding (CU) and error fixing performed better, even when normalizing for previous coding ability. These findings suggest that the nature of the queries made to LLMs influences the success of programming tasks and provides insights into how AI tools can assist learning in software development.
Journal Article
A Comparative Study of Large Language Models in Programming Education: Accuracy, Efficiency, and Feedback in Student Assignment Grading
by
Bernik, Andrija
,
Radošević, Danijel
,
Čep, Andrej
in
Accuracy
,
AI-assisted grading
,
Artificial intelligence
2025
Programming education traditionally requires extensive manual assessment of student assignments, which is both time-consuming and resource-intensive for instructors. Recent advances in large language models (LLMs) open opportunities for automating this process and providing timely feedback. This paper investigates the application of artificial intelligence (AI) tools for preliminary assessment of undergraduate programming assignments. A multi-phase experimental study was conducted across three computer science courses: Introduction to Programming, Programming 2, and Advanced Programming Concepts. A total of 315 Python assignments were collected from the Moodle learning management system, with 100 randomly selected submissions analyzed in detail. AI evaluation was performed using ChatGPT-4 (GPT-4-turbo), Claude 3, and Gemini 1.5 Pro models, employing structured prompts aligned with a predefined rubric that assessed functionality, code structure, documentation, and efficiency. Quantitative results demonstrate high correlation between AI-generated scores and instructor evaluations, with ChatGPT-4 achieving the highest consistency (Pearson coefficient 0.91) and the lowest average absolute deviation (0.68 points). Qualitative analysis highlights AI’s ability to provide structured, actionable feedback, though variability across models was observed. The study identifies benefits such as faster evaluation and enhanced feedback quality, alongside challenges including model limitations, potential biases, and the need for human oversight. Recommendations emphasize hybrid evaluation approaches combining AI automation with instructor supervision, ethical guidelines, and integration of AI tools into learning management systems. The findings indicate that AI-assisted grading can improve efficiency and pedagogical outcomes while maintaining academic integrity.
Journal Article
An AI-Driven System for Learning MQTT Communication Protocols with Python Programming
by
Funabiki, Nobuo
,
Zhu, Zihao
,
Kotama, I Nyoman Darma
in
Active learning
,
Artificial intelligence
,
Automation
2025
With rapid developments of wireless communication and Internet of Things (IoT) technologies, an increasing number of devices and sensors are interconnected, generating massive amounts of data in real time. Among the underlying protocols, Message Queuing Telemetry Transport (MQTT) has become a widely adopted lightweight publish–subscribe standard due to its simplicity, minimal overhead, and scalability. Then, understanding such protocols is essential for students and engineers engaging in IoT application system designs. However, teaching and learning MQTT remains challenging for them. Its asynchronous architecture, hierarchical topic structure, and constituting concepts such as retained messages, Quality of Service (QoS) levels, and wildcard subscriptions are often difficult for beginners. Moreover, traditional learning resources emphasize theory and provide limited hands-on guidance, leading to a steep learning curve. To address these challenges, we propose an AI-assisted, exercise-based learning platform for MQTT. This platform provides interactive exercises with intelligent feedback to bridge the gap between theory and practice. To lower the barrier for learners, all code examples for executing MQTT communication are implemented in Python for readability, and Docker is used to ensure portable deployments of the MQTT broker and AI assistant. For evaluations, we conducted a usability study using two groups. The first group, who has no prior experience, focused on fundamental concepts with AI-guided exercises. The second group, who has relevant background, engaged in advanced projects to apply and reinforce their knowledge. The results show that the proposed platform supports learners at different levels, reduces frustrations, and improves both engagement and efficiency.
Journal Article
Dynamic Assessment with AI (Agentic RAG) and Iterative Feedback: A Model for the Digital Transformation of Higher Education in the Global EdTech Ecosystem
by
Molero, David
,
Juárez, Rubén
,
de Barros-Camargo, Claudia
in
agentic RAG
,
AI-assisted assessment
,
Algorithms
2025
This article formalizes AI-assisted assessment as a discrete-time policy-level design for iterative feedback and evaluates it in a digitally transformed higher-education setting. We integrate an agentic retrieval-augmented generation (RAG) feedback engine—operationalized through planning (rubric-aligned task decomposition), tool use beyond retrieval (tests, static/dynamic analyzers, rubric checker), and self-critique (checklist-based verification)—into a six-iteration dynamic evaluation cycle. Learning trajectories are modeled with three complementary formulations: (i) an interpretable update rule with explicit parameters η and λ that links next-step gains to feedback quality and the gap-to-target and yields iteration-complexity and stability conditions; (ii) a logistic-convergence model capturing diminishing returns near ceiling; and (iii) a relative-gain regression quantifying the marginal effect of feedback quality on the fraction of the gap closed per iteration. In a Concurrent Programming course (n=35), the cohort mean increased from 58.4 to 91.2 (0–100), while dispersion decreased from 9.7 to 5.8 across six iterations; a Greenhouse–Geisser corrected repeated-measures ANOVA indicated significant within-student change. Parameter estimates show that higher-quality, evidence-grounded feedback is associated with larger next-step gains and faster convergence. Beyond performance, we engage the broader pedagogical question of what to value and how to assess in AI-rich settings: we elevate process and provenance—planning artifacts, tool-usage traces, test outcomes, and evidence citations—to first-class assessment signals, and outline defensible formats (trace-based walkthroughs and oral/code defenses) that our controller can instrument. We position this as a design model for feedback policy, complementary to state-estimation approaches such as knowledge tracing. We discuss implications for instrumentation, equity-aware metrics, reproducibility, and epistemically aligned rubrics. Limitations include the observational, single-course design; future work should test causal variants (e.g., stepped-wedge trials) and cross-domain generalization.
Journal Article
AI-Assisted Exam Variant Generation: A Human-in-the-Loop Framework for Automatic Item Creation
2025
Educational assessment relies on well-constructed test items to measure student learning accurately, yet traditional item development is time-consuming and demands specialized psychometric expertise. Automatic item generation (AIG) offers template-based scalability, and recent large language model (LLM) advances promise to democratize item creation. However, fully automated approaches risk introducing factual errors, bias, and uneven difficulty. To address these challenges, we propose and evaluate a hybrid human-in-the-loop (HITL) framework for AIG that combines psychometric rigor with the linguistic flexibility of LLMs. In a Spring 2025 case study at Franklin University Switzerland, the instructor collaborated with ChatGPT (o4-mini-high) to generate parallel exam variants for two undergraduate business courses: Quantitative Reasoning and Data Mining. The instructor began by defining “radical” and “incidental” parameters to guide the model. Through iterative cycles of prompt, review, and refinement, the instructor validated content accuracy, calibrated difficulty, and mitigated bias. All interactions (including prompt templates, AI outputs, and human edits) were systematically documented, creating a transparent audit trail. Our findings demonstrate that a HITL approach to AIG can produce diverse, psychometrically equivalent exam forms with reduced development time, while preserving item validity and fairness, and potentially reducing cheating. This offers a replicable pathway for harnessing LLMs in educational measurement without sacrificing quality, equity, or accountability.
Journal Article