Catalogue Search | MBRL

The effects of change decomposition on code review—a controlled experiment

by Bruntink, Magiel , van Deursen, Arie , Bacchelli, Alberto in Best practice , Building codes , Change decomposition

2019

Code review is a cognitively demanding and time-consuming process. Previous qualitative studies hinted at how decomposing change sets into multiple yet internally coherent ones would improve the reviewing process. So far, literature provided no quantitative analysis of this hypothesis. (1) Quantitatively measure the effects of change decomposition on the outcome of code review (in terms of number of found defects, wrongly reported issues, suggested improvements, time, and understanding); (2) Qualitatively analyze how subjects approach the review and navigate the code, building knowledge and addressing existing issues, in large vs. decomposed changes. Controlled experiment using the pull-based development model involving 28 software developers among professionals and graduate students. Change decomposition leads to fewer wrongly reported issues, influences how subjects approach and conduct the review activity (by increasing context-seeking), yet impacts neither understanding the change rationale nor the number of found defects. Change decomposition reduces the noise for subsequent data analyses but also significantly supports the tasks of the developers in charge of reviewing the changes. As such, commits belonging to different concepts should be separated, adopting this as a best practice in software engineering.

Journal Article

Share this book

Add to My Shelf

Associating working memory capacity and code change ordering with code review performance

by Bacchelli, Alberto , Schneider, Kurt , Baum, Tobias in Defects , Efficiency , Memory

2019

Change-based code review is a software quality assurance technique that is widely used in practice. Therefore, better understanding what influences performance in code reviews and finding ways to improve it can have a large impact. In this study, we examine the association of working memory capacity and cognitive load with code review performance and we test the predictions of a recent theory regarding improved code review efficiency with certain code change part orders. We perform a confirmatory experiment with 50 participants, mostly professional software developers. The participants performed code reviews on one small and two larger code changes from an open source software system to which we had seeded additional defects. We measured their efficiency and effectiveness in defect detection, their working memory capacity, and several potential confounding factors. We find that there is a moderate association between working memory capacity and the effectiveness of finding delocalized defects, influenced by other factors, whereas the association with other defect types is almost non-existing. We also confirm that the effectiveness of reviews is significantly larger for small code changes. We cannot conclude reliably whether the order of presenting the code change parts influences the efficiency of code review.

Journal Article

Share this book

Add to My Shelf

On Refining the SZZ Algorithm with Bug Discussion Data

by Bacchelli, Alberto , Rani, Pooja , Petrulio, Fernando in Algorithms , Compilers , Computer Science

2024

Context Researchers testing hypotheses related to factors leading to low-quality software often rely on historical data, specifically on details regarding when defects were introduced into a codebase of interest. The prevailing techniques to determine the introduction of defects revolve around variants of the SZZ algorithm. This algorithm leverages information on the lines modified during a bug-fixing commit and finds when these lines were last modified, thereby identifying bug-introducing commits. Objectives Despite several improvements and variants, SZZ struggles with accuracy, especially in cases of unrelated modifications or that touch files not involved in the introduction of the bug in the version control systems (aka tangled commit and ghost commits ). Methods Our research investigates whether and how incorporating content retrieved from bug discussions can address these issues by identifying the related and external files and thus improve the efficacy of the SZZ algorithm. Results To conduct our investigation, we take advantage of the links manually inserted by Mozilla developers in bug reports to signal which commits inserted bugs. Thus, we prepared the dataset, RoTEB , comprised of 12,472 bug reports. We first manually inspect a sample of 369 bug reports related to these bug-fixing or bug-introducing commits and investigate whether the files mentioned in these reports could be useful for SZZ . After we found evidence that the mentioned files are relevant, we augment SZZ with this information, using different strategies, and evaluate the resulting approach against multiple SZZ variations. Conclusion We define a taxonomy outlining the rationale behind developers’ references to diverse files in their discussions. We observe that bug discussions often mention files relevant to enhancing the SZZ algorithm’s efficacy. Then, we verify that integrating these file references augments the precision of SZZ in pinpointing bug-introducing commits. Yet, it does not markedly influence recall. These results deepen our comprehension of the usefulness of bug discussions for SZZ . Future work can leverage our dataset and explore other techniques to further address the problem of tangled commits and ghost commits. Data & material: https://zenodo.org/records/11484723 .

Journal Article

Share this book

Add to My Shelf

Workflow analysis of data science code in public GitHub repositories

by Sarasua, Cristina , Bernstein, Abraham , Ramasamy, Dhivyabharathi in Coding , Data analysis , Data science

2023

Despite the ubiquity of data science, we are far from rigorously understanding how coding in data science is performed. Even though the scientific literature has hinted at the iterative and explorative nature of data science coding, we need further empirical evidence to understand this practice and its workflows in detail. Such understanding is critical to recognise the needs of data scientists and, for instance, inform tooling support. To obtain a deeper understanding of the iterative and explorative nature of data science coding, we analysed 470 Jupyter notebooks publicly available in GitHub repositories. We focused on the extent to which data scientists transition between different types of data science activities, or steps (such as data preprocessing and modelling), as well as the frequency and co-occurrence of such transitions. For our analysis, we developed a dataset with the help of five data science experts, who manually annotated the data science steps for each code cell within the aforementioned 470 notebooks. Using the first-order Markov chain model, we extracted the transitions and analysed the transition probabilities between the different steps. In addition to providing deeper insights into the implementation practices of data science coding, our results provide evidence that the steps in a data science workflow are indeed iterative and reveal specific patterns. We also evaluated the use of the annotated dataset to train machine-learning classifiers to predict the data science step(s) of a given code cell. We investigate the representativeness of the classification by comparing the workflow analysis applied to (a) the predicted data set and (b) the data set labelled by experts, finding an F1-score of about 71% for the 10-class data science step prediction problem.

Journal Article

Share this book

Add to My Shelf

The evolution of the code during review: an investigation on review changes

by Bacchelli, Alberto , Fregnan, Enrico , Petrulio, Fernando in Investigations , Open source software , Software engineering

2022

Code review is a software engineering practice in which reviewers manually inspect the code written by a fellow developer and propose any change that is deemed necessary or useful. The main goal of code review is to improve the quality of the code under review. Despite the widespread use of code review, only a few studies focused on the investigation of its outcomes, for example, investigating the code changes that happen to the code under review. The goal of this paper is to expand our knowledge on the outcome of code review while re-evaluating results from previous work. To this aim, we analyze changes that happened during the review process, which we define as review changes. Considering three popular open-source software projects, we investigate the types of review changes (based on existing taxonomies) and what triggers them; also, we study which code factors in a code review are most related to the number of review changes. Our results show that the majority of changes relate to evolvability concerns, with a strong prevalence of documentation and structure changes at type-level. Furthermore, differently from past work, we found that the majority of review changes are not triggered by reviewers’ comments. Finally, we find that the number of review changes in a code review is related to the size of the initial patch as well as the new lines of code that it adds. However, other factors, such as lines deleted or the author of the review patchset, do not always show an empirically supported relationship with the number of changes.

Journal Article

Share this book

Add to My Shelf

Visualising data science workflows to support third-party notebook comprehension: an empirical study

by Sarasua, Cristina , Bernstein, Abraham , Ramasamy, Dhivyabharathi in Annotations , Data science , Empirical analysis

2023

Data science is an exploratory and iterative process that often leads to complex and unstructured code. This code is usually poorly documented and, consequently, hard to understand by a third party. In this paper, we first collect empirical evidence for the non-linearity of data science code from real-world Jupyter notebooks, confirming the need for new approaches that aid in data science code interaction and comprehension. Second, we propose a visualisation method that elucidates implicit workflow information in data science code and assists data scientists in navigating the so-called garden of forking paths in non-linear code. The visualisation also provides information such as the rationale and the identification of the data science pipeline step based on cell annotations. We conducted a user experiment with data scientists to evaluate the proposed method, assessing the influence of (i) different workflow visualisations and (ii) cell annotations on code comprehension. Our results show that visualising the exploration helps the users obtain an overview of the notebook, significantly improving code comprehension. Furthermore, our qualitative analysis provides more insights into the difficulties faced during data science code comprehension.

Journal Article

Share this book

Add to My Shelf

Simplifying software compliance: AI technologies in drafting technical documentation for the AI Act

by Sovrano, Francesco , Hine, Emmie , Anzolut, Stefano in Artificial intelligence , Chatbots , Compilers

2025

The European AI Act has introduced specific technical documentation requirements for AI systems. Compliance with them is challenging due to the need for advanced knowledge of both legal and technical aspects, which is rare among software developers and legal professionals. Consequently, small and medium-sized enterprises may face high costs in meeting these requirements. In this study, we explore how contemporary AI technologies, including ChatGPT and an existing compliance tool (DoXpert), can aid software developers in creating technical documentation that complies with the AI Act. We specifically demonstrate how these AI tools can identify gaps in existing documentation according to the provisions of the AI Act. Using open-source high-risk AI systems as case studies, we collaborated with legal experts to evaluate how closely tool-generated assessments align with expert opinions. Findings show partial alignment, important issues with ChatGPT (3.5 and 4), and a moderate (and statistically significant) correlation between DoXpert and expert judgments, according to the Rank Biserial Correlation analysis. Nonetheless, these findings underscore the potential of AI to combine with human analysis and alleviate the compliance burden, supporting the broader goal of fostering responsible and transparent AI development under emerging regulatory frameworks.

Journal Article

Share this book

Add to My Shelf

To react, or not to react: Patterns of reaction to API deprecation

by Bacchelli, Alberto , Anand Ashok Sawant , Robbes, Romain in Application programming interface , Consumers , Data mining

2019

Application Programming Interfaces (API) provide reusable functionality to aid developers in the development process. The features provided by these APIs might change over time as the API evolves. To allow API consumers to peacefully transition from older obsolete features to new features, API producers make use of the deprecation mechanism that allows them to indicate to the consumer that a feature should no longer be used. The Java language designers noticed that no one was taking these deprecation warnings seriously and continued using outdated features. Due to this, they decided to change the implementation of this feature in Java 9. We question as to what extent this issue exists and whether the Java language designers have a case. We start by identifying the various ways in which an API consumer can react to deprecation. Following this we benchmark the frequency of the reaction patterns by creating a dataset consisting of data mined from 50 API consumers totalling 297,254 GitHub based projects and 1,322,612,567 type-checked method invocations. We see that predominantly consumers do not react to deprecation and we try to explain this behavior by surveying API consumers and by analyzing if the API’s deprecation policy has an impact on the consumers’ decision to react.

Journal Article

Share this book

Add to My Shelf

What happens in my code reviews? An investigation on automatically classifying review changes

by Petrulio, Fernando , Bacchelli, Alberto , Di Geronimo, Linda in Classification , Machine learning , Software engineering

2022

Code reviewing is a widespread practice used by software engineers to maintain high code quality. To date, the knowledge on the effect of code review on source code is still limited. Some studies have addressed this problem by classifying the types of changes that take place during the review process (a.k.a. review changes), as this strategy can, for example, pinpoint the immediate effect of reviews on code. Nevertheless, this classification (1) is not scalable, as it was conducted manually, and (2) was not assessed in terms of how meaningful the provided information is for practitioners. This paper aims at addressing these limitations: First, we investigate to what extent a machine learning-based technique can automatically classify review changes. Then, we evaluate the relevance of information on review change types and its potential usefulness, by conducting (1) semi-structured interviews with 12 developers and (2) a qualitative study with 17 developers, who are asked to assess reports on the review changes of their project. Key results of the study show that not only it is possible to automatically classify code review changes, but this information is also perceived by practitioners as valuable to improve the code review process. Data and materials: https://doi.org/10.5281/zenodo.5592254

Journal Article

Share this book

Add to My Shelf

The indolent lambdification of Java

by Bacchelli, Alberto , Sawant, Anand Ashok , Petrulio Fernando in Compilers , Consumers , Investigations

2021

As Java 8 introduced functional interfaces and lambda expressions to the Java programming language, the JDK API was changed to introduce support for lambda expressions, thus allowing consumers to define lambda functions when using Java’s collections. While the JDK API allows for a functional paradigm, for API consumers to be able to completely embrace Java’s new functional features, third-party APIs must also support lambda expressions. To understand the current state of the Java ecosystem, we investigate (i) the extent to which third-party Java APIs have changed their interfaces, (ii) why or why not they introduce functional interface support and (iii) in the case the API has changed its interface how it does so. We also investigate the consumers’ perspective, particularly their ease in using lambda expressions in Java with APIs. We perform our investigation by manually analyzing the top 50 popular Java APIs, conducting in-person and email interviews with 23 API producers, and surveying 110 developers. We find that only a minority of the top 50 APIs support functional interfaces, the rest does not support them, predominantly in the interest of backward compatibility. Java 7 support is still greatly desirable due to enterprise projects not migrating to newer versions of Java. This suggests that the Java ecosystem is stagnant and that the introduction of new language features will not be enough to save it from the advent of new languages such as Kotlin (JVM based) and Rust (non-JVM based).

Journal Article

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter