Catalogue Search | MBRL

Recovering from a decade: a systematic mapping of information retrieval approaches to software traceability

by Borg, Markus , Ardö, Anders , Runeson, Per in Compilers , Computer and Information Sciences , Computer Science

2014

Engineers in large-scale software development have to manage large amounts of information, spread across many artifacts. Several researchers have proposed expressing retrieval of trace links among artifacts, i.e. trace recovery, as an Information Retrieval (IR) problem. The objective of this study is to produce a map of work on IR-based trace recovery, with a particular focus on previous evaluations and strength of evidence. We conducted a systematic mapping of IR-based trace recovery. Of the 79 publications classified, a majority applied algebraic IR models. While a set of studies on students indicate that IR-based trace recovery tools support certain work tasks, most previous studies do not go beyond reporting precision and recall of candidate trace links from evaluations using datasets containing less than 500 artifacts. Our review identified a need of industrial case studies. Furthermore, we conclude that the overall quality of reporting should be improved regarding both context and tool details, measures reported, and use of IR terminology. Finally, based on our empirical findings, we present suggestions on how to advance research on IR-based trace recovery.

Journal Article

Share this book

Add to My Shelf

Practical relevance of software engineering research: synthesizing the community’s voice

by Oivo Markku , Garousi Vahid , Borg, Markus in Engineering research , Literature reviews , Researchers

2020

Software engineering (SE) research should be relevant to industrial practice. There have been regular discussions in the SE community on this issue since the 1980’s, led by pioneers such as Robert Glass. As we recently passed the milestone of “50 years of software engineering”, some recent positive efforts have been made in this direction, e.g., establishing “industrial” tracks in several SE conferences. However, many researchers and practitioners believe that we, as a community, are still struggling with research relevance and utility. The goal of this paper is to synthesize the evidence and experience-based opinions shared on this topic so far in the SE community, and to encourage the community to further reflect and act on the research relevance. For this purpose, we have conducted a Multi-vocal Literature Review (MLR) of 54 systematically-selected sources (papers and non peer-reviewed articles). Instead of relying on and considering the individual opinions on research relevance, mentioned in each of the sources, the MLR aims to synthesize and provide the “holistic” view on the topic. The highlights of our MLR findings are as follows. The top three root causes of low relevance, discussed in the community, are: (1) Researchers having simplistic views (or wrong assumptions) about SE in practice; (2) Lack of connection with industry; and (3) Wrong identification of research problems. The top three suggestions for improving research relevance are: (1) Using appropriate research approaches such as action-research; (2) Choosing relevant (practical) research problems; and (3) Collaborating with industry. By synthesizing all the discussions on this important topic so far, this paper aims to encourage further discussions and actions in the community to increase our collective efforts to improve the research relevance. Furthermore, we raise the need for empirically-grounded and rigorous studies on the relevance problem in SE research, as carried out in other fields such as management science.

Journal Article

Share this book

Add to My Shelf

Automotive fault nowcasting with machine learning and natural language processing

by Randl, Korbinian , Curman, Jacob , Romell, Alv in Artificial Intelligence , Automation , Automobile industry

2024

Automated fault diagnosis can facilitate diagnostics assistance, speedier troubleshooting, and better-organised logistics. Currently, most AI-based prognostics and health management in the automotive industry ignore textual descriptions of the experienced problems or symptoms. With this study, however, we propose an ML-assisted workflow for automotive fault nowcasting that improves on current industry standards. We show that a multilingual pre-trained Transformer model can effectively classify the textual symptom claims from a large company with vehicle fleets, despite the task’s challenging nature due to the 38 languages and 1357 classes involved. Overall, we report an accuracy of more than 80% for high-frequency classes and above 60% for classes with reasonable minimum support, bringing novel evidence that automotive troubleshooting management can benefit from multilingual symptom text classification.

Journal Article

Share this book

Add to My Shelf

Adopting automated bug assignment in practice — a longitudinal case study at Ericsson

by Jonsson, Leif , Szabó, Attila , Bartalos, Béla in Automation , Case studies , Context

2024

[Context] The continuous inflow of bug reports is a considerable challenge in large development projects. Inspired by contemporary work on mining software repositories, we designed a prototype bug assignment solution based on machine learning in 2011-2016. The prototype evolved into an internal Ericsson product, TRR, in 2017-2018. TRR’s first bug assignment without human intervention happened in April 2019. [Objective] Our study evaluates the adoption of TRR within its industrial context at Ericsson, i.e., we provide lessons learned related to the productization of a research prototype within a company. Moreover, we investigate 1) how TRR performs in the field, 2) what value TRR provides to Ericsson, and 3) how TRR has influenced the ways of working. [Method] We conduct a preregistered industrial case study combining interviews with TRR stakeholders, minutes from sprint planning meetings, and bug-tracking data. The data analysis includes thematic analysis, descriptive statistics, and Bayesian causal analysis. [Results] TRR is now an incorporated part of the bug assignment process. Considering the abstraction levels of the telecommunications stack, high-level modules are more positive while low-level modules experienced some drawbacks. Most importantly, some bug reports directly reach low-level modules without first having passed through fundamental root-cause analysis steps at higher levels. On average, TRR automatically assigns 30% of the incoming bug reports with an accuracy of 75%. Auto-routed TRs are resolved around 21% faster within Ericsson, and TRR has saved highly seasoned engineers many hours of work. Indirect effects of adopting TRR include process improvements, process awareness, increased communication, and higher job satisfaction. [Conclusions] TRR has saved time at Ericsson, but the adoption of automated bug assignment was more intricate compared to similar endeavors reported from other companies. We primarily attribute the difference to the very large size of the organization and the complex products. Key facilitators in the successful adoption include a gradual introduction, product champions, and careful stakeholder analysis.

Journal Article

Share this book

Add to My Shelf

Requirements and software engineering for automotive perception systems: an interview study

by Habibullah, Khan Mohammad , Knauss, Alessia , Knauss, Eric in Annotations , Automation , Machine learning

2024

Driving automation systems, including autonomous driving and advanced driver assistance, are an important safety-critical domain. Such systems often incorporate perception systems that use machine learning to analyze the vehicle environment. We explore new or differing topics and challenges experienced by practitioners in this domain, which relate to requirements engineering (RE), quality, and systems and software engineering. We have conducted a semi-structured interview study with 19 participants across five companies and performed thematic analysis of the transcriptions. Practitioners have difficulty specifying upfront requirements and often rely on scenarios and operational design domains (ODDs) as RE artifacts. RE challenges relate to ODD detection and ODD exit detection, realistic scenarios, edge case specification, breaking down requirements, traceability, creating specifications for data and annotations, and quantifying quality requirements. Practitioners consider performance, reliability, robustness, user comfort, and—most importantly—safety as important quality attributes. Quality is assessed using statistical analysis of key metrics, and quality assurance is complicated by the addition of ML, simulation realism, and evolving standards. Systems are developed using a mix of methods, but these methods may not be sufficient for the needs of ML. Data quality methods must be a part of development methods. ML also requires a data-intensive verification and validation process, introducing data, analysis, and simulation challenges. Our findings contribute to understanding RE, safety engineering, and development methodologies for perception systems. This understanding and the collected challenges can drive future research for driving automation and other ML systems.

Journal Article

Share this book

Add to My Shelf

Ergo, SMIRK is safe: a safety case for a machine learning component in a pedestrian automatic emergency brake system

by Henriksson, Jens , Bui, Thanh , Tomaszewski, Piotr in Assurance , Engineering research , ISO standards

2023

Integration of machine learning (ML) components in critical applications introduces novel challenges for software certification and verification. New safety standards and technical guidelines are under development to support the safety of ML-based systems, e.g., ISO 21448 SOTIF for the automotive domain and the Assurance of Machine Learning for use in Autonomous Systems (AMLAS) framework. SOTIF and AMLAS provide high-level guidance but the details must be chiseled out for each specific case. We initiated a research project with the goal to demonstrate a complete safety case for an ML component in an open automotive system. This paper reports results from an industry-academia collaboration on safety assurance of SMIRK, an ML-based pedestrian automatic emergency braking demonstrator running in an industry-grade simulator. We demonstrate an application of AMLAS on SMIRK for a minimalistic operational design domain, i.e., we share a complete safety case for its integrated ML-based component. Finally, we report lessons learned and provide both SMIRK and the safety case under an open-source license for the research community to reuse.

Journal Article

Share this book

Add to My Shelf

Challenges and practices in aligning requirements with verification and validation: a case study of six companies

by Unterkalmsteiner, Michael , Runeson, Per , Feldt, Robert in Alignment , Case Study , Compilers

2014

Weak alignment of requirements engineering (RE) with verification and validation (VV) may lead to problems in delivering the required products in time with the right quality. For example, weak communication of requirements changes to testers may result in lack of verification of new requirements and incorrect verification of old invalid requirements, leading to software quality problems, wasted effort and delays. However, despite the serious implications of weak alignment research and practice both tend to focus on one or the other of RE or VV rather than on the alignment of the two. We have performed a multi-unit case study to gain insight into issues around aligning RE and VV by interviewing 30 practitioners from 6 software developing companies, involving 10 researchers in a flexible research process for case studies. The results describe current industry challenges and practices in aligning RE with VV, ranging from quality of the individual RE and VV activities, through tracing and tools, to change control and sharing a common understanding at strategy, goal and design level. The study identified that human aspects are central, i.e. cooperation and communication, and that requirements engineering practices are a critical basis for alignment. Further, the size of an organisation and its motivation for applying alignment practices, e.g. external enforcement of traceability, are variation factors that play a key role in achieving alignment. Our results provide a strategic roadmap for practitioners improvement work to address alignment challenges. Furthermore, the study provides a foundation for continued research to improve the alignment of RE with VV.

Journal Article

Share this book

Add to My Shelf

An autonomous performance testing framework using self-adaptive fuzzy reinforcement learning

by Moghadam, Mahshid Helali , Saadatmand Mehrdad , Bohlin, Markus in Algorithms , Automation , Business metrics

2022

Test automation brings the potential to reduce costs and human effort, but several aspects of software testing remain challenging to automate. One such example is automated performance testing to find performance breaking points. Current approaches to tackle automated generation of performance test cases mainly involve using source code or system model analysis or use-case-based techniques. However, source code and system models might not always be available at testing time. On the other hand, if the optimal performance testing policy for the intended objective in a testing process instead could be learned by the testing system, then test automation without advanced performance models could be possible. Furthermore, the learned policy could later be reused for similar software systems under test, thus leading to higher test efficiency. We propose SaFReL, a self-adaptive fuzzy reinforcement learning-based performance testing framework. SaFReL learns the optimal policy to generate performance test cases through an initial learning phase, then reuses it during a transfer learning phase, while keeping the learning running and updating the policy in the long term. Through multiple experiments in a simulated performance testing setup, we demonstrate that our approach generates the target performance test cases for different programs more efficiently than a typical testing process and performs adaptively without access to source code and performance models.

Journal Article

Share this book

Add to My Shelf

Component attributes and their importance in decisions and component selection

by Gorschek Tony , Chatzipetrou Panagiota , Papatheocharous Efi in Commercial off-the-shelf technology , Cost analysis , Data analysis

2020

Component-based software engineering is a common approach in the development and evolution of contemporary software systems. Different component sourcing options are available, such as: (1) Software developed internally (in-house), (2) Software developed outsourced, (3) Commercial off-the-shelf software, and (4) Open-Source Software. However, there is little available research on what attributes of a component are the most important ones when selecting new components. The objective of this study is to investigate what matters the most to industry practitioners when they decide to select a component. We conducted a cross-domain anonymous survey with industry practitioners involved in component selection. First, the practitioners selected the most important attributes from a list. Next, they prioritized their selection using the Hundred-Dollar ($100) test. We analyzed the results using compositional data analysis. The results of this exploratory analysis showed that cost was clearly considered to be the most important attribute for component selection. Other important attributes for the practitioners were: support of the component, longevity prediction, and level of off-the-shelf fit to product. Moreover, several practitioners still consider in-house software development to be the sole option when adding or replacing a component. On the other hand, there is a trend to complement it with other component sourcing options and, apart from cost, different attributes factor into their decision. Furthermore, in our analysis, nonparametric tests and biplots were used to further investigate the practitioners’ inherent characteristics. It seems that smaller and larger organizations have different views on what attributes are the most important, and the most surprising finding is their contrasting views on the cost attribute: larger organizations with mature products are considerably more cost aware.

Journal Article

Share this book

Add to My Shelf

Trust Calibration in IDEs: Paving the Way for Widespread Adoption of AI Refactoring

by Borg, Markus in Human factors , Industrial development , Large language models

2024

In the software industry, the drive to add new features often overshadows the need to improve existing code. Large Language Models (LLMs) offer a new approach to improving codebases at an unprecedented scale through AI-assisted refactoring. However, LLMs come with inherent risks such as braking changes and the introduction of security vulnerabilities. We advocate for encapsulating the interaction with the models in IDEs and validating refactoring attempts using trustworthy safeguards. However, equally important for the uptake of AI refactoring is research on trust development. In this position paper, we position our future work based on established models from research on human factors in automation. We outline action research within CodeScene on development of 1) novel LLM safeguards and 2) user interaction that conveys an appropriate level of trust. The industry collaboration enables large-scale repository analysis and A/B testing to continuously guide the design of our research interventions.

Paper

Share this book

Add to My Shelf

Language Selector

MBRLGlobalSearch

Language Selector

Catalogue Search | MBRL

Search Results Heading

Explore the vast range of titles available.

MBRLSearchResults

MBRLHappinessMeter