Asset Details
MbrlCatalogueTitleDetail
Do you wish to reserve the book?
ALFAA: Active Learning Fingerprint based Anti-Aliasing for correcting developer identity errors in version control systems
by
Amreen Sadika
, Bogart, Christopher
, Zaretzki Russell
, Mockus Audris
, Zhang, Yuxia
in
Active learning
/ Algorithms
/ Aliasing
/ Applications programs
/ Control data (computers)
/ Control systems
/ Engineering research
/ Fingerprints
/ Open source software
/ Questions
/ Social networks
/ Software development
/ Software engineering
/ Strings
/ Supervised learning
/ Version control
2020
Hey, we have placed the reservation for you!
By the way, why not check out events that you can attend while you pick your title.
You are currently in the queue to collect this book. You will be notified once it is your turn to collect the book.
Oops! Something went wrong.
Looks like we were not able to place the reservation. Kindly try again later.
Are you sure you want to remove the book from the shelf?
ALFAA: Active Learning Fingerprint based Anti-Aliasing for correcting developer identity errors in version control systems
by
Amreen Sadika
, Bogart, Christopher
, Zaretzki Russell
, Mockus Audris
, Zhang, Yuxia
in
Active learning
/ Algorithms
/ Aliasing
/ Applications programs
/ Control data (computers)
/ Control systems
/ Engineering research
/ Fingerprints
/ Open source software
/ Questions
/ Social networks
/ Software development
/ Software engineering
/ Strings
/ Supervised learning
/ Version control
2020
Oops! Something went wrong.
While trying to remove the title from your shelf something went wrong :( Kindly try again later!
Do you wish to request the book?
ALFAA: Active Learning Fingerprint based Anti-Aliasing for correcting developer identity errors in version control systems
by
Amreen Sadika
, Bogart, Christopher
, Zaretzki Russell
, Mockus Audris
, Zhang, Yuxia
in
Active learning
/ Algorithms
/ Aliasing
/ Applications programs
/ Control data (computers)
/ Control systems
/ Engineering research
/ Fingerprints
/ Open source software
/ Questions
/ Social networks
/ Software development
/ Software engineering
/ Strings
/ Supervised learning
/ Version control
2020
Please be aware that the book you have requested cannot be checked out. If you would like to checkout this book, you can reserve another copy
We have requested the book for you!
Your request is successful and it will be processed during the Library working hours. Please check the status of your request in My Requests.
Oops! Something went wrong.
Looks like we were not able to place your request. Kindly try again later.
ALFAA: Active Learning Fingerprint based Anti-Aliasing for correcting developer identity errors in version control systems
Journal Article
ALFAA: Active Learning Fingerprint based Anti-Aliasing for correcting developer identity errors in version control systems
2020
Request Book From Autostore
and Choose the Collection Method
Overview
An accurate determination of developer identities is important for software engineering research and practice. Without it, even simple questions such as “how many developers does a project have?” cannot be answered. The commonly used version control data from Git is full of identity errors and the existing approaches to correct these errors are difficult to validate on large scale and cannot be easily improved. We, therefore, aim to develop a scalable, highly accurate, easy to use and easy to improve approach to correct software developer identity errors. We first amalgamate developer identities from version control systems in open source software repositories and investigate the nature and prevalence of these errors, design corrective algorithms, and estimate the impact of the errors on networks inferred from this data. We investigate these questions using a collection of over 1B Git commits with over 23M recorded author identities. By inspecting the author strings that occur most frequently, we group identity errors into categories. We then augment the author strings with three behavioral fingerprints: time-zone frequencies, the set of files modified, and a vector embedding of the commit messages. We create a manually validated set of identities for a subset of OpenStack developers using an active learning approach and use it to fit supervised learning models to predict the identities for the remaining author strings in OpenStack. We then compare these predictions with a competing commercially available effort and a leading research method. Finally, we compare network measures for file-induced author networks based on corrected and raw data. We find commits done from different environments, misspellings, organizational ids, default values, and anonymous IDs to be the major sources of errors. We also find supervised learning methods to reduce errors by several times in comparison to existing research and commercial methods and the active learning approach to be an effective way to create validated datasets. Results also indicate that correction of developer identity has a large impact on the inference of the social network. We believe that our proposed Active Learning Fingerprint Based Anti-Aliasing (ALFAA) approach will expedite research progress in the software engineering domain for applications that involve developer identities.
Publisher
Springer Nature B.V
Subject
This website uses cookies to ensure you get the best experience on our website.