Works (7)

Updated: July 5th, 2023 15:30

2022 journal article

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 48(1), 278–294.

By: H. Tu n, Z. Yu n & T. Menzies n

author keywords: Human-in-the-loop AI; data labelling; defect prediction; software analytics
TL;DR: This approach, called EMBLEM, an AI tool first explore the software development process to label commits that are most problematic, and humans then apply their expertise to check those labels (perhaps resulting in the AI updating the support vectors within their SVM learner). (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: January 11, 2022

2022 journal article

DebtFree: minimizing labeling cost in self-admitted technical debt identification using semi-supervised learning

EMPIRICAL SOFTWARE ENGINEERING, 27(4).

By: H. Tu n & T. Menzies n

author keywords: Technical debt; Semi-supervised learning; Unsupervised learning; Labeling effort
TL;DR: The proposed DebtFree, a two-mode framework based on unsupervised learning for identifying SATDs, can reduce the labeling effort by 99% in mode1 (unlabeled training data), and up to 63% in modes2 (labeledTraining data) while improving the current active learner’s F1 relatively to almost 100%. (via Semantic Scholar)
Source: Web Of Science
Added: April 25, 2022

2022 article

Fair-SSL: Building fair ML Software with less data

2022 IEEE/ACM INTERNATIONAL WORKSHOP ON EQUITABLE DATA & TECHNOLOGY (FAIRWARE 2022), pp. 1–8.

By: J. Chakraborty n, S. Majumder n & H. Tu n

author keywords: Machine Learning with and for SE; Ethics in Software Engineering
TL;DR: This is the first SE work where semi-supervised techniques are used to fight against ethical bias in SE ML models, and the clear advantage of Fair-SSL is that it requires only 10% of the labeled training data. (via Semantic Scholar)
Source: Web Of Science
Added: October 3, 2022

2022 journal article

Identifying Self-Admitted Technical Debts With Jitterbug: A Two-Step Approach

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 48(5), 1676–1691.

By: Z. Yu*, F. Fahid n, H. Tu n & T. Menzies n

author keywords: Software; Machine learning; Pattern recognition; Training; Computer hacking; Machine learning algorithms; Estimation; Technical debt; software engineering; machine learning; pattern recognition
TL;DR: Jitterbug is proposed, a two-step framework for identifying SATDs that identifies the “easy to find” SATDs automatically with close to 100 percent precision using a novel pattern recognition technique and machine learning techniques are applied to assist human experts in manually identifying the remaining “hard to find (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: May 17, 2022

2021 article

FRUGAL: Unlocking Semi-Supervised Learning for Software Analytics

2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, pp. 394–406.

By: H. Tu n & T. Menzies n

author keywords: Software Analytics; Data Labelling Efforts; Semi-Supervised Learning
TL;DR: It is asserted that FRUGAL can save considerable effort in data labelling especially in validating prior work or researching new problems, and suggested that proponents of complex and expensive methods should always baseline such methods against simpler and cheaper alternatives. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: April 25, 2022

2021 article

Mining Workflows for Anomalous Data Transfers

2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021), pp. 1–12.

author keywords: Scientific Workflow; TCP Signatures; Anomaly Detection; Hyper-Parameter Tuning; Sequential Optimization
TL;DR: X-FLASH is developed, a network anomaly detection tool for faulty TCP workflow transfers that incorporates novel hyperparameter tuning and data mining approaches for improving the performance of the machine learning algorithms to accurately classify the anomalous TCP packets. (via Semantic Scholar)
UN Sustainable Development Goal Categories
9. Industry, Innovation and Infrastructure (OpenAlex)
Sources: Web Of Science, NC State University Libraries
Added: October 4, 2021

2018 article

Is One Hyperparameter Optimizer Enough?

PROCEEDINGS OF THE 4TH ACM SIGSOFT INTERNATIONAL WORKSHOP ON SOFTWARE ANALYTICS (SWAN'18), pp. 19–25.

By: H. Tu n & V. Nair n

author keywords: Defect Prediction; SBSE; Hyperparameter Tuning
TL;DR: It is concluded that hyperparameter optimization is more nuanced than previously believed and, while such optimization can certainly lead to large improvements in the performance of classifiers used in software analytics, it remains to be seen which specific optimizers should be applied to a new dataset. (via Semantic Scholar)
Source: Web Of Science
Added: April 2, 2019

Citation Index includes data from a number of different sources. If you have questions about the sources of data in the Citation Index or need a set of data which is free to re-distribute, please contact us.

Certain data included herein are derived from the Web of Science© and InCites© (2024) of Clarivate Analytics. All rights reserved. You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.