Works (155)

Updated: April 22nd, 2024 07:44

2024 journal article

Ethics: Why Software Engineers Can't Afford to Look Away

IEEE SOFTWARE, 41(1), 142–144.

By: B. Johnson* & T. Menzies n

author keywords: Ethics; Oral communication; Software; Software engineering
Sources: ORCID, Web Of Science, NC State University Libraries
Added: January 25, 2024

2024 journal article

Fighting for What's Right: An Interview With Marc Canellas

IEEE SOFTWARE, 41(2), 104–107.

By: B. Johnson* & T. Menzies n

author keywords: Ethics; Law; Public policy; Interviews; Artificial intelligence; Aerospace engineering
Sources: ORCID, Web Of Science, NC State University Libraries
Added: March 1, 2024

2024 journal article

Learning from Very Little Data: On the Value of Landscape Analysis for Predicting Software Project Health

ACM Transactions on Software Engineering and Methodology.

By: A. Lustosa n & T. Menzies n

TL;DR: NiSNEAK is both faster and more effective than prior state-of-the-art hyperparameter optimization algorithms (e.g. FLASH, HYPEROPT, OPTUNA). (via Semantic Scholar)
Source: ORCID
Added: March 15, 2024

2024 journal article

The Power of Positionality-Why Accessibility? An Interview With Kevin Moran and Arun Krishnavajjala

IEEE SOFTWARE, 41(3), 91–94.

By: B. Johnson* & T. Menzies n

author keywords: Software; Interviews
Sources: ORCID, Web Of Science, NC State University Libraries
Added: April 8, 2024

2024 journal article

Trading Off Scalability, Privacy, and Performance in Data Synthesis

IEEE ACCESS, 12, 26642–26654.

By: X. Ling n, T. Menzies n, C. Hazard*, J. Shu & J. Beel

author keywords: Synthetic data; Clustering algorithms; Data models; Engines; Biomedical imaging; Generative adversarial networks; Data privacy; Regression analysis; Classification algorithms; Scalability; Homomorphic encryption; Synthetic data generation; privacy preservation; regression; classification
TL;DR: It is shown that the synthetic data generated by Howso engine has good privacy and accuracy, which results in the best overall score, and the proposed random projection based synthetic data generation framework can generate synthetic data with highest accuracy score, and has the fastest scalability. (via Semantic Scholar)
Sources: ORCID, Web Of Science, NC State University Libraries
Added: February 22, 2024

2024 journal article

When less is more: on the value of "co-training" for semi-supervised software defect predictors

EMPIRICAL SOFTWARE ENGINEERING, 29(2).

By: S. Majumder*, J. Chakraborty* & T. Menzies*

author keywords: Semi-supervised learning; SSL; Self-training; Co-training; Boosting methods; Semi-supervised preprocessing; Clustering-based semi-supervised preprocessing; Intrinsically semi-supervised methods; Graph-based methods; Co-forest; Effort aware tri-training
Sources: Web Of Science, NC State University Libraries
Added: March 11, 2024

2023 article

"The Best Data Are Fake Data?": An Interview With Chris Hazard

Menzies, T., & Hazard, C. (2023, September). IEEE SOFTWARE, Vol. 40, pp. 121–124.

By: T. Menzies n & C. Hazard n

author keywords: Ethics; Hazards; Interviews; Synthetic data
TL;DR: This issue, Chris Hazard, cofounder of Diveplane, which is a leader in the burgeoning international synthetic data market, discusses the ethical implications of using synthetic data generated from real information sources. (via Semantic Scholar)
Sources: ORCID, Web Of Science, NC State University Libraries
Added: October 31, 2023

2023 journal article

(Re)Use of Research Results (Is Rampant)

COMMUNICATIONS OF THE ACM, 66(2), 75–81.

Contributors: M. Baldassarre*, N. Ernst*, B. Hermann*, T. Menzies n & R. Yedida n

UN Sustainable Development Goal Categories
Sources: Web Of Science, NC State University Libraries, ORCID
Added: February 27, 2023

2023 journal article

A Tale of Two Cities: Data and Configuration Variances in Robust Deep Learning

IEEE INTERNET COMPUTING, 27(6), 13–20.

By: G. Zhang*, J. Sun*, F. Xu, Y. Sui*, H. Bandara*, S. Chen*, T. Menzies n

author keywords: Robustness; Data models; Perturbation methods; Training; Mathematical models; Forecasting; Predictive models
Sources: ORCID, Web Of Science, NC State University Libraries
Added: January 25, 2024

2023 journal article

An expert system for redesigning software for cloud applications

EXPERT SYSTEMS WITH APPLICATIONS, 219.

By: R. Yedida n, R. Krishna, A. Kalia, T. Menzies n, J. Xiao & M. Vukovic

Contributors: R. Yedida n, R. Krishna, A. Kalia, T. Menzies n, J. Xiao & M. Vukovic

author keywords: Software engineering; Microservices; Deep learning; Hyper-parameter optimization; Refactoring
TL;DR: This paper proposes DEEPLY, a new algorithm that extends the CO-GCN deep learning partition generator with a novel loss function and some hyper-parameter optimization, and generally outperforms prior work across multiple datasets and goals. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries, ORCID
Added: March 27, 2023

2023 journal article

Assessing the Early Bird Heuristic (for Predicting ProjectQuality)

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 32(5).

By: N. Shrikanth n & T. Menzies n

author keywords: Quality prediction; defects; early; data-lite
TL;DR: A case study with 240 projects finds that the information in those projects “clumps” towards the earliest parts of the project, and it is shown that a simple model (with just a few features) generalizes to hundreds of projects. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: September 11, 2023

2023 journal article

Don't Lie to Me: Avoiding Malicious Explanations With STEALTH

IEEE SOFTWARE, 40(3), 43–53.

By: L. Alvarez n & T. Menzies n

author keywords: Software algorithms; Clustering algorithms
Sources: Web Of Science, ORCID, NC State University Libraries
Added: July 19, 2023

2023 journal article

Fair Enough: Searching for Sufficient Measures of Fairness

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 32(6).

By: S. Majumder n, J. Chakraborty n, G. Bai n, K. Stolee n & T. Menzies n

author keywords: Software fairness; fairness metrics; clustering; theoretical analysis; empirical analysis
TL;DR: This article shows that many of those fairness metrics effectively measure the same thing, and it is no longer necessary (or even possible) to satisfy all fairness metrics. (via Semantic Scholar)
Sources: ORCID, Web Of Science, NC State University Libraries
Added: October 31, 2023

2023 journal article

FairMask: Better Fairness via Model-Based Rebalancing of Protected Attributes

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 49(4), 2426–2439.

By: K. Peng n, J. Chakraborty n & T. Menzies n

author keywords: Software fairness; explanation; bias mitigation
TL;DR: This work proposes a model-based extrapolation method that corrects the misleading latent correlation between the protected attributes and other non-protected ones and achieves significantly better group and individual fairness than benchmark methods. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: May 30, 2023

2023 journal article

Finding Trends in Software Research

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 49(4), 1397–1410.

By: G. Mathew n, A. Agrawal n & T. Menzies n

author keywords: Software engineering; Conferences; Software; Analytical models; Data models; Predictive models; Testing; bibliometrics; topic modeling; text mining
TL;DR: While there is no overall gender bias in SE authorship, it is noted that women are under-represented in the top-most cited papers in the authors' field and a previously unreported dichotomy between software conferences and journals is shown. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: May 30, 2023

2023 journal article

How to "Sell" Ethics (Using AI): An Interview With Alexander Serebrenik

IEEE SOFTWARE, 40(3), 95–97.

By: T. Menzies n

author keywords: Ethics; Organizations; Interviews; Artificial intelligence
TL;DR: “Most organizations are tone deaf when it comes to ethics,” says Prof. Alexander Serebrenik of the Eindhoven University of Technology, who has been trying to talk discrimination, diversity, and inclusion with them for years and has given up. (via Semantic Scholar)
UN Sustainable Development Goal Categories
4. Quality Education (Web of Science)
10. Reduced Inequalities (OpenAlex)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: July 19, 2023

2023 journal article

How to Find Actionable Static Analysis Warnings: A Case Study With FindBugs

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 49(4), 2856–2872.

By: R. Yedida n, H. Kang*, H. Tu*, X. Yang n, D. Lo* & T. Menzies n

Contributors: R. Yedida n, H. Kang*, H. Tu*, X. Yang n, D. Lo* & T. Menzies n

author keywords: Codes; Computer bugs; Static analysis; Training; Source coding; Measurement; Industries; Software analytics; static analysis; false alarms; locality; hyperparameter optimization
TL;DR: It is shown here that effective predictors of static code warnings can be created by methods that locally adjust the decision boundary (between actionable warnings and others), and these methods yield a new high water-mark for recognizing actionablestatic code warnings. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: May 30, 2023

2023 article

The Engineering Mindset Is an Ethical Mindset (We Just Don't Teach It That Way ... Yet)

IEEE SOFTWARE, Vol. 40, pp. 103–110.

By: T. Menzies n, B. Johnson*, D. Roberts n & L. Alvarez n

author keywords: Ethics; Software
TL;DR: A proof-by-example of a CS class syllabus that enables an ethical engineering mindset while (b) not detracting from core technical topics is offered. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: June 12, 2023

2023 journal article

Unfairness Is Everywhere, so What to Do? An Interview With Jeanna Matthews

IEEE SOFTWARE, 40(6), 135–138.

By: B. Johnson* & T. Menzies n

author keywords: Sociology; Legislation; Oral communication; Software; Software measurement; Statistics; Interviews
Sources: ORCID, Web Of Science, NC State University Libraries
Added: January 25, 2024

2023 journal article

VEER: enhancing the interpretability of model-based optimizations

EMPIRICAL SOFTWARE ENGINEERING, 28(3).

author keywords: Software analytics; Multi-objective optimization; Disagreement; Interpretable AI
TL;DR: A dimension reduction method called VEER is proposed that builds a useful one-dimensional approximation to the original N-objective space that improves the execution time, but also resolves the potential model disagreement problem. (via Semantic Scholar)
UN Sustainable Development Goal Categories
7. Affordable and Clean Energy (OpenAlex)
Sources: Web Of Science, NC State University Libraries
Added: April 24, 2023

2023 journal article

What Not to Test (For Cyber-Physical Systems)

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 49(7), 3811–3826.

By: X. Ling n & T. Menzies n

author keywords: Search-based software engineering; modeling and model-driven engineering; validation and verification; software testing; simulation-based testing; multi-goal optimization
TL;DR: DoLesS (Domination with Least Squares Approximation) that selects the minimal and effective test cases by averaging over a coarse-grained grid of the information gained from multiple optimizations goals to find a minimal set of tests that can distinguish better from worse parts of the optimization goals. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: August 28, 2023

2022 journal article

Artificial Intelligence and Software Engineering: Are We Ready?

COMPUTER, 55(3), 24–28.

By: A. Mashkoor*, T. Menzies n, A. Egyed* & R. Ramler*

Sources: Web Of Science, NC State University Libraries
Added: April 4, 2022

2022 journal article

Assessing expert system-assisted literature reviews with a case study

EXPERT SYSTEMS WITH APPLICATIONS, 200.

By: Z. Yu*, J. Carver*, G. Rothermel n & T. Menzies n

author keywords: Systematic literature review; Expert systems; Software engineering; Active learning; Primary study selection; Test case prioritization
TL;DR: An expert system that incorporates an incrementally updated human-in-the-loop active learning tool is used to identify test case prioritization techniques for automated UI testing from 8,349 papers on IEEE Xplore. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: June 13, 2022

2022 journal article

Better Data Labelling With EMBLEM (and how that Impacts Defect Prediction)

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 48(1), 278–294.

By: H. Tu n, Z. Yu n & T. Menzies n

author keywords: Human-in-the-loop AI; data labelling; defect prediction; software analytics
TL;DR: This approach, called EMBLEM, an AI tool first explore the software development process to label commits that are most problematic, and humans then apply their expertise to check those labels (perhaps resulting in the AI updating the support vectors within their SVM learner). (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: January 11, 2022

2022 article

Dazzle: Using Optimized Generative Adversarial Networks to Address Security Data Class Imbalance Issue

2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022), pp. 144–155.

By: R. Shu n, T. Xia n, L. Williams n & T. Menzies n

author keywords: Security Vulnerability Prediction; Class Imbalance; Hyperparameter Optimization; Generative Adversarial Networks
TL;DR: The use of optimized GANs are suggested as an alternative method for security vulnerability data class imbalanced issues and further help build better prediction models with resampled datasets. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: September 19, 2022

2022 journal article

Defect Reduction Planning (Using TimeLIME)

IEEE Transactions on Software Engineering, 48(7), 2510–2525.

By: K. Peng n & T. Menzies n

TL;DR: TimeLIME is a new tool, introduced in this paper, that improves LIME by restricting its plans to just those attributes which change the most within a project, and for nine project trails, it is found that TimeLIME outperformed all other algorithms. (via Semantic Scholar)
Source: ORCID
Added: October 31, 2023

2022 journal article

Do I really need all this work to find vulnerabilities? An empirical case study comparing vulnerability detection techniques on a Java application

EMPIRICAL SOFTWARE ENGINEERING, 27(6).

By: S. Elder n, N. Zahan n, R. Shu n, M. Metro n, V. Kozarev n, T. Menzies n, L. Williams n

author keywords: Vulnerability Management; Web Application Security; Penetration Testing; Vulnerability Scanners
TL;DR: The goal of this research is to assist managers and other decision-makers in making informed choices about the use of software vulnerability detection techniques through an empirical study of the efficiency and effectiveness of four techniques on a Java-based web application. (via Semantic Scholar)
UN Sustainable Development Goal Categories
16. Peace, Justice and Strong Institutions (OpenAlex)
Sources: Web Of Science, NC State University Libraries, ORCID
Added: August 22, 2022

2022 journal article

How Different is Test Case Prioritization for Open and Closed Source Projects?

IEEE Transactions on Software Engineering, 48(7), 2526–2540.

By: X. Ling n, R. Agrawal n & T. Menzies n

TL;DR: It is found that prioritization approaches that work best for open-source projects can work worst for the closed-source project (and vice versa) and it is ill-advised to always apply one prioritization scheme to all projects. (via Semantic Scholar)
Source: ORCID
Added: October 31, 2023

2022 article

How to Improve Deep Learning for Software Analytics (a case study with code smell detection)

2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022), pp. 156–166.

By: R. Yedida n & T. Menzies n

Contributors: R. Yedida n & T. Menzies n

author keywords: code smell detection; deep learning; autoencoders
TL;DR: The results of this paper show that the method can achieve better than state-of-the-art results on code smell detection with fuzzy oversampling, and suggest other lessons for other kinds of analytics. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries, ORCID
Added: September 19, 2022

2022 journal article

Identifying Self-Admitted Technical Debts With Jitterbug: A Two-Step Approach

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 48(5), 1676–1691.

By: Z. Yu*, F. Fahid n, H. Tu n & T. Menzies n

author keywords: Software; Machine learning; Pattern recognition; Training; Computer hacking; Machine learning algorithms; Estimation; Technical debt; software engineering; machine learning; pattern recognition
TL;DR: Jitterbug is proposed, a two-step framework for identifying SATDs that identifies the “easy to find” SATDs automatically with close to 100 percent precision using a novel pattern recognition technique and machine learning techniques are applied to assist human experts in manually identifying the remaining “hard to find (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: May 17, 2022

2022 article

Methods for Stabilizing Models Across Large Samples of Projects (with case studies on Predicting Defect and Project Health)

2022 MINING SOFTWARE REPOSITORIES CONFERENCE (MSR 2022), pp. 566–578.

By: S. Majumder n, T. Xia n, R. Krishna n & T. Menzies n

author keywords: Defect Prediction; Project Health; Bellwether; Hierarchical Clustering; Random Forest; Two Phase Transfer Learning; Transfer Learning
TL;DR: This paper provides a promising result showing such stable models can be generated using a new transfer learning framework called STABILIZER, and these case studies are the largest demonstration of the generalizability of quantitative predictions of project quality yet reported in the SE literature. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: September 19, 2022

2022 journal article

Omni: automated ensemble with unexpected models against adversarial evasion attack

EMPIRICAL SOFTWARE ENGINEERING, 27(1).

By: R. Shu n, T. Xia n, L. Williams n & T. Menzies n

author keywords: Hyperparameter optimization; Ensemble defense; Adversarial evasion attack
TL;DR: Omni is a promising approach as a defense strategy against adversarial attacks when compared with other baseline treatments, and it is suggested to create ensemble with unexpected models that are distant from the attacker’s expected model through methods such as hyperparameter optimization. (via Semantic Scholar)
UN Sustainable Development Goal Categories
16. Peace, Justice and Strong Institutions (OpenAlex)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: December 6, 2021

2022 journal article

Predicting health indicators for open source projects (using hyperparameter optimization)

EMPIRICAL SOFTWARE ENGINEERING, 27(6).

By: T. Xia n, W. Fu n, R. Shu n, R. Agrawal n & T. Menzies n

author keywords: Hyperparameter optimization; Project health; Machine learning
TL;DR: This is the largest study yet conducted, using recent data for predicting multiple health indicators of open-source projects, and finds that traditional estimation algorithms make many mistakes. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: July 5, 2022

2022 journal article

Revisiting process versus product metrics: a large scale analysis

EMPIRICAL SOFTWARE ENGINEERING, 27(3).

By: S. Majumder n, P. Mody n & T. Menzies n

author keywords: Software engineering; Software process; Process metrics; Product metrics; Developer metrics; Random forest; Logistic regression; Support vector machine; HPO
TL;DR: Prior small-scale results are rechecked and it is found that process metrics are better predictors for defects than product metrics, but it is unwise to trust metric importance results from analytics in-the-small studies since those change dramatically when moving to analytics in the-large. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: April 4, 2022

2022 journal article

Sequential Model Optimization for Software Effort Estimation

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 48(6), 1994–2009.

By: T. Xia n, R. Shu n, X. Shen n & T. Menzies n

author keywords: Estimation; Software; Tools; Optimization; Data models; Task analysis; Mathematical model; Effort estimation; COCOMO; hyperparameter tuning; regression trees; sequential model optimization
TL;DR: This paper applies a configuration technique called “ROME” (Rapid Optimizing Methods for Estimation), which uses sequential model-based optimization (SMO) to find what configuration settings of effort estimation techniques work best for a particular data set. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: June 15, 2022

2022 article

The Secret to Better AI and Better Software (Is Requirements Engineering)

Bencomo, N., Guo, J. L. C., Harrison, R., Heyn, H.-M., & Menzies, T. (2022, January). IEEE SOFTWARE, Vol. 39, pp. 105–110.

TL;DR: Recently, practitioners and researchers met to discuss the role of requirements, and AI and SE, and the results offered here notes on that fascinating discussion. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: January 3, 2022

2021 journal article

Assessing practitioner beliefs about software engineering

EMPIRICAL SOFTWARE ENGINEERING, 26(4).

By: N. Shrikanth n, W. Nichols*, F. Fahid n & T. Menzies n

author keywords: Software analytics; Beliefs; Productivity; Quality; Experience
TL;DR: It is found that a narrow scope could delude practitioners in misinterpreting certain effects to hold in their day-to-day work, and programming languages act as a confounding factor for developer productivity and software quality. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: June 10, 2021

2021 article

Bias in Machine Learning Software: Why? How? What to Do?

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 429–440.

By: J. Chakraborty n, S. Majumder n & T. Menzies n

author keywords: Software Fairness; Fairness Metrics; Bias Mitigation
TL;DR: This paper postulates that the root causes of bias are the prior decisions that affect what data was selected and the labels assigned to those examples, and proposes the Fair-SMOTE algorithm, which removes biased labels; and rebalances internal distributions such that based on sensitive attribute, examples are equal in both positive and negative classes. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: March 7, 2022

2021 journal article

Characterizing Crowds to Better Optimize Worker Recommendation in Crowdsourced Testing

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 47(6), 1259–1276.

By: J. Wang*, S. Wang*, J. Chen n, T. Menzies n, Q. Cui, M. Xie*, Q. Wang*

author keywords: Crowdsourced testing; crowd worker recommendation; multi-objective optimization
TL;DR: Multi-Objective Crowd wOrker recoMmendation approach (MOCOM), which aims at recommending a minimum number of crowd workers who could detect the maximum number of bugs for a crowdsourced testing task, significantly outperforms five commonly-used and state-of-the-art baselines. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: June 28, 2021

2021 article

Documenting Evidence of a Replication of 'Analyze This! 145 Questions for Data Scientists in Software Engineering'

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 1602–1602.

By: X. Yang n & T. Menzies n

author keywords: reuse; replication; data science; software analysis
TL;DR: The use of the 145 software engineering questions for data scientists presented in the Microsoft study is reported here in a recent FSE~'20 paper by Huijgens et al. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: March 7, 2022

2021 article

Documenting Evidence of a Replication of 'Populating a Release History Database from Version Control and Bug Tracking Systems'

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 1601–1601.

By: X. Yang n & T. Menzies n

author keywords: reuse; replication; bug fixing; text tagging
TL;DR: The use of a keyword-based and regular expression-based approach to identify bug-fixing commits by linking commit messages and issue tracker data in a recent FSE '20 paper is reported. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: March 7, 2022

2021 article

Documenting Evidence of a Reproduction of Is There A "Golden" Feature Set for Static Warning Identification? - An Experimental Evaluation'

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 1603–1603.

By: X. Yang n & T. Menzies n

author keywords: reuse; reproduction; static analysis; deep learning
TL;DR: The use of the static analysis dataset generated by FindBugs in a recent EMSE '21 paper by Yang et al. is reported here. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: March 7, 2022

2021 article

Documenting Evidence of a Reuse of "'Why Should I Trust You?": Explaining the Predictions of Any Classifier'

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 1600–1600.

By: K. Peng n & T. Menzies n

author keywords: Software analytics; Actionable analysis
TL;DR: The framework LIME, a local instance-based explanation generation framework that was originally proposed by Ribeiro et al. in their paper "'Why Should I Trust You?': Explaining the Predictions of Any Classifier", was reused by Peng et al.'s paper "Defect Reduction Planning (using TimeLIME). (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: March 7, 2022

2021 article

Documenting Evidence of a Reuse of 'A Systematic Literature Review of Techniques and Metrics to Reduce the Cost of Mutation Testing'

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 1597–1597.

By: A. Lustosa n & T. Menzies n

author keywords: reuse; reproduction; mutation testing; systematic literature review
TL;DR: This submission is a report on the reuse of Pizzoleto et al.'s Systematic Literature Review by Guizzo et al. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: March 7, 2022

2021 article

Documenting Evidence of a Reuse of 'A Systematic Study of the Class Imbalance Problem in Convolutional Neural Networks'

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 1595–1595.

By: R. Yedida n & T. Menzies n

Contributors: R. Yedida n & T. Menzies n

author keywords: reuse; replication; oversampling; defect prediction
TL;DR: The reuse of oversampling, and modifications to the basic approach, used in a recent TSE ’21 paper by YedidaMenzies is reported, which is the oversampled technique studied by Buda et al. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Sources: Web Of Science, ORCID, NC State University Libraries
Added: January 4, 2022

2021 article

Documenting Evidence of a Reuse of 'On the Number of Linear Regions of Deep Neural Networks'

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 1596–1596.

By: R. Yedida n & T. Menzies n

Contributors: R. Yedida n & T. Menzies n

author keywords: reuse; replication; deep learning; defect prediction
TL;DR: The reuse of theoretical insights from deep learning literature is reported here, used in a recent TSE '21 paper by Yedida & Menzies, and the reuse of Theorem 4 from Montufar et al. is documented. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: January 4, 2022

2021 article

Documenting Evidence of a Reuse of 'RefactoringMiner 2.0'

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 1598–1598.

By: A. Lustosa n & T. Menzies n

author keywords: reuse; refactoring; bug introduction; mining software repositories
TL;DR: This submission is a report on the reuse of Tsantalis et al.'s Refactoring Miner (RMiner) package by Penta et al. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: March 7, 2022

2021 article

Documenting Evidence of a Reuse of 'What is a Feature? A Qualitative Study of Features in Industrial Software Product Lines'

PROCEEDINGS OF THE 29TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING (ESEC/FSE '21), pp. 1599–1599.

By: K. Peng n & T. Menzies n

author keywords: Software analytics; Software product lines; Software configuration
TL;DR: An example of reuse is the paper "Dimensions of software configuration: on the configuration context in modern software development" by Siegmund et al. reused definitions and theories about configuration features in the original paper. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: March 7, 2022

2021 article

Early Life Cycle Software Defect Prediction. Why? How?

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2021), pp. 448–459.

By: N. Shrikanth n, S. Majumder n & T. Menzies n

author keywords: sampling; early; defect prediction; analytics
TL;DR: It is shown that, at least for learning defect predictors, after the first few months, this may not be true, and researchers should adopt a "simplicity-first" approach to their work. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: September 13, 2021

2021 article

FRUGAL: Unlocking Semi-Supervised Learning for Software Analytics

2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, pp. 394–406.

By: H. Tu n & T. Menzies n

author keywords: Software Analytics; Data Labelling Efforts; Semi-Supervised Learning
TL;DR: It is asserted that FRUGAL can save considerable effort in data labelling especially in validating prior work or researching new problems, and suggested that proponents of complex and expensive methods should always baseline such methods against simpler and cheaper alternatives. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: April 25, 2022

2021 journal article

How to Better Distinguish Security Bug Reports (Using Dual Hyperparameter Optimization)

EMPIRICAL SOFTWARE ENGINEERING, 26(3).

By: R. Shu, T. Xia, J. Chen, L. Williams & T. Menzies

author keywords: Hyperparameter Optimization; Data pre-processing; Security bug report
TL;DR: The SWIFT’s dual optimization of both pre-processor and learner is more useful than optimizing each of them individually, and this approach can quickly optimize models that achieve better recalls than the prior state-of-the-art. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: May 3, 2021

2021 journal article

Improving Vulnerability Inspection Efficiency Using Active Learning

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 47(11), 2401–2420.

By: Z. Yu n, C. Theisen n, L. Williams n & T. Menzies n

author keywords: Inspection; Software; Tools; Security; Predictive models; Error correction; NIST; Active learning; security; vulnerabilities; software engineering; error correction
TL;DR: HARMLESS is an incremental support vector machine tool that builds a vulnerability prediction model from the source code inspected to date, then suggests what source code files should be inspected next, then provides feedback on when to stop. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: November 12, 2021

2021 journal article

Learning to recognize actionable static code warnings (is intrinsically easy)

EMPIRICAL SOFTWARE ENGINEERING, 26(3).

By: X. Yang n, J. Chen n, R. Yedida n, Z. Yu n & T. Menzies n

Contributors: X. Yang n, J. Chen n, R. Yedida n, Z. Yu n & T. Menzies n

author keywords: Static code analysis; Actionable warnings; Deep learning; Linear SVM; Intrinsic dimensionality
TL;DR: It is found that data mining algorithms can find actionable warnings with remarkable ease and is concluded that learning to recognize actionable static code warnings is easy, using a wide range of learning algorithms, since the underlying data is intrinsically simple. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: May 17, 2021

2021 article

Mining Workflows for Anomalous Data Transfers

2021 IEEE/ACM 18TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2021), pp. 1–12.

author keywords: Scientific Workflow; TCP Signatures; Anomaly Detection; Hyper-Parameter Tuning; Sequential Optimization
TL;DR: X-FLASH is developed, a network anomaly detection tool for faulty TCP workflow transfers that incorporates novel hyperparameter tuning and data mining approaches for improving the performance of the machine learning algorithms to accurately classify the anomalous TCP packets. (via Semantic Scholar)
UN Sustainable Development Goal Categories
9. Industry, Innovation and Infrastructure (OpenAlex)
Sources: Web Of Science, NC State University Libraries
Added: October 4, 2021

2021 journal article

On the Value of Oversampling for Deep Learning in Software Defect Prediction

IEEE Transactions on Software Engineering, 48(8), 1–1.

By: R. Yedida n & T. Menzies n

Contributors: R. Yedida n & T. Menzies n

author keywords: Deep learning; Tuning; Predictive models; Standards; Prediction algorithms; Training; Tools; Defect prediction; oversampling; class imbalance; neural networks
TL;DR: The results present a cogent case for the use of oversampling prior to applying deep learning on software defect prediction datasets, which can do significantly better than the prior DL state of the art in 14/20 defect data sets. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries, Crossref
Added: June 12, 2021

2021 article

Shockingly Simple: "Keys" for Better AI for SE

IEEE SOFTWARE, Vol. 38, pp. 114–118.

By: T. Menzies n

TL;DR: As 2020 drew to a close, I was thinking about what lessons the authors have learned about software engineering (SE) for artificial intelligence (AI)-things that they can believe now but, in the last century, would have seemed somewhat shocking. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: April 12, 2021

2021 journal article

Simpler Hyperparameter Optimization for Software Analytics: Why, How, When

IEEE Transactions on Software Engineering, 48(8), 1–1.

By: A. Agrawal*, X. Yang n, R. Agrawal n, R. Yedida n, X. Shen n & T. Menzies n

Contributors: A. Agrawal*, X. Yang n, R. Agrawal n, R. Yedida n, X. Shen n & T. Menzies n

author keywords: Software analytics; hyperparameter optimization; defect prediction; bad smell detection; issue close time; bug reports
TL;DR: The simple DODGE works best for data sets with low “intrinsic dimensionality” and very poorly for higher-dimensional data; nearly all the SE data seen here was intrinsically low-dimensional, indicating that DODGE is applicable for many SE analytics tasks. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries, Crossref
Added: June 12, 2021

2021 article

Structuring a Comprehensive Software Security Course Around the OWASP Application Security Verification Standard

2021 IEEE/ACM 43RD INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: JOINT TRACK ON SOFTWARE ENGINEERING EDUCATION AND TRAINING (ICSE-JSEET 2021), pp. 95–104.

By: S. Elder n, N. Zahan n, V. Kozarev n, R. Shu n, T. Menzies n & L. Williams n

author keywords: Security and Protection; Computer and Information Science Education; Industry-Standards
TL;DR: A theme of the course assignments was to map vulnerability discovery to the security controls of the Open Web Application Security Project (OWASP) Application Security Verification Standard (ASVS), and this mapping may have increased students' depth of understanding of a wider range of security topics. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries, ORCID
Added: November 1, 2021

2021 journal article

Understanding static code warnings: An incremental AI approach

EXPERT SYSTEMS WITH APPLICATIONS, 167.

By: X. Yang n, Z. Yu n, J. Wang* & T. Menzies n

author keywords: Actionable warning identification; Active learning; Static analysis; Selection process
TL;DR: An incremental AI tool that watches humans reading false alarm reports can quickly learn to distinguish spurious false alarms from more serious matters that deserve further attention and can identify over 90% of actionable warnings in a priority order given by the algorithm. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries, ORCID
Added: November 24, 2020

2021 journal article

Whence to Learn? Transferring Knowledge in Configurable Systems Using BEETLE

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 47(12), 2956–2972.

By: R. Krishna*, V. Nair n, P. Jamshidi* & T. Menzies n

author keywords: Performance optimization; SBSE; transfer learning; bellwether
TL;DR: This paper proposes a novel transfer learning framework called BEETLE, which is a “bellwether”-based transfer learner that focuses on identifying and learning from the most relevant source from amongst the old data. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: January 3, 2022

2020 article

Assessing Practitioner Beliefs about Software Defect Prediction

2020 IEEE/ACM 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP), pp. 182–190.

By: N. Shrikanth n & T. Menzies n

author keywords: defects; beliefs; practitioner; empirical software engineering
TL;DR: The conclusion will be that the nature of the debate with Software Engineering needs to change, while it is important to report the effects that hold right now, it is also important to reports on what effects change over time. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 16, 2021

2020 journal article

Better software analytics via "DUO": Data mining algorithms using/used-by optimizers

EMPIRICAL SOFTWARE ENGINEERING, 25(3), 2099–2136.

By: A. Agrawal n, T. Menzies n, L. Minku*, M. Wagner* & Z. Yu n

author keywords: Software analytics; Data mining; Optimization; Evolutionary algorithms
TL;DR: It is possible, useful and necessary to combine data mining and optimization using DUO, and the era of papers that just use data miners is coming to an end. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: May 8, 2020

2020 article

Expert Perspectives on AI

IEEE SOFTWARE, Vol. 37, pp. 87–94.

By: A. Carleton*, E. Harper*, M. Lyu*, S. Eldh*, T. Xie* & T. Menzies n

UN Sustainable Development Goal Categories
Sources: Web Of Science, NC State University Libraries
Added: July 13, 2020

2020 journal article

Finding Faster Configurations Using FLASH

IEEE Transactions on Software Engineering, 46(7), 794–811.

By: V. Nair n, Z. Yu n, T. Menzies n, N. Siegmund* & S. Apel*

author keywords: Software systems; Optimization; Throughput; Storms; Task analysis; Cloud computing; Performance prediction; search-based SE; configuration; multi-objective optimization; sequential model-based methods
TL;DR: Flash is introduced, a sequential model-based method that sequentially explores the configuration space by reflecting on the configurations evaluated so far to determine the next best configuration to explore, which reduces the effort required to find the better configuration. (via Semantic Scholar)
Source: ORCID
Added: July 16, 2020

2020 journal article

Learning actionable analytics from multiple software projects

EMPIRICAL SOFTWARE ENGINEERING, 25(5), 3468–3500.

By: R. Krishna* & T. Menzies n

author keywords: Data mining; Actionable analytics; Planning; Bellwethers; Defect prediction
TL;DR: This research seeks methods that generate demonstrably useful guidance on “what to do” within the context of a specific software project by proposing XTREE and BELLTREE to generating plans that can improve software quality. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 10, 2020

2020 article

Making Fair ML Software using Trustworthy Explanation

2020 35TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2020), pp. 1229–1233.

By: J. Chakraborty n, K. Peng n & T. Menzies n

TL;DR: This work shows how the proposed method based on K nearest neighbors can overcome shortcomings and find the underlying bias of black box models and describes the future framework combining explanation and planning to build fair software. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: June 10, 2021

2020 article

The AI Effect: Working at the Intersection of AI and SE

IEEE SOFTWARE, Vol. 37, pp. 26–35.

By: A. Carleton*, E. Harper*, T. Menzies n, T. Xie*, S. Eldh* & M. Lyu*

Sources: Web Of Science, NC State University Libraries
Added: July 13, 2020

2020 article

The Five Laws of SE for AI

IEEE SOFTWARE, Vol. 37, pp. 81–85.

By: T. Menzies n

TL;DR: It is time to talk about software engineering (SE) for artificial intelligence (AI) as industry is becoming increasingly dependent on AI software. (via Semantic Scholar)
UN Sustainable Development Goal Categories
9. Industry, Innovation and Infrastructure (OpenAlex)
Sources: Web Of Science, NC State University Libraries
Added: January 21, 2020

2020 article

What disconnects Practitioner Belief and Empirical Evidence ?

2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2020), pp. 286–287.

By: N. Shrikanth n & T. Menzies n

author keywords: defects; beliefs; practitioner; empirical software engineering
TL;DR: Most of the widely-held beliefs studied are only sporadically supported in the data; i.e. large effects can appear in project data and then disappear in subsequent releases. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: May 10, 2021

2020 journal article

iSENSE2.0: Improving Completion-aware Crowdtesting Management with Duplicate Tagger and Sanity Checker

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 29(4).

By: J. Wang*, Y. Yang*, T. Menzies n & Q. Wang*

author keywords: Crowdsourced testing; test management; close prediction; term coverage; capture-recapture
TL;DR: This article investigates the necessity and feasibility of close prediction of crowdtesting tasks based on an industrial dataset, and proposes a close prediction approach named iSENSE2.0, which applies incremental sampling technique to process crowdtesting reports arriving in chronological order and organizes them into fixed-sized groups as dynamic inputs. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: November 9, 2020

2019 journal article

"Bad smells" in software analytics papers

INFORMATION AND SOFTWARE TECHNOLOGY, 112, 35–47.

By: T. Menzies n & M. Shepperd*

TL;DR: This paper proposes using “bad smells”, i.e., surface indications of deeper problems and popular in the agile software community and consider how they may be manifest in software analytics studies to provide guidance for producers and consumers ofSoftware analytics studies. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: June 17, 2019

2019 journal article

"Sampling" as a Baseline Optimizer for Search-Based Software Engineering

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 45(6), 597–614.

By: J. Chen n, V. Nair n, R. Krishna n & T. Menzies n

author keywords: Search-based SE; sampling; evolutionary algorithms
TL;DR: This paper compares Sway versus state-of-the-art search-based SE tools using seven models: five software product line models; and two other software process control models (concerned with project management, effort estimation, and selection of requirements) during incremental agile development. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: July 1, 2019

2019 journal article

A Deep Learning Model for Estimating Story Points

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 45(7), 637–656.

author keywords: Software analytics; effort estimation; story point estimation; deep learning
TL;DR: A prediction model for estimating story points based on a novel combination of two powerful deep learning architectures: long short-term memory and recurrent highway network is proposed, which is end-to-end trainable from raw input data to prediction outcomes without any manual feature engineering. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: August 12, 2019

2019 journal article

Bellwethers: A Baseline Method for Transfer Learning

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 45(11), 1081–1105.

By: R. Krishna n & T. Menzies n

author keywords: Estimation; Software; Software engineering; Task analysis; Benchmark testing; Complexity theory; Analytical models; Transfer learning; defect prediction; bad smells; issue close time; effort estimation; prediction
TL;DR: Using bellwethers as a baseline method for transfer learning against which future work should be compared is recommended, because conclusions about a community are stable as long as this Bellwether continues as the best oracle. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: December 9, 2019

2019 journal article

FAST(2): An intelligent assistant for finding relevant papers

EXPERT SYSTEMS WITH APPLICATIONS, 120, 57–71.

By: Z. Yu n & T. Menzies n

author keywords: Active learning; Literature reviews; Text mining; Semi-supervised learning; Relevance feedback; Selection process
TL;DR: It is shown that FAST2 robustly optimizes the human effort to find most (95%) of the relevant software engineering papers while also compensating for the errors made by humans during the review process. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries, ORCID
Added: February 18, 2019

2019 journal article

How to "DODGE" Complex Software Analytics

IEEE Transactions on Software Engineering, 47(10), 1–1.

By: A. Agrawal*, W. Fu, D. Chen*, X. Shen n & T. Menzies n

author keywords: Tuning; Text mining; Software; Task analysis; Optimization; Software engineering; Tools; Software analytics; hyperparameter optimization; defect prediction; text mining
TL;DR: By ignoring redundant tunings, ODGE, a tuning tool, runs orders of magnitude faster, while also generating learners with more accurate predictions than seen in prior state-of-the-art approaches. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries, Crossref
Added: January 25, 2020

2019 journal article

Images don't lie: Duplicate crowdtesting reports detection with screenshot information

INFORMATION AND SOFTWARE TECHNOLOGY, 110, 139–155.

By: J. Wang*, M. Li*, S. Wang*, T. Menzies n & Q. Wang*

author keywords: Crowdtesting; Duplicate report; Similarity detection
TL;DR: This work proposes SETU which combines information from the ScrEenshots and the TextUal descriptions to detect duplicate crowdtesting reports and designs a hierarchical algorithm to detect duplicates based on the four similarity scores derived from the four features respectively. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: May 13, 2019

2019 article

Predicting Breakdowns in Cloud Services (with SPIKE)

ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, pp. 916–924.

By: J. Chen n, J. Chakraborty n, P. Clark*, K. Haverlock*, S. Cherian* & T. Menzies n

author keywords: Cloud; optimization; data mining; parameter tuning
TL;DR: SPIKE is a data mining tool which can predict upcoming service breakdowns, half an hour into the future, and performed relatively better than other widely-used learning methods (neural nets, random forests, logistic regression). (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: October 7, 2019

2019 article

TERMINATOR: Better Automated UI Test Case Prioritization

ESEC/FSE'2019: PROCEEDINGS OF THE 2019 27TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, pp. 883–894.

By: Z. Yu n, F. Fahid n, T. Menzies n, G. Rothermel n, K. Patrick* & S. Cherian*

author keywords: automated UI testing; test case prioritization; total recall
TL;DR: A novel TCP approach is proposed, that dynamically re-prioritizes the test cases when new failures are detected, by applying and adapting a state of the art framework from the total recall problem. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: October 7, 2019

2019 article

Take Control (On the Unreasonable Effectiveness of Software Analytics)

2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP 2019), pp. 265–266.

By: T. Menzies n

TL;DR: The number of variables required to make predictions about SE projects is remarkably small, which means that most of the things the authors think might affect software quality have little impact in practice and controlling just a few key variables can be enough to improve software quality. (via Semantic Scholar)
UN Sustainable Development Goal Categories
9. Industry, Innovation and Infrastructure (OpenAlex)
Sources: Web Of Science, NC State University Libraries
Added: January 13, 2020

2019 article

iSENSE: Completion-Aware Crowdtesting Management

2019 IEEE/ACM 41ST INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2019), pp. 912–923.

author keywords: Crowdtesting; automated close prediction; test completion; crowdtesting management
TL;DR: This paper investigates the necessity and feasibility of close prediction of crowdtesting tasks based on industrial dataset, and designs 8 methods for close prediction, based on various models including the bug trend, bug arrival model, capture-recapture model, and a median of 91% bugs can be detected with 49% saved cost. (via Semantic Scholar)
UN Sustainable Development Goal Categories
Sources: Web Of Science, NC State University Libraries
Added: September 7, 2020

2018 article

Actionable Analytics for Software Engineering INTRODUCTION

IEEE SOFTWARE, Vol. 35, pp. 51–53.

By: Y. Yang*, D. Falessi*, T. Menzies n & J. Hihn*

TL;DR: This theme issue aims to reflect on actionable analytics for software engineering and to document a catalog of success stories in which analytics has been proven actionable and useful, in some significant way, in an organization. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2018 article

Applications of Psychological Science for Actionable Analytics

ESEC/FSE'18: PROCEEDINGS OF THE 2018 26TH ACM JOINT MEETING ON EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, pp. 456–467.

By: D. Chen n, W. Fu n, R. Krishna n & T. Menzies n

author keywords: Decision trees; heuristics; software analytics; psychological science; empirical studies; defect prediction
TL;DR: Assessment of Fast-and-Frugal Trees for software analytics finds that FFTs are remarkably effective in that their models are very succinct (5 lines or less describing a binary decision tree) while also outperforming result from very recent, top-level, conference papers. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: March 25, 2019

2018 journal article

Beyond evolutionary algorithms for search-based software engineering

INFORMATION AND SOFTWARE TECHNOLOGY, 95, 281–294.

By: J. Chen n, V. Nair n & T. Menzies n

TL;DR: This work builds a very large initial population which is then culled using a recursive bi-clustering chop approach, and evaluates this approach on multiple SE models, unconstrained as well as constrained, and compare its performance with standard evolutionary algorithms. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2018 article

Data-Driven Search-based Software Engineering

2018 IEEE/ACM 15TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR), pp. 341–352.

By: V. Nair n, A. Agrawal n, J. Chen n, W. Fu n, G. Mathew n, T. Menzies n, L. Minku*, M. Wagner*, Z. Yu n

TL;DR: It is argued that combining these two fields is useful for situations which require learning from a large data source or when optimizers need to know the lay of the land to find better solutions, faster. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: March 4, 2019

2018 journal article

Faster discovery of faster system configurations with spectral learning

AUTOMATED SOFTWARE ENGINEERING, 25(2), 247–277.

By: V. Nair n, T. Menzies n, N. Siegmund* & S. Apel*

author keywords: Performance prediction; Spectral learning; Decision trees; Search-based software engineering; Sampling
TL;DR: It is demonstrated that predictive models generated by WHAT can be used by optimizers to discover system configurations that closely approach the optimal performance. (via Semantic Scholar)
UN Sustainable Development Goal Categories
9. Industry, Innovation and Infrastructure (OpenAlex)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2018 journal article

Finding better active learners for faster literature reviews

EMPIRICAL SOFTWARE ENGINEERING, 23(6), 3161–3186.

By: Z. Yu n, N. Kraft* & T. Menzies n

author keywords: Active learning; Systematic literature review; Software engineering; Primary study selection
TL;DR: This paper finds and implements FASTREAD, a faster technique for studying a large corpus of documents, combining and parametrizing the most efficient active learning algorithms. (via Semantic Scholar)
UN Sustainable Development Goal Categories
4. Quality Education (OpenAlex)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: December 17, 2018

2018 article

Guest Editorial for the Special Section from the 9th International Symposium on Search Based Software Engineering

Petke, J., & Menzies, T. (2018, December). INFORMATION AND SOFTWARE TECHNOLOGY, Vol. 104, pp. 194–194.

By: J. Petke* & T. Menzies n

TL;DR: Looking forward, to the forthcoming age of autonomous cars and flying drones, it is clear that software will be the key that determines what the authors can do, when, where, and how. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: November 19, 2018

2018 journal article

Heterogeneous Defect Prediction

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 44(9), 874–896.

By: J. Nam*, W. Fu n, S. Kim*, T. Menzies n & L. Tan*

author keywords: Defect prediction; quality assurance; heterogeneous metrics; transfer learning
Sources: Web Of Science, ORCID, NC State University Libraries
Added: October 16, 2018

2018 article

Is "Better Data" Better Than "Better Data Miners"? On the Benefits of Tuning SMOTE for Defect Prediction

PROCEEDINGS 2018 IEEE/ACM 40TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE), pp. 1050–1061.

By: A. Agrawal n & T. Menzies n

author keywords: Search based SE; defect prediction; classification; data analytics for software engineering; SMOTE; imbalanced data; preprocessing
TL;DR: For software analytic tasks like defect prediction, data pre-processing can be more important than classifier choice, ranking studies are incomplete without such pre- Processing, and SMOTUNED is a promising candidate for pre- processing. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: January 21, 2019

2018 article

Micky: A Cheaper Alternative for Selecting Cloud Instances

PROCEEDINGS 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), pp. 409–416.

By: C. Hsu n, V. Nair n, T. Menzies n & V. Freeh n

TL;DR: A collective-optimizer is created, MICKY, that reformulates the task of finding the near-optimal cloud configuration as a multi-armed bandit problem and can achieve on average 8.6 times reduction in measurement cost as compared to the state-of-the-art method while finding near-Optimal solutions. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: January 21, 2019

2018 article

RIOT: a Stochastic-based Method for Workflow Scheduling in the Cloud

PROCEEDINGS 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), pp. 318–325.

By: J. Chen n & T. Menzies n

author keywords: cloud computing; workflow scheduling; multiobjective optimization
TL;DR: RIOT (Randomized Instance Order Types), a stochastic based method for workflow scheduling that groups the tasks in the workflow into virtual machines via a probability model and then uses an effective surrogate based method to assess large amount of potential schedulings. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: January 21, 2019

2018 journal article

The Unreasonable Effectiveness of Software Analytics

IEEE SOFTWARE, 35(2), 96–98.

By: T. Menzies n

TL;DR: In theory, software analytics shouldn’t work because software project behavior shouldn”t be predictable, but it does. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2018 article

Total Recall, Language Processing, and Software Engineering

PROCEEDINGS OF THE 4TH ACM SIGSOFT INTERNATIONAL WORKSHOP ON NLP FOR SOFTWARE ENGINEERING (NL4SE '18), pp. 10–13.

By: Z. Yu n & T. Menzies n

author keywords: Software engineering; active learning; natural language processing; information retrieval; total recall; literature review; vulnerabilities
TL;DR: It is claimed that by applying and adapting the state of the art active learning and natural language processing algorithms for solving the total recall problem, two important software engineering tasks can also be addressed: supporting large literature reviews and identifying software security vulnerabilities. (via Semantic Scholar)
UN Sustainable Development Goal Categories
4. Quality Education (Web of Science)
Sources: Web Of Science, NC State University Libraries
Added: April 15, 2019

2018 article

VOICE OF EVIDENCE From Voice of Evidence to Redirections

IEEE SOFTWARE, Vol. 35, pp. 11–13.

By: R. Prikladnicki & T. Menzies n

TL;DR: The Voice of Experience department is being relaunched as Redirections, which will focus on the surprises in software engineering. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2018 journal article

What is wrong with topic modeling? And how to fix it using search-based software engineering

INFORMATION AND SOFTWARE TECHNOLOGY, 98, 74–88.

By: A. Agrawal n, W. Fu n & T. Menzies n

author keywords: Topic modeling; Stability; LDA; Tuning; Differential evolution
TL;DR: LDADE, a search-based software engineering tool which uses Differential Evolution (DE) to tune the LDA’s parameters, is used to provide a method in which distributions generated by LDA are more stable and can be used for further analysis. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: August 6, 2018

2017 article

A guest editorial: special issue on search based software engineering and data mining

Kessentini, M., & Menzies, T. (2017, September). AUTOMATED SOFTWARE ENGINEERING, Vol. 24, pp. 573–574.

By: M. Kessentini* & T. Menzies n

TL;DR: This special issue was to understand the cost/benefit tradeoffs in combining SBSE and DM based on real-world case studies covering several aspects of the software lifecycle and presents some of the latest innovative results in that direction. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2017 journal article

Are delayed issues harder to resolve? Revisiting cost-to-fix of defects throughout the lifecycle

EMPIRICAL SOFTWARE ENGINEERING, 22(4), 1903–1935.

By: T. Menzies n, W. Nichols*, F. Shull* & L. Layman*

author keywords: Software economics; Phase delay; Cost to fix
TL;DR: No evidence for the delayed issue effect is found, which predicts that new development processes that promise to faster retire more issues will not have a guaranteed return on investment (depending on the context where applied), and that a long-held truth in software engineering should not be considered a global truism. (via Semantic Scholar)
Sources: Web Of Science, ORCID, NC State University Libraries
Added: August 6, 2018

2017 journal article

Less is more: Minimizing code reorganization using XTREE

INFORMATION AND SOFTWARE TECHNOLOGY, 88, 53–66.

By: R. Krishna n, T. Menzies n & L. Layman*

author keywords: Bad smells; Performance prediction; Decision trees
TL;DR: Before undertaking a code reorganization based on a bad smell report, use a framework like XTREE to check and ignore any such operations that are useless; i.e. ones which lack evidence in the historical record that it is useful to make that change. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2017 journal article

Negative results for software effort estimation

EMPIRICAL SOFTWARE ENGINEERING, 22(5), 2658–2683.

By: T. Menzies n, Y. Yang*, G. Mathew n, B. Boehm* & J. Hihn*

author keywords: Effort estimation; COCOMO; CART; Nearest neighbor; Clustering; Feature selection; Prototype generation; Bootstrap sampling; Effect size; A12
TL;DR: The experiments of this paper show that, at least for effort estimation, how data is collected is more important than what learner is applied to that data, and that when COCOMO-style attributes are available, it is strongly recommend to use that data and use C OCOMO to generate predictions. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2017 journal article

TMAP: Discovering relevant API methods through text mining of API documentation

Journal of Software: Evolution and Process, 29(12), e1845.

By: R. Pandita n, R. Jetley*, S. Sudarsan*, T. Menzies n & L. Williams n

author keywords: API documents; API mappings; text mining
TL;DR: Text mining based approach (TMAP) is proposed to discover relevant API mappings using text mining on the natural language API method descriptions to support software developers in migrating an application from a source API to a target API by automatically discovering relevant method mappings. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: February 24, 2020

2017 conference paper

The NASA analogy software cost model: A web-based cost analysis tool

2017 ieee aerospace conference.

By: J. Hihn*, M. Saing*, E. Huntington*, J. Johnson*, T. Menzies n & G. Mathew n

TL;DR: This paper provides an overview of the many new features and algorithm updates in the release of the NASA Analogy Software Cost Tool (ASCoT). (via Semantic Scholar)
Sources: NC State University Libraries, NC State University Libraries
Added: August 6, 2018

2017 article

Using Bad Learners to Find Good Configurations

ESEC/FSE 2017: PROCEEDINGS OF THE 2017 11TH JOINT MEETING ON FOUNDATIONS OF SOFTWARE ENGINEERING, pp. 257–267.

By: V. Nair n, T. Menzies n, N. Siegmund* & S. Apel*

author keywords: Performance Prediction; SBSE; Sampling; Rank-based method
TL;DR: This paper demonstrates that performance models that are cheap to learn but inaccurate can still be used rank configurations and hence find the optimal configuration and significantly reduce the cost as well as the time required to build performance models. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2016 article

"How not to Do it": Anti-patterns for Data Science in Software Engineering

2016 IEEE/ACM 38TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING COMPANION (ICSE-C), pp. 887–887.

By: T. Menzies n

author keywords: Data Science; Software Analytics
TL;DR: This technical briefing will present common classes of errors seen when large communities of researchers and commercial software engineers use, and misuse data mining tools, and show how to avoid them. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2016 article

An (Accidental) Exploration of Alternatives to Evolutionary Algorithms for SBSE

SEARCH BASED SOFTWARE ENGINEERING, SSBSE 2016, Vol. 9962, pp. 96–111.

By: V. Nair n, T. Menzies n & J. Chen n

author keywords: Search-based SE; Sampling; Evolutionary algorithms
TL;DR: Experiments with Software Engineering (SE) models shows that SWAY’s performance improvement is competitive with standard MOEAs while, terminating over an order of magnitude faster. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2016 journal article

Correlation is not causation (or, when not to scream "Eureka!")

Perspectives on Data Science for Software Engineering, 327–330.

By: T. Menzies n

TL;DR: A natural response that stems from the excitement of doing science and discovering an effect that no one has ever seen before: don’t do it. (via Semantic Scholar)
Sources: NC State University Libraries, NC State University Libraries
Added: August 6, 2018

2016 conference paper

Improving and expanding NASA software cost estimation methods

2016 ieee aerospace conference.

By: J. Hihn*, L. Juster*, J. Johnson*, T. Menzies n & G. Michael n

TL;DR: The research in developing an analogy method for estimating NASA spacecraft flight software using spectral clustering on system characteristics (symbolic non-numerical data) is summarized and its performance is evaluated by comparing it to a number of the most commonly used estimation methods. (via Semantic Scholar)
Sources: NC State University Libraries, NC State University Libraries
Added: August 6, 2018

2016 journal article

Learning Mitigations for Pilot Issues When Landing Aircraft (via Multiobjective Optimization and Multiagent Simulations)

IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 46(2), 221–230.

By: J. Krall, T. Menzies n & M. Davies*

author keywords: Active Learning; cognitive modeling; human factors; multiobjective optimization
TL;DR: It is shown that, using CDA+GALE, it is possible to identify and mitigate factors that make pilots unable to complete all their required tasks in the context of different 1) function allocation strategies, 2) pilot cognitive control strategies, and 3) operational contexts that impact and safe aircraft operation. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2016 journal article

Perspectives on data science for software engineering

Perspectives on Data Science for Software Engineering, 3–6.

By: T. Menzies n, L. Williams n & T. Zimmermann*

Sources: NC State University Libraries, NC State University Libraries
Added: August 6, 2018

2016 journal article

Seven principles of inductive software engineering: What we do is different

Perspectives on Data Science for Software Engineering, 13–17.

By: T. Menzies n

TL;DR: Inductive software engineering is the branch of software engineering focusing on the delivery of data-mining based software applications, which is the extraction of small patterns from larger data sets. (via Semantic Scholar)
Sources: NC State University Libraries, NC State University Libraries
Added: August 6, 2018

2016 article

Too Much Automation? The Bellwether Effect and Its Implications for Transfer Learning

2016 31ST IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE), pp. 122–131.

By: R. Krishna n, T. Menzies n & W. Fu n

author keywords: Defect Prediction; Data Mining; Transfer learning
TL;DR: This bellwether method is a useful (and very simple) transfer learning method; “bellwethers” are a baseline method against which future transfer learners should be compared; sometimes, when building increasingly complex automatic methods, researchers should pause and compare their supposedly more sophisticated method against simpler alternatives. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2016 article

Topic Modeling of NASA Space System Problem Reports

13TH WORKING CONFERENCE ON MINING SOFTWARE REPOSITORIES (MSR 2016), pp. 303–314.

By: L. Layman*, A. Nikora*, J. Meek* & T. Menzies n

author keywords: topic modeling; data mining; defects; natural language processing; LDA
TL;DR: topic modeling is applied to a corpus of NASA problem reports to extract trends in testing and operational failures, and finds that hardware material and flight software issues are common during the integration and testing phase, while ground station software and equipment issues are more common During the operations phase. (via Semantic Scholar)
UN Sustainable Development Goal Categories
4. Quality Education (OpenAlex)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2016 journal article

Tuning for software analytics: Is it really necessary?

Information and Software Technology, 76, 135–146.

By: W. Fu n, T. Menzies n & X. Shen n

author keywords: Defect prediction; CART; Random forest; Differential evolution; Search-based software engineering
TL;DR: This paper finds that it is no longer enough to just run a data miner and present the result without conducting a tuning optimization study, and that standard methods in software analytics need to change. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries, Crossref
Added: August 6, 2018

2015 article

1st International Workshop on Big Data Software Engineering (BIGDSE 2015)

2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol 2, pp. 965–966.

By: L. Baresi*, T. Menzies n, A. Metzger* & T. Zimmermann*

author keywords: Big data; software engineering; software analytics
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2015 article

Actionable = Cluster plus Contrast?

2015 30TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING WORKSHOP (ASEW), pp. 14–17.

By: R. Krishna n & T. Menzies n

author keywords: Prediction; planning; instance-based reasoning; model-based reasoning; data mining; software engineering
TL;DR: This paper explores two approaches for learning minimal, yet effective, changes to software project artifacts. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2015 article

Cross-Project Data for Software Engineering

Menzies, T. (2015, December). COMPUTER, Vol. 48, pp. 6–6.

By: T. Menzies n

TL;DR: This installment of Computer's series highlighting the work published in IEEE Computer Society journals comes from IEEE Transactions on Software Engineering. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2015 article

Data Mining Methods and Cost Estimation Models Why is it so hard to infuse new ideas?

2015 30TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING WORKSHOP (ASEW), pp. 5–9.

By: J. Hihn* & T. Menzies n

author keywords: software; cost estimation; effort estimation; data mining; technology infusion
TL;DR: The underlying causes of the problems in the infusion of software costing models into NASA are suggested to be rooted in the fact that the different players have fundamental differences in mental models, vocabulary and objectives. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2015 journal article

GALE: Geometric Active Learning for Search-Based Software Engineering

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 41(10), 1001–1018.

By: J. Krall, T. Menzies n & M. Davies

author keywords: Multi-objective optimization; search based software engineering; active learning
TL;DR: GALE is a near-linear time MOEA that builds a piecewise approximation to the surface of best solutions along the Pareto frontier that finds comparable solutions to standard methods using far fewer evaluations. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2015 article

Guest editorial: special issue on realizing AI synergies in software engineering

Harrison, R., & Menzies, T. (2015, March). AUTOMATED SOFTWARE ENGINEERING, Vol. 22, pp. 1–2.

By: R. Harrison* & T. Menzies n

Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2015 journal article

Guest editorial: special issue on realizing AI synergies in software engineering (part 2)

Automated Software Engineering, 22(2), 143–144.

By: R. Harrison* & T. Menzies n

Sources: NC State University Libraries, NC State University Libraries
Added: August 6, 2018

2015 article

Guest editorial: special multi-issue on selected topics in Automated Software Engineering

Menzies, T., & Pasareanu, C. (2015, September). AUTOMATED SOFTWARE ENGINEERING, Vol. 22, pp. 289–290.

By: T. Menzies n & C. Pasareanu

Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2015 journal article

Guest editorial: special multi-issue on selected topics in automated software engineering

Automated Software Engineering, 22(4), 437–438.

By: T. Menzies* & C. Pasareanu*

Sources: Crossref, NC State University Libraries
Added: February 24, 2020

2015 conference paper

LACE2: Better privacy-preserving data sharing for cross project defect prediction

2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol 1, 801–811.

By: F. Peters*, T. Menzies n & L. Layman*

TL;DR: Previous work with LACE2 is extended which reduces the amount of data shared by using multi-party data sharing and the multi- party approach of Lace2 yields higher privacy than the prior approach without damaging predictive efficacy. (via Semantic Scholar)
Sources: NC State University Libraries, NC State University Libraries
Added: August 6, 2018

2015 journal article

Reduced-Item Food Audits Based on the Nutrition Environment Measures Surveys

American Journal of Preventive Medicine, 49(4), e23–e33.

By: S. Partington*, T. Menzies*, T. Colburn, B. Saelens* & K. Glanz*

MeSH headings : California; Cities / statistics & numerical data; Environment; Food / statistics & numerical data; Food Supply / statistics & numerical data; Machine Learning; Nutrition Surveys; Residence Characteristics / statistics & numerical data; Restaurants / statistics & numerical data; Washington; West Virginia
TL;DR: Reduced-item audit tools can reduce the burden and complexity of large-scale or repeated assessments of the retail food environment without compromising measurement quality. (via Semantic Scholar)
UN Sustainable Development Goal Categories
2. Zero Hunger (OpenAlex)
Sources: Crossref, NC State University Libraries
Added: February 24, 2020

2015 article

The Art and Science of Analyzing Software Data; Quantitative Methods

2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol 2, pp. 959–960.

By: T. Menzies n, L. Minku* & F. Peters

TL;DR: This tutorial discusses the following: when local data is scarce, how to adapt data from other organizations to local problems, and when working with data of dubious quality, the show how to prune spurious information. (via Semantic Scholar)
Sources: Web Of Science, NC State University Libraries
Added: August 6, 2018

2015 journal article

Transfer learning in effort estimation

Empirical Software Engineering, 20(3), 813–843.

By: E. Kocaguneli*, T. Menzies* & E. Mendes*

author keywords: Transfer learning; Effort estimation; Data mining; k-NN
TL;DR: This paper uses data on 154 projects from 2 sources to investigate transfer learning between different time intervals and 195 projects from 51 sources to provide evidence on the value of transfer learning for traditional cross-company learning problems. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: January 4, 2021

2014 journal article

Choose to Change: The West Virginia Early Childhood Obesity Prevention Project

Journal of Nutrition Education and Behavior, 46(4), S197.

By: S. Partington*, E. Murphy*, E. Bowen*, D. Lacombe*, G. Piras, L. Cottrell*, T. Menzies*

Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2014 journal article

Special issue on realizing artificial intelligence synergies in software engineering

Software Quality Journal, 22(1), 49–50.

By: T. Menzies* & M. Mernik*

Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2013 journal article

Choose to Change: The West Virginia Early Childhood Obesity Prevention Project

Journal of Nutrition Education and Behavior, 45(4), S92.

By: S. Partington*, E. Murphy*, E. Bowen*, D. Lacombe*, G. Piras, L. Carson, L. Cottrell*, T. Menzies*

Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2013 journal article

Guest editorial for the Special Section on BEST PAPERS from the 2011 conference on Predictive Models in Software Engineering (PROMISE)

Information and Software Technology, 55(8), 1477–1478.

By: T. Menzies*

Sources: Crossref, NC State University Libraries
Added: January 21, 2021

2013 journal article

Predictive models in software engineering

Empirical Software Engineering, 18(3), 433–434.

By: T. Menzies* & G. Koru*

Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2013 journal article

Software effort models should be assessed via leave-one-out validation

Journal of Systems and Software, 86(7), 1879–1890.

By: E. Kocaguneli* & T. Menzies*

author keywords: Software cost estimation; Prediction system; Bias; Variance
TL;DR: This work depreciate N-way and endorse LOO validation for assessing effort models because of their generated B&V values and runtimes and in terms of reproducibility, LOO removes one cause of conclusion instability. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2012 chapter

Crowd-Sourced Knowledge Bases

In Knowledge Management and Acquisition for Intelligent Systems (pp. 258–271).

By: Y. Kim*, B. Kang*, S. Ryu*, P. Compton*, S. Han* & T. Menzies*

TL;DR: Although people vary in document classification, simple merging may produce reasonable consensus knowledge bases, according to the results of experiments with 27 students to classify the same set of 1000 documents. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2012 journal article

Finding conclusion stability for selecting the best effort predictor in software effort estimation

Automated Software Engineering, 20(4), 543–567.

By: J. Keung*, E. Kocaguneli* & T. Menzies*

author keywords: Effort estimation; Data mining; Stability; Linear regression; Regression trees; Neural nets; Analogy; MMRE; Evaluation criteria
TL;DR: Aggregate results show that it is now possible to draw stable conclusions about the relative performance of SEE predictors and Regression trees or analogy-based methods are the best performers. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: January 21, 2021

2012 journal article

Special issue on repeatable results in software engineering prediction

Empirical Software Engineering, 17(1-2), 1–17.

By: T. Menzies* & M. Shepperd*

Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2011 journal article

Guest editorial: learning to organize testing

Automated Software Engineering, 19(2), 137–140.

By: A. Bener* & T. Menzies*

TL;DR: After a decade of intensive work into data mining to make best use of testing resources, it is time to ask what has been learned from all that research? (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: January 21, 2021

2011 journal article

Kernel methods for software effort estimation

Empirical Software Engineering, 18(1), 1–24.

By: E. Kocaguneli*, T. Menzies* & J. Keung*

author keywords: Effort estimation; Data mining; Kernel function; Bandwidth
TL;DR: It is found that non-uniform weighting through kernel methods cannot outperform uniform weighting ABE and kernel type and bandwidth parameters do not produce a definite effect on estimation performance. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: January 21, 2021

2011 journal article

Learning patterns of university student retention

Expert Systems with Applications, 38(12), 14984–14996.

By: A. Nandeshwar*, T. Menzies* & A. Nelson*

author keywords: Data mining; Student retention; Predictive modeling; Financial aid
TL;DR: Using these techniques, for the goal of predicting if students will remain for the first three years of an undergraduate degree, the following factors were found to be informative: family background and family's social-economic status, high school GPA and test scores. (via Semantic Scholar)
UN Sustainable Development Goal Categories
4. Quality Education (OpenAlex)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2010 journal article

A second look at Faster, Better, Cheaper

Innovations in Systems and Software Engineering, 6(4), 319–335.

By: O. El-Rawas* & T. Menzies*

author keywords: Software engineering; Predictor models; COCOMO; Faster Better Cheaper; Simulated annealing; Software processes
TL;DR: A stochastic AI tool is utilized to determine the behavior of FBC and it is found that, with certain caveats, it is a viable approach to systems development. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2010 journal article

Automatically finding the control variables for complex system behavior

Automated Software Engineering, 17(4), 439–468.

By: G. Gay, T. Menzies*, M. Davies* & K. Gundy-Burlet*

author keywords: Contrast-set learning; Search-based software engineering; Simulation; Optimization; Monte Carlo filtering
TL;DR: This paper benchmarks the TAR3 and TAR4.1 treatment learners against optimization techniques across three complex systems, including two projects from the Robust Software Engineering group within the National Aeronautics and Space Administration (NASA) Ames Research Center, and shows that treatment learning is both faster and more accurate than traditional optimization methods. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2010 journal article

Defect prediction from static code features: current results, limitations, new approaches

Automated Software Engineering, 17(4), 375–407.

By: T. Menzies*, Z. Milton*, B. Turhan*, B. Cukic*, Y. Jiang* & A. Bener*

author keywords: Defect prediction; Static code features; WHICH
TL;DR: It is hypothesized that the limits of the standard learning goal of maximizing area under the curve (AUC) of the probability of false alarms and probability of detection “AUC(pd, pf)” are reached, and certain widely used learners perform much worse than simple manual methods. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2010 journal article

Practical considerations in deploying statistical methods for defect prediction: A case study within the Turkish telecommunications industry

Information and Software Technology, 52(11), 1242–1257.

By: A. Tosun*, A. Bener*, B. Turhan* & T. Menzies*

author keywords: Software defect prediction; Experience report; Naive Bayes; Static code attributes
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2010 chapter

Regularities in Learning Defect Predictors

In Product-Focused Software Process Improvement (pp. 116–130).

By: B. Turhan*, A. Bener* & T. Menzies*

TL;DR: It is shown that bug reports need not necessarily come from the local projects in order to learn defect prediction models and it is demonstrated that using imported data from different sites can make it suitable for predicting defects at the local site. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2010 journal article

Sharing experiments using open-source software

Software: Practice and Experience, 41(3), 283–305.

By: A. Nelson*, T. Menzies* & G. Gay

author keywords: open source; data mining
Sources: Crossref, NC State University Libraries
Added: January 21, 2021

2010 journal article

Stable rankings for different effort models

Automated Software Engineering, 17(4), 409–437.

By: T. Menzies*, O. Jalali*, J. Hihn*, D. Baker* & K. Lum*

author keywords: COCOMO; Effort estimation; Data mining; Evaluation
TL;DR: While there exists no single universal “best” effort estimation method, there appears to exist a small number of most useful methods, which should be preceded by a “selection study” that finds the best local estimator. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2009 journal article

Accurate estimates without local data?

Software Process: Improvement and Practice, 14(4), 213–225.

By: T. Menzies*, S. Williams*, O. Elrawas*, D. Baker*, B. Boehm*, J. Hihn, K. Lum, R. Madachy*

TL;DR: It is shown empirically that, for the USC COCOMO family of models, the effects of P dominate the effects of T i.e. the output variance of these models can be controlled without using local data to constrain the tuning variance. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2009 journal article

Finding robust solutions in requirements models

Automated Software Engineering, 17(1), 87–116.

By: G. Gay, T. Menzies*, O. Jalali*, G. Mundy*, B. Gilkerson*, M. Feather*, J. Kiper*

TL;DR: In experiments with real-world requirements engineering models, it is shown that KEYS2 can generate decision ordering diagrams in O(N2) and out-performs other search algorithms (simulated annealing, ASTAR, MaxWalkSat) when assessed in terms of reducing inference times, increasing solution quality, and decreasing variance. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: January 21, 2021

2009 chapter

On the Relative Merits of Software Reuse

In Trustworthy Software Development Processes (pp. 186–197).

By: A. Orrego*, T. Menzies* & O. El-Rawas*

TL;DR: It is argued that the merits of software reuse need to be evaluated in a project by project basis, and AI search over process models is useful for such an assessment, particularly when there is not sufficient data for precisely tuning a simulation model. (via Semantic Scholar)
UN Sustainable Development Goal Categories
4. Quality Education (OpenAlex)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2009 journal article

On the relative value of cross-company and within-company data for defect prediction

Empirical Software Engineering, 14(5), 540–578.

By: B. Turhan*, T. Menzies*, A. Bener* & J. Di Stefano*

author keywords: Defect prediction; Learning; Metrics (product metrics); Cross-company; Within-company; Nearest-neighbor filtering
TL;DR: It is demonstrated in this paper that the minimum number of data samples required to build effective defect predictors can be quite small and can be collected quickly within a few months. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2008 chapter

Accurate Estimates without Calibration?

In Making Globally Distributed Software Development a Success Story (pp. 210–221).

By: T. Menzies*, O. Elrawas*, B. Boehm*, R. Madachy*, J. Hihn, D. Baker*, K. Lum

TL;DR: It is shown that while it is always preferable to tune models to localdata, it is possible to learn process control options without that data. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2008 journal article

Editorial, special issue, repeatable experiments in software engineering

Empirical Software Engineering, 13(5), 469–471.

By: T. Menzies*

TL;DR: The PROMISE project has been running for 4 years now and aims to create large libraries of repeatable experiments in software engineering and is somewhat different to other workshops that deal with learning from software data. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: January 21, 2021

2008 journal article

Learning better IV&V practices

Innovations in Systems and Software Engineering, 4(2), 169–183.

By: T. Menzies*, M. Benson, K. Costello, C. Moats, M. Northey & J. Richardson

author keywords: IV&V; Data mining; Early life cycle defect prediction; NASA
TL;DR: This is the first reproducible report of a predictor for issue frequency and severity that can be applied early in the life cycle, and it is claimed that this predictor is built using public-domain data and software. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2008 journal article

Special issue on information retrieval for program comprehension

Empirical Software Engineering, 14(1), 1–4.

By: L. Etzkorn* & T. Menzies*

TL;DR: These new IR4PC semantic measures examine informal information in the tokens within the software itself as well as the natural language content in external documentation such as software requirements documents or software design documents. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: January 21, 2021

2006 journal article

Just enough learning (of association rules): the TAR2 “Treatment” learner

Artificial Intelligence Review, 25(3), 211–229.

By: T. Menzies* & Y. Hu*

author keywords: TAR2; treatment learning; contrast set learning
TL;DR: A much simpler learner can suffice in domains with narrow funnels; i.e. where most domain variables are controlled by a very small subset; such a learner is TAR2: a weighted-class minimal contrast-set association rule learner that utilizes confidence-based pruning, but not support- based pruning. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: August 28, 2020

2004 conference paper

Mining repositories to assist in project planning and resource allocation

"International Workshop on Mining Software Repositories (MSR 2004)" W17S Workshop - 26th International Conference on Software Engineering. Presented at the "International Workshop on Mining Software Repositories (MSR 2004)" W17S Workshop - 26th International Conference on Software Engineering.

By: T. Menzies*

Event: "International Workshop on Mining Software Repositories (MSR 2004)" W17S Workshop - 26th International Conference on Software Engineering

TL;DR: This article is a reply to Software repositories plus defect logs are useful for learning defect detectors, a useful resource allocation tool for software managers and three counter arguments to such a proposal. (via Semantic Scholar)
Sources: Crossref, NC State University Libraries
Added: January 21, 2021

Citation Index includes data from a number of different sources. If you have questions about the sources of data in the Citation Index or need a set of data which is free to re-distribute, please contact us.

Certain data included herein are derived from the Web of Science© and InCites© (2024) of Clarivate Analytics. All rights reserved. You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.