Bowen Xu

Works (5)

Updated: May 24th, 2024 05:00

2024 journal article

Representation Learning for Stack Overflow Posts: How Far Are We?

ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 33(3).

By: J. He, X. Zhou, B. Xu, T. Zhang, K. Kim, Z. Yang, F. Thung, I. Irsan, D. Lo

author keywords: Stack Overflow; transformers; pre-trained models
Source: Web Of Science
Added: May 13, 2024

2024 journal article

Stealthy Backdoor Attack for Code Models

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 50(4), 721–741.

By: Z. Yang, B. Xu, J. Zhang, H. Kang, J. Shi, J. He, D. Lo

author keywords: Adversarial attack; data poisoning; backdoor attack; pre-trained models of code
Source: Web Of Science
Added: May 20, 2024

2023 article

Are We Ready to Embrace Generative AI for Software Q&A?

2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, pp. 1713–1717.

By: B. Xu n, T. Nguyen*, T. Le-Cong*, T. Hoang*, J. Liu*, K. Kim*, C. Gong*, C. Niu* ...

TL;DR: It is suggested that human-written and ChatGPT-generated answers are semantically similar, however, human- written answers outperform ChatG PT-generated ones consistently across multiple aspects, specifically by 10% on the overall score. (via Semantic Scholar)
Source: Web Of Science
Added: January 29, 2024

2023 article

CCBERT: Self-Supervised Code Change Representation Learning

2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME, pp. 182–193.

By: X. Zhou*, B. Xu n, D. Han*, Z. Yang*, J. He* & D. Lo*

TL;DR: CCBERT (Code Change BERT), a new Transformer-based pre-trained model that learns a generic representation of code changes based on a large-scale dataset containing massive unlabeled code changes, is proposed. (via Semantic Scholar)
Source: Web Of Science
Added: February 26, 2024

2023 article

The Devil is in the Tails: How Long-Tailed Code Distributions Impact Large Language Models

2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, pp. 40–52.

TL;DR: An exploratory study on the distribution of SE data found that such data usually follows a skewed distribution where a small number of classes have an extensive collection of samples, while a large number of Classes have very few samples, which has a substantial impact on the effectiveness of LLMs for code. (via Semantic Scholar)
Source: Web Of Science
Added: January 29, 2024

Citation Index includes data from a number of different sources. If you have questions about the sources of data in the Citation Index or need a set of data which is free to re-distribute, please contact us.

Certain data included herein are derived from the Web of Science© and InCites© (2024) of Clarivate Analytics. All rights reserved. You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.