TY - JOUR TI - Practical detection of CMS plugin conflicts in large plugin sets AU - Lima, Igor AU - Cândido, Jeanderson AU - d’Amorim, Marcelo T2 - Information and Software Technology AB - Content Management Systems (CMS), such as WordPress, are a very popular category of software for creating web sites and blogs. These systems typically build on top of plugin architectures. Unfortunately, it is not uncommon that the combined activation of multiple plugins in a CMS web site will produce unexpected behavior. Conflict-detection techniques exist but they do not scale. This paper proposes Pena, a technique to detect conflicts in large sets of plugins as those present in plugin market places. Pena takes on input a configuration, consisting of a potentially large set of plugins, and reports on output the offending plugin combinations. Pena uses an iterative divide-and-conquer search to explore the large space of plugin combinations and a staged filtering process to eliminate false alarms. We evaluated Pena with plugins selected from the WordPress official repository and compared its efficiency and accuracy against the technique that checks conflicts in all pairs of plugins. Results show that Pena is 12.4x to 19.6x more efficient than the comparison baseline and can find as many conflicts as it. DA - 2020/2// PY - 2020/2// DO - 10.1016/j.infsof.2019.106212 UR - https://doi.org/10.1016/j.infsof.2019.106212 ER - TY - CONF TI - In Opinion Holders’ Shoes: Modeling Cumulative Influence for View Change in Online Argumentation AU - Guo, Zhen AU - Zhang, Zhe AU - Singh, Munindar AB - Understanding how people change their views during multiparty argumentative discussions is important in applications that involve human communication, e.g., in social media and education. Existing research focuses on lexical features of individual comments, dynamics of discussions, or the personalities of participants but deemphasizes the cumulative influence of the interplay of comments by different participants on a participant’s mindset. We address the task of predicting the points where a user’s view changes given an entire discussion, thereby tackling the confusion due to multiple plausible alternatives when considering the entirety of a discussion. We make the following contributions. (1) Through a human study, we show that modeling a user’s perception of comments is crucial in predicting persuasiveness. (2) We present a sequential model for cumulative influence that captures the interplay between comments as both local and nonlocal dependencies, and demonstrate its capability of selecting the most effective information for changing views. (3) We identify contextual and interactive features and propose sequence structures to incorporate these features. Our empirical evaluation using a Reddit Change My View dataset shows that contextual and interactive features are valuable in predicting view changes, and a sequential model notably outperforms the nonsequential baseline models. C2 - 2020/4/20/ C3 - Proceedings of The Web Conference 2020 CY - Taipei DA - 2020/4/20/ DO - 10.1145/3366423.3380302 SP - 2388-2399 PB - ACM UR - http://dx.doi.org/10.1145/3366423.3380302 ER - TY - JOUR TI - Text mining to identify and extract novel disease treatments from unstructured datasets AU - Yedida, R. AU - Abrar, S.M. AU - Melo-Filho, C. AU - Muratov, E. AU - Chirkova, R. AU - Tropsha, A. T2 - arXiv DA - 2020/// PY - 2020/// DO - 10.48550/arxiv.2011.07959 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85171055748&partnerID=MN8TOARS ER - TY - JOUR TI - Parsimonious computing: A minority training regime for effective prediction in large microarray expression data sets T2 - arXiv AB - Rigorous mathematical investigation of learning rates used in back-propagation in shallow neural networks has become a necessity. This is because experimental evidence needs to be endorsed by a theoretical background. Such theory may be helpful in reducing the volume of experimental effort to accomplish desired results. We leveraged the functional property of Mean Square Error, which is Lipschitz continuous to compute learning rate in shallow neural networks. We claim that our approach reduces tuning efforts, especially when a significant corpus of data has to be handled. We achieve remarkable improvement in saving computational cost while surpassing prediction accuracy reported in literature. The learning rate, proposed here, is the inverse of the Lipschitz constant. The work results in a novel method for carrying out gene expression inference on large microarray data sets with a shallow architecture constrained by limited computing resources. A combination of random sub-sampling of the dataset, an adaptive Lipschitz constant inspired learning rate and a new activation function, A-ReLU helped accomplish the results reported in the paper. DA - 2020/// PY - 2020/// DO - 10.48550/arxiv.2005.08442 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85169948146&partnerID=MN8TOARS ER - TY - JOUR TI - On the value of oversampling for deep learning in software defect prediction AU - Yedida, R. AU - Menzies, T. T2 - arXiv AB - One truism of deep learning is that the automatic feature engineering (seen in the first layers of those networks) excuses data scientists from performing tedious manual feature engineering prior to running DL. For the specific case of deep learning for defect prediction, we show that that truism is false. Specifically, when we preprocess data with a novel oversampling technique called fuzzy sampling, as part of a larger pipeline called GHOST (Goal-oriented Hyper-parameter Optimization for Scalable Training), then we can do significantly better than the prior DL state of the art in 14/20 defect data sets. Our approach yields state-of-the-art results significantly faster deep learners. These results present a cogent case for the use of oversampling prior to applying deep learning on software defect prediction datasets. DA - 2020/// PY - 2020/// DO - 10.48550/arxiv.2008.03835 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85171025165&partnerID=MN8TOARS ER - TY - JOUR TI - How to Recognize Actionable Static Code Warnings (Using Linear SVMs) AU - Yang, X. AU - Chen, J. AU - Yedida, R. AU - Yu, Z. AU - Menzies, T. T2 - arXiv AB - Static code warning tools often generate warnings that programmers ignore. Such tools can be made more useful via data mining algorithms that select the "actionable" warnings; i.e. the warnings that are usually not ignored. In this paper, we look for actionable warnings within a sample of 5,675 actionable warnings seen in 31,058 static code warnings from FindBugs. We find that data mining algorithms can find actionable warnings with remarkable ease. Specifically, a range of data mining methods (deep learners, random forests, decision tree learners, and support vector machines) all achieved very good results (recalls and AUC (TRN, TPR) measures usually over 95% and false alarms usually under 5%). Given that all these learners succeeded so easily, it is appropriate to ask if there is something about this task that is inherently easy. We report that while our data sets have up to 58 raw features, those features can be approximated by less than two underlying dimensions. For such intrinsically simple data, many different kinds of learners can generate useful models with similar performance. Based on the above, we conclude that learning to recognize actionable static code warnings is easy, using a wide range of learning algorithms, since the underlying data is intrinsically simple. If we had to pick one particular learner for this task, we would suggest linear SVMs (since, at least in our sample, that learner ran relatively quickly and achieved the best median performance) and we would not recommend deep learning (since this data is intrinsically very simple). DA - 2020/// PY - 2020/// DO - 10.48550/arxiv.2006.00444 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85171025965&partnerID=MN8TOARS ER - TY - CHAP TI - Infrastructure AU - Ruddell, Benjamin L. AU - Gao, Hongkai AU - Pala, Okan AU - Rushforth, Richard AU - Sabo, John T2 - The Food-Energy-Water Nexus AB - Infrastructures handle high-volume goods and services that require heavily capitalized, large-scale, durable, reliable, shared, interdependent, and specialized systems. Infrastructure facilitates social, economic, and environmental functions by achieving a high degree of efficiency at a low marginal cost to produce, transport, distribute, quality-control, and allocate high-volume goods and services. Infrastructure development usually requires large, long-term investments and substantial consideration of risk, change, and extreme events during the design phase. This chapter explains the basic structures that form infrastructure for FEW systems and provides useful diagrams of FEW supply chains that utilize those infrastructures. PY - 2020/// DO - 10.1007/978-3-030-29914-9_10 UR - http://dx.doi.org/10.1007/978-3-030-29914-9_10 ER - TY - CONF TI - Analysis of Access Control Enforcement in Android AU - Enck, William C2 - 2020/// C3 - Proceedings of the 25th ACM Symposium on Access Control Models and Technologies DA - 2020/// SP - 117-118 ER - TY - CONF TI - Actions speak louder than words: Entity-sensitive privacy policy and data flow analysis with policheck AU - Andow, Benjami AU - Mahmud, Samin Yaseer AU - Whitaker, Justin AU - Enck, William AU - Reaves, Bradley AU - Singh, Kapil AU - Egelman, Serge C2 - 2020/// C3 - Proceedings of the 29th USENIX Security Symposium (USENIX Security'20) DA - 2020/// ER - TY - CONF TI - nm-Variant Systems: Adversarial-Resistant Software Rejuvenation for Cloud-Based Web Applications AU - Polinsky, Isaac AU - Martin, Kyle AU - Enck, William AU - Reiter, Michael K C2 - 2020/// C3 - Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy DA - 2020/// SP - 235-246 ER - TY - JOUR TI - Optimizing Vulnerability-Driven Honey Traffic Using Game Theory AU - Anjum, Iffat AU - Miah, Mohammad Sujan AU - Zhu, Mu AU - Sharmin, Nazia AU - Kiekintveld, Christopher AU - Enck, William AU - Singh, Munindar P T2 - arXiv preprint arXiv:2002.09069 DA - 2020/// PY - 2020/// ER - TY - CONF TI - Do configuration management tools make systems more secure? an empirical research plan AU - Rahman, Md Rayhanur AU - Enck, William AU - Williams, Laurie C2 - 2020/// C3 - Proceedings of the 7th Symposium on Hot Topics in the Science of Security DA - 2020/// SP - 1-2 ER - TY - CONF TI - Cardpliance: PCI DSS compliance of android applications AU - Mahmud, Samin Yaseer AU - Acharya, Akhil AU - Andow, Benjamin AU - Enck, William AU - Reaves, Bradley C2 - 2020/// C3 - Proceedings of the 29th USENIX Conference on Security Symposium DA - 2020/// SP - 1517-1533 ER - TY - JOUR TI - Finding Faster Configurations Using FLASH AU - Nair, Vivek AU - Yu, Zhe AU - Menzies, Tim AU - Siegmund, Norbert AU - Apel, Sven T2 - IEEE Transactions on Software Engineering AB - Finding good configurations of a software system is often challenging since the number of configuration options can be large. Software engineers often make poor choices about configuration or, even worse, they usually use a sub-optimal configuration in production, which leads to inadequate performance. To assist engineers in finding the better configuration, this article introduces Flash, a sequential model-based method that sequentially explores the configuration space by reflecting on the configurations evaluated so far to determine the next best configuration to explore. Flash scales up to software systems that defeat the prior state-of-the-art model-based methods in this area. Flash runs much faster than existing methods and can solve both single-objective and multi-objective optimization problems. The central insight of this article is to use the prior knowledge of the configuration space (gained from prior runs) to choose the next promising configuration. This strategy reduces the effort (i.e., number of measurements) required to find the better configuration. We evaluate Flash using 30 scenarios based on 7 software systems to demonstrate that Flash saves effort in 100 and 80 percent of cases in single-objective and multi-objective problems respectively by up to several orders of magnitude compared to state-of-the-art techniques. DA - 2020/7/1/ PY - 2020/7/1/ DO - 10.1109/TSE.2018.2870895 VL - 46 IS - 7 SP - 794-811 UR - https://doi.org/10.1109/TSE.2018.2870895 ER - TY - CONF TI - Cloudy with a Chance of Misconceptions: Exploring Users’ Perceptions and Expectations of Security and Privacy in Cloud Office Suites AU - Wermke, Dominik AU - Huaman, Nicolas AU - Stransky, Christian AU - Busch, Niklas AU - Acar, Yasemin AU - Fahl, Sascha C2 - 2020/// C3 - Sixteenth Symposium on Usable Privacy and Security (SOUPS 2020) DA - 2020/// SP - 359-377 ER - TY - CONF TI - Typing Exercises as Interactive Worked Examples for Deliberate Practice in CS Courses AU - Gaweda, Adam M AU - Lynch, Collin F AU - Seamon, Nathan AU - Oliveira, Gabriel AU - Deliwa, Alay T2 - ACM C2 - 2020/// C3 - In Proceedings of the Twenty-Second Australasian Computing Education Conference DA - 2020/// SP - 1-9 ER - TY - JOUR TI - Student Teamwork on Programming Projects: What can GitHub logs show us? AU - Gitinabard, Niki AU - Okoilu, Ruth AU - Xu, Yiqao AU - Heckman, Sarah AU - Barnes, Tiffany AU - Lynch, Collin T2 - arXiv preprint arXiv:2008.11262 DA - 2020/// PY - 2020/// ER - TY - CONF TI - Integrating Testing Throughout the CS Curriculum AU - Heckman, Sarah AU - Schmidt, Jessica Young AU - King, Jason T2 - IEEE C2 - 2020/// C3 - 2020 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW) DA - 2020/// SP - 441-444 ER - TY - CHAP TI - Routing and Wavelength (Spectrum) Assignment AU - Simmons, Jane M. AU - Rouskas, George N. T2 - Springer Handbook of Optical Networks A2 - Mukherjee, Biswanath A2 - Tomkos, Ioannis A2 - Tornatore, Massimo A2 - Winzer, Peter A2 - Zhao, Yongli AB - ZusammenfassungRouting a connection from its source to its destination is a fundamental component of network design. The choice of route affects numerous properties of a connection, most notably cost, latency, and availability, as well as the resulting level of congestion in the network. This chapter addresses various algorithms, strategies, and tradeoffs related to routing.At the physical optical layer, connections are assigned a unique wavelength on a particular optical fiber, a process known as wavelength assignment (). Together with routing, the combination of these two processes is commonly referred to as . In networks based on all-optical technology, WA can be challenging. It becomes more so when the physical properties of the optical signal need to be considered. This chapter covers several WA algorithms and strategies that have produced efficient designs in practical networks.A recent development in the evolution of optical networks is flexible networking, where the amount of spectrum allocated to a connection can be variable. Spectrum assignment is analogous to, though more complex than, wavelength assignment; various heuristics have been proposed as covered in this chapter. Flexible (or elastic) networks are prone to more contention issues as compared to traditional optical networks. To maintain a high degree of capacity efficiency, it is likely that spectral defragmentation will be needed in these networks; several design choices are discussed. PY - 2020/// DO - 10.1007/978-3-030-16250-4_12 SP - 447-484 PB - Springer International Publishing SN - 9783030162498 9783030162504 UR - http://dx.doi.org/10.1007/978-3-030-16250-4_12 ER - TY - CONF TI - Predicting lower limb 3D kinematics during gait using reduced number of wearable sensors via deep learning AU - Hossain, Md Sanzid Bin AU - Lee, Youngho AU - Hong, Junghwa AU - Choi, Hwan AU - Guo, Zhishan T2 - 44th Meetings of the American Society of Biomechanics (ASB) C2 - 2020/// C3 - Proceedings of the 44th Meetings of the American Society of Biomechanics (ASB) DA - 2020/// PY - 2020/8// ER - TY - CONF TI - Mixed Criticality Scheduling of Probabilistic Real-Time Systems AU - Singh, Jasdeep AU - Santinelli, Luca AU - Reghenzani, Federico AU - Bletsas, Konstantinos AU - Guo, Zhishan T2 - 0th European Congress on Embedded Real Time Software and Systems C2 - 2020/1// C3 - Proceeding of the 10th European Congress on Embedded Real Time Software and Systems CY - Toulouse, France DA - 2020/1// PY - 2020/1// ER - TY - CONF TI - CPU Energy-Aware Parallel Real-Time Scheduling AU - Saifullah, Abusayeed AU - Fahmida, Sezana AU - Modekurthy, Venkata AU - Fisher, Nathan AU - Guo, Zhishan T2 - 32th Euromicro Conference on Real-Time Systems (ECRTS) C2 - 2020/6// C3 - Proceedings of the 32th Euromicro Conference on Real-Time Systems (ECRTS) CY - Modena, Italy DA - 2020/6// PY - 2020/6// ER - TY - CONF TI - Efficient Feasibility Analysis for Graph-based Real-Time Task Systems AU - Sun, Jinghao AU - Shi, Rongxiao AU - Wang, Kexuan AU - Guan, Nan AU - Guo, Zhishan T2 - International Conference on Embedded Software (EMSOFT) C2 - 2020/9// C3 - International Conference on Embedded Software (EMSOFT) DA - 2020/9// PY - 2020/9// ER - TY - CONF TI - Optimizing Energy in Non-preemptive Mixed-Criticality Scheduling by Exploiting Probabilistic Information AU - Bhuiyan, Ashikahmed AU - Reghenzani, Federico AU - Fornaciari, William AU - Guo, Zhishan T2 - International Conference on Embedded Software (EMSOFT) C2 - 2020/// C3 - International Conference on Embedded Software (EMSOFT) DA - 2020/// PY - 2020/9// ER - TY - ER - TY - ER - TY - ER - TY - ER - TY - ER - TY - ER - TY - ER - TY - ER - TY - JOUR TI - Understanding the Impact of COVID-19 Intervention Policies on the Labor Market of the Hospitality and Retail Industries AU - Huang, Arthur AU - Makridis, Christos AU - Baker, Mark AU - Medeiros, Marcos AU - Guo, Zhishan T2 - SSRN Electronic Journal AB - Not Available for Download Add Paper to My Library Share: Permalink Using these links will ensure access to this page indefinitely Copy URL Copy DOI DA - 2020/// PY - 2020/// DO - 10.2139/ssrn.3637766 J2 - SSRN Journal LA - en OP - SN - 1556-5068 UR - http://dx.doi.org/10.2139/ssrn.3637766 DB - Crossref ER - TY - JOUR TI - Optimizing Energy in Non-Preemptive Mixed-Criticality Scheduling by Exploiting Probabilistic Information AU - Bhuiyan, Ashikahmed AU - Reghenzani, Federico AU - Fornaciari, William AU - Guo, Zhishan T2 - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems AB - The strict requirements on the timing correctness biased the modeling and analysis of real-time systems toward the worst-case performances. Such focus on the worst-case, however, does not provide enough information to effectively steer the resource/energy optimization. In this article, we integrate a probabilistic-based energy prediction strategy with the precise scheduling of mixed-criticality tasks, where the timing correctness must be met for all tasks at all scenarios. The dynamic voltage and frequency scaling (DVFS) is applied to this precise scheduling policy to enable energy minimization. We propose a probabilistic technique to derive an energy-efficient speed (for the processor) that minimizes the average energy consumption, while guaranteeing the (worst-case) timing correctness for all tasks, including LO-criticality ones, under any execution condition. We present a response time analysis for such systems under the nonpreemptive fixed-priority scheduling policy. Finally, we conduct an extensive simulation campaign based on randomly generated task sets to verify the effectiveness of our algorithm (with respect to energy savings) and it reports up to 46% energy-saving. DA - 2020/11// PY - 2020/11// DO - 10.1109/tcad.2020.3012231 VL - 39 IS - 11 SP - 3906-3917 J2 - IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. OP - SN - 0278-0070 1937-4151 UR - http://dx.doi.org/10.1109/tcad.2020.3012231 DB - Crossref ER - TY - JOUR TI - Tug of perspectives: Mobile app users versus developers AU - Kuttal, S. AU - Bai, Y. AU - Scott, E. AU - Sharma, R. T2 - International Journal of Computer Science and Information Security DA - 2020/6// PY - 2020/6// VL - 18 IS - 6 SP - 83–94 ER - TY - JOUR TI - Birds of a feather flock together? A study of developers’ flocking and migration behavior in GitHub and Stack Overflow AU - Kuttal, S. AU - Sun, M. AU - Ghosh, A. AU - Sharma, R. T2 - International Journal of Computer Science and Information Security DA - 2020/6// PY - 2020/6// VL - 18 IS - 6 SP - 1–12 ER - TY - JOUR TI - Source code comments: Overlooked in the realm of code clone detection AU - Kuttal, S. AU - Ghosh, A. T2 - International Journal of Computer Science and Information Security DA - 2020/11// PY - 2020/11// DO - 10.5281/zenodo.4361801 VL - 18 IS - 11 SP - 11–22 ER - TY - JOUR TI - MP 2 SDA AU - Bian, Jiang AU - Xiong, Haoyi AU - Fu, Yanjie AU - Huan, Jun AU - Guo, Zhishan T2 - ACM Transactions on Knowledge Discovery from Data AB - Sparse Discriminant Analysis (SDA) has been widely used to improve the performance of classical Fisher’s Linear Discriminant Analysis in supervised metric learning, feature selection, and classification. With the increasing needs of distributed data collection, storage, and processing, enabling the Sparse Discriminant Learning to embrace the multi-party distributed computing environments becomes an emerging research topic. This article proposes a novel multi-party SDA algorithm, which can learn SDA models effectively without sharing any raw data and basic statistics among machines. The proposed algorithm (1) leverages the direct estimation of SDA to derive a distributed loss function for the discriminant learning, (2) parameterizes the distributed loss function with local/global estimates through bootstrapping, and (3) approximates a global estimation of linear discriminant projection vector by optimizing the “distributed bootstrapping loss function” with gossip-based stochastic gradient descent. Experimental results on both synthetic and real-world benchmark datasets show that our algorithm can compete with the aggregated SDA with similar performance, and significantly outperforms the most recent distributed SDA in terms of accuracy and F1-score. DA - 2020/3/13/ PY - 2020/3/13/ DO - 10.1145/3374919 VL - 14 IS - 3 SP - 1-22 J2 - ACM Trans. Knowl. Discov. Data LA - en OP - SN - 1556-4681 1556-472X UR - http://dx.doi.org/10.1145/3374919 DB - Crossref ER - TY - CONF TI - Hard-Real-Time Routing in Probabilistic Graphs to Minimize Expected Delay AU - Agrawal, Kunal AU - Baruah, Sanjoy AU - Guo, Zhishan AU - Li, Jing AU - Vaidhun, Sudharsan AB - This work studies the hard-real-time routing problem in graphs: one needs to travel from a given vertex to another within a hard deadline. For each edge in the network, the worst-case delay that may be encountered across that edge is bounded. As far as this given bound is trustworthy at a very high level of assurance, it must be guaranteed that one will meet the specified deadline. The actual delays across edges are uncertain and the goal is to minimize the total expected delay while meeting the deadline. We propose a comprehensive solution to this problem. Specifically, if the precise a priori estimates of the delay probability distributions are available, we develop an optimal table-driven algorithm that identifies the route with the minimum expected delay. If those estimates are not precise (i.e., unknown or dynamic), we develop an efficient Q-Learning approach that leverages the table-driven algorithm to track the true distributions rapidly, while ensuring to meet the specified hard deadline. The proposed solution suggests a promising direction towards incorporating probabilistic information and learning-based approaches into safety-critical systems without compromising safety guarantees, when it is not feasible to establish the trustworthiness of the probabilistic information at the high assurance levels required for verification purposes. C2 - 2020/12// C3 - 2020 IEEE Real-Time Systems Symposium (RTSS) DA - 2020/12// DO - 10.1109/rtss49844.2020.00017 PB - IEEE UR - http://dx.doi.org/10.1109/rtss49844.2020.00017 ER - TY - JOUR TI - Efficient Feasibility Analysis for Graph-Based Real-Time Task Systems AU - Sun, Jinghao AU - Shi, Rongxiao AU - Wang, Kexuan AU - Guan, Nan AU - Guo, Zhishan T2 - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems AB - The demand bound function (DBF) is a powerful abstraction to analyze the feasibility/schedulability of real-time tasks. Computing the DBF for expressive system models, such as graph-based tasks, is typically very expensive. In this article, we develop new techniques to drastically improve the DBF computation efficiency for a representative graph-based task model, digraph real-time tasks (DRT). First, we apply the well-known quick processor-demand analysis (QPA) technique, which was originally designed for simple sporadic tasks, to the analysis of DRT. The challenge is that existing analysis techniques of DRT have to compute the demand for each possible interval size, which is contradictory to the idea of QPA that aims to aggressively skip the computation for most interval sizes. To solve this problem, we develop a novel integer linear programming (ILP)-based analysis technique for DRT, to which we can apply QPA to significantly improve the analysis efficiency. Second, we improve the task utilization computation (a major step in DBF computation for DRT) efficiency from pseudo-polynomial complexity to polynomial complexity. Experiments show that our approach can improve the analysis efficiency by dozens of times. DA - 2020/11// PY - 2020/11// DO - 10.1109/tcad.2020.3012174 VL - 39 IS - 11 SP - 3385-3397 UR - http://dx.doi.org/10.1109/tcad.2020.3012174 ER - TY - JOUR TI - Understanding the impact of COVID-19 intervention policies on the hospitality labor market AU - Huang, Arthur AU - Makridis, Christos AU - Baker, Mark AU - Medeiros, Marcos AU - Guo, Zhishan T2 - International Journal of Hospitality Management AB - Using new high-frequency data that covers a representative sample of small businesses in the United States, this study investigates the effects of the COVID-19 pandemic and the resulting state policies on the hospitality industry. First, business closure policies are associated with a 20–30% reduction of non-salaried workers in the food/drink and leisure/entertainment sectors during March-April of 2020. Second, business reopening policies play a statistically significant role in slowly reviving the labor market. Third, considerable differences exist in the impact of policies on the labor market by state. Fourth, the rise of new COVID-19 cases on a daily basis is associated with the continued deterioration of the labor market. Lastly, managerial, practical, and economic implications are described. DA - 2020/10// PY - 2020/10// DO - 10.1016/j.ijhm.2020.102660 VL - 91 SP - 102660 UR - https://doi.org/10.1016/j.ijhm.2020.102660 ER - TY - JOUR TI - MiniTEE—A Lightweight TrustZone-Assisted TEE for Real-Time Systems AU - Liu, Songran AU - Guan, Nan AU - Guo, Zhishan AU - Yi, Wang T2 - Electronics AB - While trusted execution environments (TEEs) provide industry standard security and isolation, TEE requests through secure monitor calls (SMCs) attribute to large time overhead and weakened temporal predictability. Moreover, as current available TEE solutions are designed for Linux and/or Android initially, it will encounter many constraints (e.g., driver libraries incompatible, large memory footprint, etc.) when integrating with low-end Real-Time Operating Systems, RTOSs. In this paper, we present MiniTEE to understand, evaluate and discuss the benefits and limitations when integrating TrustZone-assisted TEEs with RTOSs. We demonstrate how MiniTEE can be adequately exploited for meeting the real-time needs, while presenting a low performance overhead to the rich OSs (i.e., low-end RTOSs). DA - 2020/7/11/ PY - 2020/7/11/ DO - 10.3390/electronics9071130 VL - 9 IS - 7 SP - 1130 UR - https://doi.org/10.3390/electronics9071130 ER - TY - JOUR TI - Energy-Efficient Parallel Real-Time Scheduling on Clustered Multi-Core AU - Bhuiyan, Ashikahmed AU - Liu, Di AU - Khan, Aamir AU - Saifullah, Abusayeed AU - Guan, Nan AU - Guo, Zhishan T2 - IEEE Transactions on Parallel and Distributed Systems AB - Energy-efficiency is a critical requirement for computation-intensive real-time applications on multi-core embedded systems. Multi-core processors enable intra-task parallelism, and in this work, we study energy-efficient real-time scheduling of constrained deadline sporadic parallel tasks, where each task is represented as a directed acyclic graph (DAG). We consider a clustered multi-core platform where processors within the same cluster run at the same speed at any given time. A new concept named speed-profile is proposed to model per-task and per-cluster energy-consumption variations during run-time to minimize the expected long-term energy consumption. To our knowledge, no existing work considers energy-aware real-time scheduling of DAG tasks with constrained deadlines, nor on a clustered multi-core platform. The proposed energy-aware real-time scheduler is implemented upon an ODROID XU-3 board to evaluate and demonstrate its feasibility and practicality. To complement our system experiments in large-scale, we have also conducted simulations that demonstrate a CPU energy saving of up to 67 percent through our proposed approach compared to existing methods. DA - 2020/9/1/ PY - 2020/9/1/ DO - 10.1109/TPDS.2020.2985701 VL - 31 IS - 9 SP - 2097-2111 UR - https://doi.org/10.1109/TPDS.2020.2985701 ER - TY - ER - TY - ER - TY - JOUR TI - Optimizing IoT Energy Efficiency on Edge (EEE): a Cross-layer Design in a Cognitive Mesh Network AU - Liu, Jianqing AU - Pang, Yawei AU - Ding, Haichuan AU - Cai, Ying AU - Zhang, Haixia AU - Fang, Yuguang T2 - IEEE Transactions on Wireless Communications DA - 2020/// PY - 2020/// VL - 20 IS - 4 SP - 2472-2486 ER - TY - CONF TI - An Efficient Data Aggregation Scheme with Local Differential Privacy in Smart Grid AU - Gai, Na AU - Xue, Kaiping AU - He, Peixuan AU - Zhu, Bin AU - Liu, Jianqing AU - He, Debiao T2 - IEEE C2 - 2020/// C3 - 2020 16th International Conference on Mobility, Sensing and Networking (MSN) DA - 2020/// SP - 73-80 ER - TY - CONF TI - Cooperative Caching in a Content-Centric Network for High-Definition Map Delivery AU - Liu, Jiaxi AU - Zhang, Chi AU - Wang, Yuanyuan AU - Wei, Lingbo AU - Liu, Jianqing T2 - IEEE C2 - 2020/// C3 - 2020 3rd International Conference on Hot Information-Centric Networking (HotICN) DA - 2020/// SP - 96-101 ER - TY - JOUR TI - Energy efficiency and traffic offloading optimization in integrated satellite/terrestrial radio access networks AU - Li, Jian AU - Xue, Kaiping AU - Wei, David SL AU - Liu, Jianqing AU - Zhang, Yongdong T2 - IEEE Transactions on Wireless Communications DA - 2020/// PY - 2020/// VL - 19 IS - 4 SP - 2367-2381 ER - TY - JOUR TI - ESAC: An Efficient and Secure Access Control Scheme in Vehicular Named Data Networking AU - Jiang, Shunrong AU - Liu, Jianqing AU - Wang, Liangmin AU - Zhou, Yong AU - Fang, Yuguang T2 - IEEE Transactions on Vehicular Technology DA - 2020/// PY - 2020/// VL - 69 IS - 9 SP - 10252-10263 ER - TY - CONF TI - Vehicular Edge Computing Meets Cache: An Access Control Scheme for Content Delivery AU - Jiang, Shunrong AU - Liu, Jianqing AU - Huang, Longxia AU - Wu, Haiqin AU - Zhou, Yong T2 - IEEE C2 - 2020/// C3 - ICC 2020-2020 IEEE International Conference on Communications (ICC) DA - 2020/// SP - 1-6 ER - TY - JOUR TI - Privacy-preserving conjunctive keyword search on encrypted data with enhanced fine-grained access control AU - Cao, Qiang AU - Li, Yanping AU - Wu, Zhenqiang AU - Miao, Yinbin AU - Liu, Jianqing T2 - World Wide Web DA - 2020/// PY - 2020/// VL - 23 IS - 2 SP - 959-989 ER - TY - JOUR TI - Memristor Based Variation Enabled Differentially Private Learning Systems for Edge Computing in IoT AU - Fu, Jingyan AU - Liao, Zhiheng AU - Liu, Jianqing AU - Smith, Scott C AU - Wang, Jinhui T2 - IEEE Internet of Things Journal DA - 2020/// PY - 2020/// ER - TY - JOUR TI - MPTCP Meets Big Data: Customizing Transmission Strategy for Various Data Flows AU - Xing, Yitao AU - Han, Jiangping AU - Xue, Kaiping AU - Liu, Jianqing AU - Pan, Miao AU - Hong, Peilin T2 - IEEE Network DA - 2020/// PY - 2020/// VL - 34 IS - 4 SP - 35-41 ER - TY - JOUR TI - A User-Centric Handover Scheme for Ultra-Dense LEO Satellite Networks AU - Li, Jian AU - Xue, Kaiping AU - Liu, Jianqing AU - Zhang, Yongdong T2 - IEEE Wireless Communications Letters DA - 2020/// PY - 2020/// VL - 9 IS - 11 SP - 1904-1908 ER - TY - JOUR TI - Energy-Efficient UAV Communications under Stochastic Trajectory: A Markov Decision Process Approach AU - Han, Di AU - Chen, Wei AU - Liu, Jianqing T2 - IEEE Transactions on Green Communications and Networking DA - 2020/// PY - 2020/// ER - TY - CONF TI - Traffic Optimization for In-flight Internet Access via Air-to-ground Communications AU - Wan, Kai AU - Wang, Zhen AU - Wang, Yuanyuan AU - Zhang, Chi AU - Liu, Jianqing T2 - IEEE C2 - 2020/// C3 - 2020 IEEE/CIC International Conference on Communications in China (ICCC) DA - 2020/// SP - 250-255 ER - TY - CHAP TI - Fault Tolerance in Multiagent Systems AU - V, Samuel H. Christie AU - Chopra, Amit K. T2 - Engineering Multi-Agent Systems AB - A decentralized multiagent systems (MAS) is comprised of autonomous agents who interact with each other via asynchronous messaging. A protocol specifies a MAS by specifying the constraints on messaging between agents. Agents enact protocols by applying their own internal decision making. Various kinds of faults may occur when enacting a protocol. For example, messages may be lost, duplicates may be delivered, and agents may crash during the processing of a message. Our contribution in this paper is demonstrating how information protocols support rich fault tolerance mechanisms, and in a manner that is unanticipated by alternative approaches for engineering decentralized MAS. PY - 2020/// DO - 10.1007/978-3-030-66534-0_5 SP - 78-86 PB - Springer International Publishing UR - https://doi.org/10.1007/978-3-030-66534-0_5 ER - TY - CONF TI - Comparing feature engineering approaches to predict complex programming behaviors AU - Wang, W. AU - Rao, Y. AU - Shi, Y. AU - Milliken, A. AU - Martens, C. AU - Barnes, T. AU - Price, T.W. C2 - 2020/// C3 - CEUR Workshop Proceedings DA - 2020/// VL - 2734 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85096164837&partnerID=MN8TOARS ER - TY - JOUR TI - Test_positive at W-nut 2020 shared task-3: Joint event multi-task learning for slot filling in noisy text AU - Chen, C. AU - Huang, C.-Y. AU - Hou, Y. AU - Shi, Y. AU - Dai, E. AU - Wang, J. T2 - arXiv DA - 2020/// PY - 2020/// UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85098406978&partnerID=MN8TOARS ER - TY - CONF TI - Approximating vertex cover using structural rounding AU - Lavallee, B. AU - Russell, H. AU - Sullivan, B.D. AU - Poel, A. AB - In this work, we provide the first practical evaluation of the structural rounding framework for approximation algorithms. Structural rounding works by first editing to a well-structured class, efficiently solving the edited instance, and “lifting” the partial solution to recover an approximation on the input. We focus on the well-studied Vertex Cover problem, and edit to the class of bipartite graphs (where Vertex Cover has an exact polynomial time algorithm). In addition to the naïve lifting strategy for Vertex Cover described by Demaine et al. in the paper describing structural rounding, we introduce a suite of new lifting strategies and measure their effectiveness on a large corpus of synthetic graphs. We find that in this setting, structural rounding significantly outperforms standard 2-approximations. Further, simpler lifting strategies are extremely competitive with the more sophisticated approaches. The implementations are available as an open-source Python package, and all experiments are replicable. C2 - 2020/// C3 - Proceedings of the Workshop on Algorithm Engineering and Experiments DA - 2020/// DO - 10.1137/1.9781611976007.6 VL - 2020-January SP - 70-80 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85079379022&partnerID=MN8TOARS ER - TY - JOUR TI - A color-avoiding approach to subgraph counting in bounded expansion classes AU - Reidl, F. AU - Sullivan, B.D. T2 - arXiv DA - 2020/// PY - 2020/// UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85095561220&partnerID=MN8TOARS ER - TY - CONF TI - Problem detection in peer assessments between subjects by effective transfer learning and active learning AU - Xiao, Yunkai AU - Zingle, Gabriel AU - Jia, Qinjin AU - Akbar, Shoaib AU - Song, Yang AU - Dong, Muyao AU - Qi, Li AU - Gehringer, Edward T2 - EDM 2020: 13th International Conference on Educational Data Mining C2 - 2020/7// C3 - EDM 2020: 13th International Conference on Educational Data Mining DA - 2020/7// PY - 2020/7// SP - 516–523 ER - TY - CONF TI - Detecting Problem Statements in Peer Assessments AU - Xiao, Yunkai AU - Zingle, Gabriel AU - Jia, Qinjin AU - Shah, Harsh AU - Zhang, Yi AU - Li, Tianyi AU - Karovaliya, Mohsin AU - Zhao, Weixiang AU - Song, Yang AU - Ji, Jie AU - Balasubramaniam, Ashwin AU - Patel, Harshit AU - Bhalasubbramanian, Priyankha AU - Patel, Vikram AU - Gehringer, Edward T2 - EDM 2020: 13th International Conference on Educational Data Mining C2 - 2020/7// C3 - EDM 2020: 13th International Conference on Educational Data Mining DA - 2020/7// PY - 2020/7// SP - 704–709 ER - TY - CONF TI - Making large classes work for you and your students AU - Gehringer, Edward T2 - 2020 ASEE Virtual Annual Conference AB - Abstract Small classes offer the best opportunity for personal interaction and individualized instruction. But when rising enrollments and declining per-student funding make large classes a reality, all is not lost. Large classes offer many opportunities that small classes do not. If you are prepared to take advantage of them, you can make large classes work to advantage for yourself and your students. Six areas of opportunity can be identified. First, there is staffing. Larger classes receive more support from teaching assistants. TAs can specialize in different kinds of work (e.g., maintaining the gradebook, managing the web site, setting up programming environments). When one TA is busy with other work, someone else is always available to cover for them. Second is community. A large class can grow into a supportive learning community. Students have more opportunity to partner with, and learn from, other students. TAs are more effective too, in part because they collectively have enough experience to solve one another's problems. Questions are answered more quickly on Piazza or a message board. Clicker-style polling provides feedback with an impact not possible in small classes. Assessment also benefits. Grading is more efficient, as startup overhead plays less of a role. Assessing teaching effectiveness is easier, too, simply because there is more evidence. Students can be surveyed after class to see how well techniques have worked. In a small class, this might lead to survey fatigue, but in a large class, a different subset can be surveyed each class day. The instructor can determine more quickly when a technique is not working, and can make corrections sooner. Content generation. Teaching assistants can provide suggestions for improving content, delivery, and management that go far beyond what the instructor alone could devise. If students are asked to generate content (e.g., worked examples, homework/test questions), a large class can provide a lot more usable material. Common misconceptions become apparent far more quickly, so the instructor can design multiple-choice distractors to call them out, and draft specialized feedback on why each misconception is incorrect. Research. It’s much easier to do statistically valid studies with a control group and an experimental group in the same class. This bypasses the confounding issue that "it was a different semester with a different instructor, but we also added an intervention …” On subjective questions, the TAs’ grades can be compared with each other, allowing the instructor to identify and correct grader bias and rubric ambiguity. Recruitment. An instructor of a large class becomes known to a lot more students, and these students are more likely to consider working with you later on. You may become their graduate advisor, or advisor for an undergraduate research project. A large class is also a great place to recruit for independent-study students who may assist one of your research projects or generate resources for later offerings of the same course. The full paper will discuss how to take advantage of these opportunities, using examples provided by experienced faculty and past ASEE papers. C2 - 2020/// C3 - 2020 ASEE Virtual Annual Conference DA - 2020/// PY - 2020/6/22/ DO - 10.18260/1-2--34944 PB - American Society for Engineering Education Annual Conference ER - TY - CONF TI - A Course as Ecosystem: Melding Teaching, Research, and Practice AU - Gehringer, Edward T2 - 2020 ASEE Virtual Annual Conference AB - Abstract We often compartmentalize our academic life into the areas of teaching, research, and practice. In fact, there are many synergies to be realized by treating a course as a complete ecosystem, drawing from all three areas. In this paper, the author discusses how he has transitioned an advanced course into an opportunity for peer mentoring, a testbed for several Ph.D. research projects, and an occasion to practice the skills that he teaches. The course in question is [redacted], an advanced undergraduate and masters-level course in software engineering. Many years ago, the author started assigning homework involving contributions to open-source projects. Soon he realized that he could include his own open-source project, an educational technology application used in the class, as a source of student projects. This offered several benefits: assignments were more “real world,” because they related to software that they had actually used; students could use their talents to improve the experience of students in later semesters; and the instructor was incentivized to pay careful attention to mentoring and evaluating student work, because it directly benefited his application. That was only the beginning. With just a little bit of help, his Ph.D. students found AI and SE-related projects to improve the application. For three of those students, these projects formed a substantial part of their dissertations. They led to peer-reviewed papers in research conferences and journals, as well as education-related papers at ASEE Annual Conferences and Frontiers in Education. Indirectly, these projects led to an NSF grant of more than $1 million. Meanwhile, independent-study students worked on improving the application itself. Initially, these projects did not achieve very much, as students struggled with poor design in code written by other students. Over time, we have incorporated better tools and practices to improve the code base, and this allows the students to make useful contributions much sooner in the semester. As the course has grown in popularity, it has served as a recruiting platform for independent-study and masters-thesis students. As well as contributing to the author’s research, these students have helped design active-learning exercises for every class period during the semester. Every term, about the time that registration starts for the following semester, the author presents a list of independent-study topics to the class, and solicits student interest in each of them. One important independent-study topic is mentoring: mentors contribute to the application, help write the specs on open-source projects for the class, and meet weekly with project teams to check their progress and offer advice. This is one of the reasons that this course is frequently cited by students as instrumental in helping them get a job. In summary, the course helps the author fulfill his teaching responsibility, serves as a test bed for software-engineering and SoTL research, and gives the author an opportunity to serve as a practitioner guiding the design of an application used by thousands of students. C2 - 2020/// C3 - 2020 ASEE Virtual Annual Conference DA - 2020/// PY - 2020/6/22/ DO - 10.18260/1-2--33991 PB - American Society for Engineering Education Annual Conference ER - TY - CONF TI - EDM and Privacy: Ethics and Legalities of Data Collection, Usage, and Storage AU - Klose, Mark AU - Desai, Vasvi AU - Song, Yang AU - Gehringer, Edward T2 - EDM 2020: 13th International Conference on Educational Data Mining C2 - 2020/7// C3 - EDM 2020: 13th International Conference on Educational Data Mining DA - 2020/7// PY - 2020/7// SP - 451–459 PB - ERIC UR - https://educationaldatamining.org/files/conferences/EDM2020/papers/paper_135.pdf ER - TY - CONF TI - Comparing and combining tests for plagiarism detection in online exams AU - Gehringer, Edward F AU - Liu, Xiaohan AU - Kariya, Abhirav AU - Wang, Guoyi C2 - 2020/7// C3 - EDM 2020: 13th International Conference on Educational Data Mining DA - 2020/7// UR - https://educationaldatamining.org/files/conferences/EDM2020/papers/paper_179.pdf ER - TY - CONF TI - Clouseau: Generating Communication Protocols from Commitments AU - Singh, Munindar P. AU - Chopra, Amit K. AB - Engineering a decentralized multiagent system (MAS) requires realizing interactions modeled as a communication protocol between autonomous agents. We contribute Clouseau, an approach that takes a commitment-based specification of an interaction and generates a communication protocol amenable to decentralized enactment. We show that the generated protocol is (1) correct—realizes all and only the computations that satisfy the input specification; (2) safe—ensures the agents' local views remain consistent; and (3) live—ensures the agents can proceed to completion. C2 - 2020/2// C3 - Proceedings of the AAAI Conference on Artificial Intelligence DA - 2020/2// DO - 10.1609/aaai.v34i05.6215 VL - 34 SP - 7244-7252 M1 - 5 PB - AAAI Press UR - http://dx.doi.org/10.1609/aaai.v34i05.6215 ER - TY - JOUR TI - An Evaluation of Communication Protocol Languages for Engineering Multiagent Systems AU - Chopra, Amit K AU - V, Samuel H Christie AU - Singh, Munindar P. T2 - Journal of Artificial Intelligence Research AB - Communication protocols are central to engineering decentralized multiagent systems. Modern protocol languages are typically formal and address aspects of decentralization, such as asynchrony. However, modern languages differ in important ways in their basic abstractions and operational assumptions. This diversity makes a comparative evaluation of protocol languages a challenging task. We contribute a rich evaluation of diverse and modern protocol languages. Among the selected languages, Scribble is based on session types; Trace-C and Trace-F on trace expressions; HAPN on hierarchical state machines, and BSPL on information causality. Our contribution is four-fold. One, we contribute important criteria for evaluating protocol languages. Two, for each criterion, we compare the languages on the basis of whether they are able to specify elementary protocols that go to the heart of the criterion. Three, for each language, we map our findings to a canonical architecture style for multiagent systems, highlighting where the languages depart from the architecture. Four, we identify design principles for protocol languages as guidance for future research. DA - 2020/12/22/ PY - 2020/12/22/ DO - 10.1613/jair.1.12212 VL - 69 SP - 1351-1393 UR - http://dx.doi.org/10.1613/jair.1.12212 ER - TY - JOUR TI - The Impact of Virtual Reality in the Social Presence of a Virtual Agent AU - Guimaraes, Manuel AU - Prada, Rui AU - Santos, Pedro A. AU - Dias, Joao AU - Jhala, Arnav AU - Mascarenhas, Samuel T2 - PROCEEDINGS OF THE 20TH ACM INTERNATIONAL CONFERENCE ON INTELLIGENT VIRTUAL AGENTS (ACM IVA 2020) AB - In this work we test the hypothesis that interacting with an intelligent virtual character in Virtual Reality (VR) has a stronger impact compared to the same interaction in a traditional non-immersive platform, both in terms of presence and believability. DA - 2020/// PY - 2020/// DO - 10.1145/3383652.3423879 SP - KW - intelligent virtual agents KW - virtual reality KW - social presence KW - social skills training ER - TY - JOUR TI - Platinum: Reusing Constraint Solutions in Bounded Analysis of Relational Logic AU - Zheng, Guolong AU - Bagheri, Hamid AU - Rothermel, Gregg AU - Wang, Jianghao T2 - FUNDAMENTAL APPROACHES TO SOFTWARE ENGINEERING (FASE 2020) AB - Alloy is a lightweight specification language based on relational logic, with an analysis engine that relies on SAT solvers to automate bounded verification of specifications. In spite of its strengths, the reliance of the Alloy Analyzer on computationally heavy solvers means that it can take a significant amount of time to verify software properties, even within limited bounds. This challenge is exacerbated by the ever-evolving nature of complex software systems. This paper presents Platinum, a technique for efficient analysis of evolving Alloy specifications, that recognizes opportunities for constraint reduction and reuse of previously identified constraint solutions. The insight behind Platinum is that formula constraints recur often during the analysis of a single specification and across its revisions, and constraint solutions can be reused over sequences of analyses performed on evolving specifications. Our empirical results show that Platinum substantially reduces (by 66.4% on average) the analysis time required on specifications extracted from real-world software systems. DA - 2020/// PY - 2020/// DO - 10.1007/978-3-030-45234-6_2 VL - 12076 SP - 29-52 SN - 1611-3349 ER - TY - JOUR TI - GOPipe: A Granularity-Oblivious Programming Framework for Pipelined Stencil Executions on GPU AU - Oh, Chanyoung AU - Zheng, Zhen AU - Shen, Xipeng AU - Zhai, Jidong AU - Yi, Youngmin T2 - PACT '20: PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES AB - Recent studies have shown promising performance benefits when multiple stages of a pipelined stencil application are mapped to different parts of a GPU to run concurrently. An important factor for the computing efficiency of such pipelines is the granularity of a task. In previous programming frameworks that support true pipelined computations on GPU, the choice has to be made by the programmers during the application development time. Due to many difficulties, programmers' decisions are often far from optimal, causing inferior performance and performance portability. DA - 2020/// PY - 2020/// DO - 10.1145/3410463.3414656 SP - 43-54 SN - 1089-795X KW - Programming Framework KW - GPU KW - Optimizations ER - TY - JOUR TI - Self-Patch: Beyond Patch Tuesday for Containerized Applications AU - Tunde-Onadele, Olufogorehan AU - Lin, Yuhang AU - He, Jingzhu AU - Gu, Xiaohui T2 - 2020 IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING AND SELF-ORGANIZING SYSTEMS (ACSOS 2020) AB - Containers have become increasingly popular in distributed computing environments. However, recent studies have shown that containerized applications are susceptible to various security attacks. Traditional periodically scheduled software update approaches not only become ineffective under dynamic container environments but also impose high overhead to containers. In this paper, we present Self-Patch, a new self-triggering patching framework for applications running inside containers. Self-Patch combines light-weight runtime attack detection and dynamic targeted patching to achieve more efficient and effective security protection for containerized applications. We evaluated our schemes over 31 real world vulnerability attacks in 23 commonly used server applications. Results show that Self-Patch can accurately detect and classify 81% of attacks and reduce patching overhead by up to 84%. DA - 2020/// PY - 2020/// DO - 10.1109/ACSOS49614.2020.00022 SP - 21-27 KW - Container Security KW - Anomaly Detection KW - Security Patching ER - TY - JOUR TI - Debugging Hiring: What Went Right and What Went Wrong in the Technical Interview Process AU - Behroozi, Mahnaz AU - Shirolkar, Shivani AU - Barik, Titus AU - Parnin, Chris T2 - 2020 IEEE/ACM 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN SOCIETY (ICSE-SEIS 2021) AB - The typical hiring pipeline for software engineering occurs over several stages---from phone screening and technical on-site interviews, to offer and negotiation. When these hiring pipelines are leaky, otherwise qualified candidates are lost at some stage of the pipeline. These leaky pipelines impact companies in several ways, including hindering a company's ability to recruit competitive candidates and build diverse software teams.To understand where candidates become disengaged in the hiring pipeline---and what companies can do to prevent it---we conducted a qualitative study on over 10,000 reviews on 19 companies from Glassdoor, a website where candidates can leave reviews about their hiring process experiences. We identified several poor practices which prematurely sabotage the hiring process---for example, not adequately communicating hiring criteria, conducting interviews with inexperienced interviewers, and ghosting candidates. Our findings provide a set of guidelines to help companies improve their hiring pipeline practices---such as being deliberate about phrasing and language during initial contact with the candidate, providing candidates with constructive feedback after their interviews, and bringing salary transparency and long-term career discussions into offers and negotiations. Operationalizing these guidelines helps make the hiring pipeline more transparent, fair, and inclusive. DA - 2020/// PY - 2020/// DO - 10.1145/3377815.3381372 SP - 71-80 KW - career KW - hiring practices KW - interview feedback KW - opinion mining KW - reviews KW - software engineering KW - technical interviews KW - whiteboard ER - TY - JOUR TI - Engaging Students with Instructor Solutions in Online Programming Homework AU - Price, Thomas W. AU - Williams, Joseph Jay AU - Solyst, Jaemarie AU - Marwan, Samiha T2 - PROCEEDINGS OF THE 2020 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI'20) AB - Students working on programming homework do not receive the same level of support as in the classroom, relying primarily on automated feedback from test cases. One low-effort way to provide more support is by prompting students to compare their solution to an instructor's solution, but it is unclear the best way to design such prompts to support learning. We designed and deployed a randomized controlled trial during online programming homework, where we provided students with an instructor's solution, and randomized whether they were prompted to compare their solution to the instructor's, to fill in the blanks for a written explanation of the instructor's solution, to do both, or neither. Our results suggest that these prompts can effectively engage students in reflecting on instructor solutions, although the results point to design trade-offs between the amount of effort that different prompts require from students and instructors, and their relative impact on learning. DA - 2020/// PY - 2020/// DO - 10.1145/3313831.3376857 SP - KW - Computing Education KW - Programming KW - Self-explanation KW - Comparison ER - TY - JOUR TI - A Scalable Solution to Network Design Problems: Decomposition with Exhaustive Routing Search AU - Fayez, Mahmoud AU - Katib, Iyad AU - Rouskas, George N. AU - Gharib, Tarek F. AU - Ahmed, H. K. AU - Faheem, H. M. T2 - 2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM) AB - Many network design problems encompass two tasks, routing and resource allocation, that are so intricately intertwined as to contribute significantly to the intractability of such problems. In this paper, we make two contributions to addressing general network design problems of this nature. First, we present a new decomposition method that optimally decouples resource allocation from routing, making it possible to tackle each of these aspects separately. Second, we develop a recursive branch-and-bound algorithm to search the routing space exhaustively, yet in a scalable manner. We apply our method to a well-known intractable problem in optical networks, routing and spectrum assignment (RSA). Our results indicate that the recursive algorithm is able to search efficiently the entire routing space of topologies representative of large-scale wide area networks. DA - 2020/// PY - 2020/// DO - 10.1109/GLOBECOM42002.2020.9322439 SP - SN - 2576-6813 ER - TY - JOUR TI - Service Chain Rerouting for NFV Load Balancing AU - Gao, Lingnan AU - Rouskas, George N. T2 - 2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM) AB - Network function virtualization (NFV), with its potential to facilitate network service provisioning, has drawn growing interest from both academia and industry. One essential challenge is to allocate efficiently the bandwidth and computational resources to the service requests. In an online context, service chain requests may arrive, depart or evolve in an arbitrary fashion, adding more difficulty to the problem. Service chain reconfiguration may help improve the performance by individually rerouting a subset of the service chain requests. In this paper, we propose a new service chain reconfiguration framework to achieve load balancing in an NFV environment under varying levels of support from the underlying infrastructure. We show that our framework can achieve an approximation ratio of O(lnm/ln lnm) with high probability for the service chain request rerouting problem. DA - 2020/// PY - 2020/// DO - 10.1109/GLOBECOM42002.2020.9322265 SP - SN - 2576-6813 ER - TY - JOUR TI - Performance Implications of Problem Decomposition Approaches for SDN Pipelines AU - Brockelsby, William AU - Dutta, Rudra T2 - 2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM) AB - Software defined networking (SDN) allows organizations to modify networks programmatically to implement custom forwarding behavior and to react to changing conditions. While there are many approaches available to implement SDN those that leverage forwarding table abstractions such as OpenFlow and P4 require developers to decompose problems into one or more tables associated with a definable pipeline. This paper explores tradeoffs between table depth and pipeline length associated with different problem decomposition options by analyzing the performance impact on hardware and software data planes including software data planes leveraging hardware acceleration through the use of SmartNICs. DA - 2020/// PY - 2020/// DO - 10.1109/GLOBECOM42002.2020.9322392 SP - SN - 2576-6813 KW - computer networks KW - software defined networking ER - TY - CONF TI - Toward a Block-Based Programming Approach to Interactive Storytelling for Upper Elementary Students AU - Smith, Andy AU - Mott, Bradford AU - Taylor, Sandra AU - Hubbard-Cheuoua, Aleata AU - Minogue, James AU - Oliver, Kevin AU - Ringstaff, Cathy AB - Developing narrative and computational thinking skills is crucial for K-12 student learning. A growing number of K-12 teachers are utilizing digital storytelling, where students create short narratives around a topic, as a means of creating motivating problem-solving activities for a variety of domains, including history and science. At the same time, there is increasing awareness of the need to engage K-12 students in computational thinking, including elementary school students. Given the challenges that the syntax of text-based programming languages poses for even novice university-level learners, block-based programming languages have emerged as an effective tool for introducing computational thinking to elementary-level students. Leveraging the unique affordances of narrative and computational thinking offers significant potential for student learning; however, integrating them presents significant challenges. In this paper, we describe initial work toward solving this problem by introducing an approach to block-based programming for interactive storytelling to engage upper elementary students (ages 9 to 11) in computational thinking and narrative skill development. Leveraging design principles and best practices from prior research on elementary-grade block-based programming and digital storytelling, we propose a set of custom blocks enabling learners to create interactive narratives. We describe both the process used to derive the custom blocks, including their alignment with elements of interactive narrative and with specific computational thinking curricular goals, as well as lessons learned from students interacting with a prototype learning environment utilizing the block-based programming approach. C2 - 2020/11// C3 - Interactive Storytelling DA - 2020/11// DO - 10.1007/978-3-030-62516-0_10 SP - 111-119 PB - Springer International Publishing UR - http://dx.doi.org/10.1007/978-3-030-62516-0_10 ER - TY - JOUR TI - LeakyPick: IoT Audio Spy Detector AU - Mitev, Richard AU - Pazii, Anna AU - Miettinen, Markus AU - Enck, William AU - Sadeghi, Ahmad-Reza T2 - 36TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2020) AB - Manufacturers of smart home Internet of Things (IoT) devices are increasingly adding voice assistant and audio monitoring features to a wide range of devices including smart speakers, televisions, thermostats, security systems, and doorbells. Consequently, many of these devices are equipped with microphones, raising significant privacy concerns: users may not always be aware of when audio recordings are sent to the cloud, or who may gain access to the recordings. In this paper, we present the LeakyPick architecture that enables the detection of the smart home devices that stream recorded audio to the Internet in response to observing a sound. Our proof-of-concept is a LeakyPick device that is placed in a user’s smart home and periodically “probes” other devices in its environment and monitors the subsequent network traffic for statistical patterns that indicate audio transmission. Our prototype is built on a Raspberry Pi for less than USD $40 and has a measurement accuracy of 94% in detecting audio transmissions for a collection of 8 devices with voice assistant capabilities. Furthermore, we used LeakyPick to identify 89 words that an Amazon Echo Dot misinterprets as its wake-word, resulting in unexpected audio transmission. LeakyPick provides a cost effective approach to help regular consumers monitor their homes for sound-triggered devices that unexpectedly transmit audio to the cloud. DA - 2020/// PY - 2020/// DO - 10.1145/3427228.3427277 SP - 694-705 SN - 1063-9527 ER - TY - JOUR TI - CDL: Classified Distributed Learning for Detecting Security Attacks in Containerized Applications AU - Lin, Yuhang AU - Tunde-Onadele, Olufogorehan AU - Gu, Xiaohui T2 - 36TH ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2020) AB - Containers have been widely adopted in production computing environments for its efficiency and low overhead of isolation. However, recent studies have shown that containerized applications are prone to various security attacks. Moreover, containerized applications are often highly dynamic and short-lived, which further exacerbates the problem. In this paper, we present CDL, a classified distributed learning framework to achieve efficient security attack detection for containerized applications. CDL integrates online application classification and anomaly detection to overcome the challenge of lacking sufficient training data for dynamic short-lived containers while considering diversified normal behaviors in different applications. We have implemented a prototype of CDL and evaluated it over 33 real world vulnerability attacks in 24 commonly used server applications. Our experimental results show that CDL can reduce the false positive rate from over 12% to 0.24% compared to traditional anomaly detection schemes without aggregating training data. By introducing application classification into container behavior learning, CDL can improve the detection rate from catching 20 attacks to 31 attacks before those attacks succeed. CDL is light-weight, which can complete application classification and anomaly detection for each data sample within a few milliseconds. DA - 2020/// PY - 2020/// DO - 10.1145/3427228.3427236 SP - 179-188 SN - 1063-9527 KW - Container Security KW - Anomaly Detection KW - Machine Learning ER - TY - CONF TI - EARS: Enabling Private Feedback Updates in Anonymous Reputation Systems AU - Kilari, Vishnu Teja AU - Yu, Ruozhou AU - Misra, Satyajayant AU - Xue, Guoliang AB - Reputation systems, designed to remedy the lack of information quality and assess credibility of information sources, have become an indispensable component of many online systems. A typical reputation system works by tracking all information originating from a source, and the feedback to the information with its attribution to the source. The tracking of information and the feedback, though essential, could violate the privacy of users who provide the information and/or the feedback, which could both cause harm to the users' online well-being, and discourage them from participation. Anonymous reputation systems have been designed to protect user privacy by ensuring anonymity of the users. Yet, current anonymous reputation systems suffer from several limitations, including but not limited to a)lack of support for core functionalities such as feedback update, b) lack of protocol efficiency for practical deployment, and c) reliance on a fully trusted authority. This paper proposes EARS, an anonymous reputation system that ensures user anonymity while supporting all core functionalities (including feedback update) of a reputation system both efficiently and practically, and without the need of a fully trusted central authority. We present security analysis of EARS against multiple types of attacks that could potentially violate user anonymity, such as feedback duplication, bad mouthing, and ballot stuffing. We also present evaluation of the efficiency and scalability of our system based on implementations. C2 - 2020/6// C3 - 2020 IEEE Conference on Communications and Network Security (CNS) DA - 2020/6// DO - 10.1109/cns48642.2020.9162328 PB - IEEE UR - http://dx.doi.org/10.1109/cns48642.2020.9162328 ER - TY - JOUR TI - GVPRoF: A Value Profiler for GPU-Based Clusters AU - Zhou, Keren AU - Hao, Yueming AU - Mellor-Crummey, John AU - Meng, Xiaozhu AU - Liu, Xu T2 - PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20) AB - GPGPUs are widely used in high-performance computing systems to accelerate scientific and machine learning workloads. Developing efficient GPU kernels is critically important to obtain “bare-metal” performance on GPU-based clusters. In this paper, we describe the design and implementation of GVPROF, the first value profiler that pinpoints value-related inefficiencies in applications running on NVIDIA GPU-based clusters. The novelty of GVPROF resides in its ability to detect temporal and spatial value redundancies, which provides useful information to guide code optimization. GVPROF can monitor production multi-node multi-GPU executions in clusters. Our experiments with well-known GPU benchmarks and HPC applications show that GVPROF incurs acceptable overhead and scales to large executions. Using GVPROF, we optimized several HPC and machine learning workloads on one NVIDIA V100 GPU. In one case study of LAMMPS, optimizations based on information from GVProf led to whole-program speedups ranging from 1.37x on a single GPU to 1.08x on 64 GPUs. DA - 2020/// PY - 2020/// DO - 10.1109/SC41405.2020.00093 VL - 2020-November SP - UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85102360712&partnerID=MN8TOARS KW - High performance computing KW - Performance analysis KW - Parallel programming KW - Supercomputers ER - TY - JOUR TI - SCALANA: Automating Scaling Loss Detection with Graph Analysis AU - Jin, Yuyang AU - Wang, Haojie AU - Yu, Teng AU - Tang, Xiongchao AU - Hoefler, Torsten AU - Liu, Xu AU - Zhai, Jidong T2 - PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20) AB - Scaling a parallel program to modern supercomputers is challenging due to inter-process communication, Amdahl’s law, and resource contention. Performance analysis tools for finding such scaling bottlenecks either base on profiling or tracing. Profiling incurs low overheads but does not capture detailed dependencies needed for root-cause analysis. Tracing collects all information at prohibitive overheads. In this work, we design SCALANA that uses static analysis techniques to achieve the best of both worlds - it enables the analyzability of traces at a cost similar to profiling. SCALANA first leverages static compiler techniques to build a Program Structure Graph, which records the main computation and communication patterns as well as the program’s control structures. At runtime, we adopt lightweight techniques to collect performance data according to the graph structure and generate a Program Performance Graph. With this graph, we propose a novel approach, called backtracking root cause detection, which can automatically and efficiently detect the root cause of scaling loss. We evaluate SCALANA with real applications. Results show that our approach can effectively locate the root cause of scaling loss for real applications and incurs 1.73parcent overhead on average for up to 2,048 processes. We achieve up to 11.11parcent performance improvement by fixing the root causes detected by SCALANA on 2,048 processes. DA - 2020/// PY - 2020/// DO - 10.1109/SC41405.2020.00032 SP - KW - Performance Analysis KW - Scalability Bottleneck KW - Root-Cause Defection KW - Static Analysis ER - TY - JOUR TI - ZeroSpy: Exploring Software Inefficiency with Redundant Zeros AU - You, Xin AU - Yang, Hailong AU - Luan, Zhongzhi AU - Qian, Depei AU - Liu, Xu T2 - PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20) AB - Redundant zeros cause inefficiencies in which the zero values are loaded and computed repeatedly, resulting in unnecessary memory traffic and identity computation that waste memory bandwidth and CPU resources. optimizing compilers is difficult in eliminating these zero-related inefficiencies due to limitations in static analysis. Hardware approaches, in contrast, optimize inefficiencies without code modification, but are not widely adopted in commodity processors. In this paper, we propose ZeroSpy - a fine-grained profiler to identify redundant zeros caused by both inappropriate use of data structures and useless computation. ZeroSpy also provides intuitive optimization guidance by revealing the locations where the redundant zeros happen in source lines and calling contexts. The experimental results demonstrate ZeroSpy is capable of identifying redundant zeros in programs that have been highly optimized for years. Based on the optimization guidance revealed by ZeroSpy, we can achieve significant speedups after eliminating redundant zeros. DA - 2020/// PY - 2020/// DO - 10.1109/SC41405.2020.00033 SP - KW - Redundant Zero KW - Software Inefficiency KW - Performance Profiling and Optimization ER - TY - JOUR TI - DRCCTPROF: A Fine-Grained Call Path Profiler for ARM-Based Clusters AU - Zhao, Qidong AU - Liu, Xu AU - Chabbi, Milind T2 - PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20) AB - ARM is an attractive CPU architecture for exascale systems because of its energy efficiency. As a recent entry into the HPC paradigm, ARM lags in its software stack, especially in the performance tooling aspect. Notably, there is a lack of fine-grained measurement tools to analyze fully optimized HPC binary executables on ARM processors. In this paper, we introduce DRCCTPROF — a fine-grained call path profiling framework for binaries running on ARM architectures. The unique ability of DRCCTPROF is to obtain full calling context at any and every machine instruction that executes, which provides more detailed diagnostic feedback for performance optimization and correctness tools. Furthermore, DRCCTPROF not only associates any instruction with source code along the call path, but also associates memory access instructions back to the constituent data object. Finally, DRCCTPROF incurs moderate overhead and provides a compact view to visualize the profiles collected from parallel executions. DA - 2020/// PY - 2020/// DO - 10.1109/SC41405.2020.00034 SP - KW - Fine-grained analysis KW - ARM KW - performance analysis KW - debugging KW - high-performance computing ER - TY - JOUR TI - A Blockchain-based Vehicle-trust Management Framework Under a Crowdsourcing Environment AU - Wang, Dawei AU - Chen, Xiao AU - Wu, Haiqin AU - Yu, Ruozhou AU - Zhao, Yishi T2 - 2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020) AB - Vehicular crowdsourcing networks (VCNs) enable vehicles to provide or obtain traffic-related services in a costefficient and flexible manner. Therefore, it is crucial to provide trusted management in VCNs for high reliability towards both service producers and consumers. However, most recent VCN platforms rely on a third party to manage crowdsourcing services which might be not fully trusted by users. For the issue, this paper proposes a blockchain-based trust management scheme for VCNs to provide a decentralized and trusted service management. A comprehensive trust evaluation model (TEM) is designed to quantify the trust degree of each vehicular node, and a vehicle-trust blockchain framework called VTchain is proposed to preserve the trust values of nodes while guaranteeing transparency and trustworthiness. Particularly, we leverage a trusted execution environment (TEE) to provide secure trust evaluation to tackle possible untrusted road-side units. In addition, we introduce TEM-based Proof of Trust to support blockchain maintenance, which works together with an efficient consensus algorithm Zyzzyva for improved scalability. Finally, extensive experiments are conducted by developing a testbed deployed on cloud servers for measurements. DA - 2020/// PY - 2020/// DO - 10.1109/TrustCom50675.2020.00266 SP - 1950-1955 SN - 2324-898X KW - Vehicular crowdsourcing networks KW - trust management KW - blockchain KW - trusted execution environment ER - TY - JOUR TI - A Review of Geospatial Content in IEEE Visualization Publications AU - Yoshizumi, Alexander AU - Coffer, Megan M. AU - Collins, Elyssa L. AU - Gaines, Mollie D. AU - Gao, Xiaojie AU - Jones, Kate AU - McGregor, Ian R. AU - McQuillan, Katie A. AU - Perin, Vinicius AU - Tomkins, Laura M. AU - Worm, Thom AU - Tateosian, Laura T2 - 2020 IEEE VISUALIZATION CONFERENCE - SHORT PAPERS (VIS 2020) AB - Geospatial analysis is crucial for addressing many of the world's most pressing challenges. Given this, there is immense value in improving and expanding the visualization techniques used to communicate geospatial data. In this work, we explore this important intersection - between geospatial analytics and visualization - by examining a set of recent IEEE VIS Conference papers (a selection from 2017-2019) to assess the inclusion of geospatial data and geospatial analyses within these papers. After removing the papers with no geospatial data, we organize the remaining literature into geospatial data domain categories and provide insight into how these categories relate to VIS Conference paper types. We also contextualize our results by investigating the use of geospatial terms in IEEE Visualization publications over the last 30 years. Our work provides an understanding of the quantity and role of geospatial subject matter in recent IEEE VIS publications and supplies a foundation for future meta-analytical work around geospatial analytics and geovisualization that may shed light on opportunities for innovation. DA - 2020/// PY - 2020/// DO - 10.1109/VIS47514.2020.00017 SP - 51-55 KW - Human-centered computing KW - Visualization KW - Visualization application domains KW - Geographic visualization ER - TY - JOUR TI - VCFC: Structural and Semantic Compression and Indexing of Genetic Variant Data AU - Ferriter, Kyle AU - Mueller, Frank AU - Bahmani, Amir AU - Pan, Cuiping T2 - 2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE AB - Personalized genomic datasets are growing in size at an accelerating pace presenting a dilemma between the need for fast retrieval requiring “near data” and cost of storage, which decreases for “distant media” with larger capacity but longer access time. Instead of database technology, the bioinformatics community has developed an industry standard for compressing and indexing of genetic variant files that store the difference between a person's genome to a human reference genome. These standardizations rely on generic data compression schemes.This work contributes novel domain-specific compression and indexing algorithms that retain the structure and semantics of genetic variation data while supporting common query patterns. A line-based run-length partial compression technique for variant genotype data using a novel indexing strategy is developed and shown to perform well on large sample sets compared to the industry standard. The evaluation over genomic datasets indicates compression at a comparable size for our data representation while resulting in speedup of ≈2X in indexed queries compared to the industry standard. This underlines that our representation could replace existing standards resulting in reduced computational cost at equivalent storage size. DA - 2020/// PY - 2020/// DO - 10.1109/BIBM49941.2020.9313221 SP - 200-203 SN - 2156-1133 ER - TY - JOUR TI - Assessing Practitioner Beliefs about Software Defect Prediction AU - Shrikanth, N. C. AU - Menzies, Tim T2 - 2020 IEEE/ACM 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: SOFTWARE ENGINEERING IN PRACTICE (ICSE-SEIP) AB - Just because software developers say they believe in "X", that does not necessarily mean that "X" is true. As shown here, there exist numerous beliefs listed in the recent Software Engineering literature which are only supported by small portions of the available data. Hence we ask what is the source of this disconnect between beliefs and evidence?. DA - 2020/// PY - 2020/// DO - 10.1145/3377813.3381367 SP - 182-190 KW - defects KW - beliefs KW - practitioner KW - empirical software engineering ER - TY - JOUR TI - Distributed and Privacy Preserving Routing of Connected Vehicles to Minimize Congestion AU - Boob, Surabhi AU - Mahmood, Shakir AU - Shahzad, Muhammad T2 - 2020 IEEE 17TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SMART SYSTEMS (MASS 2020) AB - With a large number of connected vehicles on the roads, there is an opportunity to leverage their connectivity to minimize congestion on roads by calculating fast routes for vehicles in a way that each vehicle contributes as little to the congestion as possible. The existing commercial and research based approaches of calculating routes for vehicles suffer from one or more of the following two limitations: 1) they are not privacy preserving in the sense that they receive destination addresses from users and may either store and use them for other commercial purposes or are at a risk of getting hacked and exposing these addresses to hackers; and 2) they require expensive infrastructure such as road side units (RSUs). To address these limitations, we propose a distributed and privacy preserving routing protocol, namely DPR, which the connected vehicles collaboratively and repeatedly execute to calculate fast routes to their destinations such that the overall congestion on the road network is significantly reduced and at the same time the privacy of the vehicles is preserved. The DPR protocol relies on direct vehicle to vehicle communication and does not need any new infrastructure such as RSUs. We have implemented and evaluated our DPR protocol through simulations on a real road network under several traffic conditions. Our results show that DPR reduces the average travel time of vehicles that travel a distance of 1000, 2500, and over 4000 meters by 15%, 32%, and 42%, respectively. This reduction in travel time is significant considering that this improvement results purely from software manipulations and without requiring any new infrastructure. DA - 2020/// PY - 2020/// DO - 10.1109/MASS50613.2020.00036 SP - 220-228 SN - 2155-6806 ER - TY - JOUR TI - A WiFi-based Home Security System AU - Zhang, Shaohu AU - Venkatnarayan, Raghav H. AU - Shahzad, Muhammad T2 - 2020 IEEE 17TH INTERNATIONAL CONFERENCE ON MOBILE AD HOC AND SMART SYSTEMS (MASS 2020) AB - Typical home security systems monitor homes for intrusions by installing contact sensors on doors and windows and motion sensors inside the house. Unfortunately, due to the high deployment and operational costs of today's home security systems, only a small fraction of homes have security systems installed (e.g., only 17% in the US and 15% in China). In this paper, we propose a WiFi based Home Security system (WiHS) that uses commodity WiFi devices, which most modern households already have, to perform the three primary tasks of typical home security systems: 1) detect when a door/window is opened/closed, 2) identify which door/window has been opened/closed, and 3) detect movements inside the house. The design of WiHS is based on our intuitive and theoretical understanding of the impacts of the movements of doors and windows on WiFi signals, which we will develop and present in this paper. We extensively evaluated WiHS using commodity WiFi devices in 3 different houses. WiHS detected intrusions with over 95% accuracy and identified the exact door/window that moved with just 4.5% average error. DA - 2020/// PY - 2020/// DO - 10.1109/MASS50613.2020.00026 SP - 129-137 SN - 2155-6806 ER - TY - JOUR TI - Efficient Constrained Subgraph Extraction for Exploratory Discovery in Large Knowledge Graphs AU - Gao, Sidan AU - Korchiev, Nodirbek AU - Samatova, Vodelina AU - Anyanwu, Kemafor T2 - 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) AB - Knowledge graphs which often integrate heterogeneous data can be exploited for serendipitous knowledge discovery using appropriate integration paradigms. We posit that a semi-structured querying model which blends the benefits of structured and unstructured querying could offer a sweetspot. However, there is a need for effective algorithmic techniques for such query processing.In this paper, we propose a class of constrained subgraph connection structure discovery queries whose specification is only partially structured. Graph theoretically, these amount subgraph homeomorphism problems that tolerate flexibility in graph structure matching. Central to achieving the goals of performance and scale of query evaluation is the use of a path algebraic framework rather than a graph theoretic framework. The path algebraic framework is coupled with some efficient data encoding, representation and indexing. Together, these allow more effective querying than using the traditional graph traversal style algorithms, demonstrated by a comparative evaluation. DA - 2020/// PY - 2020/// DO - 10.1109/BigData50022.2020.9378338 SP - 623-630 SN - 2639-1589 KW - Exploratory Querying KW - RDF KW - Knowledge Graphs KW - Set Constrained Path Queries ER - TY - JOUR TI - MuLan: Multilevel Language-based Representation Learning for Disease Progression Modeling AU - Sohn, Hyunwoo AU - Park, Kyungjin AU - Chi, Min T2 - 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) AB - Modeling patient disease progression using Electronic Health Records (EHRs) is crucial to assist clinical decision making. In recent years, deep learning models such as Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) have shown great success in handling sequential multivariate data, such as EHRs. Despite their great success, it is often difficult to interpret and visualize patient disease progression learned from these models in a meaningful yet unified way. In this work, we present MuLan: a Multilevel Language-based representation learning framework that can automatically learn a hierarchical representation for EHRs at entry, event, and visit levels. We validate MuLan on modeling the progression of an extremely challenging disease, septic shock, by using real-world EHRs. Our results showed that these unified multilevel representations can be utilized not only for interpreting and visualizing the latent mechanism of patients' septic shock progressions but also for early detection of septic shock. DA - 2020/// PY - 2020/// DO - 10.1109/BigData50022.2020.9377829 SP - 1246-1255 SN - 2639-1589 KW - Electronic health records KW - disease progression modeling KW - interpretability KW - representation learning ER - TY - JOUR TI - An Adversarial Domain Separation Framework for Septic Shock Early Prediction Across EHR Systems AU - Khoshnevisan, Farzaneh AU - Chi, Min T2 - 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) AB - Modeling patient disease progression using Electronic Health Records (EHRs) is critical to assist clinical decision making. While most of prior work has mainly focused on developing effective disease progression models using EHRs collected from an individual medical system, relatively little work has investigated building robust yet generalizable diagnosis models across different systems. In this work, we propose a general domain adaptation (DA) framework that tackles two categories of discrepancies in EHRs collected from different medical systems: one is caused by heterogeneous patient populations (covariate shift) and the other is caused by variations in data collection procedures (systematic bias). Prior research in DA has mainly focused on addressing covariate shift but not systematic bias. In this work, we propose an adversarial domain separation framework that addresses both categories of discrepancies by maintaining one globally-shared invariant latent representation across all systems through an adversarial learning process, while also allocating a domain-specific model for each system to extract local latent representations that cannot and should not be unified across systems. Moreover, our proposed framework is based on variational recurrent neural network (VRNN) because of its ability to capture complex temporal dependencies and handling missing values in time-series data. We evaluate our framework for early diagnosis of an extremely challenging condition, septic shock, using two real-world EHRs from distinct medical systems in the U.S. The results show that by separating globally-shared from domain-specific representations, our framework significantly improves septic shock early prediction performance in both EHRs and outperforms the current state-of-the-art DA models. DA - 2020/// PY - 2020/// DO - 10.1109/BigData50022.2020.9378058 SP - 64-73 SN - 2639-1589 KW - adversarial domain adaptation variational RNN KW - Electronic health Record KW - septic shock KW - early prediction ER - TY - JOUR TI - Characterizing the Impact of TCP Coexistence in Data Center Networks AU - Ganji, Anirudh AU - Singh, Anand AU - Shahzad, Muhammad T2 - 2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS) AB - The switch fabrics of today's data centers carry traffic controlled by a variety of TCP congestion control algorithms. This leads us to ask: how does the coexistence of multiple variants of TCP on shared switch fabric impacts the performance achieved by different applications in data centers? To answer this question, we conducted an extensive set of experiments with coexisting TCP variants on Leaf-Spine and Fat-Tree switch fabrics. We executed common data center workloads, which include streaming, MapReduce, and storage workloads, using four commonly used TCP variants, namely BBR, DCTCP, CUBIC, and New Reno. We also extensively executed iPerf workloads using these 4 TCP variants to purely study the impact of the coexistence of TCP variants on each other's performance without incorporating the network behavior of the application layer. Our experiments resulted in a large set of network traces comprised of 160 billion packets (we will release these traces after publication of this work). We present comprehensive observations from these traces that have important implications in ensuring optimal utilization of data center switch fabric and in meeting the network performance needs of application layer workloads. DA - 2020/// PY - 2020/// DO - 10.1109/ICDCS47774.2020.00035 SP - 388-398 SN - 1063-6927 ER - TY - JOUR TI - A Literature Review on Mining Cyberthreat Intelligence from Unstructured Texts AU - Rahman, Md Rayhanur AU - Mahdavi-Hezaveh, Rezvan AU - Williams, Laurie T2 - 20TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2020) AB - Cyberthreat defense mechanisms have become more proactive these days, and thus leading to the increasing incorporation of cyberthreat intelligence (CTI). Cybersecurity researchers and vendors are powering the CTI with large volumes of unstructured textual data containing information on threat events, threat techniques, and tactics. Hence, extracting cyberthreat-relevant information through text mining is an effective way to obtain actionable CTI to thwart cyberattacks. The goal of this research is to aid cybersecurity researchers understand the source, purpose, and approaches for mining cyberthreat intelligence from unstructured text through a literature review of peer-reviewed studies on this topic. We perform a literature review to identify and analyze existing research on mining CTI. By using search queries in the bibliographic databases, 28,484 articles are found. From those, 38 studies are identified through the filtering criteria which include removing duplicates, non-English, non-peer-reviewed articles, and articles not about mining CTI. We find that the most prominent sources of unstructured threat data are the threat reports, Twitter feeds, and posts from hackers and security experts. We also observe that security researchers mined CTI from unstructured sources to extract Indicator of Compromise (IoC), threat-related topic, and event detection. Finally, natural language processing (NLP) based approaches: topic classification; keyword identification; and semantic relationship extraction among the keywords are mostly availed in the selected studies to mine CTI information from unstructured threat sources. DA - 2020/// PY - 2020/// DO - 10.1109/ICDMW51313.2020.00075 SP - 516-525 SN - 2375-9232 ER - TY - JOUR TI - Gang of Eight: A Defect Taxonomy for Infrastructure as Code Scripts AU - Rahman, Akond AU - Farhana, Effat AU - Parnin, Chris AU - Williams, Laurie T2 - 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020) AB - Defects in infrastructure as code (IaC) scripts can have serious consequences, for example, creating large-scale system outages. A taxonomy of IaC defects can be useful for understanding the nature of defects, and identifying activities needed to fix and prevent defects in IaC scripts. The goal of this paper is to help practitioners improve the quality of infrastructure as code (IaC) scripts by developing a defect taxonomy for IaC scripts through qualitative analysis. We develop a taxonomy of IaC defects by applying qualitative analysis on 1,448 defect-related commits collected from open source software (OSS) repositories of the Openstack organization. We conduct a survey with 66 practitioners to assess if they agree with the identified defect categories included in our taxonomy. We quantify the frequency of identified defect categories by analyzing 80,425 commits collected from 291 OSS repositories spanning across 2005 to 2019. DA - 2020/// PY - 2020/// DO - 10.1145/3377811.3380409 SP - 752-764 SN - 0270-5257 KW - bug KW - category KW - configuration as code KW - configuration scripts KW - defect KW - devops KW - infrastructure as code KW - puppet KW - software quality KW - taxonomy ER - TY - JOUR TI - HARP: Holistic Analysis for Refactoring Python-Based Analytics Programs AU - Zhou, Weijie AU - Zhao, Yue AU - Zhang, Guoqiang AU - Shen, Xipeng T2 - 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020) AB - Modern machine learning programs are often written in Python, with the main computations specified through calls to some highly optimized libraries (e.g., TensorFlow, PyTorch). How to maximize the computing efficiency of such programs is essential for many application domains, which has drawn lots of recent attention. This work points out a common limitation in existing efforts: they focus their views only on the static computation graphs specified by library APIs, but leave the influence from the hosting Python code largely unconsidered. The limitation often causes them to miss the big picture and hence many important optimization opportunities. This work proposes a new approach named HARP to address the problem. HARP enables holistic analysis that spans across computation graphs and their hosting Python code. HARP achieves it through a set of novel techniques: analytics-conscious speculative analysis to circumvent Python complexities, a unified representation augmented computation graphs to capture all dimensions of knowledge related with the holistic analysis, and conditioned feedback mechanism to allow risk-controlled aggressive analysis. Refactoring based on HARP gives 1.3--3X and 2.07X average speedups on a set of TensorFlow and PyTorch programs. DA - 2020/// PY - 2020/// DO - 10.1145/3377811.3380434 SP - 506-517 SN - 0270-5257 KW - machine learning program KW - computation graph KW - dynamic language KW - program analysis ER - TY - JOUR TI - An Initial Study on Adapting DTW at Individual Query for Electrocardiogram Analysis AU - Shen, Daniel AU - Chi, Min T2 - ADVANCED ANALYTICS AND LEARNING ON TEMPORAL DATA, AALTD 2019 AB - This paper describes an initial investigation on adapting windowed Dynamic Time Warping (DTW) for enhancing the reliability of fast DTW for Electrocardiogram analysis in Cardiology, a domain where risks are especially important to avoid. The key question it explores is whether it is worthwhile to adapt the window size of DTW for every query temporal sequence, a factor critically determining the speed-accuracy tradeoff of DTW. It in addition extends the adaptation to cover also the order of sequences for lower bound calculations. Experiments on ECG temporal sequences show that the techniques help significantly reduce risks that windowed DTW algorithms are subject to and at the same time keeping a high speed. DA - 2020/// PY - 2020/// DO - 10.1007/978-3-030-39098-3_16 VL - 11986 SP - 213-228 SN - 1611-3349 KW - DTW KW - Time series analytics KW - Algorithm optimizations KW - Electrocardiogram ER - TY - JOUR TI - Here We Go Again: Why Is It Difficult for Developers to Learn Another Programming Language? AU - Shrestha, Nischal AU - Botta, Colton AU - Barik, Titus AU - Parnin, Chris T2 - 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020) AB - Once a programmer knows one language, they can leverage concepts and knowledge already learned, and easily pick up another programming language. But is that always the case? To understand if programmers have difficulty learning additional programming languages, we conductedan empirical study of Stack Overflow questions across 18 different programming languages. We hypothesized that previous knowledge could potentially interfere with learning a new programming language. From our inspection of 450 Stack Overflow questions, we found 276 instances of interference that occurred due to faulty assumptions originating from knowledge about a different language. To understand why these difficulties occurred, we conducted semi-structured interviews with 16 professional programmers. The interviews revealed that programmers make failed attempts to relate a new programming language with what they already know. Our findings inform design implications for technical authors, toolsmiths, and language designers, such as designing documentation and automated tools that reduce interference, anticipating uncommon language transitions during language design, and welcoming programmers not just into a language, but its entire ecosystem. DA - 2020/// PY - 2020/// DO - 10.1145/3377811.3380352 SP - 691-701 SN - 0270-5257 KW - interference theory KW - learning KW - program comprehension KW - programming environments KW - programming languages ER - TY - JOUR TI - SLACC: Simion-based Language Agnostic Code Clones AU - Mathew, George AU - Parnin, Chris AU - Stolee, Kathryn T. T2 - 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020) AB - Successful cross-language clone detection could enable researchers and developers to create robust language migration tools, facilitate learning additional programming languages once one is mastered, and promote reuse of code snippets over a broader codebase. However, identifying cross-language clones presents special challenges to the clone detection problem. A lack of common underlying representation between arbitrary languages means detecting clones requires one of the following solutions: 1) a static analysis framework replicated across each targeted language with annotations matching language features across all languages, or 2) a dynamic analysis framework that detects clones based on runtime behavior. DA - 2020/// PY - 2020/// DO - 10.1145/3377811.3380407 SP - 210-221 SN - 0270-5257 KW - semantic code clone detection KW - cross-language analysis ER - TY - JOUR TI - Caspar: Extracting and Synthesizing User Stories of Problems from App Reviews AU - Guo, Hui AU - Singh, Munindar P. T2 - 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2020) AB - A user's review of an app often describes the user's interactions with the app. These interactions, which we interpret as mini stories, are prominent in reviews with negative ratings. In general, a story in an app review would contain at least two types of events: user actions and associated app behaviors. Being able to identify such stories would enable an app's developer in better maintaining and improving the app's functionality and enhancing user experience. DA - 2020/// PY - 2020/// DO - 10.1145/3377811.3380924 SP - 628-640 SN - 0270-5257 ER - TY - RPRT TI - Text Mining to Identify and Extract Novel Disease Treatments From Unstructured Datasets AU - Yedida, R. AU - Abrar, S.M. AU - Melo-Filho, C. AU - Muratov, E. AU - Chirkova, R. AU - Tropsha, A. DA - 2020/// PY - 2020/// M1 - 2011.07959 M3 - arXiv preprint SN - 2011.07959 ER - TY - CHAP TI - Optimizing Inter-nationality of Journals: A Classical Gradient Approach Revisited via Swarm Intelligence AU - Khaidem, Luckyson AU - Yedida, Rahul AU - Theophilus, Abhijit J. T2 - Modeling, Machine Learning and Astronomy T3 - Communications in Computer and Information Science AB - With the growth of a vast number of new journals, the de facto definitions of Internationality has raised debate across researchers. A robust set of metrics, not prone to manipulation, is paramount for evaluating influence when journals claim “International” status. The ScientoBASE project defines internationality in terms of publication quality and spread of influence beyond geographical boundaries. This is acheived through quantified metrics, like the NLIQ, OCQ, SNIP and ICR, passed into the Cobb Douglas Production Function to estimate the range of influence a journal has over its audience. The global optima of this range is the maximum projected internationality score, or the internationality index of the journal. The optimization, however, being multivariate and constrained presents several challenges to classical techniques, such as curvature variation, premature convergence and parameter scaling. This study approaches these issues by optimizing through the Swarm Intelligence meta-heuristic. Particle Swarm Optimization makes no assumptions on the function being optimized and does away with the need to calculate a gradient. These advantages circumvent the aforementioned issues and highlight the need for traction on machine learning in optimization. The model presented here observes that each journal has an associated globally optimal internationality score that fluctuates proportionally to input metrics, thereby describing a robust confluence of key influence indicators that pave way for investigating alternative criteria for attributing credits to publications. PY - 2020/// DO - 10.1007/978-981-33-6463-9_1 VL - 1290 SP - 3–14 PB - Springer Singapore SN - 9789813364622 9789813364639 SV - 1290 UR - http://dx.doi.org/10.1007/978-981-33-6463-9_1 ER - TY - CONF TI - Parsimonious Computing: A Minority Training Regime for Effective Prediction in Large Microarray Expression Data Sets AU - Sridhar, Shailesh AU - Saha, Snehanshu AU - Shaikh, Azhar AU - Yedida, Rahul AU - Saha, Sriparna T2 - 2020 International Joint Conference on Neural Networks (IJCNN) AB - Rigorous mathematical investigation of learning rates used in back-propagation in shallow neural networks has become a necessity. This is because experimental evidence needs to be endorsed by a theoretical background. Such theory may be helpful in reducing the volume of experimental effort to accomplish desired results. We leveraged the functional property of Mean Square Error, which is Lipschitz continuous to compute learning rate in shallow neural networks. We claim that our approach reduces tuning efforts, especially when a significant corpus of data has to be handled. We achieve remarkable improvement in saving computational cost while surpassing prediction accuracy reported in literature. The learning rate, proposed here, is the inverse of the Lipschitz constant. The work results in a novel method for carrying out gene expression inference on large microarray data sets with a shallow architecture constrained by limited computing resources. A combination of random sub-sampling of the dataset, an adaptive Lipschitz constant inspired learning rate and a new activation function, A-ReLU helped accomplish the results reported in the paper. C2 - 2020/7// C3 - 2020 International Joint Conference on Neural Networks (IJCNN) DA - 2020/7// DO - 10.1109/ijcnn48605.2020.9207083 SP - 1-8 PB - IEEE SN - 9781728169262 UR - http://dx.doi.org/10.1109/ijcnn48605.2020.9207083 ER - TY - JOUR TI - LipschitzLR: Using theoretically computed adaptive learning rates for fast convergence AU - Yedida, Rahul AU - Saha, Snehanshu AU - Prashanth, Tejas T2 - APPLIED INTELLIGENCE AB - We present a novel theoretical framework for computing large, adaptive learning rates. Our framework makes minimal assumptions on the activations used and exploits the functional properties of the loss function. Specifically, we show that the inverse of the Lipschitz constant of the loss function is an ideal learning rate. We analytically compute formulas for the Lipschitz constant of several loss functions, and through extensive experimentation, demonstrate the strength of our approach using several architectures and datasets. In addition, we detail the computation of learning rates when other optimizers, namely, SGD with momentum, RMSprop, and Adam, are used. Compared to standard choices of learning rates, our approach converges faster, and yields better results. DA - 2020/// PY - 2020/// DO - 10.1007/s10489-020-01892-0 VL - 9 KW - Lipschitz constant KW - Adaptive learning KW - Machine learning KW - Deep learning ER - TY - JOUR TI - A Note on Sparse Polynomial Interpolation in Dickson Polynomial Basis AU - Imamoglu, Erdal AU - Kaltofen, Erich L. T2 - ACM COMMUNICATIONS IN COMPUTER ALGEBRA AB - research-article A note on sparse polynomial interpolation in Dickson polynomial basis Share on Authors: Erdal Imamoglu Kirklareli University, Kirklareli, Turkey Kirklareli University, Kirklareli, TurkeyView Profile , Erich L. Kaltofen North Carolina State University, Raleigh, North Carolina and Duke University, Durham, North Carolina North Carolina State University, Raleigh, North Carolina and Duke University, Durham, North CarolinaView Profile Authors Info & Claims ACM Communications in Computer AlgebraVolume 54Issue 4December 2020 pp 125–128https://doi.org/10.1145/3465002.3465003Online:10 May 2021Publication History 0citation29DownloadsMetricsTotal Citations0Total Downloads29Last 12 Months29Last 6 weeks3 Get Citation AlertsNew Citation Alert added!This alert has been successfully added and will be sent to:You will be notified whenever a record that you have chosen has been cited.To manage your alert preferences, click on the button below.Manage my AlertsNew Citation Alert!Please log in to your account Save to BinderSave to BinderCreate a New BinderNameCancelCreateExport CitationPublisher SiteGet Access DA - 2020/12// PY - 2020/12// DO - 10.1145/3465002.3465003 VL - 54 IS - 4 SP - 125-128 SN - 1932-2240 ER - TY - JOUR TI - Aarohi: Making Real-Time Node Failure Prediction Feasible AU - Das, Anwesha AU - Mueller, Frank AU - Rountree, Barry T2 - 2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM IPDPS 2020 AB - Large-scale production systems are well known to encounter node failures, which affect compute capacity and energy. Both in HPC systems and enterprise data centers, combating failures is becoming challenging with increasing hardware and software complexity. Several data mining solutions of logs have been investigated in the context of anomaly detection in such systems. However, with subsequent proactive failure mitigation, the existing log mining solutions are not sufficiently fast for real-time anomaly detection. Machine learning (ML)-based training can produce high accuracy but the inference scheme needs to be enhanced with rapid parsers to assess anomalies in real-time. This work tackles online anomaly prediction in computing systems by exploiting context free grammar-based rapid event analysis. We present our framework Aarohi 1 , which describes an effective way to predict failures online. Aarohi is designed to be generic and scalable making it suitable as a real-time predictor. Aarohi obtains more than 3 minutes lead times to node failures with an average of 0.31 msecs prediction time for a chain length of 18. The overall improvement obtained w.r.t. the existing state-of-the-art is over a factor of 27.4×. Our compiler-based approach provides new research directions for lead time optimization with a significant prediction speedup required for the deployment of proactive fault tolerant solutions in practice. DA - 2020/// PY - 2020/// DO - 10.1109/IPDPS47924.2020.00115 SP - 1092-1101 SN - 1530-2075 KW - Online Prediction KW - HPC KW - Node Failures KW - Parsing ER - TY - JOUR TI - Just-in-time Quantum Circuit Transpilation Reduces Noise AU - Wilson, Ellis AU - Singh, Sudhakar AU - Mueller, Frank T2 - IEEE INTERNATIONAL CONFERENCE ON QUANTUM COMPUTING AND ENGINEERING (QCE20) AB - Running quantum programs is fraught with challenges on on today's noisy intermediate scale quantum (NISQ) devices. Many of these challenges originate from the error characteristics that stem from rapid decoherence and noise during measurement, qubit connections, crosstalk, the qubits themselves, and transformations of qubit state via gates. Not only are qubits not “created equal”, but their noise level also changes over time. IBM is said to calibrate their quantum systems once per day and reports noise levels (errors) at the time of such calibration. This information is subsequently used to map circuits to higher quality qubits and connections up to the next calibration point. This work provides evidence that there is room for improvement over this daily calibration cycle. It contributes a technique to measure noise levels (errors) related to qubits immediately before executing one or more sensitive circuits and shows that just-in-time noise measurements can benefit late physical qubit mappings. With this just-in-time recalibrated transpilation, the fidelity of results is improved over IBM's default mappings, which only uses their daily calibrations. The framework assess two major sources of noise, namely readout errors (measurement errors) and two-qubit gate/connection errors. Experiments indicate that the accuracy of circuit results improves by 3-304% on average and up to 400% with on-the-fly circuit mappings based on error measurements just prior to application execution. DA - 2020/// PY - 2020/// DO - 10.1109/QCE49297.2020.00050 SP - 345-355 KW - quantum computing KW - errors KW - dynamic compilation ER - TY - JOUR TI - Making Fair ML Software using Trustworthy Explanation AU - Chakraborty, Joymallya AU - Peng, Kewen AU - Menzies, Tim T2 - 2020 35TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING (ASE 2020) AB - Machine learning software is being used in many applications (finance, hiring, admissions, criminal justice) having huge social impact. But sometimes the behavior of this software is biased and it shows discrimination based on some sensitive attributes such as sex, race etc. Prior works concentrated on finding and mitigating bias in ML models. A recent trend is using instance-based model-agnostic explanation methods such as LIME[36] to find out bias in the model prediction. Our work concentrates on finding shortcomings of current bias measures and explanation methods. We show how our proposed method based on K nearest neighbors can overcome those shortcomings and find the underlying bias of black box models. Our results are more trustworthy and helpful for the practitioners. Finally, We describe our future framework combining explanation and planning to build fair software. DA - 2020/// PY - 2020/// DO - 10.1145/3324884.3418932 SP - 1229-1233 SN - 1527-1366 ER - TY - JOUR TI - What disconnects Practitioner Belief and Empirical Evidence ? AU - Shrikanth, N. C. AU - Menzies, Tim T2 - 2020 ACM/IEEE 42ND INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING: COMPANION PROCEEDINGS (ICSE-COMPANION 2020) AB - Just because software developers say they believe in "X", that does not necessarily mean that "X" is true. As shown here, there exist numerous beliefs listed in the recent Software Engineering literature which are only supported by small portions of the available data. Hence we ask what is the source of this disconnect between beliefs and evidence?. DA - 2020/// PY - 2020/// DO - 10.1145/3377812.3390802 SP - 286-287 SN - 0270-5257 KW - defects KW - beliefs KW - practitioner KW - empirical software engineering ER - TY - JOUR TI - Hardware-Based Domain Virtualization for Intra-Process Isolation of Persistent Memory Objects AU - Xu, Yuanchao AU - Ye, ChenCheng AU - Solihin, Yan AU - Shen, Xipeng T2 - 2020 ACM/IEEE 47TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2020) AB - Persistent memory has appealing properties in serving as main memory. While file access is protected by system calls, an attached persistent memory object (PMO) is one load/store away from accidental (or malicious) reads or writes, which may arise from use of just one buggy library. The recent progress in intra-process isolation could potentially protect PMO by enabling a process to partition sensitive data and code into isolated components. However, the existing intra-process isolations (e.g., Intel MPK) support isolation of only up to 16 domains, forming a major barrier for PMO protections. Although there is some recent effort trying to virtualize MPK to circumvent the limit, it suffers large overhead. This paper presents two novel architecture supports, which provide 11 - 52 × higher efficiency while offering the first known domain-based protection for PMOs. DA - 2020/// PY - 2020/// DO - 10.1109/ISCA45697.2020.00062 SP - 680-692 SN - 0884-7495 KW - Persistent Memory Objects KW - Memory Protection Keys KW - Intra-process Isolation ER - TY - JOUR TI - Kobold: Evaluating Decentralized Access Control for Remote NSXPC Methods on iOS AU - Deshotels, Luke AU - Carabas, Costin AU - Beichler, Jordan AU - Deaconescu, Razvan AU - Enck, William T2 - 2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2020) AB - Apple uses several access control mechanisms to prevent third party applications from directly accessing security sensitive resources, including sandboxing and file access control. However, third party applications may also indirectly access these resources using inter-process communication (IPC) with system daemons. If these daemons fail to properly enforce access control on IPC, confused deputy vulnerabilities may result. Identifying such vulnerabilities begins with an enumeration of all IPC services accessible to third party applications. However, the IPC interfaces and their corresponding access control policies are unknown and must be reverse engineered at a large scale. In this paper, we present the Kobold framework to study NSXPC-based system services using a combination of static and dynamic analysis. Using Kobold, we discovered multiple NSXPC services with confused deputy vulnerabilities and daemon crashes. Our findings include the ability to activate the microphone, disable access to all websites, and leak private data stored in iOS File Providers. DA - 2020/// PY - 2020/// DO - 10.1109/SP40000.2020.00023 SP - 1056-1070 SN - 1081-6011 KW - access control KW - iOS KW - iPhone KW - inter-process communication KW - fuzzer KW - attack surface KW - automation KW - policy analysis ER - TY - JOUR TI - Anonymous Lottery In The Proof-of-Stake Setting AU - Baldimtsi, Foteini AU - Madathil, Varun AU - Scafuro, Alessandra AU - Zhou, Linfeng T2 - 2020 IEEE 33RD COMPUTER SECURITY FOUNDATIONS SYMPOSIUM (CSF 2020) AB - When Proof-of-Stake (PoS) underlies a consensus protocol, parties who are eligible to participate in the protocol are selected via a public selection function that depends on the stake they own. Identity and stake of the selected parties must then be disclosed in order to allow verification of their eligibility, and this can raise privacy concerns. In this paper, we present a modular approach for addressing the identity leaks of selection functions, decoupling the problem of implementing an anonymous selection of the participants, from the problem of implementing others task, e.g. consensus. We present an ideal functionality for anonymous selection that can be more easily composed with other protocols. We then show an instantiation of our anonymous selection functionality based on the selection function of Algorand. DA - 2020/// PY - 2020/// DO - 10.1109/CSF49147.2020.00030 SP - 318-333 SN - 2374-8303 KW - Blockchain KW - Proof-of-Stake KW - Privacy ER - TY - JOUR TI - Integrating Testing Throughout the CS Curriculum AU - Heckman, Sarah AU - Schmidt, Jessica Young AU - King, Jason T2 - 2020 IEEE 13TH INTERNATIONAL CONFERENCE ON SOFTWARE TESTING, VERIFICATION AND VALIDATION WORKSHOPS (ICSTW) AB - Software testing is a critical component of any software development lifecycle, but becoming an experienced software tester requires understanding many strategies for writing high-quality test cases and a significant amount of practice. Situated learning theory suggests that students should be exposed to things they would see in a professional workplace. In terms of software testing, students should be exposed to real-world software testing practices in a variety of contexts, from the simplest of programs to the very complex. The goal of this paper is to share our experience integrating software testing into our undergraduate curriculum at North Carolina State University. In this paper, we discuss how software testing is taught in our CS1 - Introductory Programming, CS2 - Software Development Fundamentals, and several other courses beyond CS2. Over the past 10 years of teaching software testing in introductory programming courses, we discuss lessons learned and highlight open concerns for future research. DA - 2020/// PY - 2020/// DO - 10.1109/ICSTW50294.2020.00079 SP - 441-444 SN - 2159-4848 KW - testing KW - CS1 KW - CS2 ER - TY - CONF TI - Robust resource provisioning in time-varying edge networks AB - Edge computing is one of the revolutionary technologies that enable high-performance and low-latency modern applications, such as smart cities, connected vehicles, etc. Yet its adoption has been limited by factors including high cost of edge resources, heterogeneous and fluctuating demands, and lack of reliability. In this paper, we study resource provisioning in edge computing, taking into account these different factors. First, based on observations from real demand traces, we propose a time-varying stochastic model to capture the time-dependent and uncertain demand and network dynamics in an edge network. We then apply a novel robustness model that accounts for both expected and worst-case performance of a service. Based on these models, we formulate edge provisioning as a multi-stage stochastic optimization problem. The problem is NP-hard even in the deterministic case. Leveraging the multi-stage structure, we apply nested Benders decomposition to solve the problem. We also describe several efficiency enhancement techniques, including a novel technique for quickly solving the large number of decomposed subproblems. Finally, we present results from real dataset-based simulations, which demonstrate the advantages of the proposed models, algorithm and techniques. C2 - 2020/10/14/ C3 - Twenty-First International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing (MobiHoc) DA - 2020/10/14/ DO - 10.1145/3397166.3409146 ER - TY - CONF TI - A Lightweight Intervention to Decrease Gender Bias in Student Evaluations of Teaching AU - Fisk, Susan AU - Stolee, Kathryn T. AU - Battestilli, Lina T2 - 2020 Research on Equity and Sustained Participation in Engineering, Computing, and Technology (RESPECT) AB - Women are underrepresented as instructors in engineering, computing, and technology classes. One factor that disadvantages women in the classroom are student evaluations of teaching (SETs), as research finds they contain significant gender bias. This may contribute to the dearth of women in computing education, as SETs are used in decisions about contract renewals, hiring, tenure, and promotion. The double-bind is one cause of gender bias in SETs, meaning that it is more difficult for women than for men in leadership positions (such as being a professor) to be perceived as both competent and likable. We examine a lightweight intervention's impact on gender bias caused by the double-bind. Specifically, we conducted a field experiment in which the woman professor of a CS1 class for non-majors gave students in the intervention condition additional, positive exam feedback via email. We hypothesized this would increase students' perceptions of the professor's likability, which would then increase her SETs. We find that the intervention increased top-performing students' ratings of the professors' likability. We also find that the professor received significantly higher SETs the semester she sent the intervention emails. While women should not have to alter their behavior to accommodate students' gender biases, this intervention may be a useful survival strategy for women impacted by gender bias in SETs. C2 - 2020/3/10/ C3 - 2020 Research on Equity and Sustained Participation in Engineering, Computing, and Technology (RESPECT) DA - 2020/3/10/ DO - 10.1109/respect49803.2020.9272454 PB - IEEE SN - 9781728171722 UR - http://dx.doi.org/10.1109/respect49803.2020.9272454 DB - Crossref ER - TY - JOUR TI - Database-Access Performance Antipatterns in Database-Backed Web Applications AU - Shao, Shudi AU - Qiu, Zhengyi AU - Yu, Xiao AU - Yang, Wei AU - Jin, Guoliang AU - Xie, Tao AU - Wu, Xintao T2 - 2020 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION (ICSME 2020) AB - Database-backed web applications are prone to performance bugs related to database accesses. While much work has been conducted on database-access antipatterns with some recent work focusing on performance impact, there still lacks a comprehensive view of database-access performance antipatterns in database-backed web applications. To date, no existing work systematically reports known antipatterns in the literature, and no existing work has studied database-access performance bugs in major types of web applications that access databases differently.To address this issue, we first summarize all known database-access performance antipatterns found through our literature survey, and we report all of them in this paper. We further collect database-access performance bugs from web applications that access databases through language-provided SQL interfaces, which have been largely ignored by recent work, to check how extensively the known antipatterns can cover these bugs. For bugs not covered by the known antipatterns, we extract new database-access performance antipatterns based on real-world performance bugs from such web applications. Our study in total reports 24 known and 10 new database-access performance antipatterns. Our results can guide future work to develop effective tool support for different types of web applications. DA - 2020/// PY - 2020/// DO - 10.1109/ICSME46990.2020.00016 SP - 58-69 SN - 1063-6773 KW - performance antipatterns KW - performance bugs KW - database-backed web applications KW - characteristic study ER - TY - JOUR TI - PRIME: Block-Wise Missingness Handling for Multi-modalities in Intelligent Tutoring Systems AU - Yang, , Xi AU - Kim, Yeo-Jin AU - Taub, Michelle AU - Azevedo, Roger AU - Chi, Min T2 - MULTIMEDIA MODELING (MMM 2020), PT II AB - Block-wise missingness in multimodal data poses a challenging barrier for the analysis over it, which is quite common in practical scenarios such as the multimedia intelligent tutoring systems (ITSs). In this work, we collected data from 194 undergraduates via a biology ITS which involves three modalities: student-system logfiles, facial expressions, and eye tracking. However, only 32 out of the 194 students had all three modalities and 83% of them were missing the facial expression data, eye tracking data, or both. To handle such a block-wise missing problem, we propose a Progressively Refined Imputation for Multi-modalities by auto-Encoder (PRIME), which trains the model based on single, pairwise, and entire modalities for imputation in a progressive manner, and therefore enables us to maximally utilize all the available data. We have evaluated PRIME against single-modality log-only (without missingness handling) and five state-of-the-art missing data handling methods on one important yet challenging student modeling task: to predict students’ learning gains. Our results show that using multimodal data as a result of missing data handling yields better prediction performance than using logfiles only, and PRIME outperforms other baseline methods for both learning gain prediction and data reconstruction tasks. DA - 2020/// PY - 2020/// DO - 10.1007/978-3-030-37734-2_6 VL - 11962 SP - 63-75 SN - 1611-3349 KW - Multimodal KW - Block-wise missing KW - Learning gain prediction ER - TY - JOUR TI - RFMap: Generating Indoor Maps using RF Signals AU - Khan, Usman Mahmood AU - Venkatnarayan, Raghav H. AU - Shahzad, Muhammad T2 - 2020 19TH ACM/IEEE INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING IN SENSOR NETWORKS (IPSN 2020) AB - Generating maps of indoor environments beyond the line of sight finds applications in several areas such as planning, navigation, and security. While researchers have previously explored the use of RF signals to generate maps, prior work has two important limitations: (i) it requires moving the mapping setup along the entire lengths of the sides of the building, and (ii) it generates maps that are not fully connected, rather are scatter plots of locations from where some obstacles reflected the signals. Thus, prior approaches require human interpretation to locate the walls and determine how they merge. In this paper, we address these limitations and propose RFMap, which generates fully connected maps, and does not require the measurement setup to be moved along the sides of the buildings. To generate the map, RFMap first transmits RF signals in many different directions by rotating the antennas while keeping them at the same location and then measures the distances of different reflectors inside the building. Next, it identifies these reflectors and classifies them into various types based on the properties of the reflections. A key challenge is that RFMap does not receive reflections from all the directions due to the specular nature of the reflectors. Due to this, it only gets sparse data about the objects in the environment. To address this challenge, RFMap trains deep generative adversarial network (GAN) to intelligently predict the missing information. At runtime, it feeds the locations and types of the detected reflectors to the trained GAN and generates the complete and accurate map. We implemented RFMap using software defined radios and extensively evaluated it in several real world environments. Our results show that RFMap generated the maps of all the buildings that we tested it on with high accuracy. DA - 2020/// PY - 2020/// DO - 10.1109/IPSN48710.2020.00-40 SP - 133-144 KW - RF KW - Indoor Maps KW - Machine Learning KW - Wireless ER - TY - JOUR TI - Efficient Algorithm for the Topological Characterization of Worm-like and Branched Micelle Structures from Simulations AU - Conchuir, Breanndan O. AU - Gardner, Kirk AU - Jordan, Kirk E. AU - Bray, David J. AU - Anderson, Richard L. AU - Johnston, Michael A. AU - Swope, William C. AU - Harrison, Alex AU - Sheehy, Donald R. AU - Peters, Thomas J. T2 - JOURNAL OF CHEMICAL THEORY AND COMPUTATION AB - Many surfactant-based formulations are utilized in industry as they produce desirable viscoelastic properties at low concentrations. These properties are due to the presence of worm-like micelles (WLMs), and as a result, understanding the processes that lead to WLM formation is of significant interest. Various experimental techniques have been applied with some success to this problem but can encounter issues probing key microscopic characteristics or the specific regimes of interest. The complementary use of computer simulations could provide an alternate route to accessing their structural and dynamic behavior. However, few computational methods exist for measuring key characteristics of WLMs formed in particle simulations. Further, their mathematical formulations are challenged by WLMs with sharp curvature profiles or density fluctuations along the backbone. Here, we present a new topological algorithm for identifying and characterizing WLMs in particle simulations, which has desirable mathematical properties that address shortcomings in previous techniques. We apply the algorithm to the case of sodium dodecyl sulfate micelles to demonstrate how it can be used to construct a comprehensive topological characterization of the observed structures. DA - 2020/7/14/ PY - 2020/7/14/ DO - 10.1021/acs.jctc.0c00311 VL - 16 IS - 7 SP - 4588-4598 SN - 1549-9626 ER - TY - CONF TI - Infusing Computing: A Scaffolding and Teacher Accessibility Analysis of Computing Lessons Designed by Novices AU - Cateté, V. AU - Isvik, A. AU - Barnes, T. AB - Creators of computing curricula do not always have formal pedagogical training. We investigated if exposing novice lesson designers to pedagogical best practices would result in the creation of lessons where evidence of successful use of these practices could be identified. We trained 29 high school students who were in a full-time computer science summer internship on how to create Snap! programming lessons for non-computing courses. Over the course of three weeks they developed computing-infused lessons on their choice of learning topic (science, business, language, etc.). We examined these lessons for their use of scaffolding, teacher accessibility, equity, and content. We found that students implemented many of the scaffolding techniques that they themselves experienced and created lessons that were detailed enough to be accessible for teacher use. We also identified significant relationships between both subject area and gender on equity scores, as well as an impact of collaboration on scaffolding type included. No difference in artifact quality was identified by prior student coding experience. This project represents an innovative way to engage students in learning more computer science while creating educational materials for computing in K-12 classrooms. C2 - 2020/// C3 - ACM International Conference Proceeding Series DA - 2020/// DO - 10.1145/3428029.3428056 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85096921632&partnerID=MN8TOARS ER - TY - CONF TI - FLAMES: A Socially Relevant Computing Summer Internship for High School Students AU - Isvik, A. AU - Catete, V. AU - Barnes, T. AB - In this article, we examine a female-oriented, high school computing outreach program, FLAMES, consisting of an 8-week high school summer intern program run within a university computer science (CS) department. We focus on examining the effects of the program on students skills and affect towards computing. Much of the literature in CS outreach research examines summer camps, after-school programs, and other school-year events that often have a focus on only teaching students computing content. Our program is unique and socially relevant as students are trained to assist teachers with the development of Computational Thinking-Infused curricula for their classrooms. This paper presents the design of our program, an overview of the curriculum, and results including both student and teacher feedback. Results show that the program has benefited each of the parties involved, including its student participants, facilitators, and the teachers assisted by the participants. We share our lessons learned in order to help other CS departments develop similar broadening participation in computing programs. C2 - 2020/// C3 - 2020 Research on Equity and Sustained Participation in Engineering, Computing, and Technology, RESPECT 2020 - Proceedings DA - 2020/// DO - 10.1109/RESPECT49803.2020.9272515 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85093961043&partnerID=MN8TOARS ER - TY - CONF TI - FIRST Principles to Design for Online, Synchronous High School CS Teacher Training and Curriculum Co-Design AU - Grover, S. AU - Cateté, V. AU - Barnes, T. AU - Hill, M. AU - Ledeczi, A. AU - Broll, B. AB - The Covid-19 pandemic has offered new challenges and opportunities for teaching and research. It has forced constraints on in-person gathering of researchers, teachers, and students, and conversely, has also opened doors to creative instructional design. This paper describes a novel approach to designing an online, synchronous teacher professional development (PD) and curriculum co-design experience. It shares our work in bringing together high school teachers and researchers in four US states. The teachers participated in a 3-week summer PD on ideas of Distributed Computing and how to teach this advanced topic to high school students using NetsBlox, an extension of the Snap! block-based programming environment. C2 - 2020/// C3 - ACM International Conference Proceeding Series DA - 2020/// DO - 10.1145/3428029.3428059 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85096938644&partnerID=MN8TOARS ER - TY - CONF TI - Creating a School-wide CS/CT-focused STEM Ecosystem to Address Access Barriers AU - Boulden, D. AU - Edwards, C. AU - Catete, V. AU - Lytle, N. AU - Barnes, T. AU - Wiebe, E.N. AU - Frye, D. AB - STEM ecosystem is an emerging model for identifying the barriers and support structures that students have in their learning trajectories in STEM. In this paper our university-based research team presents a CS/CT-focused STEM ecosystem strategy designed to address underrepresentation in computing fields. We describe our current and future work within our school-level research-practice partnership (RPP) with a local middle school, used to guide the creation of this ecosystem. C2 - 2020/// C3 - 2020 Research on Equity and Sustained Participation in Engineering, Computing, and Technology, RESPECT 2020 - Proceedings DA - 2020/// DO - 10.1109/RESPECT49803.2020.9272485 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85098802518&partnerID=MN8TOARS ER - TY - CONF TI - Bridge to Computing: An outreach program for at-risk young men AU - Catete, V. AU - Bell, D. AU - Isvik, A. AU - Lytle, N. AU - Dong, Y. AU - Barnes, T. AB - In 2017, our police department and the Give Back Organization (GBO), a local non-profit, contacted our university about hosting a game development summer camp. The camp was proposed to keep boys ages 12-15 living in a community with high levels of gang enrollment off the streets while providing an opportunity to learn about college and computing careers. The police also wanted to improve officer-youth r elations. Our lab provided camp counselors, space, and content for the camp. Each camp supported a total of twelve African-American boys. Over three years, we refactored the curricula and organization of the camp and present our lessons learned from the experience. C2 - 2020/// C3 - 2020 Research on Equity and Sustained Participation in Engineering, Computing, and Technology, RESPECT 2020 - Proceedings DA - 2020/// DO - 10.1109/RESPECT49803.2020.9272475 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85098792411&partnerID=MN8TOARS ER - TY - CONF TI - Aligning Theory and Practice in Teacher Professional Development for Computer Science AU - Cateté, V. AU - Alvarez, L. AU - Isvik, A. AU - Milliken, A. AU - Hill, M. AU - Barnes, T. AB - Since the Advanced Placement Computer Science Principles (AP CSP) course has been released, it has vastly increased the need for highly trained CSP teachers who are prepared to bring CS to a diverse group of students. We have designed professional development (PD) workshops for high school teachers learning to teach this new CSP course, basing our design and iterative refinements on effective practices from other STEM disciplines. In summers 2012-2019, we have prepared over 600 teachers to teach CSP. Our PD provides teachers with time to learn CS content and pedagogical content knowledge. A key component of our PD design focuses on professionally-relevant activities– specifically, teachers develop and lead CSP lessons and provide feedback to each other through practice-focused discussions with experienced teachers and their peers. Another key component of our PD is including opportunities for continued professional growth, where we provide opportunities for teachers to engage as leaders who mentor others, curate materials, or facilitate future PDs. Our data has shown an increase in teachers’ confidence to lead in the classroom and also shown equal accessibility and growth for participants regardless of prior programming experience. In this paper, we examine and articulate the foundational theories for our CSP PD and how we have adapted these methodologies to cultivate an inclusive and productive learning environment for teachers. We also perform a retrospective analysis to determine which PD and program activities our teachers found most meaningful and relevant to their daily teaching. C2 - 2020/// C3 - ACM International Conference Proceeding Series DA - 2020/// DO - 10.1145/3428029.3428560 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85096914379&partnerID=MN8TOARS ER - TY - CONF TI - Data-driven approaches for exploring the effects of teacher instruction on student programming behaviors AU - Lytle, N. AU - Catete, V. AU - Dong, Y. C2 - 2020/// C3 - CEUR Workshop Proceedings DA - 2020/// VL - 2734 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85096177425&partnerID=MN8TOARS ER - TY - CONF TI - A block-based modeling curriculum for teaching middle grade science students about Covid-19 AU - Cateté, V. AU - Lytle, N. AU - Boulden, D. AU - Hinckle, M. AU - Wiebe, E. AU - Barnes, T. AB - While the scientific community is learning more about the novel Coronavirus and its associate disease, Covid-19, it is important to begin efforts to educate students on the disease, how it is transmitted, and the possible steps we as societies and individuals can take to combat the spread. To this end, we adapted an existing computational thinking curriculum originally designed to teach students about how infectious diseases are spread by having them build a model within the block-based programming environment, Cellular. This new curriculum introduces relevant scientific terms and tasks student to program an increasingly complex model ending the activity by choosing which risk-reduction strategy to employ. C2 - 2020/// C3 - ACM International Conference Proceeding Series DA - 2020/// DO - 10.1145/3421590.3421624 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85094951953&partnerID=MN8TOARS ER - TY - CONF TI - Symbiotic HW Cache and SW DTLB Prefetching for DRAM/NVM Hybrid Memory AU - Patil, Onkar AU - Mueller, Frank AU - Ionkov, Latchesar AU - Lee, Jason AU - Lang, Michael T2 - 2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS) AB - The introduction of NVDIMM memory devices has encouraged the use of DRAM/NVM based hybrid memory systems to increase the memory-per-core ratio in compute nodes and obtain possible energy and cost benefits. However, Non-Volatile Memory (NVM) is slower than DRAM in terms of read/write latency. This difference in performance will adversely affect memory-bound applications. Traditionally, data prefetching at the hardware level has been used to increase the number of cache hits to mitigate performance degradation. However, software (SW) prefetching has not been used effectively to reduce the effects of high memory access latencies. Also, the current cache hierarchy and hardware (HW) prefetching are not optimized for a hybrid memory system. We hypothesize that HW and SW prefetching can complement each other in placing data in caches and the Data Translation Look-aside Buffer (DTLB) prior to their references, and by doing so adaptively, highly varying access latencies in a DRAM/NVM hybrid memory system are taken into account. This work contributes an adaptive SW prefetch method based on the characterization of read/write/unroll prefetch distances for NVM and DRAM. Prefetch performance is characterized via custom benchmarks based on STREAM2 specifications in a multicore MPI runtime environment and compared to the performance of the standard SW prefetch pass in GCC. Furthermore, the effects of HW prefetching on kernels executing on hybrid memory system are evaluated. Experimental results indicate that SW prefetching targeted to populate the DTLB results in up to 26% performance improvement when symbiotically used in conjunction with HW prefetching, as opposed to only HW prefetching. Based on our findings, changes to GCC's prefetch-loop-arrays compiler pass are proposed to take advantage of DTLB prefetching in a hybrid memory system for kernels that are frequently used in HPC applications. C2 - 2020/11/17/ C3 - 2020 28th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS) CY - Nice, France DA - 2020/11/17/ PY - 2020/// DO - 10.1109/MASCOTS50786.2020.9285963 SP - 1-8 PB - IEEE SN - 9781728192383 UR - http://dx.doi.org/10.1109/MASCOTS50786.2020.9285963 ER - TY - JOUR TI - A Systematic Literature Review of Animal-Assisted Interventions in Oncology (Part II): Theoretical Mechanisms and Frameworks AU - Holder, Timothy R. N. AU - Gruen, Margaret E. AU - Roberts, David L. AU - Somers, Tamara AU - Bozkurt, Alper T2 - INTEGRATIVE CANCER THERAPIES AB - Animal-assisted interventions (AAIs) can improve patients’ quality of life as complementary medical treatments. Part I of this 2-paper systematic review focused on the methods and results of cancer-related AAIs; Part II discusses the theories of the field’s investigators. Researchers cite animal personality, physical touch, physical movement, distraction, and increased human interaction as sources of observed positive outcomes. These mechanisms then group under theoretical frameworks such as the social support hypothesis or the human-animal bond concept to fully explain AAI in oncology. The cognitive activation theory of stress, the science of unitary human beings, and the self-object hypothesis are additional frameworks mentioned by some researchers. We also discuss concepts of neurobiological transduction connecting mechanisms to AAI benefits. Future researchers should base study design on theories with testable hypotheses and use consistent terminology to report results. This review aids progress toward a unified theoretical framework and toward more holistic cancer treatments. DA - 2020/7// PY - 2020/7// DO - 10.1177/1534735420943269 VL - 19 SP - SN - 1552-695X KW - animal-assisted interventions KW - animal-assisted activities KW - animal-assisted therapy KW - oncology KW - cancer KW - human-animal bond KW - mechanisms KW - theoretical frameworks ER - TY - JOUR TI - Quantitative assessment of linear noise-reduction filters for spectroscopy AU - Le, Long V AU - Kim, Young D. AU - Aspnes, David E. T2 - OPTICS EXPRESS AB - Linear noise-reduction filters used in spectroscopy must strike a balance between reducing noise and preserving lineshapes, the two conflicting requirements of interest. Here, we quantify this tradeoff by capitalizing on Parseval’s Theorem to cast two measures of performance, mean-square error (MSE) and noise, into reciprocal- (Fourier-) space (RS). The resulting expressions are simpler and more informative than those based in direct- (spectral-) space (DS). These results provide quantitative insight not only into the effectiveness of different linear filters, but also information as to how they can be improved. Surprisingly, the rectangular (“ideal” or “brick wall”) filter is found to be nearly optimal, a consequence of eliminating distortion in low-order Fourier coefficients where the major fraction of spectral information is contained. Using the information provided by the RS version of MSE, we develop a version that is demonstrably superior to the brick-wall and also the Gauss-Hermite filter, its former nearest competitor. DA - 2020/12/21/ PY - 2020/12/21/ DO - 10.1364/OE.411768 VL - 28 IS - 26 SP - 38917-38933 SN - 1094-4087 ER - TY - JOUR TI - Polynomial Treedepth Bounds in Linear Colorings AU - Kun, Jeremy AU - O'Brien, Michael P. AU - Pilipczuk, Marcin AU - Sullivan, Blair D. T2 - ALGORITHMICA AB - Abstract Low-treedepth colorings are an important tool for algorithms that exploit structure in classes of bounded expansion; they guarantee subgraphs that use few colors have bounded treedepth . These colorings have an implicit tradeoff between the total number of colors used and the treedepth bound, and prior empirical work suggests that the former dominates the run time of existing algorithms in practice. We introduce p - linear colorings as an alternative to the commonly used p -centered colorings. They can be efficiently computed in bounded expansion classes and use at most as many colors as p -centered colorings. Although a set of $$k<p$$ k < p colors from a p -centered coloring induces a subgraph of treedepth at most k , the same number of colors from a p -linear coloring may induce subgraphs of larger treedepth. We establish a polynomial upper bound on the treedepth in general graphs, and give tighter bounds in trees and interval graphs via constructive coloring algorithms. We also give a co-NP-completeness reduction for recognizing p -linear colorings and discuss ways to overcome this limitation in practice. DA - 2020/// PY - 2020/// DO - 10.1007/s00453-020-00760-0 KW - Linear colorings KW - p-centered colorings KW - Bounded expansion KW - Treedepth ER - TY - JOUR TI - TADOC: Text analytics directly on compression AU - Zhang, Feng AU - Zhai, Jidong AU - Shen, Xipeng AU - Wang, Dalin AU - Chen, Zheng AU - Mutlu, Onur AU - Chen, Wenguang AU - Du, Xiaoyong T2 - VLDB JOURNAL AB - This article provides a comprehensive description of text analytics directly on compression (TADOC), which enables direct document analytics on compressed textual data. The article explains the concept of TADOC and the challenges to its effective realizations. Additionally, a series of guidelines and technical solutions that effectively address those challenges, including the adoption of a hierarchical compression method and a set of novel algorithms and data structure designs, are presented. Experiments on six data analytics tasks of various complexities show that TADOC can save 90.8% storage space and 87.9% memory usage, while halving data processing times. DA - 2020/// PY - 2020/// DO - 10.1007/s00778-020-00636-3 KW - Text analytics KW - Document analytics KW - Compression KW - Sequitur ER - TY - JOUR TI - Efficient algorithms for finding2-mediansof a tree AU - Oudjit, Aissa AU - Stallmann, Matthias T2 - NETWORKS AB - Abstract The p ‐median problem for networks is NP‐hard, but polynomial time algorithms exist for trees ( n is the number of nodes): O( pn 2 ) by Tamir, and O( n lg p + 2 n ) by Benkoczi and Bhattacharya. Goldman gave an O( n ) algorithm for the 1‐median problem on trees. Mirchandani and Oudjit proved localization properties for 2‐medians on trees; these were later used to obtain an O( n lg n ) bound, and, in special cases, O( n ) . We present a framework that unifies all efficient algorithms for the 2‐median problem on trees. Our framework isolates the nonlinear part of the computation so that future time‐bound improvements are easily incorporated. We also introduce a method for reducing the search space, improving all known runtimes in many instances. Finally, we give a new algorithm for the case where edge lengths are positive integers. The associated time bound is O( n + D ) , where D is the sum of the logarithms of edge lengths. This is O( n ) if edge lengths are bounded by a constant and O( n lglg n ) if they are O(lg n ) . The algorithm is flexible enough to extend to noninteger edge lengths, preserving the time bound in some circumstances. DA - 2020/// PY - 2020/// DO - 10.1002/net.21978 VL - 9 KW - 2-median KW - binary search KW - linear time KW - priority queue KW - sorting KW - trees ER - TY - JOUR TI - The one-dimensional Green-Naghdi equations with a time dependent bottom topography and their conservation laws AU - Kaptsov, E. AU - Meleshko, S. AU - Samatova, N. F. T2 - PHYSICS OF FLUIDS AB - This paper deals with the one-dimensional Green–Naghdi equations describing the behavior of fluid flow over an uneven bottom topography depending on time. Using Matsuno’s approach, the corresponding equations are derived in Eulerian coordinates. Further study is performed in Lagrangian coordinates. This study allowed us to find the general form of the Lagrangian corresponding to the analyzed equations. Then, Noether’s theorem is used to derive conservation laws. As some of the tools in the application of Noether’s theorem are admitted generators, a complete group classification of the Green–Naghdi equations with respect to the bottom depending on time is performed. Using Noether’s theorem, the found Lagrangians, and the group classification, conservation laws of the one-dimensional Green–Naghdi equations with uneven bottom topography depending on time are obtained. DA - 2020/12/1/ PY - 2020/12/1/ DO - 10.1063/5.0031238 VL - 32 IS - 12 SP - SN - 1089-7666 ER - TY - JOUR TI - Congestion Minimization for Service Chain Routing Problems With Path Length Considerations AU - Gao, Lingnan AU - Rouskas, George N. T2 - IEEE-ACM TRANSACTIONS ON NETWORKING AB - Network function virtualization (NFV), with its perceived potential to accelerate service deployment and to introduce flexibility in service provisioning, has drawn a growing interest from industry and academia alike over the past few years. One of the key challenges in realizing NFV is the service chain routing problem, whereby traffic must be routed so as to traverse the various components of a network service that have been mapped onto the underlying network. In this work, we consider the online service chain routing problem. We route the service chain with the goal of jointly minimizing the maximum network congestion and the number of hops from the source to the destination. To this end, we present a simple yet effective online algorithm in which the routing decision is irrevocably made without prior knowledge of future requests. We prove that our algorithm is O(log m)-competitive in terms of congestion minimization, where m is the number of edges of the underlying network topology, and we show that this ratio is asymptotically optimal. DA - 2020/12// PY - 2020/12// DO - 10.1109/TNET.2020.3017792 VL - 28 IS - 6 SP - 2643-2656 SN - 1558-2566 UR - https://doi.org/10.1109/TNET.2020.3017792 KW - Routing KW - Heuristic algorithms KW - Minimization KW - Network function virtualization KW - Resource management KW - IEEE transactions KW - Noise measurement KW - Network function virtualization KW - virtual network functions KW - NFV orchestration KW - online algorithm KW - resource allocation ER - TY - CONF TI - Enabling Efficient Random Access to Hierarchically-Compressed Data AU - Zhang, Feng AU - Zhai, Jidong AU - Shen, Xipeng AU - Mutlu, Onur AU - Du, Xiaoyong AB - Recent studies have shown the promise of direct data processing on hierarchically-compressed text documents. By removing the need for decompressing data, the direct data processing technique brings large savings in both time and space. However, its benefits have been limited to data traversal operations; for random accesses, direct data processing is several times slower than the state-of-the-art baselines. This paper presents a set of techniques that successfully eliminate the limitation, and for the first time, establishes the feasibility of effectively handling both data traversal operations and random data accesses on hierarchically-compressed data. The work yields a new library, which achieves 3.1× speedup over the state-of-the-art on random data accesses to compressed data, while preserving the capability of supporting traversal operations efficiently and providing large (3.9×) space savings. C2 - 2020/// C3 - 2020 IEEE 36th International Conference on Data Engineering (ICDE) DA - 2020/// DO - 10.1109/ICDE48307.2020.00097 SP - 1069-1080 ER - TY - JOUR TI - DIAC An Inter-app Conflicts Detector for Open IoT Systems AU - Li, Xinyi AU - Zhang, Lei AU - Shen, Xipeng T2 - ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS AB - This article tackles the problem of detecting and solving potential conflicts among independently developed apps that are to be installed into an open Internet-of-Things (IoT) environment. It provides a new set of definitions and categorizations of the conflicts to more precisely characterize the nature of the problem, and it proposes a representation named “IA Graphs” for formally representing IoT controls and inter-app interplays. Based on the definitions and representation, it then describes an efficient conflict detection algorithm. Combining conflict categories, seriousness indicator, and conflict frequency, an innovative solution policy for solving various detected conflicts is developed, which also takes into account user preference and interest by providing interactive process. It implements a compiler and runtime software system that integrates all the proposed techniques together into a comprehensive solution. Experiments on SmartThings apps validate its significantly better detection efficacy over prior methods and effectiveness of conflict solution with user preference. DA - 2020/11// PY - 2020/11// DO - 10.1145/3391895 VL - 19 IS - 6 SP - SN - 1558-3465 KW - IoT KW - compiler KW - conflicts detection ER - TY - JOUR TI - Automated real-time anomaly detection of temperature sensors through machine-learning AU - Nayak, Debanjana AU - Perros, Harry T2 - INTERNATIONAL JOURNAL OF SENSOR NETWORKS AB - Fast identification of faulty sensors is necessary for guaranteeing their robust functions in diverse applications ranging from extreme weather prediction to energy saving to healthcare. We present an automated machine-learning based framework that can detect anomalies of temperature sensor data in real-time. We adopted a purely temporal approach that utilises a univariate time-series (UTS) generated by a single sensor. The framework divides the UTS into subsequences, models each subsequence stochastically as an autoregressive function, and finally mines the function parameters with a one-class support vector machines (OC-SVM) that classifies any outlier as an anomaly. Extensive experimentation showed that the framework identifies both normal and anomalous data correctly with high degrees of accuracy. DA - 2020/// PY - 2020/// DO - 10.1504/IJSNET.2020.111233 VL - 34 IS - 3 SP - 137-152 SN - 1748-1287 KW - UTS KW - univariate time-series KW - anomaly detection KW - temperature sensors KW - OC-SVMs KW - one-class support vector machines KW - autoregression ER - TY - JOUR TI - Some Stress Is Good Stress: The Challenge-Hindrance Framework, Academic Self-Efficacy, and Academic Outcomes AU - Travis, Justin AU - Kaszycki, Alyssa AU - Geden, Michael AU - Bunde, James T2 - JOURNAL OF EDUCATIONAL PSYCHOLOGY AB - Historically, most investigations involving stress have assumed its undesirability, and deleterious effects have been identified across a variety of domains. Recently, however, researchers in management and health have differentiated between types of stress, and revealed a more complicated picture as a result. Specifically, stressors perceived as goal-relevant and manageable (i.e., challenging) are thought to increase motivation, performance, and well-being, while stressors viewed as goal-relevant but unmanageable (i.e., hindering) are believed to hamper performance and occasion maladaptive behaviors. Empirical support for this theoretical framework has accumulated in employment settings, but the model has yet to be adequately extended to higher education. The current study used a longitudinal design and multiple academic outcomes to explore the challenge-hindrance distinction in a large, diverse student sample. Students from 2 Southeastern institutions (N = 853) were assessed for challenge stress (e.g., class difficulty, high expectations), hindrance stress (e.g., ambiguous expectations, favoritism), academic self-efficacy (ASE), grade point average (GPA), hours withdrawn, and transfer intentions. Results were generally theory-consistent, as ratings of challenge and hindrance stress were associated with positive and negative academic outcomes, respectively. ASE did not moderate the challenge–GPA relationship, but emerged as an independent predictor of academic functioning. Implications for stress researchers, educators, and academic decision-makers are discussed. (PsycInfo Database Record (c) 2020 APA, all rights reserved) DA - 2020/11// PY - 2020/11// DO - 10.1037/edu0000478 VL - 112 IS - 8 SP - 1632-1643 SN - 1939-2176 KW - stressors KW - self-efficacy KW - academic performance ER - TY - CONF TI - GIS-Based Estimation of Seasonal Solar Energy Potential for Parking Lots and Roads AU - Vivek Nanda, V.M. AU - Tateosian, L. AU - Baran, P. T2 - 2020 IEEE Green Technologies Conference C2 - 2020/// C3 - IEEE Greentech Conference Proceedings CY - Okalahoma City, OK DA - 2020/// PY - 2020/4/1/ ER - TY - CONF TI - A Review of Geospatial Content in IEEE Visualization Publications AU - Yoshizumi, A. AU - Coffer, M. AU - Collins, E. AU - Gaines, M. AU - Gao, X. AU - Jones, K. AU - McGregor, I. AU - McQuillan, K. AU - Perin, V. AU - Worm, T. AU - Tomkins, L. AU - Tateosian, L. T2 - 2020 IEEE Visualization Conference C2 - 2020/10/20/ C3 - Proceedings IEEE Visualization 2020 CY - Salt Lake City, Utah DA - 2020/10/20/ PY - 2020/10/25/ ER - TY - JOUR TI - Evolution of novel activation functions in neural network training for astronomy data: habitability classification of exoplanets AU - Saha, Snehanshu AU - Nagaraj, Nithin AU - Mathur, Archana AU - Yedida, Rahul AU - Sneha, H. R. T2 - EUROPEAN PHYSICAL JOURNAL-SPECIAL TOPICS AB - We present analytical exploration of novel activation functions as consequence of integration of several ideas leading to implementation and subsequent use in habitability classification of exoplanets. Neural networks, although a powerful engine in supervised methods, often require expensive tuning efforts for optimized performance. Habitability classes are hard to discriminate, especially when attributes used as hard markers of separation are removed from the data set. The solution is approached from the point of investigating analytical properties of the proposed activation functions. The theory of ordinary differential equations and fixed point are exploited to justify the "lack of tuning efforts" to achieve optimal performance compared to traditional activation functions. Additionally, the relationship between the proposed activation functions and the more popular ones is established through extensive analytical and empirical evidence. Finally, the activation functions have been implemented in plain vanilla feed-forward neural network to classify exoplanets. DA - 2020/11// PY - 2020/11// DO - 10.1140/epjst/e2020-000098-9 VL - 229 IS - 16 SP - 2629-2738 SN - 1951-6401 UR - https://doi.org/10.1140/epjst/e2020-000098-9 ER - TY - JOUR TI - Robust Revocable Anonymous Authentication for Vehicle to Grid Communications AU - Kilari, Vishnu T. AU - Yu, Ruozhou AU - Misra, Satyajayant AU - Xue, Guoliang T2 - IEEE Transactions on Intelligent Transportation Systems (T-ITS) AB - Electric vehicles can place a significant load on the power grid due to their unscheduled charging events. One way of improving power grid stability is to schedule electric vehicle charging in advance. Before a charging visit, the electric vehicle provides necessary information to request for charging at a charging station, which prepares and reserves the energy before the visit. However, the reported information can cause privacy leakage of the electric vehicle user. Anonymous information reporting can protect user privacy, but also enables attacks on the charging station by unauthorized users. An anonymous authentication system can address these issues, but cannot detect misbehaviors by authenticated users. One remedy to this is revocable anonymity-based authentication, which can revoke the anonymity of malicious users after their misbehaviors. However, we show that such a system is still vulnerable to application-level Denial of Service attacks, where a malicious user requests for large amounts of energy simultaneously from many charging stations, preventing these stations from serving other users. To address this, we improve upon an existing revocable anonymity-based authentication framework. We propose a permit-based mechanism, where each electric vehicle is only issued with one blind signature-based permit at a time. A request is valid only if it contains a valid and unused permit, which protects the system from the application-level Denial of Service attacks. Security analysis and experiments demonstrate that our framework, while ensuring user anonymity and being robust to the aforementioned attack, is also scalable and lightweight. DA - 2020/11// PY - 2020/11// DO - 10.1109/tits.2019.2948803 VL - 21 IS - 11 SP - 4845–4857 KW - Smart grid KW - V2G communications KW - anonymous authentication KW - revocable anonymity ER - TY - JOUR TI - Protocols Over Things: A Decentralized Programming Model for the Internet of Things AU - V, Samuel H. Christie AU - Smirnova, Daria AU - Chopra, Amit K. AU - Singh, Munindar P. T2 - Computer AB - Current programming models for developing Internet of Things (IoT) applications are logically centralized and ill-suited for most IoT applications. We contribute Protocols over Things, a decentralized programming model that represents an IoT application via a protocol between the parties involved and provides improved performance over network-level delivery guarantees. DA - 2020/12// PY - 2020/12// DO - 10.1109/MC.2020.3023887 VL - 53 IS - 12 SP - 60-68 UR - https://doi.org/10.1109/MC.2020.3023887 KW - Protocols KW - Internet of Things KW - Logistics KW - Programming KW - Correlation KW - Decision making ER - TY - CHAP TI - Traffic Grooming AU - Dutta, Rudra AU - Harai, Hiroaki T2 - Springer Handbook of Optical Networks AB - ZusammenfassungA particular thread of research in optical networking that is concerned with the efficient assignment of traffic demands to available network bandwidth became known as traffic grooming in the mid-1990s. Initially motivated by the distinctly different network characteristics of optical and electronic communication channels, the area focused on how subwavelength traffic components were to be mapped to wavelength communication channels, such that the need to convert traffic back to the electronic domain at intermediate network nodes, for the purpose of differential routing, was minimized. Over time, it broadened to include joint considerations with other network design goals and constraints. It was influenced in turn by existing technology limitations, and in turn served to influence continuing technology trends. Traffic grooming has had a significant effect on both the research and practice of transport networking. It continues to be a meaningful area not just in historical terms, but as a wealth of techniques that can be called upon for considering the traffic engineering problem afresh as each new development at the optical layer, or change in economic realities of networking equipment or traffic requirements, redefines the conditions of that problem. PY - 2020/// DO - 10.1007/978-3-030-16250-4_14 SP - 513-534 PB - Springer International Publishing UR - https://doi.org/10.1007/978-3-030-16250-4_14 ER - TY - JOUR TI - Advanced Wireless for Unmanned Aerial Systems: 5G Standardization, Research Challenges, and AERPAW Architecture AU - Marojevic, Vuk AU - Guvenc, Ismail AU - Dutta, Rudra AU - Sichitiu, Mihail L. AU - Floyd, Brian A. T2 - IEEE Vehicular Technology Magazine AB - The 5G mobile communications systems merge traditionally separate communications and networking systems and services to effectively support a myriad of heterogeneous applications. Researchers and industry working groups are investigating the integration of aerial nodes, shared spectrum techniques, and new network architectures, which are gradually being introduced into standards. This article discusses relevant standardization efforts for the integration of unmanned aerial systems (UASs) into 5G and the requirements for an aerial wireless testbed. We introduce the aerial experimentation and research platform for advanced wireless (AERPAW) and, specifically, its architecture, which is designed for enabling experimental research in controlled yet production-like environments. Sample research projects and trials show the critical R&D needs, broad scope, and impact that such a platform can have on technology evolution, regulation, and standardization as well as future services. DA - 2020/6// PY - 2020/6// DO - 10.1109/MVT.2020.2979494 VL - 15 IS - 2 SP - 22-30 UR - https://doi.org/10.1109/MVT.2020.2979494 KW - 5G mobile communication KW - Wireless communication KW - Software KW - Long Term Evolution KW - 3GPP KW - Three-dimensional displays ER - TY - JOUR TI - Studying Programming in the Neuroage: Just a Crazy Idea? AU - Siegmund, Janet AU - Peitek, Norman AU - Brechmann, Andre AU - Parnin, Chris AU - Apel, Sven T2 - COMMUNICATIONS OF THE ACM AB - Programming research has entered the Neuroage. DA - 2020/6// PY - 2020/6// DO - 10.1145/3347093 VL - 63 IS - 6 SP - 30-34 SN - 1557-7317 ER - TY - JOUR TI - iSENSE2.0: Improving Completion-aware Crowdtesting Management with Duplicate Tagger and Sanity Checker AU - Wang, Junjie AU - Yang, Ye AU - Menzies, Tim AU - Wang, Qing T2 - ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY AB - Software engineers get questions of “how much testing is enough” on a regular basis. Existing approaches in software testing management employ experience-, risk-, or value-based analysis to prioritize and manage testing processes. However, very few is applicable to the emerging crowdtesting paradigm to cope with extremely limited information and control over unknown, online crowdworkers. In practice, deciding when to close a crowdtesting task is largely done by experience-based guesswork and frequently results in ineffective crowdtesting. More specifically, it is found that an average of 32% testing cost was wasteful spending in current crowdtesting practice. This article intends to address this challenge by introducing automated decision support for monitoring and determining appropriate time to close crowdtesting tasks. To that end, it first investigates the necessity and feasibility of close prediction of crowdtesting tasks based on an industrial dataset. Next, it proposes a close prediction approach named iSENSE2.0, which applies incremental sampling technique to process crowdtesting reports arriving in chronological order and organizes them into fixed-sized groups as dynamic inputs. Then, a duplicate tagger analyzes the duplicate status of received crowd reports, and a CRC-based (Capture-ReCapture) close estimator generates the close decision based on the dynamic bug arrival status. In addition, a coverage-based sanity checker is designed to reinforce the stability and performance of close prediction. Finally, the evaluation of iSENSE2.0 is conducted on 56,920 reports of 306 crowdtesting tasks from one of the largest crowdtesting platforms. The results show that a median of 100% bugs can be detected with 30% saved cost. The performance of iSENSE2.0 does not demonstrate significant difference with the state-of-the-art approach iSENSE , while the later one relies on the duplicate tag, which is generally considered as time-consuming and tedious to obtain. DA - 2020/10// PY - 2020/10// DO - 10.1145/3394602 VL - 29 IS - 4 SP - SN - 1557-7392 UR - https://doi.org/10.1145/3394602 KW - Crowdsourced testing KW - test management KW - close prediction KW - term coverage KW - capture-recapture ER - TY - JOUR TI - Avoiding Help Avoidance: Using Interface Design Changes to Promote Unsolicited Hint Usage in an Intelligent Tutor AU - Maniktala, Mehak AU - Cody, Christa AU - Barnes, Tiffany AU - Chi, Min T2 - INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION AB - Within intelligent tutoring systems, considerable research has investigated hints, including how to generate data-driven hints, what hint content to present, and when to provide hints for optimal learning outcomes. However, less attention has been paid to how hints are presented. In this paper, we propose a new hint delivery mechanism called “Assertions” for providing unsolicited hints in a data-driven intelligent tutor. Assertions are partially-worked example steps designed to appear within a student workspace, and in the same format as student-derived steps, to show students a possible subgoal leading to the solution. We hypothesized that Assertions can help address the well-known hint avoidance problem. In systems that only provide hints upon request, hint avoidance results in students not receiving hints when they are needed. Our unsolicited Assertions do not seek to improve student help-seeking, but rather seek to ensure students receive the help they need. We contrast Assertions with Messages, text-based, unsolicited hints that appear after student inactivity. Our results show that Assertions significantly increase unsolicited hint usage compared to Messages. Further, they show a significant aptitude-treatment interaction between Assertions and prior proficiency, with Assertions leading students with low prior proficiency to generate shorter (more efficient) posttest solutions faster. We also present a clustering analysis that shows patterns of productive persistence among students with low prior knowledge when the tutor provides unsolicited help in the form of Assertions. Overall, this work provides encouraging evidence that hint presentation can significantly impact how students use them and using Assertions can be an effective way to address help avoidance. DA - 2020/11// PY - 2020/11// DO - 10.1007/s40593-020-00213-3 VL - 30 IS - 4 SP - 637-667 SN - 1560-4306 KW - Intelligent tutoring system KW - Help avoidance KW - User experience KW - Unsolicited hints KW - Aptitude-treatment interaction KW - Logic proofs KW - Productive persistence KW - Clustering KW - problem solving ER - TY - CONF TI - New Foundations of Ethical Multiagent Systems AU - Murukannaiah, Pradeep K. AU - Ajmeri, Nirav AU - Jonker, Catholijn M. AU - Singh, Munindar P. C2 - 2020/5/1/ C3 - Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS) DA - 2020/5/1/ SP - 1706-1710 PB - International Foundation for Autonomous Agents and MultiAgent Systems UR - https://research-information.bris.ac.uk/en/publications/766baf8c-9f8a-4311-89f4-9b436a30d64c N1 - Blue Sky Ideas Track RN - Blue Sky Ideas Track ER - TY - CONF TI - Elessar: Ethics in Norm-Aware Agents AU - Ajmeri, Nirav AU - Guo, Hui AU - Murukannaiah, Pradeep K. AU - Singh, Munindar P. C2 - 2020/5/1/ C3 - Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems (AAMAS) DA - 2020/5/1/ SP - 16-24 PB - International Foundation for Autonomous Agents and MultiAgent Systems UR - https://research-information.bris.ac.uk/en/publications/fb85ded1-f0d2-4ef2-93bb-4a15fccafc6a ER - TY - JOUR TI - Elementary Students' Understanding of CS Terms AU - Vandenberg, Jessica AU - Tsan, Jennifer AU - Boulden, Danielle AU - Zakaria, Zarifa AU - Lynch, Collin AU - Boyer, Kristy Elizabeth AU - Wiebe, Eric T2 - ACM TRANSACTIONS ON COMPUTING EDUCATION AB - The language and concepts used by curriculum designers are not always interpreted by children as designers intended. This can be problematic when researchers use self-reported survey instruments in concert with curricula, which often rely on the implicit belief that students’ understanding aligns with their own. We report on our refinement of a validated survey to measure upper elementary students’ attitudes and perspectives about computer science (CS), using an iterative, design-based research approach informed by educational and psychological cognitive interview processes. We interviewed six groups of students over three iterations of the instrument on their understanding of CS concepts and attitudes toward coding. Our findings indicated that students could not explain the terms computer programs nor computer science as expected. Furthermore, they struggled to understand how coding may support their learning in other domains. These results may guide the development of appropriate CS-related survey instruments and curricular materials for K–6 students. DA - 2020/9// PY - 2020/9// DO - 10.1145/3386364 VL - 20 IS - 3 SP - SN - 1946-6226 KW - Cognitive interviewing KW - elementary KW - computer science KW - instrument development ER - TY - JOUR TI - Ethics in Self-* Sociotechnical Systems (Tutorial Abstract) AU - Ajmeri, Nirav AU - Murukannaiah, Pradeep K. AU - Singh, Munindar P. T2 - 2020 IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING AND SELF-ORGANIZING SYSTEMS COMPANION (ACSOS-C 2020) AB - The surprising capabilities demonstrated by AI technologies overlaid on detailed data and fine-grained control give cause for concern that agents can wield enormous power over human welfare, drawing increasing attention to ethics in AI. DA - 2020/// PY - 2020/// DO - 10.1109/ACSOS-C51401.2020.00070 SP - 249-249 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85092729693&partnerID=MN8TOARS ER - TY - JOUR TI - Robust Group Subspace Recovery: A New Approach for Multi-Modality Data Fusion AU - Ghanem, Sally AU - Panahi, Ashkan AU - Krim, Hamid AU - Kerekes, Ryan A. T2 - IEEE SENSORS JOURNAL AB - Robust Subspace Recovery (RoSuRe) algorithm was recently introduced as a principled and numerically efficient algorithm that unfolds underlying Unions of Subspaces (UoS) structure, present in the data. The union of Subspaces (UoS) is capable of identifying more complex trends in data sets than simple linear models. We build on and extend RoSuRe to prospect the structure of different data modalities individually. We propose a novel multi-modal data fusion approach based on group sparsity which we refer to as Robust Group Subspace Recovery (RoGSuRe). Relying on a bi-sparsity pursuit paradigm and non-smooth optimization techniques, the introduced framework learns a new joint representation of the time series from different data modalities, respecting an underlying UoS model. We subsequently integrate the obtained structures to form a unified subspace structure. The proposed approach exploits the structural dependencies between the different modalities data to cluster the associated target objects. The resulting fusion of the unlabeled sensors' data from experiments on audio and magnetic data has shown that our method is competitive with other state of the art subspace clustering methods. The resulting UoS structure is employed to classify newly observed data points, highlighting the abstraction capacity of the proposed method. DA - 2020/// PY - 2020/// DO - 10.1109/JSEN.2020.2999461 VL - 20 IS - 20 SP - 12307-12316 SN - 1558-1748 KW - Sensor fusion KW - Data integration KW - Data models KW - Sparse matrices KW - Magnetic sensors KW - Sensor phenomena and characterization KW - Sparse learning KW - unsupervised classification KW - data fusion KW - multimodal data ER - TY - JOUR TI - An Improved Broadcast Authentication Protocol for Wireless Sensor Networks Based on the Self-Reinitializable Hash Chains AU - Huang, Haiping AU - Huang, Qinglong AU - Xiao, Fu AU - Wang, Wenming AU - Li, Qi AU - Dai, Ting T2 - SECURITY AND COMMUNICATION NETWORKS AB - Broadcast authentication is a fundamental security primitive in wireless sensor networks (WSNs), which is a critical sensing component of IoT. Although symmetric-key-based μTESLA protocol has been proposed, some concerns about the difficulty of predicting the network lifecycle in advance and the security problems caused by an overlong long hash chain still remain. This paper presents a scalable broadcast authentication scheme named DH-μTESLA, which is an extension and improvement of μTESLA and Multilevel μTESLA, to achieve several vital properties, such as infinite lifecycle of hash chains, security authentication, scalability, and strong tolerance of message loss. The proposal consists of the t,n-threshold-based self-reinitializable hash chain scheme (SRHC-TD) and the d-left-counting-Bloom-filter-based authentication scheme (AdlCBF). In comparison to other broadcast authentication protocols, our proposal achieves more security properties such as fresh node’s participation and DoS resistance. Furthermore, the reinitializable hash chain constructed in SRHC-TD is proved to be secure and has less computation and communication overhead compared with typical solutions, and efficient storage is realized based on AdlCBF, which can also defend against DoS attacks. DA - 2020/9/1/ PY - 2020/9/1/ DO - 10.1155/2020/8897282 VL - 2020 SP - SN - 1939-0122 ER - TY - JOUR TI - Minimizing Inter-Core Crosstalk Jointly in Spatial, Frequency, and Time Domains for Scheduled Lightpath Demands in Multi-Core Fiber-based Elastic Optical Network AU - Tang, Fengxian AU - Li, Yongcheng AU - Shen, Gangxiang AU - Rouskas, George N. T2 - JOURNAL OF LIGHTWAVE TECHNOLOGY AB - Elastic optical networks (EON) technology in combination with space division multiplexing (SDM) is considered as having the potential to expand the transmission capacity of optical transport networks. However, inter-core crosstalk may cause serious signal impairment in a multi-core fiber (MCF) links. At the same time, scheduled lightpath demands, for which the expected setup and teardown times are known in advance, are considered as an important type of traffic demand for future networks. In this article, we develop approaches to schedule simultaneous lightpaths onto non-adjacent MCF cores so as to reduce inter-core crosstalk between these lightpaths. To this end, we first define a new metric to estimate the inter-core crosstalk jointly considering the spatial, frequency, and time domains. We then tackle the routing, spectrum, core, and time assignment (RSCTA) problem for the MCF-based EON by developing an integer linear programming (ILP) model, as well as an auxiliary graph (AG) based heuristic algorithm, which jointly optimize spectrum resource utilization and reduce the lightpath inter-core crosstalk. Simulation studies show the effectiveness of the proposed approach in terms of both performance aspects. In addition, the performance of the proposed heuristic algorithm is shown to be close to that of the ILP model in small networks. DA - 2020/10/15/ PY - 2020/10/15/ DO - 10.1109/JLT.2020.3004138 VL - 38 IS - 20 SP - 5595-5607 SN - 1558-2213 UR - https://doi.org/10.1109/JLT.2020.3004138 KW - Crosstalk KW - Optical fiber networks KW - Heuristic algorithms KW - Routing KW - Resource management KW - Time-domain analysis KW - Inter-core crosstalk KW - RSCTA KW - MCF-based EON KW - scheduled lightpath demand ER - TY - JOUR TI - Bimodal affect recognition based on autoregressive hidden Markov models from physiological signals AU - Akbulut, Fatma Patlar AU - Perros, Harry G. AU - Shahzad, Muhammad T2 - COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE AB - Background and objective: Affect provides contextual information about the emotional state of a person as he/she communicates in both verbal and/or non-verbal forms. While human’s are great at determining the emotional state of people while they communicate in person, it is challenging and still largely an unsolved problem to computationally determine the emotional state of a person. Methods: Emotional states of a person manifest in the physiological biosignals such as electrocardiogram (ECG) and electrodermal activity (EDA) because these signals are impacted by the peripheral nervous system of the body, and the peripheral nervous system is strongly coupled with the mental state of the person. In this paper, we present a method to accurately recognize six emotions using ECG and EDA signals and applying autoregressive hidden Markov models (AR-HMMs) and heart rate variability analysis on these signals. The six emotions include happiness, sadness, surprise, fear, anger, and disgust. Results: We evaluated our method on a comprehensive new dataset collected from 30 participants. Our results show that our proposed method achieves an average accuracy of 88.6% in distinguishing across the 6 emotions. Conclusions: The key technical depth of the paper is in the use of the AR-HMMs to model the EDA signal and the use of LDA to enable accurate emotion recognition without requiring a large number of training samples. Unlike other studies, we have taken a hierarchical approach to classify emotions, where we first categorize the emotion as either positive or negative and then identify the exact emotion. DA - 2020/10// PY - 2020/10// DO - 10.1016/j.cmpb.2020.105571 VL - 195 SP - SN - 1872-7565 KW - Affect recognition KW - Autoregressive hidden Markov models KW - Machine learning KW - Biosignals KW - Heart rate variability ER - TY - CONF TI - Unproductive Help-seeking in Programming: What it is and How to Address it AB - While programming, novices often lack the ability to effectively seek help, such as when to ask for a hint or feedback. Students may avoid help when they need it, or abuse help to avoid putting in effort, and both behaviors can impede learning. In this paper we present two main contributions. First, we investigated log data from students working in a programming environment that offers automated hints, and we propose a taxonomy of unproductive help-seeking behaviors in programming. Second, we used these findings to design a novel user interface for hints that subtly encourages students to seek help with the right frequency, estimated with a data-driven algorithm. We conducted a pilot study to evaluate our data-driven (DD) hint display, compared to a traditional interface, where students request hints on-demand as desired. We found students with the DD display were less than half as likely to engage in unproductive help-seeking, and we found suggestive evidence that this may improve their learning. C2 - 2020/6/15/ C3 - Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education DA - 2020/6/15/ DO - 10.1145/3341525.3387394 UR - http://dx.doi.org/10.1145/3341525.3387394 ER - TY - JOUR TI - Community Detection and Improved Detectability in Multiplex Networks AU - Huang, Yuming AU - Panahi, Ashkan AU - Krim, Hamid AU - Dai, Liyi T2 - IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING AB - We investigate the widely encountered problem of detecting communities in multiplex networks, such as social networks, with an unknown arbitrary heterogeneous structure. To improve detectability, we propose a generative model that leverages the multiplicity of a single community in multiple layers, with no prior assumption on the relation of communities among different layers. Our model relies on a novel idea of incorporating a large set of generic localized community label constraints across the layers, in conjunction with the celebrated Stochastic Block Model (SBM) in each layer. Accordingly, we build a probabilistic graphical model over the entire multiplex network by treating the constraints as Bayesian priors. We mathematically prove that these constraints/priors promote existence of identical communities across layers without introducing further correlation between individual communities. The constraints are further tailored to render a sparse graphical model and the numerically efficient Belief Propagation algorithm is subsequently employed. We further demonstrate by numerical experiments that in the presence of consistent communities between different layers, consistent communities are matched, and the detectability is improved over a single layer. We compare our model with a “correlated model” which exploits the prior knowledge of community correlation between layers. Similar detectability improvement is obtained under such a correlation, even though our model relies on much milder assumptions than the correlated model. Our model even shows a better detection performance over a certain correlation and signal to noise ratio (SNR) range. In the absence of community correlation, the correlation model naturally fails, while ours maintains its performance. DA - 2020/// PY - 2020/// DO - 10.1109/TNSE.2019.2949036 VL - 7 IS - 3 SP - 1697-1709 SN - 2327-4697 KW - Multiplexing KW - Stochastic processes KW - Belief propagation KW - Correlation KW - Periodic structures KW - Computational modeling KW - Bayes methods KW - Network theory (graphs) KW - graphical models KW - belief propagation ER - TY - JOUR TI - Adaptive Metrics for Adaptive Samples AU - Cavanna, Nicholas J. AU - Sheehy, Donald R. T2 - ALGORITHMS AB - We generalize the local-feature size definition of adaptive sampling used in surface reconstruction to relate it to an alternative metric on Euclidean space. In the new metric, adaptive samples become uniform samples, making it simpler both to give adaptive sampling versions of homological inference results and to prove topological guarantees using the critical points theory of distance functions. This ultimately leads to an algorithm for homology inference from samples whose spacing depends on their distance to a discrete representation of the complement space. DA - 2020/8// PY - 2020/8// DO - 10.3390/a13080200 VL - 13 IS - 8 SP - SN - 1999-4893 UR - https://doi.org/10.3390/a13080200 KW - surface reconstruction KW - homology inference KW - adaptive sampling KW - topological data analysis ER - TY - JOUR TI - Group classification of the two-dimensional shallow water equations with the beta-plane approximation of coriolis parameter in Lagrangian coordinates AU - Meleshko, S. AU - Samatova, N. F. T2 - COMMUNICATIONS IN NONLINEAR SCIENCE AND NUMERICAL SIMULATION AB - Two-dimensional shallow water equations with uneven bottom and a Coriolis parameter f=f0+βy, (β ≠ 0) in mass Lagrangian coordinates are studied in this paper. The equations describing these flows are reduced to two Euler–Lagrange equations. The paper provides a complete group classification of the equations and applications of Noether’s theorem for constructing conservation laws. DA - 2020/11// PY - 2020/11// DO - 10.1016/j.cnsns.2020.105337 VL - 90 SP - SN - 1878-7274 KW - Lagrangian coordinates KW - Shallow water equations KW - Uneven bottom KW - Admitted lie group ER - TY - JOUR TI - BarrierFinder: recognizing ad hoc barriers AU - Wang, Tao AU - Yu, Xiao AU - Qiu, Zhengyi AU - Jin, Guoliang AU - Mueller, Frank T2 - EMPIRICAL SOFTWARE ENGINEERING DA - 2020/11// PY - 2020/11// DO - 10.1007/s10664-020-09862-3 VL - 25 IS - 6 SP - 4676-4706 SN - 1573-7616 KW - Ad hoc synchronizations KW - Barriers KW - Program slicing KW - Symbolic execution KW - Temporal invariants ER - TY - JOUR TI - Featured Research on Equity and Sustained Participation in Engineering, Computing, and Technology AU - Barnes, Tiffany AU - Payton, Jamie AU - Washington, Nicki AU - Stukes, Felesia AU - Peterfreund, Alan AU - Dunton, Sarah T2 - COMPUTING IN SCIENCE & ENGINEERING AB - This special issue presents five invited research articles featuring distinguished contributions to the Fourth IEEE Special Technical Community on Broadening Participation (STCBP) Conference for Research on Equity and Sustained Participation in Engineering, Computing, and Technology (RESPECT 2019). The articles advance our understanding of challenges for and evidence-based approaches to advancing diversity, equity, and inclusion in computing education. DA - 2020/// PY - 2020/// DO - 10.1109/MCSE.2020.3010595 VL - 22 IS - 5 SP - 4-6 SN - 1558-366X ER - TY - JOUR TI - A Systematic Literature Review of Animal-Assisted Interventions in Oncology (Part I): Methods and Results AU - Holder, Timothy R. N. AU - Gruen, Margaret E. AU - Roberts, David L. AU - Somers, Tamara AU - Bozkurt, Alper T2 - INTEGRATIVE CANCER THERAPIES AB - Animal-assisted interventions (AAIs) use human-animal interactions to positive effect in various contexts including cancer care. As the first installment of a 2-part series, this systematic literature review focuses on the research methods and quantitative results of AAI studies in oncology. We find methodological consistency in the use of canines as therapy animals, in the types of high-risk patients excluded from studies, and in the infection precautions taken with therapy animals throughout cancer wards. The investigated patient endpoints are not significantly affected by AAI, with the exceptions of improvements in oxygen consumption, quality of life, depression, mood, and satisfaction with therapy. The AAI field in oncology has progressed significantly since its inception and has great potential to positively affect future patient outcomes. To advance the field, future research should consistently improve the methodological design of studies, report data more completely, and focus more on the therapy animal’s well-being. DA - 2020/8// PY - 2020/8// DO - 10.1177/1534735420943278 VL - 19 SP - SN - 1552-695X KW - animal-assisted interventions KW - animal-assisted activities KW - animal-assisted therapy KW - oncology KW - cancer KW - human-animal bond KW - quantitative ER - TY - JOUR TI - Coordinating scaffolds for collaborative inquiry in a game-based learning environment AU - Saleh, Asmalina AU - Chen, Yuxin AU - Hmelo-Silver, Cindy E. AU - Glazewski, Krista D. AU - Mott, Bradford W. AU - Lester, James C. T2 - JOURNAL OF RESEARCH IN SCIENCE TEACHING AB - Abstract Collaborative inquiry learning affords educators a context within which to support understanding of scientific practices, disciplinary core ideas, and crosscutting concepts. One approach to supporting collaborative science inquiry is through problem‐based learning (PBL). However, there are two key challenges in scaffolding collaborative inquiry learning in technology rich environments. First, it is unclear how we might understand the impact of scaffolds that address multiple functions (e.g., to support inquiry and argumentation). Second, scaffolds take different forms, further complicating how to coordinate the forms and functions of scaffolds to support effective collaborative inquiry. To address these issues, we identify two functions that needed to be scaffolded, the PBL inquiry cycle and accountable talk. We then designed predefined hard scaffolds and just‐in‐time soft scaffolds that target the regulation of collaborative inquiry processes and accountable talk. Drawing on a mixed method approach, we examine how middle school students from a rural school engaged with Crystal Island: EcoJourneys for two weeks (N=45). Findings indicate that hard scaffolds targeting the PBL inquiry process and soft scaffolds that targeted accountable talk fostered engagement in these processes. Although the one‐to‐one mapping between form and function generated positive results, additional soft scaffolds were also needed for effective engagement in collaborative inquiry and that these soft scaffolds were often contingent on hard scaffolds. Our findings have implications for how we might design the form of scaffolds across multiple functions in game‐based learning environments. DA - 2020/11// PY - 2020/11// DO - 10.1002/tea.21656 VL - 57 IS - 9 SP - 1490-1518 SN - 1098-2736 KW - collaborative inquiry learning KW - problem-based learning KW - scaffolds KW - game-based learning environments ER - TY - JOUR TI - Peeking through the Classroom Window : A Detailed Data-Driven Analysis on the Usage of a Curriculum Integrated Math Game in Authentic Classrooms AU - Shabrina, Preya AU - Akintunde, Ruth Okoilu AU - Maniktala, Mehak AU - Barnes, Tiffany AU - Lynch, Collin AU - Rutherford, Teomara T2 - LAK20: THE TENTH INTERNATIONAL CONFERENCE ON LEARNING ANALYTICS & KNOWLEDGE AB - We present a data-driven analysis that provides generalized insights of how a curriculum integrated educational math game gets used as a routinized classroom activity throughout the year in authentic primary school classrooms. Our study relates observations from a field study on Spatial Temporal Math (ST Math) to our findings mined from ST Math students' sequential game play data. We identified features that vary across game play sessions and modeled their relationship with session performance. We also derived data-informed suggestions that may provide teachers with insights into how to design classroom game play sessions to facilitate more effective learning. DA - 2020/// PY - 2020/// DO - 10.1145/3375462.3375525 SP - 625-634 KW - Curriculum Integrated Math Games KW - Game Analytics KW - Integration ER - TY - CHAP TI - Exploring the Impact of Simple Explanations and Agency on Batch Deep Reinforcement Learning Induced Pedagogical Policies AU - Ausin, Markel Sanz AU - Maniktala, Mehak AU - Barnes, Tiffany AU - Chi, Min T2 - Lecture Notes in Computer Science AB - In recent years, Reinforcement learning (RL), especially Deep RL (DRL), has shown outstanding performance in video games from Atari, Mario, to StarCraft. However, little evidence has shown that DRL can be successfully applied to real-life human-centric tasks such as education or healthcare. Different from classic game-playing where the RL goal is to make an agent smart, in human-centric tasks the ultimate RL goal is to make the human-agent interactions productive and fruitful. Additionally, in many real-life human-centric tasks, data can be noisy and limited. As a sub-field of RL, batch RL is designed for handling situations where data is limited yet noisy, and building simulations is challenging. In two consecutive classroom studies, we investigated applying batch DRL to the task of pedagogical policy induction for an Intelligent Tutoring System (ITS), and empirically evaluated the effectiveness of induced pedagogical policies. In Fall 2018 (F18), the DRL policy is compared against an expert-designed baseline policy and in Spring 2019 (S19), we examined the impact of explaining the batch DRL-induced policy with student decisions and the expert baseline policy. Our results showed that 1) while no significant difference was found between the batch RL-induced policy and the expert policy in F18, the batch RL-induced policy with simple explanations significantly improved students’ learning performance more than the expert policy alone in S19; and 2) no significant differences were found between the student decision making and the expert policy. Overall, our results suggest that pairing simple explanations with induced RL policies can be an important and effective technique for applying RL to real-life human-centric tasks. PY - 2020/// DO - 10.1007/978-3-030-52237-7_38 SP - 472-485 PB - Springer International Publishing UR - https://doi.org/10.1007/978-3-030-52237-7_38 KW - Deep reinforcement learning KW - Pedagogical policy KW - Explanation ER - TY - CONF TI - Writing Effective Autograded Exercises Using Bloom's Taxonomy AU - Battestilli, Lina AU - Korkes, Sarah AB - Abstract Computer science enrollment continues to grow every year, with class sizes growing into the hundreds. Many instructors in introductory computing courses have turned to auto-graded exercises to ease grading load while still allowing students to practice concepts. As the use of auto-graders becomes more common, it is imperative to ensure that the exercise sets are being written to maximize student benefit. In this paper, we use Bloom's Taxonomy (BT) to create auto-graded exercise sets that scale up from lower to higher levels of complexity. We focus on evaluating learning efficiency, code quality, and student perception of their learning experience. We found that it takes students more submission attempts in the auto-grader when the are given BT Apply/Analyze-type questions that contain some starter code. Students complete the auto-graded assignments with fewer number of submissions when there is no-starter code and they have to write the solution from scratch, i.e. BT Create-type of questions. However, when writing code from scratch, the students' code quality can suffer because the students are not required to actually understand the concept being tested and might be able to find a workaround to pass the tests of the auto-grader. C2 - 2020/6// C3 - 2020 ASEE Virtual Annual Conference Content Access Proceedings DA - 2020/6// DO - 10.18260/1-2--35711 PB - ASEE Conferences UR - https://doi.org/10.18260/1-2--35711 ER - TY - CONF TI - Predictive Student Modeling in Block-Based Programming Environments with Bayesian Hierarchical Models AB - Recent years have seen a growing interest in block-based programming environments for computer science education. Although block-based programming offers a gentle introduction to coding for novice programmers, introductory computer science still presents significant challenges, so there is a great need for block-based programming environments to provide students with adaptive support. Predictive student modeling holds significant potential for adaptive support in block-based programming environments because it can identify early on when a student is struggling. However, predictive student models often make a number of simplifying assumptions, such as assuming a normal response distribution or homogeneous student characteristics, which can limit the predictive performance of models. These assumptions, when invalid, can significantly reduce the predictive accuracy of student models. C2 - 2020/7/7/ C3 - Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization DA - 2020/7/7/ DO - 10.1145/3340631.3394853 UR - http://dx.doi.org/10.1145/3340631.3394853 ER - TY - JOUR TI - The 'as code' activities: development anti-patterns for infrastructure as code AU - Rahman, Akond AU - Farhana, Effat AU - Williams, Laurie T2 - EMPIRICAL SOFTWARE ENGINEERING AB - Context: The 'as code' suffix in infrastructure as code (IaC) refers to applying software engineering activities, such as version control, to maintain IaC scripts. Without the application of these activities, defects that can have serious consequences may be introduced in IaC scripts. A systematic investigation of the development anti-patterns for IaC scripts can guide practitioners in identifying activities to avoid defects in IaC scripts. Development anti-patterns are recurring development activities that relate with defective IaC scripts. Goal: The goal of this paper is to help practitioners improve the quality of infrastructure as code (IaC) scripts by identifying development activities that relate with defective IaC scripts. Methodology: We identify development anti-patterns by adopting a mixed-methods approach, where we apply quantitative analysis with 2,138 open source IaC scripts and conduct a survey with 51 practitioners. Findings: We observe five development activities to be related with defective IaC scripts from our quantitative analysis. We identify five development anti-patterns namely, 'boss is not around', 'many cooks spoil', 'minors are spoiler', 'silos', and 'unfocused contribution'. Conclusion: Our identified development anti-patterns suggest the importance of 'as code' activities in IaC because these activities are related to quality of IaC scripts. DA - 2020/9// PY - 2020/9// DO - 10.1007/s10664-020-09841-8 VL - 25 IS - 5 SP - 3430-3467 SN - 1573-7616 KW - Anti-pattern KW - Bugs KW - Configuration script KW - Continuous deployment KW - Defect KW - Devops KW - Infrastructure as code KW - Practice KW - Puppet KW - Quality ER - TY - JOUR TI - Novel Geomechanics Concepts for Earthquake Excitations Applied in Time Domain AU - Haldar, Achintya AU - Gaxiola-Camacho, J. Ramon AU - Azizsoltani, Hamoon AU - Villegas-Mercado, Francisco J. AU - Vazirizade, Sayyed Mohsen T2 - INTERNATIONAL JOURNAL OF GEOMECHANICS AB - Novel geomechanics concepts for seismic design satisfying the current performance-based seismic design (PBSD) requirements are presented. Issues related to soil conditions are explicitly addressed. To satisfy the underlying dynamics as realistically as possible, structures are represented by finite elements and the earthquake excitations are applied in time domain. PBSD is essentially a sophisticated risk-based design concept. To incorporate uncertainty in the seismic loading, the current design guidelines require the consideration of at least 11 design earthquake time histories. For wider acceptance, information on the seismic risk is extracted using multiple deterministic analyses. The proposed concept is showcased by estimating the underlying risk of a nine-story steel building designed by experts for several performance levels and different soil conditions. The basic intent of PBSD is to limit the probability of collapse to about 0.10. The study confirms that the building was well designed by the experts and the proposed method confirms this requirement. The authors believe that they proposed an alternative to simulation and the classical random vibration concept. DA - 2020/9/1/ PY - 2020/9/1/ DO - 10.1061/(ASCE)GM.1943-5622.0001799 VL - 20 IS - 9 SP - SN - 1943-5622 KW - Earthquake engineering KW - Dynamics of geomaterials KW - Site conditions KW - Multiple deterministic analyses KW - Design earthquake time history KW - Performance-based seismic design ER - TY - JOUR TI - From Machine Ethics to Internet Ethics: Broadening the Horizon AU - Murukannaiah, Pradeep K. AU - Singh, Munindar P. T2 - IEEE INTERNET COMPUTING AB - This article introduces some of the key concepts and challenges pertaining to ethics from the standpoint of Internet applications. DA - 2020/// PY - 2020/// DO - 10.1109/MIC.2020.2989935 VL - 24 IS - 3 SP - 51-57 SN - 1941-0131 KW - Ethics KW - Internet KW - Hospitals KW - Urban areas KW - Roads KW - Cognition KW - Smart phones ER - TY - JOUR TI - Exploring neighborhoods in large metagenome assembly graphs using spacegraphcats reveals hidden sequence diversity AU - Brown, C. Titus AU - Moritz, Dominik AU - Michael P. O'Brien, AU - Reidl, Felix AU - Reiter, Taylor AU - Sullivan, Blair D. T2 - GENOME BIOLOGY AB - Abstract Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently extract subgraphs surrounding an inferred genome. We apply this system to recover missing content from genome bins and show that substantial genomic sequence variation is present in a real metagenome. Our software implementation is available at https://github.com/spacegraphcats/spacegraphcats under the 3-Clause BSD License. DA - 2020/7/6/ PY - 2020/7/6/ DO - 10.1186/s13059-020-02066-4 VL - 21 IS - 1 SP - SN - 1474-760X UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85087661224&partnerID=MN8TOARS KW - Metagenomics KW - Sequence assembly KW - Strain variation KW - Bounded expansion KW - Dominating set ER - TY - JOUR TI - Fostering Engagement in Health Behavior Change: Iterative Development of an Interactive Narrative Environment to Enhance Adolescent Preventive Health Services AU - Ozer, Elizabeth M. AU - Rowe, Jonathan AU - Tebb, Kathleen P. AU - Berna, Mark AU - Penilla, Carlos AU - Giovanelli, Alison AU - Jasik, Carolyn AU - Lester, James C. T2 - JOURNAL OF ADOLESCENT HEALTH AB - Accidents and unintentional injuries account for the greatest number of adolescent deaths, often involving use of alcohol and other substances. This article describes the iterative design and development of Interactive Narrative System for Patient-Individualized Reflective Exploration (INSPIRE), a narrative-centered behavior change environment for adolescents focused on reducing alcohol use. INSPIRE is designed to serve as an extension to clinical preventive care, engaging adolescents in a theoretically grounded intervention for health behavior change by leveraging 3D game engine and interactive narrative technologies.Adolescents were engaged in all aspects of the iterative, multiyear development process of INSPIRE through over 20 focus groups and iterative pilot testing involving more than 145 adolescents. Qualitative findings from focus groups are reported, as well as quantitative findings from small-scale pilot sessions investigating adolescent engagement with a prototype version of INSPIRE using a combination of questionnaire and interaction trace log data.Adolescents reported that they found INSPIRE to be engaging, believable, and relevant to their lives. The majority of participants indicated that the narrative's protagonist character was like them (84%) and that the narrative featured virtual characters that they could relate to (79%). In the interactive narrative, the goals most frequently chosen by adolescents were "stay in control" (60%) and "do not get in trouble" (55%).With a strong theoretical framework (social-cognitive behavior change theory) and technology advances (narrative-centered learning environments), the field is well positioned to design health behavior change systems that can realize significant impacts on behavior change for adolescent preventive health. DA - 2020/8// PY - 2020/8// DO - 10.1016/j.jadohealth.2020.04.022 VL - 67 IS - 2 SP - S34-S44 SN - 1879-1972 KW - Health behavior change KW - Prevention KW - Interactive narrative technologies KW - Adolescent risk behavior KW - Alcohol use KW - Games for health KW - Narrative-centered behavior change environments KW - Health information technology KW - Self-efficacy KW - Social-cognitive theory ER - TY - JOUR TI - Innovative Digital Technologies to Improve Adolescent and Young Adult Health AU - Ozer, Elizabeth M. AU - Lester, James C. T2 - JOURNAL OF ADOLESCENT HEALTH AB - The lives of adolescents and young adults (AYAs) have become increasingly intertwined with technology. Multidisciplinary perspectives and collaboration are needed to capitalize on the strategic use of technology during key developmental windows. Technology-rich models of behavior change, with opportunities for personalizing health interventions, offer significant transformative potential to improve adolescent and young adult health. There is considerable momentum behind advancing integration of digital health technology to enhance the efficiency and effectiveness of the clinical encounter, and rapid advances in technology provide mechanisms for enabling AYAs to take agentic roles in promoting health practice and policy. This Special Issue, Innovative Digital Technologies to Improve Adolescent and Young Adult Health, evolved from our collaborative multidisciplinary research that has been supported by the National Science Foundation under the Smart and Connected Health: Connecting Data, People, and Systems program (IIS-1344670 & IIS-1344803), with the goal of accelerating the development and integration of innovative technology to support the transformation of health and medicine. In the special issue, we are excited to share articles that highlight the potential of innovative technologies to promote AYA health and well-being. The need for the AYA health community to engage in multidisciplinary work to address key challenges posed by health technologies, including access, inequity, bias, privacy, security, and integration into clinical workflows and adolescent lives, is essential. This requires an intentional focus on inclusivity for all AYAs, especially those historically excluded, without which these technologies will reproduce existing inequalities and structural racism in health care. The rapid shift to online technology for clinical services delivery, research data collection, and education in response to the COVID-19 pandemic highlights the opportunities and perils of innovative techologies, particularly in regard to disparities in access, inequity, and privacy concerns. We are grateful to the guest editor of this Special Issue, Professor Lena Sanci, and the many authors who have contributed their work. DA - 2020/8// PY - 2020/8// DO - 10.1016/j.jadohealth.2020.05.015 VL - 67 IS - 2 SP - S3-S3 SN - 1879-1972 ER - TY - JOUR TI - Learning actionable analytics from multiple software projects AU - Krishna, Rahul AU - Menzies, Tim T2 - EMPIRICAL SOFTWARE ENGINEERING AB - The current generation of software analytics tools are mostly prediction algorithms (e.g. support vector machines, naive bayes, logistic regression, etc). While prediction is useful, after prediction comes planning about what actions to take in order to improve quality. This research seeks methods that generate demonstrably useful guidance on “what to do” within the context of a specific software project. Specifically, we propose XTREE (for within-project planning) and BELLTREE (for cross-project planning) to generating plans that can improve software quality. Each such plan has the property that, if followed, it reduces the expected number of future defect reports. To find this expected number, planning was first applied to data from release x. Next, we looked for change in release x + 1 that conformed to our plans. This procedure was applied using a range of planners from the literature, as well as XTREE. In 10 open-source JAVA systems, several hundreds of defects were reduced in sections of the code that conformed to XTREE’s plans. Further, when compared to other planners, XTREE’s plans were found to be easier to implement (since they were shorter) and more effective at reducing the expected number of defects. DA - 2020/9// PY - 2020/9// DO - 10.1007/s10664-020-09843-6 VL - 25 IS - 5 SP - 3468-3500 SN - 1573-7616 KW - Data mining KW - Actionable analytics KW - Planning KW - Bellwethers KW - Defect prediction ER - TY - JOUR TI - Preliminary Evaluation of a Wearable Sensor System for Heart Rate Assessment in Guide Dog Puppies AU - Foster, Marc AU - Brugarolas, Rita AU - Walker, Katherine AU - Mealin, Sean AU - Cleghern, Zach AU - Yuschak, Sherrie AU - Clark, Julia Condit AU - Adin, Darcy AU - Russenberger, Jane AU - Gruen, Margaret AU - Sherman, Barbara L. AU - Roberts, David L. AU - Bozkurt, Alper T2 - IEEE SENSORS JOURNAL AB - This paper details the development of a novel wireless heart rate sensing system for puppies in training as guide dogs. The system includes a harness with on-board electrocardiography (ECG) front-end circuit, inertial measurement unit and a micro-computer with wireless capability where the major research focus of this paper was on the ergonomic design and evaluation of the system on puppies. The first phase of our evaluation was performed on a Labrador Retriever between 12 to 26 weeks in age as a pilot study. The longitudinal weekly data collected revealed the expected trend of a decreasing average heart rate and increased heart rate variability as the age increased. In the second phase, we improved the system ergonomics for a larger scale deployment in a guide dog school (Guiding Eyes for the Blind (Guiding Eyes)) on seventy 7.5-week-old puppies (heart rate coverage average of 86.7%). The acquired ECG based heart rate data was used to predict the performance of puppies in Guiding Eyes's temperament test. We used the data as an input to a machine learning model which predicted two Behavior Checklist (BCL) scores as determined by expert Guiding Eyes puppy evaluators with an accuracy above 90%. DA - 2020/// PY - 2020/// DO - 10.1109/JSEN.2020.2986159 VL - 20 IS - 16 SP - 9449-9459 SN - 1558-1748 KW - ECG KW - heart rate variability KW - electrodes KW - machine learning KW - 3D printing KW - wearable ER - TY - JOUR TI - Artificial Intelligence for Personalized Preventive Adolescent Healthcare AU - Rowe, Jonathan P. AU - Lester, James C. T2 - JOURNAL OF ADOLESCENT HEALTH AB - Recent advances in artificial intelligence (AI) are creating new opportunities for personalizing technology-based health interventions to adolescents. This article provides a computer science perspective on how emerging AI technologies—intelligent learning environments, interactive narrative generation, user modeling, and adaptive coaching—can be utilized to model adolescent learning and engagement and deliver personalized support in adaptive health technologies. Many of these technologies have emerged from human-centered applications of AI in education, training, and entertainment. However, their application to improving healthcare, to date, has been comparatively limited. We illustrate the opportunities provided by AI-driven adaptive technologies for adolescent preventive healthcare by describing a vision of how future adolescent preventive health interventions might be delivered both inside and outside of the clinic. Key challenges posed by AI-driven health technologies are also presented, including issues of privacy, ethics, encoded bias, and integration into clinical workflows and adolescent lives. Examples of empirical findings about the effectiveness of AI technologies for user modeling and adaptive coaching are presented, which underscore their promise for application toward adolescent health. The article concludes with a brief discussion of future research directions for the field, which is well positioned to leverage AI to improve adolescent health and well-being. DA - 2020/8// PY - 2020/8// DO - 10.1016/j.jadohealth.2020.02.021 VL - 67 IS - 2 SP - 552-558 SN - 1879-1972 KW - Artificial intelligence KW - Prevention KW - Health information technology KW - Adaptive learning technologies KW - User modeling KW - Interactive narrative generation KW - Adolescents ER - TY - JOUR TI - Exploring Differences Between Student and Teacher Created Snap! Projects AU - Isvik, Amy AU - Catete, Veronica AU - Alvarez, Lauren AU - Lytle, Nicholas AU - Barnes, Tiffany T2 - 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) AB - This paper illustrates coding decisions by in-service teachers and high school interns working independently versus collaboratively to build computing activities for non-computing classrooms. We investigate code written in Snap! to gain insights on project type and subject matter. We also share case studies on how intern collaboration influences final product execution. Through our research, we found student-only teams often created tutorial projects whereas teachers-only teams create interactive narratives. We found students were able to reuse code across projects to replicate similar mechanics and that students specialize in different aspects of project creation. Overall, we find it beneficial to have collaborative teacher-student teams. DA - 2020/8// PY - 2020/8// DO - 10.1109/vl/hcc50065.2020.9127249 VL - 2020-August UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85093919704&partnerID=MN8TOARS ER - TY - JOUR TI - Poster: Designing GradeSnap for Block-Based Code AU - Milliken, Alexandra AU - Catete, Veronica AU - Isvik, Amy AU - Barnes, Tiffany T2 - 2020 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC) AB - Many K-12 CS education instructors are new to teaching computing and use block-based programming (BBP), allowing them and their students to learn computing concepts with less frustration. Based on prior work with instructors, we found that they need assistance with grading student projects as they are new to coding and developing corresponding assessments. Existing methods are either not intuitive and too restrictive for custom assignments, or tediously unnecessary. Additionally, providing a grading support tool will help increase adoption of our Advanced Placement Computer Science Principles curriculum. Therefore, we began developing a tool to assist teachers with grading BBP projects. This paper reports on a modified User Experience Research method, which we used to determine the critical and necessary features needed for the tool to allow teachers to successfully and efficiently assess BBP student artifacts. DA - 2020/8// PY - 2020/8// DO - 10.1109/vl/hcc50065.2020.9127284 VL - 2020-August UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85093947070&partnerID=MN8TOARS ER - TY - CONF TI - Engaging Students with Instructor Solutions in Online Programming Homework AU - Price, Thomas W. AU - Williams, Joseph Jay AU - Solyst, Jaemarie AU - Marwan, Samiha C2 - 2020/// C3 - ACM CHI Conference on Human Factors in Computing Systems CY - Honolulu, HI, USA DA - 2020/// ER - TY - CONF TI - Crescendo : Engaging Students to Self-Paced Programming Practices AU - Wang, Wengran AU - Zhi, Rui AU - Milliken, Alexandra AU - Lytle, Nicholas AU - Price, Thomas W C2 - 2020/// C3 - Proceedings of the ACM Technical Symposium on Computer Science Education DA - 2020/// ER - TY - CONF TI - Adaptive Immediate Feedback Can Improve Novice Programming Engagement and Intention to Persist in Computer Science AU - Marwan, Samiha AU - Gao, Ge AU - Fisk, Susan AU - Price, Thomas W. AU - Barnes, Tiffany C2 - 2020/// C3 - Proceedings of the International Computing Education Research Conference DA - 2020/// SP - 1-10 ER - TY - CONF TI - Step Tutor: Supporting Students through Step-by-Step Example-Based Feedback AU - Wang, Wengran AU - Rao, Yudong AU - Zhi, Rui AU - Marwan, Samiha AU - Gao, Ge AU - Price, Thomas W. C2 - 2020/// C3 - Proceedings of the International Conference on Innovation and Technology in Computer Science Education DA - 2020/// ER - TY - JOUR TI - Multimodal learning analytics for game-based learning AU - Emerson, Andrew AU - Cloude, Elizabeth B. AU - Azevedo, Roger AU - Lester, James T2 - BRITISH JOURNAL OF EDUCATIONAL TECHNOLOGY AB - Abstract A distinctive feature of game‐based learning environments is their capacity to create learning experiences that are both effective and engaging. Recent advances in sensor‐based technologies such as facial expression analysis and gaze tracking have introduced the opportunity to leverage multimodal data streams for learning analytics. Learning analytics informed by multimodal data captured during students’ interactions with game‐based learning environments hold significant promise for developing a deeper understanding of game‐based learning, designing game‐based learning environments to detect maladaptive behaviors and informing adaptive scaffolding to support individualized learning. This paper introduces a multimodal learning analytics approach that incorporates student gameplay, eye tracking and facial expression data to predict student posttest performance and interest after interacting with a game‐based learning environment, Crystal Island . We investigated the degree to which separate and combined modalities (ie, gameplay, facial expressions of emotions and eye gaze) captured from students ( n = 65) were predictive of student posttest performance and interest after interacting with Crystal Island . Results indicate that when predicting student posttest performance and interest, models utilizing multimodal data either perform equally well or outperform models utilizing unimodal data. We discuss the synergistic effects of combining modalities for predicting both student interest and posttest performance. The findings suggest that multimodal learning analytics can accurately predict students’ posttest performance and interest during game‐based learning and hold significant potential for guiding real‐time adaptive scaffolding. DA - 2020/9// PY - 2020/9// DO - 10.1111/bjet.12992 VL - 51 IS - 5 SP - 1505-1526 SN - 1467-8535 ER - TY - JOUR TI - Large Scale Characterization of Software Vulnerability Life Cycles AU - Shahzad, Muhammad AU - Shafiq, M. Zubair AU - Liu, Alex X. T2 - IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING AB - Software systems inherently contain vulnerabilities that have been exploited in the past resulting in significant revenue losses. The study of various aspects related to vulnerabilities such as their severity, rates of disclosure, exploit and patch release, and existence of common vulnerabilities in different products can help in improving the development, deployment, and maintenance process of software systems. It can also help in designing future security policies and conducting audits of past incidents. Furthermore, such an analysis can help customers to assess the security risks associated with software products of different vendors. In this paper, we conduct an exploratory measurement study of a large software vulnerability data set containing 56077 vulnerabilities disclosed since 1988 till 2013. We investigate vulnerabilities along following eight dimensions: (1) phases in the life cycle of vulnerabilities, (2) evolution of vulnerabilities over the years, (3) functionality of vulnerabilities, (4) access requirement for exploitation of vulnerabilities, (5) risk level of vulnerabilities, (6) software vendors, (7) software products, and (8) existence of common vulnerabilities in multiple software products. Our exploratory analysis uncovers several statistically significant findings that have important implications for software development and deployment. DA - 2020/// PY - 2020/// DO - 10.1109/TDSC.2019.2893950 VL - 17 IS - 4 SP - 730-744 SN - 1941-0018 KW - Computer hacking KW - Data aggregation KW - Linux KW - Market research KW - Microsoft Windows KW - Vulnerability KW - disclosure KW - patch KW - exploit KW - diversity ER - TY - JOUR TI - Expert Perspectives on AI AU - Carleton, Anita D. AU - Harper, Erin AU - Lyu, Michael R. AU - Eldh, Sigrid AU - Xie, Tao AU - Menzies, Tim T2 - IEEE SOFTWARE AB - IEEE Software: With the rapid changes occurring in the fields of artificial intelligence (AI) and machine learning (ML), what areas do you think are the most important to focus on right now, especially in relation to software engineering? DA - 2020/// PY - 2020/// DO - 10.1109/MS.2020.2987673 VL - 37 IS - 4 SP - 87-94 SN - 1937-4194 ER - TY - JOUR TI - The AI Effect: Working at the Intersection of AI and SE AU - Carleton, Anita D. AU - Harper, Erin AU - Menzies, Tim AU - Xie, Tao AU - Eldh, Sigrid AU - Lyu, Michael R. T2 - IEEE SOFTWARE AB - This special issue explores the intersection of artificial intelligence (AI) and software engineering (SE), that is, what can AI do for SE, and how can we as software engineers design and build better AI systems? DA - 2020/// PY - 2020/// DO - 10.1109/MS.2020.2987666 VL - 37 IS - 4 SP - 26-35 SN - 1937-4194 ER - TY - JOUR TI - A Combinatorial Testing-Based Approach to Fault Localization AU - Ghandehari, Laleh Sh AU - Lei, Yu AU - Kacker, Raghu AU - Kuhn, Richard AU - Xie, Tao AU - Kung, David T2 - IEEE TRANSACTIONS ON SOFTWARE ENGINEERING AB - Combinatorial testing has been shown to be a very effective strategy for software testing. After a failure is detected, the next task is to identify one or more faulty statements in the source code that have caused the failure. In this paper, we present a fault localization approach, called BEN, which produces a ranking of statements in terms of their likelihood of being faulty by leveraging the result of combinatorial testing. BEN consists of two major phases. In the first phase, BEN identifies a combination that is very likely to be failure-inducing. A combination is failure-inducing if it causes any test in which it appears to fail. In the second phase, BEN takes as input a failure-inducing combination identified in the first phase and produces a ranking of statements in terms of their likelihood to be faulty. We conducted an experiment in which our approach was applied to the Siemens suite and four real-world programs, flex, grep, gzip and sed, from Software Infrastructure Repository (SIR). The experimental results show that our approach can effectively and efficiently localize the faulty statements in these programs. DA - 2020/6/1/ PY - 2020/6/1/ DO - 10.1109/TSE.2018.2865935 VL - 46 IS - 6 SP - 616-645 SN - 1939-3520 KW - Testing KW - Fault diagnosis KW - Flexible printed circuits KW - Software KW - Task analysis KW - Debugging KW - Computer science KW - Combinatorial testing KW - fault localization KW - debugging ER - TY - JOUR TI - MERR: Improving Security of Persistent Memory Objects via Efficient Memory Exposure Reduction and Randomization AU - Xu, Yuanchao AU - Solihin, Yan AU - Shen, Xipeng T2 - TWENTY-FIFTH INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXV) AB - This paper proposes a new defensive technique for memory, especially useful for long-living objects on Non-Volatile Memory (NVM), or called Persistent Memory objects (PMOs). The method takes a distinctive perspective, trying to reduce memory exposure time by largely shortening the overhead in attaching and detaching PMOs into the memory space. It does it through a novel idea, embedding page table subtrees inside PMOs. The paper discusses the complexities the technique brings, to permission controls and hardware implementations, and provides solutions. Experimental results show that the new technique reduces memory exposure time by 60% with a 5% time overhead (70% with 10.9% overhead). It allows much more frequent address randomizations (shortening the period from seconds to less than 41.4us), offering significant potential for enhancing memory security. DA - 2020/// PY - 2020/// DO - 10.1145/3373376.3378492 SP - 987-1000 KW - persistent memory objects KW - memory exposure reduction KW - runtime randomization ER - TY - CHAP TI - An Evaluation of Data-Driven Programming Hints in a Classroom Setting AU - Price, Thomas W. AU - Marwan, Samiha AU - Winters, Michael AU - Williams, Joseph Jay T2 - Lecture Notes in Computer Science AB - Data-driven programming hints are a scalable way to support students when they are stuck by automatically offering suggestions and identifying errors. However, few classroom studies have investigated data-driven hints’ impact on students’ performance and learning. In this work, we ran a controlled experiment with 241 students in an authentic classroom setting, comparing students who learned with and without hints. We found no evidence that hints improved student performance or learning overall, and we discuss possible reasons why. PY - 2020/// DO - 10.1007/978-3-030-52240-7_45 SP - 246-251 PB - Springer International Publishing UR - https://doi.org/10.1007/978-3-030-52240-7_45 KW - Data-driven hints KW - Computing education ER - TY - JOUR TI - Recursive algorithm for selecting optimum routing tables to solve offline routing and spectrum assignment problem AU - Fayez, Mahmoud AU - Katib, Iyad AU - Rouskas, George N. AU - Gharib, Tarek F. AU - Khaleed, Heba AU - Faheem, Hossam M. T2 - AIN SHAMS ENGINEERING JOURNAL AB - The Routing and Spectrum Assignment (RSA) problem is NP-Hard so searching the entire problem space is not applicable. Many decomposition algorithms rely on reducing the search space in the routing space and applying heuristics algorithm in the spectrum assignment sub-problem. This is not necessarily a right solution as the ignored routing tables may lead to a better solution when they are used later as input to the Spectrum Assignment sub-problem. In this paper, we develop a new recursive decomposition approach for the RSA problem in optical networks. At the core of our approach is a new recursive branch and-bound procedure for carrying out an exhaustive search of the routing space in a scalable manner. This recursion effectively decouples the routing from the spectrum assignment part of the problem. Sequential generation of the full set of routing tables requires huge memory and very large processing time. Alternatively, our approach deploys multi-core architectures to generate the routing tables in parallel using OpenMP. Experimental results indicate that our recursive algorithm is quite efficient in searching the entire routing space for topologies representing large-scale wide area networks. Importantly, the decomposition may be more generally applied to any network design problem whose solution involves a search over both a routing and a resource allocation space. The main contributions for this paper are that we are able to generate all the search space in parallel in less than 1 min for 32-nodes network. Secondly, we are able to investigate all the routing tables, eliminate most of the search space, and select the promising routing tables that are proven to lead to a better solution in the Spectrum Assignment Sub-Problem DA - 2020/6// PY - 2020/6// DO - 10.1016/j.asej.2019.10.008 VL - 11 IS - 2 SP - 273-280 SN - 2090-4495 KW - RSA KW - Exhaustive routing search KW - RSA decomposition ER - TY - JOUR TI - DeepStealth: Game-Based Learning Stealth Assessment With Deep Neural Networks AU - Min, Wookhee AU - Frankosky, Megan H. AU - Mott, Bradford W. AU - Rowe, Jonathan P. AU - Smith, Andy AU - Wiebe, Eric AU - Boyer, Kristy Elizabeth AU - Lester, James C. T2 - IEEE Transactions on Learning Technologies AB - A distinctive feature of game-based learning environments is their capacity for enabling stealth assessment. Stealth assessment analyzes a stream of fine-grained student interaction data from a game-based learning environment to dynamically draw inferences about students' competencies through evidence-centered design. In evidence-centered design, evidence models have been traditionally designed using statistical rules authored by domain experts that are encoded using Bayesian networks. This article presents DeepStealth, a deep learning-based stealth assessment framework, that yields significant reductions in the feature engineering labor that has previously been required to create stealth assessments. DeepStealth utilizes end-to-end trainable deep neural network-based evidence models. Using this framework, evidence models are devised using a set of predictive features captured from raw, low-level interaction data to infer evidence for competencies. We investigate two deep learning-based evidence models, long short-term memory networks (LSTMs) and n-gram encoded feedforward neural networks (FFNNs). We compare these models' predictive performance for inferring students' knowledge to linear-chain conditional random fields (CRFs) and naïve Bayes models. We perform feature set-level analyses of game trace logs and external pre-learning measures, and we examine the models' early prediction capacity. The framework is evaluated using data collected from 182 middle school students interacting with a game-based learning environment for middle grade computational thinking. Results indicate that LSTM-based stealth assessors outperform competitive baseline approaches with respect to predictive accuracy and early prediction capacity. We find that LSTMs, FFNNs, and CRFs all benefit from combined feature sets derived from both game trace logs and external pre-learning measures. DA - 2020/4/1/ PY - 2020/4/1/ DO - 10.1109/TLT.2019.2922356 VL - 13 IS - 2 SP - 312-325 UR - https://doi.org/10.1109/TLT.2019.2922356 KW - Hidden Markov models KW - Computational modeling KW - Games KW - Predictive models KW - Task analysis KW - Adaptation models KW - Computer science KW - Computational thinking KW - deep learning KW - educational games KW - game-based learning KW - stealth assessment ER - TY - JOUR TI - SF-Sketch: A Two-Stage Sketch for Data Streams AU - Liu, Lingtong AU - Shen, Yulong AU - Yan, Yibo AU - Yang, Tong AU - Shahzad, Muhammad AU - Cui, Bin AU - Xie, Gaogang T2 - IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS AB - Sketches are probabilistic data structures designed for recording frequencies of items in a multi-set. They are widely used in various fields, especially for gathering Internet statistics from distributed data streams in network measurements. In a distributed streaming application with high data rates, a sketch in each monitoring node “fills up” very quickly and then its content is transferred to a remote collector responsible for answering queries. Thus, the size of the contents transferred must be kept as small as possible while meeting the desired accuracy requirement. To obtain significantly higher accuracy while keeping the same update and query speed as the best prior sketches, in this article, we propose a new sketch - the Slim-Fat (SF) sketch. The key idea behind the SF-sketch is to maintain two separate sketches: a larger sketch, the Fat-subsketch, and a smaller sketch, the Slim-subsketch. The Fat-subsketch is used for updating and periodically producing the Slim-subsketch, which is then transferred to the remote collector for answering queries quickly and accurately. We also present the error bound as well as an accurate model of the correct rate of the SF-sketch, and verify their correctness through experiments. We implemented and extensively evaluated the SF-sketch along with several prior sketches. Our results show that when the size of our Slim-subsketch and of the widely used Count-Min (CM) sketch are kept the same, our SF-sketch outperforms the CM-sketch by up to 33.1 times in terms of accuracy (when the ratio of the sizes of the Fat-subsketch and the Slim-subsketch is 16:1). We have made all source codes publicly available at Github [“Source code of SF sketches,” [Online]. Available: https://github.com/paper2017/SF-sketch]. DA - 2020/10/1/ PY - 2020/10/1/ DO - 10.1109/TPDS.2020.2987609 VL - 31 IS - 10 SP - 2263-2276 SN - 1558-2183 KW - Distributed databases KW - Monitoring KW - Bars KW - Frequency measurement KW - Registers KW - Fats KW - Hash functions KW - Network measurements KW - sketch KW - distributed monitoring KW - multiset KW - frequent items ER - TY - CONF TI - A conceptual assessment framework for k-12 computer science rubric design AU - Akram, B. AU - Min, W. AU - Wiebe, E. AU - Navied, A. AU - Mott, B. AU - Boyer, K.E. AU - Lester, J. AB - The lack of effective guidelines for assessing students' computer science (CS) competencies is creating significant demand by K-12 teachers for CS assessments to evaluate students' learning. We propose a conceptual assessment framework that guides teachers through designing appropriate assessments for computer science (CS) activities in their classrooms. The framework addresses the critical problem of incorporating CS into K-12 curricula without corresponding assessments. We illustrate its use with the design of a rubric for a bubble sort algorithm situated in a game-based learning environment for middle-grade students. We also apply a preliminary and a revised version of this assessment on two datasets collected from students' interactions with the learning environment. We found consistency among results identified through applying the preliminary and the revised rubric. The results reveal distinctive patterns in students' approaches to CS problem solving and coherency with respect to different aspects of the rubric.* C2 - 2020/// C3 - Annual Conference on Innovation and Technology in Computer Science Education, ITiCSE DA - 2020/// DO - 10.1145/3328778.3372643 SP - 1328 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85081624104&partnerID=MN8TOARS KW - CS Assessment KW - Evidence Centered Design KW - K-12 CS Instruction ER - TY - JOUR TI - A Look into Programmers' Heads AU - Peitek, Norman AU - Siegmund, Janet AU - Apel, Sven AU - Kastner, Christian AU - Parnin, Chris AU - Bethmann, Anja AU - Leich, Thomas AU - Saake, Gunter AU - Brechmann, Andre T2 - IEEE TRANSACTIONS ON SOFTWARE ENGINEERING AB - Program comprehension is an important, but hard to measure cognitive process. This makes it difficult to provide suitable programming languages, tools, or coding conventions to support developers in their everyday work. Here, we explore whether functional magnetic resonance imaging (fMRI) is feasible for soundly measuring program comprehension. To this end, we observed 17 participants inside an fMRI scanner while they were comprehending source code. The results show a clear, distinct activation of five brain regions, which are related to working memory, attention, and language processing, which all fit well to our understanding of program comprehension. Furthermore, we found reduced activity in the default mode network, indicating the cognitive effort necessary for program comprehension. We also observed that familiarity with Java as underlying programming language reduced cognitive effort during program comprehension. To gain confidence in the results and the method, we replicated the study with 11 new participants and largely confirmed our findings. Our results encourage us and, hopefully, others to use fMRI to observe programmers and, in the long run, answer questions, such as: How should we train programmers? Can we train someone to become an excellent programmer? How effective are new languages and tools for program comprehension? DA - 2020/4/1/ PY - 2020/4/1/ DO - 10.1109/TSE.2018.2863303 VL - 46 IS - 4 SP - 442-462 SN - 1939-3520 KW - Functional magnetic resonance imaging KW - Task analysis KW - Cognition KW - Brain KW - Programming KW - Blood KW - Functional magnetic resonance imaging KW - program comprehension ER - TY - JOUR TI - Desen: Specification of Sociotechnical Systems via Patterns of Regulation and Control AU - Kafali, Özgür AU - Ajmeri, Nirav AU - Singh, Munindar P. T2 - ACM Transactions on Software Engineering and Methodology AB - We address the problem of engineering a sociotechnical system (STS) with respect to its stakeholders’ requirements. We motivate a two-tier STS conception composed of a technical tier that provides control mechanisms and describes what actions are allowed by the software components, and a social tier that characterizes the stakeholders’ expectations of each other in terms of norms. We adopt agents as computational entities, each representing a different stakeholder. Unlike previous approaches, our framework, D ESEN , incorporates the social dimension into the formal verification process. Thus, D ESEN supports agents potentially violating applicable norms—a consequence of their autonomy. In addition to requirements verification, D ESEN supports refinement of STS specifications via design patterns to meet stated requirements. We evaluate D ESEN at three levels. We illustrate how D ESEN carries out refinement via the application of patterns on a hospital emergency scenario. We show via a human-subject study that a design process based on our patterns is helpful for participants who are inexperienced in conceptual modeling and norms. We provide an agent-based environment to simulate the hospital emergency scenario to compare STS specifications (including participant solutions from the human-subject study) with metrics indicating social welfare and norm compliance, and other domain dependent metrics. DA - 2020/2/5/ PY - 2020/2/5/ DO - 10.1145/3365664 VL - 29 IS - 1 SP - 7:1-7:50 UR - http://dx.doi.org/10.1145/3365664 KW - Agent-oriented software engineering KW - norms KW - security and privacy requirements KW - design patterns KW - simulation ER - TY - JOUR TI - Better software analytics via "DUO": Data mining algorithms using/used-by optimizers AU - Agrawal, Amritanshu AU - Menzies, Tim AU - Minku, Leandro L. AU - Wagner, Markus AU - Yu, Zhe T2 - EMPIRICAL SOFTWARE ENGINEERING AB - This paper claims that a new field of empirical software engineering research and practice is emerging: data mining using/used-by optimizers for empirical studies or DUO. For example, data miners can generate models that are explored by optimizers. Also, optimizers can advise how to best adjust the control parameters of a data miner. This combined approach acts like an agent leaning over the shoulder of an analyst that advises "ask this question next" or "ignore that problem, it is not relevant to your goals". Further, those agents can help us build "better" predictive models, where "better" can be either greater predictive accuracy or faster modeling time (which, in turn, enables the exploration of a wider range of options). We also caution that the era of papers that just use data miners is coming to an end. Results obtained from an unoptimized data miner can be quickly refuted, just by applying an optimizer to produce a different (and better performing) model. Our conclusion, hence, is that for software analytics it is possible, useful and necessary to combine data mining and optimization using DUO. DA - 2020/5// PY - 2020/5// DO - 10.1007/s10664-020-09808-9 VL - 25 IS - 3 SP - 2099-2136 SN - 1573-7616 UR - https://doi.org/10.1007/s10664-020-09808-9 KW - Software analytics KW - Data mining KW - Optimization KW - Evolutionary algorithms ER - TY - CONF TI - Data-informed curriculum sequences for a curriculum-integrated game AU - Akintunde, Ruth Okoilu AU - Shabrina, Preya AU - Catete, Veronica AU - Barnes, Tiffany AU - Lynch, Collin AU - Rutherford, Teomara AB - In this paper, we perform a predictive analysis of a curriculum-integrated math game, ST Math, to suggest a partial ordering for the game's curriculum sequence. We analyzed the sequence of ST Math objectives played by elementary school students in 5 U.S. districts and grouped each objective into difficult and easy categories according to how many retries were needed for students to master an objective. We observed that retries on some objectives were high in one district and low in another district where the objectives are played in a different order. Motivated by this observation, we investigated what makes an effective curriculum sequence. To infer a new partially-ordered sequence, we performed an expanded replication study of a novel predictive analysis by a prior study to find predictive relationships between 15 objectives played in different sequences by 3,328 students from 5 districts. Based on the predictive abilities of objectives in these districts, we found 17 suggested objective orderings. After deriving these orderings, we confirmed the validity of the order by evaluating the impact of the suggested sequence on changes in rates of retries and corresponding performance. We observed that when the objectives were played in the suggested sequence, we record a drastic reduction in retries, implying that these objectives are easier for students. This indicates that objectives that come earlier can provide prerequisite knowledge for later objectives. We believe that data-informed sequences, such as the ones we suggest, may improve efficiency of instruction and increase content learning and performance. C2 - 2020/// C3 - Proceedings of the Tenth International Conference on Learning Analytics & Knowledge DA - 2020/// DO - 10.1145/3375462.3375530 SP - 635-644 PB - ACM UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85082400636&partnerID=MN8TOARS KW - Serious Game Analytics KW - Curricular Sequencing KW - Retries KW - Educational Games ER - TY - CONF TI - Investigating Different Assignment Designs to Promote Collaboration in Block-Based Environments AU - Lytle, Nicholas AU - Milliken, Alexandra AU - Cateté, Veronica AU - Barnes, Tiffany AB - Pair Programming is often employed in educational settings as a means of promoting collaboration and scaffolding the assignment difficulty for teams. While much research supports its inclusion as a pedagogical practice at the university level, some research has demonstrated in K-12 contexts, it can potentially lead to inequitable learning enviroments and create dynamics between partners that might negatively effect novice learners. New block-based programming environments like Netsblox have attempted to address this by creating ways for both partners to program simultaneously, but this feature has yet to be examined in detail. In this paper, we introduce several modes of Collaboration afforded by Netsblox. This includes Pair-Separate, Pair-Together, and Partner Puzzles - a mode that Splits the necessary blocks to build the assignment between team members. From an initial pilot study involving 25 pairs of middle and high school students, we find that most pairs preferred working on assignments in the Partner Puzzle mode as it presented a fun challenge to teams. We end on recommendations for building assignments using this methodology and future research directions investigating the role of collaboration in programming C2 - 2020/// C3 - Proceedings of the 51st ACM Technical Symposium on Computer Science Education DA - 2020/// DO - 10.1145/3328778.3366943 SP - 832-838 PB - ACM UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85081623137&partnerID=MN8TOARS KW - Pair-Programming KW - Collaboration KW - Block-based environments ER - TY - CONF TI - Code, Connect, Create: The 3C Professional Development Model to Support Computational Thinking Infusion AU - Jocius, Robin AU - Joshi, Deepti AU - Dong, Yihuan AU - Robinson, Richard AU - Cateté, Veronica AU - Barnes, Tiffany AU - Albert, Jennifer AU - Andrews, Ashley AU - Lytle, Nicholas AB - Despite the increasing attention to infusing CT into middle and high school content area classrooms, there is a lack of information about the most effective practices and models to support teachers in their efforts to integrate disciplinary content and CT principles. To address this need, this paper proposes the Code, Connect and Create (3C) professional development (PD) model, which was designed to support middle and high school content area teachers in infusing computational thinking into their classrooms. To evaluate the model, we analyzed quantitative and qualitative data collected from Infusing Computing PD workshops designed for in-service science, math, English language arts, and social studies teachers located in two Southeastern states. Drawing on findings from our analysis of teacher-created learning segments, surveys, and interviews, we argue that the 3C professional development model supported shifts in teacher understandings of the role of computational thinking in content area classrooms, as well as their self-efficacy and beliefs regarding CT integration into disciplinary content. We conclude by offering implications for the use of this model to increase teacher and student access to computational thinking practices in middle and high school classrooms. C2 - 2020/// C3 - Proceedings of the 51st ACM Technical Symposium on Computer Science Education DA - 2020/// DO - 10.1145/3328778.3366797 SP - 971-977 PB - ACM UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85081610818&partnerID=MN8TOARS KW - Computational thinking KW - Professional development KW - K-12 computing KW - Teacher education ER - TY - CONF TI - Designing Block-Based Programming Language Features to Support Upper Elementary Students in Creating Interactive Science Narratives AB - Recent years have seen a growing recognition of the importance of enabling K-12 students to engage in computational thinking, particularly in elementary grades where students' dispositions toward STEM are developing. Block-based programming has emerged as an effective tool for engaging these novice learners in computational thinking. At the same time, digital storytelling has emerged as a promising avenue for creating motivating problem-solving scenarios that engage students in science investigations. Although block-based programming and digital storytelling are in many ways synergistic, there is a lingering question of how to design block-based languages at an age-appropriate level to enable effective and engaging storytelling. In this work, we review design principles from prior block-based and digital storytelling systems as well as propose the design of block-based programming language features to enable the creation of rich, interactive science narratives by upper elementary students. C2 - 2020/2/26/ C3 - Proceedings of the 51st ACM Technical Symposium on Computer Science Education DA - 2020/2/26/ DO - 10.1145/3328778.3372653 UR - http://dx.doi.org/10.1145/3328778.3372653 KW - Digital storytelling KW - Block-based programming ER - TY - CONF TI - Cluster-Based Analysis of Novice Coding Misconceptions in Block-Based Programming AB - Recent years have seen an increasing interest in identifying common student misconceptions during introductory programming. In a parallel development, block-based programming environments for novice programmers have grown in popularity, especially in introductory courses. While these environments eliminate many syntax-related errors faced by novice programmers, there has been limited work that investigates the types of misconceptions students might exhibit in these environments. Developing a better understanding of these misconceptions will enable these programming environments and instructors to more effectively tailor feedback to students, such as prompts and hints, when they face challenges. In this paper, we present results from a cluster analysis of student programs from interactions with programming activities in a block-based programming environment for introductory computer science education. Using the interaction data from students' programming activities, we identify three families of student misconceptions and discuss their implications for refinement of the activities as well as design of future activities. We then examine the value of block counts, block sequence counts, and system interaction counts as programming features for clustering block-based programs. These clusters can help researchers identify which students would benefit from feedback or interventions and what kind of feedback provides the most benefit to that particular student. C2 - 2020/2/26/ C3 - Proceedings of the 51st ACM Technical Symposium on Computer Science Education DA - 2020/2/26/ DO - 10.1145/3328778.3366924 UR - http://dx.doi.org/10.1145/3328778.3366924 KW - Block-based programming KW - introductory programming education KW - cluster analysis ER - TY - JOUR TI - Modeling buffer capacity and pH in acid and acidified foods AU - Price, Robert E. AU - Longtin, Madyson AU - Conley-Payton, Summer AU - Osborne, Jason A. AU - Johanningsmeier, Suzanne D. AU - Bitzer, Donald AU - Breidt, Fred T2 - JOURNAL OF FOOD SCIENCE AB - Standard ionic equilibria equations may be used for calculating pH of weak acid and base solutions. These calculations are difficult or impossible to solve analytically for foods that include many unknown buffering components, making pH prediction in these systems impractical. We combined buffer capacity (BC) models with a pH prediction algorithm to allow pH prediction in complex food matrices from BC data. Numerical models were developed using Matlab software to estimate the pH and buffering components for mixtures of weak acid and base solutions. The pH model was validated with laboratory solutions of acetic or citric acids with ammonia, in combinations with varying salts using Latin hypercube designs. Linear regressions of observed versus predicted pH values based on the concentration and pK values of the solution components resulted in estimated slopes between 0.96 and 1.01 with and without added salts. BC models were generated from titration curves for 0.6 M acetic acid or 12.4 mM citric acid resulting in acid concentration and pK estimates. Predicted pH values from these estimates were within 0.11 pH units of the measured pH. Acetic acid concentration measurements based on the model were within 6% accuracy compared to high-performance liquid chromatography measurements for concentrations less than 400 mM, although they were underestimated above that. The models may have application for use in determining the BC of food ingredients with unknown buffering components. Predicting pH changes for food ingredients using these models may be useful for regulatory purposes with acid or acidified foods and for product development. PRACTICAL APPLICATION: Buffer capacity models may benefit regulatory agencies and manufacturers of acid and acidified foods to determine pH stability (below pH 4.6) and how low-acid food ingredients may affect the safety of these foods. Predicting pH for solutions with known or unknown buffering components was based on titration data and models that use only monoprotic weak acids and bases. These models may be useful for product development and food safety by estimating pH and buffering capacity. DA - 2020/4// PY - 2020/4// DO - 10.1111/1750-3841.15091 VL - 85 IS - 4 SP - 918-925 SN - 1750-3841 KW - acid KW - base KW - acid foods KW - acidified foods KW - buffer capacity KW - buffer model KW - pH ER - TY - CONF TI - Toward Finding Online Activity Patterns in a Flipped Programming Course AU - Battestilli, Lina AU - Domı́nguez, Ignacio X. AU - Thyagarajan, Maanasa AB - Instructors are increasingly flipping their classrooms, where students are required to study on their own prior to in-class time with the instructor. We present preliminary results on identifying student online behavior patterns in a CS1 flipped course that correlate with students' test scores covering the material explained in the online videos. We found that clustering students based on how much of the online lecture videos they watched allows us to find significant differences in the average test scores of each cluster. C2 - 2020/2/26/ C3 - Proceedings of the 51st ACM Technical Symposium on Computer Science Education CY - New York, NY, USA DA - 2020/2/26/ DO - 10.1145/3328778.3372626 SP - 1345 PB - Association for Computing Machinery UR - http://dx.doi.org/10.1145/3328778.3372626 KW - CS1 KW - flipped classroom KW - online student behavior KW - online video KW - activity patterns ER - TY - JOUR TI - The Effect of Metacognitive Scaffolding for Learning by Teaching a Teachable Agent AU - Matsuda, Noboru AU - Weng, Wenting AU - Wall, Natalie T2 - International Journal of Artificial Intelligence in Education DA - 2020/1/8/ PY - 2020/1/8/ DO - 10.1007/s40593-019-00190-2 VL - 30 IS - 1 SP - 1-37 J2 - Int J Artif Intell Educ LA - en OP - SN - 1560-4292 1560-4306 UR - http://dx.doi.org/10.1007/s40593-019-00190-2 DB - Crossref KW - Learning by teaching KW - Goal-oriented practice KW - Teachable agent KW - Algebra ER - TY - JOUR TI - The Impact of Contextualized Emotions on Self-Regulated Learning and Scientific Reasoning during Learning with a Game-Based Learning Environment AU - Taub, Michelle AU - Sawyer, Robert AU - Lester, James AU - Azevedo, Roger T2 - INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION DA - 2020/3// PY - 2020/3// DO - 10.1007/s40593-019-00191-1 VL - 30 IS - 1 SP - 97-120 ER - TY - JOUR TI - Story-telling maps generated from semantic representations of events AU - Tateosian, Laura AU - Glatz, Michelle AU - Shukunobe, Makiko T2 - BEHAVIOUR & INFORMATION TECHNOLOGY AB - Narratives enable readers to assimilate disparate facts. Accompanying maps can make the narratives even more accessible. As work in computer science has begun to generate stories from low-level event/activity data, there is a need for systems that complement these tools to generate maps illustrating spatial components of these stories. While traditional maps display static spatial relationships, story maps need to not only dynamically display relationships based on the flow of the story but also display character perceptions and intentions. In this work, we study cartographic illustrations of historical battles to design a map generation system for reports produced from a multiplayer battle game log. We then create a story and ask viewers to describe mapped events and rate their own descriptions relative to intended interpretations. Some viewers received training prior to seeing the story, which was shown to be effective, though training may have been unnecessary for certain map types. Self-rating correlated highly with expert ratings, revealing an efficient proxy for expert analysis of map interpretability, a shortcut for determining if training is needed for story-telling maps or other novel visualisation techniques. The study's semantic questions and feedback solicitation demonstrate a process for identifying user-centric improvements to story-telling map design. DA - 2020/4/2/ PY - 2020/4/2/ DO - 10.1080/0144929X.2019.1569162 VL - 39 IS - 4 SP - 391-413 SN - 1362-3001 KW - Visualisation KW - maps KW - user study ER - TY - JOUR TI - Going deeper: Automatic short-answer grading by combining student and question models AU - Zhang, Yuan AU - Lin, Chen AU - Chi, Min T2 - USER MODELING AND USER-ADAPTED INTERACTION DA - 2020/3// PY - 2020/3// DO - 10.1007/s11257-019-09251-6 VL - 30 IS - 1 SP - 51-80 SN - 1573-1391 KW - Automatic short-answer grading KW - Machine learning KW - Deep belief network ER - TY - JOUR TI - Better together: Comparing vulnerability prediction models AU - Theisen, Christopher AU - Williams, Laurie T2 - INFORMATION AND SOFTWARE TECHNOLOGY AB - Vulnerability Prediction Models (VPMs) are an approach for prioritizing security inspection and testing to find and fix vulnerabilities. VPMs have been created based on a variety of metrics and approaches, yet widespread adoption of VPM usage in practice has not occurred. Knowing which VPMs have strong prediction and which VPMs have low data requirements and resources usage would be useful for practitioners to match VPMs to their project’s needs. The low density of vulnerabilities compared to defects is also an obstacle for practical VPMs. The goal of the paper is to help security practitioners and researchers choose appropriate features for vulnerability prediction through a comparison of Vulnerability Prediction Models. We performed replications of VPMs on Mozilla Firefox with 28,750 source code files featuring 271 vulnerabilities using software metrics, text mining, and crash data. We then combined features from each VPM and reran our classifiers. We improved the F-score of the best VPM (.20 to 0.28) by combining features from three types of VPMs and using Naive Bayes as the classifier. The strongest features in the combined model were the number of times a file was involved in a crash, the number of outgoing calls from a file, and the string “nullptr”. Our results indicate that further work is needed to develop new features for input into classifiers. In addition, new analytic approaches for VPMs are needed for VPMs to be useful in practical situations, due to the low density of vulnerabilities in software (less than 1% for our dataset). DA - 2020/3// PY - 2020/3// DO - 10.1016/j.infsof.2019.106204 VL - 119 SP - SN - 1873-6025 KW - Security KW - Vulnerabilities KW - Prediction model KW - Software engineering ER - TY - JOUR TI - PeakPass: Automating ChIP-Seq Blacklist Creation AU - Wimberley, Charles E. AU - Heber, Steffen T2 - JOURNAL OF COMPUTATIONAL BIOLOGY AB - ChIP-Seq blacklists contain genomic regions that frequently produce artifacts and noise in ChIP-Seq experiments. To improve signal-to-noise ratio, ChIP-Seq pipelines often remove data points that map to blacklist regions. Existing blacklists have been compiled in a manual or semiautomated way. In this article we describe PeakPass, an efficient method to generate blacklists, and demonstrate that blacklists can increase ChIP-Seq data quality. PeakPass leverages machine learning and attempts to automate blacklist generation. PeakPass uses a random forest classifier in combination with genomic features such as sequence, annotated repeats, complexity, assembly gaps, and the ratio of multimapping to uniquely mapping reads to identify artifact regions. We have validated PeakPass on a large data set and tested it for the purpose of upgrading a blacklist to a new reference genome version. We trained PeakPass on the ENCODE blacklist for the hg19 human reference genome, and created an updated blacklist for hg38. To assess the performance of this blacklist, we tested 42 ChIP-Seq replicates from 24 experiments using 10 ChIP-Seq quality metrics including relative strand coefficient, standardized standard deviation, and enrichment of reads in promoter regions. Using the blacklist generated by PeakPass resulted in a statistically significant improvement for nine of these metrics. DA - 2020/2/1/ PY - 2020/2/1/ DO - 10.1089/cmb.2019.0295 VL - 27 IS - 2 SP - 259-268 SN - 1557-8666 KW - blacklist KW - ChIP-seq KW - classification KW - quality control ER - TY - JOUR TI - The agency effect: The impact of student agency on learning, emotions, and problem-solving behaviors in a game-based learning environment AU - Taub, M. AU - Sawyer, R. AU - Smith, A. AU - Rowe, J. AU - Azevedo, R. AU - Lester, J. T2 - Computers and Education AB - Game-based learning environments are designed to foster high levels of student engagement and motivation during learning of complex topics. Game-based learning environments allow students freedom to navigate a space to interact with game elements that foster learning, i.e., agency. Agency has been studied in learning, and it has been demonstrated that increased student agency results in greater learning outcomes. However, it is unclear what is the level of agency that is required to demonstrate this effect, and whether this effect applies only to learning or to problem solving and affect during game-based learning as well. To investigate how the level of student agency impacts learning, problem solving, and affect, a study was conducted with 138 college students interacting with a game-based learning environment for microbiology, Crystal Island. This study is an extension of a previous study that examined the impact of agency on learning and problem-solving behaviors during game-based learning with Crystal Island. Students were randomly assigned to either a High Agency condition, a Low Agency condition, or a No Agency condition. It was found that students in the Low Agency condition achieved significantly higher normalized learning gain scores than students in the No Agency condition, and marginally higher normalized learning gains than the High Agency condition. Post-surveys of interest and presence indicated that students in the No Agency condition were less interested, and perceived themselves as less present in the virtual environment, than students in the other conditions. Students in the No Agency condition also experienced less frustration, confusion, and joy than the other agency conditions, indicating a less cognitively stimulating experience. Overall the results indicate that a moderate degree of agency provided to students in game-based learning environments leads to better learning outcomes without sacrificing interest and without yielding a negative emotional experience, demonstrating how even low levels of agency can positively impact learning, problem solving, and affect during game-based learning. DA - 2020/// PY - 2020/// DO - 10.1016/j.compedu.2019.103781 VL - 147 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85076323543&partnerID=MN8TOARS KW - Agency in learning KW - Game-based learning KW - Learner-centered emotions KW - Self-regulated learning ER - TY - JOUR TI - Blockchain-Based Financial Technologies and Cryptocurrencies for Low-Income People: Technical Potential Versus Practical Reality AU - Singh, Munindar P. AU - Chopra, Amit K. T2 - COMPUTER AB - Several blockchain-based financial technologies and cryptocurrencies have been launched for low-income people. Blockchain?s technical potential can be used to serve the needs of unbanked and underbanked populations, but there is no evidence that these needs are being met. DA - 2020/1// PY - 2020/1// DO - 10.1109/MC.2019.2951977 VL - 53 IS - 1 SP - 53-62 SN - 1558-0814 ER - TY - JOUR TI - Enabling Runtime SpMV Format Selection through an Overhead Conscious Method AU - Zhou, Weijie AU - Zhao, Yue AU - Shen, Xipeng AU - Chen, Wang T2 - IEEE Transactions on Parallel and Distributed Systems AB - Sparse matrix-vector multiplication (SpMV) is an important kernel and its performance is critical for many applications. Storage format selection is to select the best format to store a sparse matrix; it is essential for SpMV performance. Prior studies have focused on predicting the format that helps SpMV run fastest, but have ignored the runtime prediction and format conversion overhead. This work shows that the runtime overhead makes the predictions from previous solutions frequently sub-optimal and sometimes inferior regarding the end-to-end time. It proposes a new paradigm for SpMV storage selection, an overhead-conscious method. Through carefully designed regression models and neural network-based time series prediction models, the method captures the influence imposed on the overall program performance by the overhead and the benefits of format prediction and conversions. The method employs a novel two-stage lazy-and-light scheme to help control the possible negative effects of format predictions, and at the same time, maximize the overall format conversion benefits. Experiments show that the technique outperforms previous techniques significantly. It improves the overall performance of applications by 1.21X to 1.53X, significantly larger than the 0.83X to 1.25X upper-bound speedups overhead-oblivious methods could give. DA - 2020/1// PY - 2020/1// DO - 10.1109/TPDS.2019.2932931 VL - 31 IS - 1 SP - 80-93 KW - SpMV KW - high performance computing KW - program optimization KW - sparse matrix format KW - prediction model ER - TY - JOUR TI - The Five Laws of SE for AI AU - Menzies, Tim T2 - IEEE SOFTWARE AB - It is time to talk about software engineering (SE) for artificial intelligence (AI). As shown in Figure 1, industry is becoming increasingly dependent on AI software. Clearly, AI is useful for SE. But what about the other way around? How important is SE for AI? Many thought leaders in the AI industry are asking how to better develop and maintain AI software (see Figure 2). DA - 2020/// PY - 2020/// DO - 10.1109/MS.2019.2954841 VL - 37 IS - 1 SP - 81-85 SN - 1937-4194 ER - TY - JOUR TI - Computational Governance and Violable Contracts for Blockchain Applications AU - Singh, Munindar P. AU - Chopra, Amit K. T2 - Computer AB - We propose a sociotechnical, yet computational, approach to building decentralized applications that accommodates and exploits blockchain technology. Our architecture incorporates the notion of a declarative, violable contract and enables flexible governance based on formal organizational structures, correctness verification without obstructing autonomy, and a basis for trust. DA - 2020/1// PY - 2020/1// DO - 10.1109/MC.2019.2947372 VL - 53 IS - 1 SP - 53-62 UR - https://doi.org/10.1109/MC.2019.2947372 ER - TY - JOUR TI - Special Issue: Graph Computing AU - Jin, Hai AU - Shen, Xipeng AU - Lovas, Robert AU - Liao, Xiaofei T2 - CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE AB - Graph computing now is popular in many areas, including social network and gene sequence alignment. Graph computing system and algorithm have a history prior to the use of graph databases and have a future that is not necessarily entangled with typical database concerns. With the data's increasing size, many distributed graph-computing systems have been developed in recent years to process and analyze massive graphs. Researchers pay more attention on the graph partition schemes on distributed environment. However, other researchers think a single system can avoid the network overhead and may have better performance even if the data size is too big for the memory space. With the rapid development of coprocessors, some researchers think it is promising to build a domain specific computer, just for graph computing. The proposed special issue of Concurrency and Computation: Practice and Experience contains revised and extended versions of selected best papers with respect to graph computing at the 21st IEEE International Conference on Parallel and Distributed Systems (ICPADS’16), which was held at Wuhan, China, on December 13-16, 2016. Established in 1992, ICPADS has been a major international forum for scientists, engineers, and users to exchange and share their experiences, new ideas, and latest research results on all aspects of parallel and distributed computing systems. The purpose of this special issue is to provide a comprehensive view into recent advances in systems software, algorithms, partition schemes, and even graph computer based on new advances in computer architecture and applications. The five selected papers are summarized as follows. The first paper, titled “An efficient iterative graph data processing framework based on bulk synchronous parallel model” by Liu et al,1 presents an efficient computational framework for graph data processing based on the bulk synchronous parallel model. Existing Pregel-like graph processing systems remains in its early stage, and there still exist many challenges with prohibitive superstep-synchronized overhead. Furthermore, the graph data partition strategy in these earlier graph systems fails to support load balancing, therefore causing the increase of network I/O overhead as the scale of graph data grows. Thus, this paper leverages a global synchronization mechanism to enhance the performance of graph computation. Meanwhile, a balanced hash-based graph partition mechanism is presented to optimize the large-scale graph data processing. The work has a real implementation upon on Pregrel system, which can better support a variety of graph analytics applications. The second paper, titled “An efficient iterative graph data processing framework based on bulk synchronous parallel model” by Linchen Yu,2 proposes an optimized scheduling system for parallelizing the programs in the Xen. Virtualization challenges the traditional CPU scheduling, leading that the spin lock in virtualized environment can be preempted by the VMM, increasing synchronization overhead and decreasing the performance of parallel programs. Many studies have proposed the co-scheduling to alleviate this problem. However, these earlier attempts are not suitable to non-parallel workloads with the CPU fragmentation problem as well. Therefore, a simultaneous optimization scheduling system, called CCHybrid, is proposed in the Xen virtualized environment. Results show the efficiency of CCHybrid over the traditional Xen Credit scheduler. The third paper, titled “ms-PoSW: A multi-server aided proof of shared ownership scheme for secure deduplication in cloud” by Xiong et al,3 introduces a novel concept of the Proof for securing client-side deduplication of the shared files. With the rapid development of cloud computing and big data technologies, collaborative cloud applications are inextricably linked to our daily life and, therefore, produce a large number of shared files, which is challenging for secure access and data duplication in cloud. This paper proposes a novel multiserver-aided PoSW scheme for collaborative cloud applications and propose a hybrid PoSW scheme to reduce the computational cost of the shared owner's client. Furthermore, a hybrid PoSW scheme is constructed to address the secure proof of hybrid cloud architectures. The fourth paper, titled “Sparse random compressive sensing based data aggregation in wireless sensor networks” by Yin et al,4 introduces a compressive data aggregation scheme. In wireless sensor networks, the increasingly expanding data volume has high spatial-temporal correlation. Although some earlier studies attempt to eliminate data redundancy, few can handle energy consumption and latency simultaneously. In this paper, the authors a delay-minimum energy-balanced data aggregation method, which can eliminate the redundancy among the readings and prolong the network lifetime. A sparse random matrix is adopted as a measurement matrix to balance communication cost. Particularly, each measurement can form an aggregation tree with minimum delay. Furthermore, a novel scheduling method is used to avoid information interference as well. The fifth paper, titled “Dynamic cluster strategy for hierarchical rollback-recovery protocols in MPI HPC applications” by Liao et al,5 proposes a dynamic cluster strategy to adapt to the runtime variation of communication pattern by using a prediction scheme. The idea comes from a fact that Hierarchical rollback-recovery protocols provide failure containment and reduce the amount of message to be logged, making it an attractive and scalable solution for fault tolerance even at a large scale. This paper shows how the communication pattern changes with the stages of application because MPI HPC applications scale up and become more complex. Therefore, to further increase the efficiency of hierarchical rollback-recovery protocols, the authors propose a dynamic cluster strategy (DCS) to adapt to the change of communication pattern. In contrast to the existing static process partition algorithms, this strategy adopts a prediction mechanism by using the clusters of processes obtained from prior part of applications in the succeeding part. Detailed experiments are then performed to evaluate the effectiveness and efficiency DCS at an extremely large scale. We hope that the readers would find the contents of this special issue interesting and further inspire them to look ahead into the challenges of designing, exploring, and exploiting graph analytics applications. DA - 2020/2/10/ PY - 2020/2/10/ DO - 10.1002/cpe.5452 VL - 32 IS - 3 SP - SN - 1532-0634 ER - TY - CHAP TI - A simple Hybrid Event-B model of an active control system for earthquake protection AU - Banach, Richard AU - Baugh, John T2 - From Astrophysics to Unconventional Computation A2 - Adamatzky, Andrew A2 - Kendon, Vivien AB - In earthquake-prone zones of the world, severe damage to buildings and life endangering harm to people pose a major risk when severe earthquakes happen. In recent decades, active and passive measures to prevent building damage have been designed and deployed. A simple model of an active damage prevention system, founded on earlier work, is investigated from a model based formal development perspective, using Hybrid Event-B. The non-trivial physical behaviour in the model is readily captured within the formalism. However, when the usual approximation and discretization techniques from engineering and applied mathematics are used, the rather brittle refinement techniques used in model based formal development start to break down. Despite this, the model developed stands up well when compared via simulation with a standard approach. The requirements of a richer formal development framework, better able to cope with applications exhibiting non-trivial physical elements are discussed. PY - 2020/// DO - 10.1007/978-3-030-15792-0_7 VL - 35 SP - 157-194 PB - Springer ER -