TY - CONF TI - Characterization of Sweetpotato Inheritance Using Ultradense Multilocus Genetic Map AU - Mollinari, M. AU - Bode, A.O. AU - Pereira, G.S. AU - Gemenet, D.C. AU - Khan, A. AU - Yencho, Gc AU - Zeng, Z B T2 - International Plant & Animal Genome XXVIII Conference C2 - 2020/// C3 - International Plant & Animal Genome XXVIII Conference DA - 2020/// ER - TY - JOUR TI - fullsibQTL: An R package for QTL mapping in biparental populations of outcrossing species AU - Gazaffi, R. AU - Amadeu, R.R. AU - Mollinari, M. AU - Rosa, J.R.B.F. AU - Taniguti, C.H. AU - Margarido, G.R.A. AU - Garcia, A.A.F. T2 - bioRxiv AB - ABSTRACT Accurate QTL mapping in outcrossing species requires software programs which consider genetic features of these populations, such as markers with different segregation patterns and different level of information. Although the available mapping procedures to date allow inferring QTL position and effects, they are mostly not based on multilocus genetic maps. Having a QTL analysis based in such maps is crucial since they allow informative markers to propagate their information to less informative intervals of the map. We developed fullsibQTL , a novel and freely available R package to perform composite interval QTL mapping considering outcrossing populations and markers with different segregation patterns. It allows to estimate QTL position, effects, segregation patterns, and linkage phase with flanking markers. Additionally, several statistical and graphical tools are implemented, for straightforward analysis and interpretations. fullsibQTL is an R open source package with C and R source code (GPLv3). It is multiplatform and can be installed from https://github.com/augusto-garcia/fullsibQTL . DA - 2020/// PY - 2020/// DO - 10.1101/2020.12.04.412262 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85099897031&partnerID=MN8TOARS ER - TY - JOUR TI - Quantitative trait locus mapping for common scab resistance in a tetraploid potato full-sib population AU - Silva Pereira, G. AU - Mollinari, M. AU - Qu, X. AU - Thill, C. AU - Zeng, Z.-B. AU - Haynes, K. AU - Yencho, G.C. T2 - bioRxiv AB - Abstract Despite the negative impact of common scab ( Streptomyces spp.) to the potato industry, little is known about the genetic architecture of resistance to this bacterial disease in the crop. We evaluated a mapping population (~150 full-sibs) derived from a cross between two tetraploid potatoes (‘Atlantic’ × B1829-5) in three environments (MN11, PA11, ME12) under natural common scab pressure. Three measures to common scab reaction were assessed, namely percentage of scabby tubers, and disease area and lesion indices, which were highly correlated (>0.76). Due to large environmental effect, heritability values were zero for all three traits in MN11, but moderate to high in PA11 and ME12 (0.44~0.79). We identified a single quantitative trait locus (QTL) for lesion index in PA11, ME12 and joint analyses on linkage group 3, explaining 22~30% of the total variation. The identification of QTL haplotypes and candidate genes contributing to disease resistance can support genomics-assisted breeding approaches. DA - 2020/// PY - 2020/// DO - 10.1101/2020.10.24.353557 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85098810824&partnerID=MN8TOARS ER - TY - JOUR TI - Mercury exposure in relation to sleep duration, timing, and fragmentation among adolescents in Mexico City AU - Jansen, E.C. AU - Hector, Emily C. AU - Goodrich, J.M. AU - Cantoral, A. AU - Rojo, M.M. Téllez AU - Basu, N. AU - Song, P.X.-K. AU - Olascoaga, L. Torres AU - Peterson, K.E. T2 - Environmental Research DA - 2020/// PY - 2020/// VL - 191 SP - 110216 ER - TY - JOUR TI - Doubly distributed supervised learning and inference with high-dimensional correlated outcomes AU - Hector, Emily C. AU - Song, Peter X.-K. T2 - Journal of Machine Learning Research DA - 2020/// PY - 2020/// VL - 21 SP - 1-35 ER - TY - JOUR TI - The value of summary statistics for anomaly detection in temporally evolving networks: A performance evaluation study AU - Kodali, Lata AU - Sengupta, Srijan AU - House, Leanna AU - Woodall, William H T2 - Applied Stochastic Models in Business and Industry DA - 2020/// PY - 2020/// VL - 36 IS - 6 SP - 980-1013 ER - TY - JOUR TI - Scalable estimation of epidemic thresholds via node sampling AU - Dasgupta, Anirban AU - Sengupta, Srijan T2 - arXiv preprint arXiv:2007.14820 DA - 2020/// PY - 2020/// ER - TY - JOUR TI - Online Social Deception and Its Countermeasures: A Survey AU - Guo, Zhen AU - Cho, Jin-Hee AU - Chen, Ray AU - Sengupta, Srijan AU - Hong, Michin AU - Mitra, Tanushree T2 - IEEE Access DA - 2020/// PY - 2020/// ER - TY - JOUR TI - Improved understanding and prediction of freshwater fish communities through the use of joint species distribution models AU - Wagner, Tyler AU - Hansen, Gretchen J.A. AU - Schliep, Erin M. AU - Bethke, Bethany J. AU - Honsey, Andrew E. AU - Jacobson, Peter C. AU - Kline, Benjamen C. AU - White, Shannon L. T2 - Canadian Journal of Fisheries and Aquatic Sciences AB - Two primary goals in fisheries research are to (i) understand how habitat and environmental conditions influence the distribution of fishes across the landscape and (ii) make predictions about how fish communities will respond to environmental and anthropogenic change. In inland, freshwater ecosystems, quantitative approaches traditionally used to accomplish these goals largely ignore the effects of species interactions (competition, predation, mutualism) on shaping community structure, potentially leading to erroneous conclusions regarding habitat associations and unrealistic predictions about species distributions. Using two contrasting case studies, we highlight how joint species distribution models (JSDMs) can address the aforementioned deficiencies by simultaneously quantifying the effects of abiotic habitat variables and species dependencies. In particular, we show that conditional predictions of species occurrence from JSDMs can better predict species presence or absence compared with predictions that ignore species dependencies. JSDMs also allow for the estimation of site-specific probabilities of species co-occurrence, which can be informative for generating hypotheses about species interactions. JSDMs provide a flexible framework that can be used to address a variety of questions in fisheries science and management. DA - 2020/9// PY - 2020/9// DO - 10.1139/cjfas-2019-0348 VL - 77 IS - 9 SP - 1540-1551 J2 - Can. J. Fish. Aquat. Sci. LA - en OP - SN - 0706-652X 1205-7533 UR - http://dx.doi.org/10.1139/cjfas-2019-0348 DB - Crossref ER - TY - JOUR TI - Ecological prediction at macroscales using big data: Does sampling design matter? AU - Soranno, Patricia A. AU - Cheruvelil, Kendra Spence AU - Liu, Boyang AU - Wang, Qi AU - Tan, Pang‐Ning AU - Zhou, Jiayu AU - King, Katelyn B. S. AU - McCullough, Ian M. AU - Stachelek, Jemma AU - Bartley, Meridith AU - Filstrup, Christopher T. AU - Hanks, Ephraim M. AU - Lapierre, Jean‐François AU - Lottig, Noah R. AU - Schliep, Erin M. AU - Wagner, Tyler AU - Webster, Katherine E. T2 - Ecological Applications AB - Abstract Although ecosystems respond to global change at regional to continental scales (i.e., macroscales), model predictions of ecosystem responses often rely on data from targeted monitoring of a small proportion of sampled ecosystems within a particular geographic area. In this study, we examined how the sampling strategy used to collect data for such models influences predictive performance. We subsampled a large and spatially extensive data set to investigate how macroscale sampling strategy affects prediction of ecosystem characteristics in 6,784 lakes across a 1.8‐million‐km 2 area. We estimated model predictive performance for different subsets of the data set to mimic three common sampling strategies for collecting observations of ecosystem characteristics: random sampling design, stratified random sampling design, and targeted sampling. We found that sampling strategy influenced model predictive performance such that (1) stratified random sampling designs did not improve predictive performance compared to simple random sampling designs and (2) although one of the scenarios that mimicked targeted (non‐random) sampling had the poorest performing predictive models, the other targeted sampling scenarios resulted in models with similar predictive performance to that of the random sampling scenarios. Our results suggest that although potential biases in data sets from some forms of targeted sampling may limit predictive performance, compiling existing spatially extensive data sets can result in models with good predictive performance that may inform a wide range of science questions and policy goals related to global change. DA - 2020/4/27/ PY - 2020/4/27/ DO - 10.1002/eap.2123 VL - 30 IS - 6 J2 - Ecol Appl LA - en OP - SN - 1051-0761 1939-5582 UR - http://dx.doi.org/10.1002/eap.2123 DB - Crossref KW - data-intensive ecology KW - ecological context KW - extrapolation KW - interpolation KW - lakes KW - macroscale KW - monitoring KW - prediction KW - sampling KW - sampling design ER - TY - JOUR TI - On the spatial and temporal shift in the archetypal seasonal temperature cycle as driven by annual and semi‐annual harmonics AU - North, Joshua S. AU - Schliep, Erin M. AU - Wikle, Christopher K. T2 - Environmetrics AB - Abstract Statistical methods are required to evaluate and quantify the uncertainty in environmental processes, such as land and sea surface temperature, in a changing climate. Typically, annual harmonics are used to characterize the variation in the seasonal temperature cycle. However, an often overlooked feature of the climate seasonal cycle is the semi‐annual harmonic, which can account for a significant portion of the variance of the seasonal cycle and varies in amplitude and phase across space. Together, the spatial variation in the annual and semi‐annual harmonics can play an important role in driving processes that are tied to seasonality (e.g., ecological and agricultural processes). We propose a multivariate spatiotemporal model to quantify the spatial and temporal change in minimum and maximum temperature seasonal cycles as a function of the annual and semi‐annual harmonics. Our approach captures spatial dependence, temporal dynamics, and multivariate dependence of these harmonics through spatially and temporally varying coefficients. We apply the model to minimum and maximum temperature over North American for the years 1979–2018. Formal model inference within the Bayesian paradigm enables the identification of regions experiencing significant changes in minimum and maximum temperature seasonal cycles due to the relative effects of changes in the two harmonics. DA - 2020/12/28/ PY - 2020/12/28/ DO - 10.1002/env.2665 VL - 32 IS - 6 J2 - Environmetrics LA - en OP - SN - 1180-4009 1099-095X UR - http://dx.doi.org/10.1002/env.2665 DB - Crossref KW - dynamic system modeling KW - North American temperature cycle KW - predictive process KW - spatial synchrony KW - spatiotemporal statistics ER - TY - JOUR TI - Data fusion model for speciated nitrogen to identify environmental drivers and improve estimation of nitrogen in lakes AU - Schliep, Erin M. AU - Collins, Sarah M. AU - Rojas-Salazar, Shirley AU - Lottig, Noah R. AU - Stanley, Emily H. T2 - The Annals of Applied Statistics AB - Concentrations of nitrogen provide a critical metric for understanding ecosystem function and water quality in lakes. However, varying approaches for quantifying nitrogen concentrations may bias the comparison of water quality across lakes and regions. Different measurements of total nitrogen exist based on its composition (e.g., organic versus inorganic, dissolved versus particulate), which we refer to as nitrogen species. Fortunately, measurements of multiple nitrogen species are often collected and can, therefore, be leveraged together to inform our understanding of the controls on total nitrogen in lakes. We develop a multivariate hierarchical statistical model that fuses speciated nitrogen measurements, obtained across multiple methods of reporting, in order to improve our estimates of total nitrogen. The model accounts for lower detection limits and measurement error that vary across lake, species and observation. By modeling speciated nitrogen, as opposed to previous efforts that mostly consider only total nitrogen, we obtain more resolved inference with regard to differences in sources of nitrogen and their relationship with complex environmental drivers. We illustrate the inferential benefits of our model using speciated nitrogen data from the LAke GeOSpatial and temporal database (LAGOS). DA - 2020/12/1/ PY - 2020/12/1/ DO - 10.1214/20-aoas1371 VL - 14 IS - 4 J2 - Ann. Appl. Stat. OP - SN - 1932-6157 UR - http://dx.doi.org/10.1214/20-aoas1371 DB - Crossref KW - Bayesian hierarchical model KW - detection limits KW - LAGOS KW - multivariate KW - Markov chain Monte Carlo ER - TY - JOUR TI - Statistical data integration in survey sampling: a review AU - Yang, Shu AU - Kim, Jae Kwang T2 - JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE AB - Finite population inference is a central goal in survey sampling. Probability sampling is the main statistical approach to finite population inference. Challenges arise due to high cost and increasing non-response rates. Data integration provides a timely solution by leveraging multiple data sources to provide more robust and efficient inference than using any single data source alone. The technique for data integration varies depending on types of samples and available information to be combined. This article provides a systematic review of data integration techniques for combining probability samples, probability and non-probability samples, and probability and big data samples. We discuss a wide range of integration methods such as generalized least squares, calibration weighting, inverse probability weighting, mass imputation, and doubly robust methods. Finally, we highlight important questions for future research. DA - 2020/12// PY - 2020/12// DO - 10.1007/s42081-020-00093-w VL - 3 IS - 2 SP - 625-650 SN - 2520-8764 KW - Generalizability KW - Meta-analysis KW - Missing at random KW - Transportability ER - TY - JOUR TI - Water quality performance of a permeable pavement and stormwater harvesting treatment train stormwater control measure AU - Winston, Ryan J. AU - Arend, Kristi AU - Dorsey, Jay D. AU - Hunt, William F. T2 - BLUE-GREEN SYSTEMS AB - Abstract Stormwater runoff from urban development causes undesired impacts to surface waters, including discharge of pollutants, erosion, and loss of habitat. A treatment train consisting of permeable interlocking concrete pavement and underground stormwater harvesting was monitored to quantify water quality improvements. The permeable pavement provided primary treatment and the cistern contributed to final polishing of total suspended solids (TSS) and turbidity concentrations (&gt;96%) and loads (99.5% for TSS). Because of this, &gt;40% reduction of sediment-bound nutrient forms and total nitrogen was observed. Nitrate reduction (&gt;70%) appeared to be related to an anaerobic zone in water stored in the scarified soil beneath the permeable pavement, allowing denitrification to occur. Sequestration of copper, lead, and zinc occurred during the first 5 months of monitoring, with leaching observed during the second half of the monitoring period. This was potentially caused by a decrease in pH within the cistern or residual chloride from deicing salt causing de-sorption of metals from accumulated sediment. Pollutant loading followed the same trends as pollutant concentrations, with load reduction improved vis-à-vis concentrations because of the 27% runoff reduction provided by the treatment train. This study has shown that permeable pavement can serve as an effective pretreatment for stormwater harvesting schemes. DA - 2020/1/1/ PY - 2020/1/1/ DO - 10.2166/bgs.2020.914 VL - 2 IS - 1 SP - 91-111 SN - 2617-4782 KW - green infrastructure KW - pervious pavement KW - porous pavement KW - rainwater harvesting KW - series KW - WSUD ER - TY - CONF TI - A New Framework for Online Testing of Heterogeneous Treatment Effect AU - Yu, M. AU - Lu, W. AU - Song, R. T2 - Thirty-Fourth AAAI Conference on Artificial Intelligence AB - We propose a new framework for online testing of heterogeneous treatment effects. The proposed test, named sequential score test (SST), is able to control type I error under continuous monitoring and detect multi-dimensional heterogeneous treatment effects. We provide an online p-value calculation for SST, making it convenient for continuous monitoring, and extend our tests to online multiple testing settings by controlling the false discovery rate. We examine the empirical performance of the proposed tests and compare them with a state-of-art online test, named mSPRT using simulations and a real data. The results show that our proposed test controls type I error at any time, has higher detection power and allows quick inference on online A/B testing. C2 - 2020/// C3 - Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence CY - New York Hilton Midtown, New York, New York, USA DA - 2020/// PY - 2020/2/7/ DO - 10.1609/aaai.v34i06.6594 VL - 34 SP - 10310-10317 M1 - 6 PB - AAAI Press ER - TY - JOUR TI - Differences in proteome response to cold acclimation in Zoysia japonica cultivars with different levels of freeze tolerance AU - Brown, Jessica M. AU - Yu, Xingwang AU - Holloway, H. McCamy P. AU - DaCosta, Michelle AU - Bernstein, Rachael P. AU - Lu, Jefferson AU - Tuong, Tan D. AU - Patton, Aaron J. AU - Dunne, Jeffrey C. AU - Arellano, Consuelo AU - Livingston, David P. AU - Milla-Lewis, Susana R. T2 - CROP SCIENCE AB - Abstract Zoysiagrasses ( Zoysia spp.) are warm‐season turfgrasses primarily grown in the southern and transition zones of the United States. An understanding of the physiological and proteomic changes that zoysiagrasses undergo during cold acclimation may shed light on phenotypic traits and proteins useful in selection of freeze‐tolerant genotypes. We investigated the relationship between cold acclimation, protein expression, and freeze tolerance in cold acclimated (CA) and nonacclimated (NA) plants of Zoysia japonica Steud. cultivars Meyer (freeze‐tolerant) and Victoria (freeze‐susceptible). Meristematic tissues from the grass crowns were harvested for proteomic analysis. Freeze testing indicated that cold acclimation accounted for a 1.9‐fold increase in plant survival than nonacclimation treatment. Overall, proteomic analysis identified 62 protein spots differentially accumulated in abundance under cold acclimation. Nine and 22 unique protein spots were identified for Meyer and Victoria, respectively, with increased abundance or decreased abundance. In addition, 23 shared protein spots were found among the two cultivars in response to cold acclimation. Function classification revealed that these proteins were involved primarily in transcription, signal transduction and stress defense, carbohydrate and energy metabolism, and protein and amino acid metabolism. Several proteins of interest for their association with cold acclimation were identified. Further investigation of these proteins and their functional categories may contribute to increase our understanding of the differences in freezing tolerance among zoysiagrass germplasm. DA - 2020/// PY - 2020/// DO - 10.1002/csc2.20225 VL - 60 IS - 5 SP - 2744-2756 SN - 1435-0653 ER - TY - JOUR TI - Provable Convex Co-clustering of Tensors AU - Chi, Eric C. AU - Gaines, Brian J. AU - Sun, Will Wei AU - Zhou, Hua AU - Yang, Jian T2 - Journal of Machine Learning Research DA - 2020/// PY - 2020/// VL - 21 IS - 214 SP - 1-58 UR - http://jmlr.org/papers/v21/18-155.html ER - TY - JOUR TI - Estimating Average Treatment Effects Utilizing Fractional Imputation when Confounders are Subject to Missingness AU - Corder, Nathan AU - Yang, Shu T2 - JOURNAL OF CAUSAL INFERENCE AB - Abstract The problem of missingness in observational data is ubiquitous. When the confounders are missing at random, multiple imputation is commonly used; however, the method requires congeniality conditions for valid inferences, which may not be satisfied when estimating average causal treatment effects. Alternatively, fractional imputation, proposed by Kim 2011, has been implemented to handling missing values in regression context. In this article, we develop fractional imputation methods for estimating the average treatment effects with confounders missing at random. We show that the fractional imputation estimator of the average treatment effect is asymptotically normal, which permits a consistent variance estimate. Via simulation study, we compare fractional imputation’s accuracy and precision with that of multiple imputation. DA - 2020/1// PY - 2020/1// DO - 10.1515/jci-2019-0024 VL - 8 IS - 1 SP - 249-271 SN - 2193-3685 KW - Missing Data KW - Fractional Imputation KW - Multiple Imputation ER - TY - JOUR TI - Novel Imaging Modalities Shedding Light on Plant Biology: Start Small and Grow Big AU - Clark, Natalie M. AU - Broeck, Lisa AU - Guichard, Marjorie AU - Stager, Adam AU - Tanner, Herbert G. AU - Blilou, Ikram AU - Grossmann, Guido AU - Iyer-Pascuzzi, Anjali S. AU - Maizel, Alexis AU - Sparks, Erin E. AU - Sozzani, Rosangela T2 - ANNUAL REVIEW OF PLANT BIOLOGY, VOL 71, 2020 AB - The acquisition of quantitative information on plant development across a range of temporal and spatial scales is essential to understand the mechanisms of plant growth. Recent years have shown the emergence of imaging methodologies that enable the capture and analysis of plant growth, from the dynamics of molecules within cells to the measurement of morphometricand physiological traits in field-grown plants. In some instances, these imaging methods can be parallelized across multiple samples to increase throughput. When high throughput is combined with high temporal and spatial resolution, the resulting image-derived data sets could be combined with molecular large-scale data sets to enable unprecedented systems-level computational modeling. Such image-driven functional genomics studies may be expected to appear at an accelerating rate in the near future given the early success of the foundational efforts reviewed here. We present new imaging modalities and review how they have enabled a better understanding of plant growth from the microscopic to the macroscopic scale. DA - 2020/// PY - 2020/// DO - 10.1146/annurev-arplant-050718-100038 VL - 71 SP - 789-816 SN - 1545-2123 KW - Forster resonance energy transfer KW - scanning fluorescent correlation spectroscopy KW - microfluid devices KW - light sheet microscopy KW - imaging of macroscopic traits KW - multiscale imaging techniques ER - TY - JOUR TI - INTEGRATIVE STATISTICAL METHODS FOR EXPOSURE MIXTURES AND HEALTH AU - Reich, Brian J. AU - Guan, Yawen AU - Fourches, Denis AU - Warren, Joshua L. AU - Sarnat, Stefanie E. AU - Chang, Howard H. T2 - ANNALS OF APPLIED STATISTICS AB - Humans are concurrently exposed to chemically, structurally and toxicologically diverse chemicals. A critical challenge for environmental epidemiology is to quantify the risk of adverse health outcomes resulting from exposures to such chemical mixtures and to identify which mixture constituents may be driving etiologic associations. A variety of statistical methods have been proposed to address these critical research questions. However, they generally rely solely on measured exposure and health data available within a specific study. Advancements in understanding of the role of mixtures on human health impacts may be better achieved through the utilization of external data and knowledge from multiple disciplines with innovative statistical tools. In this paper we develop new methods for health analyses that incorporate auxiliary information about the chemicals in a mixture, such as physicochemical, structural and/or toxicological data. We expect that the constituents identified using auxiliary information will be more biologically meaningful than those identified by methods that solely utilize observed correlations between measured exposure. We develop flexible Bayesian models by specifying prior distributions for the exposures and their effects that include auxiliary information and examine this idea over a spectrum of analyses from regression to factor analysis. The methods are applied to study the effects of volatile organic compounds on emergency room visits in Atlanta. We find that including cheminformatic information about the exposure variables improves prediction and provides a more interpretable model for emergency room visits for respiratory diseases. DA - 2020/12// PY - 2020/12// DO - 10.1214/20-AOAS1364 VL - 14 IS - 4 SP - 1945-1963 SN - 1941-7330 KW - Cheminformatics KW - collinearity KW - factor analysis KW - principal components KW - stochastic search KW - variable selection ER - TY - JOUR TI - Uniform convergence of penalized splines AU - Xiao, Luo AU - Nan, Zhe T2 - STAT AB - Penalized splines are popular for nonparametric regression. We establish the minimax rate optimality of penalized splines for uniform convergence, thus improving the existing rate in the literature. The result is applicable to several types of penalized splines that are commonly used and holds under mild conditions on the design points. DA - 2020/// PY - 2020/// DO - 10.1002/sta4.297 VL - 9 IS - 1 SP - SN - 2049-1573 KW - nonparametric regression KW - penalized splines KW - rate optimality KW - uniform convergence ER - TY - JOUR TI - Fast covariance estimation for multivariate sparse functional data AU - Li, Cai AU - Xiao, Luo AU - Luo, Sheng T2 - STAT AB - Covariance estimation is essential yet underdeveloped for analyzing multivariate functional data. We propose a fast covariance estimation method for multivariate sparse functional data using bivariate penalized splines. The tensor-product B-spline formulation of the proposed method enables a simple spectral decomposition of the associated covariance operator and explicit expressions of the resulting eigenfunctions as linear combinations of B-spline bases, thereby dramatically facilitating subsequent principal component analysis. We derive a fast algorithm for selecting the smoothing parameters in covariance smoothing using leave-one-subject-out cross-validation. The method is evaluated with extensive numerical studies and applied to an Alzheimer's disease study with multiple longitudinal outcomes. DA - 2020/// PY - 2020/// DO - 10.1002/sta4.245 VL - 9 IS - 1 SP - SN - 2049-1573 KW - bivariate smoothing KW - covariance function KW - functional principal component analysis KW - longitudinal data KW - multivariate functional data KW - prediction ER - TY - JOUR TI - Field Assessment of the Hydrologic Mitigation Performance of Three Aging Bioretention Cells AU - Johnson, Jeffrey P. AU - Hunt, William F. T2 - JOURNAL OF SUSTAINABLE WATER IN THE BUILT ENVIRONMENT AB - Increasing imperviousness has driven regulation and design philosophies to offset consequent increases in runoff volumes and peak flows. Previous research has shown bioretention to reduce runoff volumes and peak flows. Since most research has focused on newly constructed systems, the long-term performance of bioretention has been questioned. Because bioretention is a biologically based practice, changes over time could impact hydrologic performance. This research examined and compared the hydrologic mitigation performance of three bioretention cells (BRCs) in central North Carolina with postconstruction ages ranging from 8 to 17 years old. Observed runoff volumes were significantly reduced at each of the three cells by 90%, 81%, and 64%. The volume discharge ratio for each cell was at or below low impact development (LID) target thresholds (0.33) for 63%, 67%, and 48% of observed storm events. Similar to volume reduction, all three BRCs significantly reduced peak flows. Peak discharge ratios at each site were less than the LID target threshold (0.33) for over 75% of observed storm events, and the interquartile range of peak discharge ratios was less than the LID target threshold for all observed storm events <25.4 mm. All three BRCs struggled to mitigate volumes and peak flows for large storm events (>50 mm). As the frequency and magnitude of larger events increases, guidance recommending additional surface storage should be considered. When compared to the hydrologic performance of “young” BRCs (less than 3 years old), “old” BRCs (at least 3 years old) perform at least as well with respect to peak flow mitigation while appearing to reduce runoff volumes better than newly constructed BRCs. That the three BRCs presented herein ranged from 8 to 17 years old during their respective monitoring periods while significantly reducing peak flows and runoff volumes (while meeting LID target thresholds) supports the prediction of long-term hydrologic mitigation of bioretention. DA - 2020/11// PY - 2020/11// DO - 10.1061/JSWBAY.0000925 VL - 6 IS - 4 SP - SN - 2379-6111 ER - TY - SOUND TI - Monte Carlo Methods in Practice AU - Ghosh, Sujit DA - 2020/7/17/ PY - 2020/7/17/ ER - TY - SOUND TI - A Glimpse of Monte Carlo Methods AU - Ghosh, Sujit DA - 2020/9/29/ PY - 2020/9/29/ UR - https://youtu.be/9Rvb3X3V8bc) ER - TY - SOUND TI - A Gambler's Journey through Monte Carlo AU - Ghosh, Sujit DA - 2020/11/5/ PY - 2020/11/5/ ER - TY - CONF TI - On Empirical Estimation of Mode Based on Weakly Dependent Samples AU - Ghosh, Sujit T2 - International Conference on Statistics for Twenty-First Century C2 - 2020/12/18/ DA - 2020/12/18/ PY - 2020/12/18/ PB - University of Kerala ER - TY - JOUR TI - Rapid Hazard Characterization of Environmental Chemicals Using a Compendium of Human Cell Lines from Different Organs AU - Chen, Zunwei AU - Liu, Yizhong AU - Wright, Fred A. AU - Chiu, Weihsueh A. AU - Rusyn, Ivan T2 - ALTEX-ALTERNATIVES TO ANIMAL EXPERIMENTATION AB - The lack of adequate toxicity data for the vast majority of chemicals in the environment has spurred the development of new approach methodologies (NAMs). This study aimed to develop a practical high-throughput in vitro model for rapidly evaluating potential hazards of chemicals using a small number of human cells. Forty-two compounds were tested using human induced pluripotent stem cell (iPSC)-derived cells (hepatocytes, neurons, cardiomyocytes and endothelial cells), and a primary endothelial cell line. Both functional and cytotoxicity endpoints were evaluated using high-content imaging. Concentration-response was used to derive points-of-departure (POD). PODs were integrated with ToxPi and used as surrogate NAM-based PODs for risk characterization. We found chemical class-specific similarity among the chemicals tested; metal salts exhibited the highest overall bioactivity. We also observed cell type-specific patterns among classes of chemicals, indicating the ability of the proposed in vitro model to recognize effects on different cell types. Compared to available NAM datasets, such as ToxCast/Tox21 and chemical structure-based descriptors, we found that the data from the five-cell-type model was as good or even better in assigning compounds to chemical classes. Additionally, the PODs from this model performed well as a conservative surrogate for regulatory in vivo PODs and were less likely to underestimate in vivo potency and potential risk compared to other NAM-based PODs. In summary, we demonstrate the potential of this in vitro screening model to inform rapid risk-based decision-making through ranking, clustering, and assessment of both hazard and risks of diverse environmental chemicals. DA - 2020/// PY - 2020/// DO - 10.14573/altex.2002291 VL - 37 IS - 4 SP - 623-638 SN - 1868-8551 ER - TY - CONF TI - A Non-Iterative Quantile Change Detection Method in Mixture Model with Heavy-Tailed Components AU - Li, Yuantong AU - Ma, Qi AU - Ghosh, Sujit K. T2 - KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining AB - Estimating parameters of mixture model has wide applications ranging from classification problems to estimating of complex distributions. Most of the current literature on estimating the parameters of the mixture densities are based on iterative Expectation Maximization (EM) type algorithms which require the use of either taking expectations over the latent label variables or generating samples from the conditional distribution of such latent labels using the Bayes rule. Moreover, when the number of components is unknown, the problem becomes computationally more demanding due to well-known label switching issues [28]. In this paper, we propose a robust and quick approach based on change-point methods to determine the number of mixture components that works for almost any location-scale families even when the components are heavy tailed (e.g., Cauchy). We present several numerical illustrations by comparing our method with some of popular methods available in the literature using simulated data and real case studies. The proposed method is shown be as much as 500 times faster than some of the competing methods and are also shown to be more accurate in estimating the mixture distributions by goodness-of-fit tests. C2 - 2020/7/6/ C3 - Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining DA - 2020/7/6/ DO - 10.1145/3394486.3403240 PB - ACM SN - 9781450379984 UR - http://dx.doi.org/10.1145/3394486.3403240 DB - Crossref KW - mixture model KW - heavy-tailed distribution KW - Cauchy distribution KW - stock data ER - TY - JOUR TI - Joint modeling of longitudinal continuous, longitudinal ordinal, and time-to-event outcomes AU - Alam, Khurshid AU - MAITY, ARNAB AU - Sinha, Sanjoy K. AU - Rizopoulos, Dimitris AU - Sattar, Abdus T2 - LIFETIME DATA ANALYSIS DA - 2020/// PY - 2020/// DO - 10.1007/s10985-020-09511-3 KW - Joint models KW - Association parameters KW - Frailty model KW - Linear mixed model KW - Proportional odds model ER - TY - JOUR TI - Bayesian Regression Using a Prior on the Model Fit: The R2-D2 Shrinkage Prior AU - Zhang, Yan Dora AU - Naughton, Brian P. AU - Bondell, Howard D. AU - Reich, Brian T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - Prior distributions for high-dimensional linear regression require specifying a joint distribution for the unobserved regression coefficients, which is inherently difficult. We instead propose a new class of shrinkage priors for linear regression via specifying a prior first on the model fit, in particular, the coefficient of determination, and then distributing through to the coefficients in a novel way. The proposed method compares favorably to previous approaches in terms of both concentration around the origin and tail behavior, which leads to improved performance both in posterior contraction and in empirical performance. The limiting behavior of the proposed prior is 1/x , both around the origin and in the tails. This behavior is optimal in the sense that it simultaneously lies on the boundary of being an improper prior both in the tails and around the origin. None of the existing shrinkage priors obtain this behavior in both regions simultaneously. We also demonstrate that our proposed prior leads to the same near-minimax posterior contraction rate as the spike-and-slab prior. Supplementary materials for this article are available online. DA - 2020/// PY - 2020/// DO - 10.1080/01621459.2020.1825449 KW - Beta-prime distribution KW - Coefficient of determination KW - Global-local shrinkage KW - High-dimensional regression ER - TY - JOUR TI - Independent increments in group sequential tests: a review AU - Kim, Kyung Mann AU - Tsiatis, Anastasios A. T2 - SORT-STATISTICS AND OPERATIONS RESEARCH TRANSACTIONS DA - 2020/// PY - 2020/// DO - 10.2436/20.8080.02.101 VL - 44 IS - 2 SP - 223-264 SN - 2013-8830 KW - Failure time data KW - interim analysis KW - longitudinal data KW - clinical trials KW - repeated significance tests KW - sequential methods ER - TY - JOUR TI - Parameter Estimation for Multi-state Coherent Series and Parallel Systems with Positively Quadrant Dependent Models AU - Kulkarni, Leena AU - Sabnis, Sanjeev AU - Ghosh, Sujit K. T2 - SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY DA - 2020/// PY - 2020/// DO - 10.1007/s13171-020-00217-0 KW - Multi-state series system KW - Generalized method of moments KW - Maximum likelihood estimation KW - Positively quadrant dependent KW - Farlie-Gumbel-Morgenstern distribution ER - TY - JOUR TI - Statistical Downscaling with Spatial Misalignment: Application to Wildland Fire PM2.5 Concentration Forecasting AU - Majumder, Suman AU - Guan, Yawen AU - Reich, Brian AU - O'Neill, Susan AU - Rappold, Ana G. T2 - JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS AB - Fine particulate matter, PM2.5, has been documented to have adverse health effects and wildland fires are a major contributor to PM2.5 air pollution in the US. Forecasters use numerical models to predict PM2.5 concentrations to warn the public of impending health risk. Statistical methods are needed to calibrate the numerical model forecast using monitor data to reduce bias and quantify uncertainty. Typical model calibration techniques do not allow for errors due to misalignment of geographic locations. We propose a spatiotemporal downscaling methodology that uses image registration techniques to identify the spatial misalignment and accounts for and corrects the bias produced by such warping. Our model is fitted in a Bayesian framework to provide uncertainty quantification of the misalignment and other sources of error. We apply this method to different simulated data sets and show enhanced performance of the method in presence of spatial misalignment. Finally, we apply the method to a large fire in Washington state and show that the proposed method provides more realistic uncertainty quantification than standard methods. DA - 2020/// PY - 2020/// DO - 10.1007/s13253-020-00420-4 KW - Image registration KW - Public health KW - Smoothing KW - Warping ER - TY - JOUR TI - BAM1/2 receptor kinase signaling drives CLE peptide-mediated formative cell divisions in Arabidopsis roots AU - Crook, Ashley D. AU - Willoughby, Andrew C. AU - Hazak, Ora AU - Okuda, Satohiro AU - VanDerMolen, Kylie R. AU - Soyars, Cara L. AU - Cattaneo, Pietro AU - Clark, Natalie M. AU - Sozzani, Rosangela AU - Hothorn, Michael AU - Hardtke, Christian S. AU - Nimchuk, Zachary L. T2 - PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA AB - Cell division is often regulated by extracellular signaling networks to ensure correct patterning during development. In Arabidopsis, the SHORT-ROOT (SHR)/SCARECROW (SCR) transcription factor dimer activates CYCLIND6;1 (CYCD6;1) to drive formative divisions during root ground tissue development. Here, we show plasma-membrane-localized BARELY ANY MERISTEM1/2 (BAM1/2) family receptor kinases are required for SHR-dependent formative divisions and CYCD6;1 expression, but not SHR-dependent ground tissue specification. Root-enriched CLE ligands bind the BAM1 extracellular domain and are necessary and sufficient to activate SHR-mediated divisions and CYCD6;1 expression. Correspondingly, BAM-CLE signaling contributes to the restriction of formative divisions to the distal root region. Additionally, genetic analysis reveals that BAM-CLE and SHR converge to regulate additional cell divisions outside of the ground tissues. Our work identifies an extracellular signaling pathway regulating formative root divisions and provides a framework to explore this pathway in patterning and evolution. DA - 2020/12/22/ PY - 2020/12/22/ DO - 10.1073/pnas.2018565117 VL - 117 IS - 51 SP - 32750-32756 SN - 0027-8424 KW - Arabidopsis KW - receptor kinase KW - cell cycle KW - SHORT-ROOT KW - CLE peptide ER - TY - JOUR TI - Systematic Comparisons for Composition Profiles, Taxonomic Levels, and Machine Learning Methods for Microbiome-Based Disease Prediction AU - Song, Kuncheng AU - Wright, Fred A. AU - Zhou, Yi-Hui T2 - FRONTIERS IN MOLECULAR BIOSCIENCES AB - Microbiome composition profiles generated from 16S rRNA sequencing have been extensively studied for their usefulness in phenotype trait prediction, including for complex diseases such as diabetes and obesity. These microbiome compositions have typically been quantified in the form of Operational Taxonomic Unit (OTU) count matrices. However, alternate approaches such as Amplicon Sequence Variants (ASV) have been used, as well as the direct use of k-mer sequence counts. The overall effect of these different types of predictors when used in concert with various machine learning methods has been difficult to assess, due to varied combinations described in the literature. Here we provide an in-depth investigation of more than 1,000 combinations of these three clustering/counting methods, in combination with varied choices for normalization and filtering, grouping at various taxonomic levels, and the use of more than ten commonly used machine learning methods for phenotype prediction. The use of short k-mers, which have computational advantages and conceptual simplicity, is shown to be effective as a source for microbiome-based prediction. Among machine-learning approaches, tree-based methods show consistent, though modest, advantages in prediction accuracy. We describe the various advantages and disadvantages of combinations in analysis approaches, and provide general observations to serve as a useful guide for future trait-prediction explorations using microbiome data. DA - 2020/12/16/ PY - 2020/12/16/ DO - 10.3389/fmolb.2020.610845 VL - 7 SP - SN - 2296-889X KW - phenotype prediction KW - machine learning method KW - k-mers KW - operational taxonomic unit (OTU) KW - amplicon sequence variant (ASV) KW - phylogenetic analysis ER - TY - JOUR TI - EMPIRICAL BAYES ORACLE UNCERTAINTY QUANTIFICATION FOR REGRESSION AU - Belitser, Eduard AU - Ghosal, Subhashis T2 - ANNALS OF STATISTICS AB - We propose an empirical Bayes method for high-dimensional linear regression models. Following an oracle approach that quantifies the error locally for each possible value of the parameter, we show that an empirical Bayes posterior contracts at the optimal rate at all parameters and leads to uniform size-optimal credible balls with guaranteed coverage under an “excessive bias restriction” condition. This condition gives rise to a new slicing of the entire space that is suitable for ensuring uniformity in uncertainty quantification. The obtained results immediately lead to optimal contraction and coverage properties for many conceivable classes simultaneously. The results are also extended to high-dimensional additive nonparametric regression models. DA - 2020/12// PY - 2020/12// DO - 10.1214/19-AOS1845 VL - 48 IS - 6 SP - 3113-3137 SN - 0090-5364 KW - Credible ball KW - coverage KW - empirical Bayes KW - excessive bias restriction KW - oracle rate ER - TY - JOUR TI - Quantitative Trait Loci Associated with Gray Leaf Spot Resistance in St. Augustinegrass AU - Yu, Xingwang AU - Mulkey, Steve E. AU - Zuleta, Maria C. AU - Arellano, Consuelo AU - Ma, Bangya AU - Milla-Lewis, Susana R. T2 - PLANT DISEASE AB - Gray leaf spot (GLS), caused by Magnaporthe grisea, is a major fungal disease of St. Augustinegrass (Stenotaphrum secundatum), causing widespread blighting of the foliage under warm, humid conditions. To identify quantitative trait loci (QTL) controlling GLS resistance, an F 1 mapping population consisting of 153 hybrids was developed from crosses between cultivar Raleigh (susceptible parent) and plant introduction PI 410353 (resistant parent). Single-nucleotide polymorphism (SNP) markers generated from genotyping-by-sequencing constituted nine linkage groups for each parental linkage map. The Raleigh map consisted of 2,257 SNP markers and spanned 916.63 centimorgans (cM), while the PI 410353 map comprised 511 SNP markers and covered 804.27 cM. GLS resistance was evaluated under controlled environmental conditions with measurements of final disease incidence and lesion length. Additionally, two derived traits, area under the disease progress curve and area under the lesion expansion curve, were calculated for QTL analysis. Twenty QTL were identified as being associated with these GLS resistance traits, which explained 7.6 to 37.2% of the total phenotypic variation. Three potential GLS QTL “hotspots” were identified on two linkage groups: P2 (106.26 to 110.36 cM and 113.15 to 116.67 cM) and P5 (17.74 to 19.28 cM). The two major effect QTL glsp2.3 and glsp5.2 together reduced 20.2% of disease incidence in this study. Sequence analysis showed that two candidate genes encoding β-1,3-glucanases were found in the intervals of two QTL, which might function in GLS resistance response. These QTL and linked markers can be potentially used to assist the transfer of GLS resistance genes to elite St. Augustinegrass breeding lines. DA - 2020/11// PY - 2020/11// DO - 10.1094/PDIS-04-20-0905-RE VL - 104 IS - 11 SP - 2799-2806 SN - 1943-7692 KW - gray leaf spot KW - Magnaporthe grisea KW - quantitative trait loci KW - St. Augustinegrass ER - TY - JOUR TI - Goodness-of-fit test for skew normality based on energy statistics AU - Opperman, Logan AU - Ning, Wei T2 - RANDOM OPERATORS AND STOCHASTIC EQUATIONS AB - Abstract In this paper, we propose a goodness-of-fit test based on the energy statistic for skew normality. Simulations indicate that the Type-I error of the proposed test can be controlled reasonably well for given nominal levels. Power comparisons to other existing methods under different settings show the advantage of the proposed test. Such a test is applied to two real data sets to illustrate the testing procedure. DA - 2020/9// PY - 2020/9// DO - 10.1515/rose-2020-2042 VL - 28 IS - 3 SP - 227-236 SN - 1569-397X KW - Goodness-of-fit test KW - energy statistic KW - skew normal distribution KW - skew normality ER - TY - PCOMM TI - Questioning Existing Cancer Hazard Evaluation Standards in the Name of Statistics AU - Rusyn, Ivan AU - Chiu, Weihsueh A. AU - Wright, Fred A. DA - 2020/10// PY - 2020/10// DO - 10.1093/toxsci/kfaa077 SP - 521-522 ER - TY - JOUR TI - Integrative Analysis of Gene-Specific DNA Methylation and Untargeted Metabolomics Data from the ELEMENT Cohort AU - Goodrich, Jaclyn M AU - Hector, Emily C AU - Tang, Lu AU - LaBarre, Jennifer L AU - Dolinoy, Dana C AU - Mercado-Garcia, Adriana AU - Cantoral, Alejandra AU - Song, Peter XK AU - Téllez-Rojo, Martha Maria AU - Peterson, Karen E T2 - Epigenetics Insights AB - Epigenetic modifications, such as DNA methylation, influence gene expression and cardiometabolic phenotypes that are manifest in developmental periods in later life, including adolescence. Untargeted metabolomics analysis provide a comprehensive snapshot of physiological processes and metabolism and have been related to DNA methylation in adults, offering insights into the regulatory networks that influence cellular processes. We analyzed the cross-sectional correlation of blood leukocyte DNA methylation with 3758 serum metabolite features (574 of which are identifiable) in 238 children (ages 8-14 years) from the Early Life Exposures in Mexico to Environmental Toxicants (ELEMENT) study. Associations between these features and percent DNA methylation in adolescent blood leukocytes at LINE-1 repetitive elements and genes that regulate early life growth (IGF2, H19, HSD11B2) were assessed by mixed effects models, adjusting for sex, age, and puberty status. After false discovery rate correction (FDR q < 0.05), 76 metabolites were significantly associated with LINE-1 DNA methylation, 27 with HSD11B2, 103 with H19, and 4 with IGF2. The ten identifiable metabolites included dicarboxylic fatty acids (five associated with LINE-1 or H19 methylation at q < 0.05) and 1-octadecanoyl-rac-glycerol (q < 0.0001 for association with H19 and q = 0.04 for association with LINE-1). We then assessed the association between these ten known metabolites and adiposity 3 years later. Two metabolites, dicarboxylic fatty acid 17:3 and 5-oxo-7-octenoic acid, were inversely associated with measures of adiposity (P < .05) assessed approximately 3 years later in adolescence. In stratified analyses, sex-specific and puberty-stage specific (Tanner stage = 2 to 5 vs Tanner stage = 1) associations were observed. Most notably, hundreds of statistically significant associations were observed between H19 and LINE-1 DNA methylation and metabolites among children who had initiated puberty. Understanding relationships between subclinical molecular biomarkers (DNA methylation and metabolites) may increase our understanding of genes and biological pathways contributing to metabolic changes that underlie the development of adiposity during adolescence. DA - 2020/1// PY - 2020/1// DO - 10.1177/2516865720977888 UR - https://doi.org/10.1177/2516865720977888 KW - Metabolic programming KW - epigenetics KW - DNA methylation KW - IGF2 KW - H19 KW - HSD11B2 KW - LINE-1 KW - adolescence KW - biomarkers KW - adiposity KW - children's health ER - TY - JOUR TI - Multiway Graph Signal Processing on Tensors: Integrative Analysis of Irregular Geometries AU - Stanley, Jay S., III AU - Chi, Eric C. AU - Mishne, Gal T2 - IEEE SIGNAL PROCESSING MAGAZINE AB - Graph signal processing (GSP) is an important methodology for studying data residing on irregular structures. As acquired data is increasingly taking the form of multi-way tensors, new signal processing tools are needed to maximally utilize the multi-way structure within the data. In this paper, we review modern signal processing frameworks generalizing GSP to multi-way data, starting from graph signals coupled to familiar regular axes such as time in sensor networks, and then extending to general graphs across all tensor modes. This widely applicable paradigm motivates reformulating and improving upon classical problems and approaches to creatively address the challenges in tensor-based data. We synthesize common themes arising from current efforts to combine GSP with tensor analysis and highlight future directions in extending GSP to the multi-way paradigm. DA - 2020/11// PY - 2020/11// DO - 10.1109/MSP.2020.3013555 VL - 37 IS - 6 SP - 160-173 SN - 1558-0792 KW - Tensors KW - Signal processing KW - Two dimensional displays KW - Geometry KW - Discrete Fourier transforms KW - Graphical models KW - Laplace equations ER - TY - JOUR TI - High-Dimensional Precision Medicine From Patient-Derived Xenografts AU - Rashid, Naim U. AU - Luckett, Daniel J. AU - Chen, Jingxiang AU - Lawson, Michael T. AU - Wang, Longshaokan AU - Zhang, Yunshu AU - Laber, Eric B. AU - Liu, Yufeng AU - Yeh, Jen Jen AU - Zeng, Donglin AU - Kosorok, Michael R. T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - The complexity of human cancer often results in significant heterogeneity in response to treatment. Precision medicine offers the potential to improve patient outcomes by leveraging this heterogeneity. Individualized treatment rules (ITRs) formalize precision medicine as maps from the patient covariate space into the space of allowable treatments. The optimal ITR is that which maximizes the mean of a clinical outcome in a population of interest. Patient-derived xenograft (PDX) studies permit the evaluation of multiple treatments within a single tumor, and thus are ideally suited for estimating optimal ITRs. PDX data are characterized by correlated outcomes, a high-dimensional feature space, and a large number of treatments. Here we explore machine learning methods for estimating optimal ITRs from PDX data. We analyze data from a large PDX study to identify biomarkers that are informative for developing personalized treatment recommendations in multiple cancers. We estimate optimal ITRs using regression-based (Q-learning) and direct-search methods (outcome weighted learning). Finally, we implement a superlearner approach to combine multiple estimated ITRs and show that the resulting ITR performs better than any of the input ITRs, mitigating uncertainty regarding user choice. Our results indicate that PDX data are a valuable resource for developing individualized treatment strategies in oncology. Supplementary materials for this article are available online. DA - 2020/11/5/ PY - 2020/11/5/ DO - 10.1080/01621459.2020.1828091 VL - 116 IS - 535 SP - 1140-1154 SN - 1537-274X KW - Biomarkers KW - Deep learning autoencoders KW - Machine learning KW - Outcome weighted learning KW - Precision medicine KW - Q-learning ER - TY - JOUR TI - Assembly of whole-chromosome pseudomolecules for polyploid plant genomes using outbred mapping populations AU - Zhou, Chenxi AU - Olukolu, Bode AU - Gemenet, Dorcus C. AU - Wu, Shan AU - Gruneberg, Wolfgang AU - Cao, Minh Duc AU - Fei, Zhangjun AU - Zeng, Zhao-Bang AU - George, Andrew W. AU - Khan, Awais AU - Yencho, G. Craig AU - Coin, Lachlan J. M. T2 - NATURE GENETICS DA - 2020/11// PY - 2020/11// DO - 10.1038/s41588-020-00717-7 VL - 52 IS - 11 SP - 1256-+ SN - 1546-1718 ER - TY - JOUR TI - LEPTOSPIRA, PARVOVIRUS, AND TOXOPLASMA IN THE NORTH AMERICAN RIVER OTTER (LONTRA CANADENSIS) IN NORTH CAROLINA, USA AU - Sanders, Charles W., II AU - Olfenbuttel, Colleen AU - Pacifici, Krishna AU - Hess, George R. AU - Livingston, Robert S. AU - DePerno, Christopher S. T2 - JOURNAL OF WILDLIFE DISEASES AB - The North American river otter (Lontra canadensis) is the largest mustelid in North Carolina, US, and was once extirpated from the central and western portions of the state. Over time and after a successful reintroduction project, otters are now abundant and occur throughout North Carolina. However, there is a concern that diseases may have an impact on the otter population, as well as on other aquatic mammals, either through exposure to emerging diseases, contact with domestic animals such as domestic cats (Felis catus), or less robust condition of individuals through declines in water quality. We tested brain and kidney tissue from harvested otters for the pathogens that cause leptospirosis, parvovirus, and toxoplasmosis. Leptospirosis and toxoplasmosis are priority zoonoses and are maintained by domestic and wild mammals. Although parvovirus is not zoonotic, it does affect pets, causing mild to fatal symptoms. Across the 2014–15 and 2015–16 trapping seasons, we tested 220 otters (76 females, 144 males) using real-time PCR for Leptospira interrogans, parvovirus, and Toxoplasma gondii. Of the otters tested, 1% (3/220) were positive for L. interrogans, 19% (41/220) were positive for parvovirus, and 24% (53/220) were positive for T. gondii. Although the pathogens for parvovirus and toxoplasmosis are relatively common in North Carolina otters, the otter harvest has remained steady and the population appears to be abundant and self-sustaining. Therefore, parvovirus and toxoplasmosis do not currently appear to be negatively impacting the population. However, subsequent research should examine transmission parameters between domestic and wild species and the sublethal effects of infection. DA - 2020/10// PY - 2020/10// DO - 10.7589/2019-05-129 VL - 56 IS - 4 SP - 791-802 SN - 1943-3700 KW - Disease KW - leptospirosis KW - Lontra canadensis KW - North Carolina KW - otter KW - parvovirus KW - toxoplasmosis ER - TY - JOUR TI - Exploring the Limits of Combined Image/'omics Analysis for Non-cancer Histological Phenotypes AU - Gallins, Paul AU - Saghapour, Ehsan AU - Zhou, Yi-Hui T2 - FRONTIERS IN GENETICS AB - The last several years have witnessed an explosion of methods and applications for combining image data with 'omics data, and for prediction of clinical phenotypes. Much of this research has focused on cancer histology, for which genetic perturbations are large, and the signal to noise ratio is high. Related research on chronic, complex diseases is limited by tissue sample availability, lower genomic signal strength, and the less extreme and tissue-specific nature of intermediate histological phenotypes. Data from the GTEx Consortium provides a unique opportunity to investigate the connection among phenotypic histological variation, imaging data, and 'omics profiling, from multiple tissue-specific phenotypes at the sub-clinical level. Investigating histological designations in multiple tissues, we survey the evidence for genomic association and prediction of histology, and use the results to test the limits of prediction accuracy using machine learning methods applied to the imaging data, genomics data, and their combination. We find that expression data has similar or superior accuracy for pathology prediction as our use of imaging data. A variety of machine learning methods have similar performance, while network embedding methods offer at best limited improvements. These observations hold across a range of tissues and predictor types. The results are supportive of the use of genomic measurements in the same target tissue in which pathological phenotyping has been performed, which to our knowledge is a novel finding. Even while prediction accuracy remains a challenge, the results show clear evidence of pathway and tissue-specific biology. DA - 2020/10/23/ PY - 2020/10/23/ DO - 10.3389/fgene.2020.555886 VL - 11 SP - SN - 1664-8021 KW - imaging KW - genomics KW - pathology KW - prediction KW - integration KW - histology KW - machine learning KW - embedding ER - TY - JOUR TI - The GTEx Consortium atlas of genetic regulatory effects across human tissues AU - Aguet, Francois AU - Barbeira, Alvaro N. AU - Bonazzola, Rodrigo AU - Brown, Andrew AU - Castel, Stephane E. AU - Jo, Brian AU - Kasela, Silva AU - Kim-Hellmuth, Sarah AU - Liang, Yanyu AU - Parsana, Princy AU - Flynn, Elise AU - Fresard, Laure AU - Gamazon, Eric R. AU - Hamel, Andrew R. AU - He, Yuan AU - Hormozdiari, Farhad AU - Mohammadi, Pejman AU - Munoz-Aguirre, Manuel AU - Ardlie, Kristin G. AU - Battle, Alexis AU - Bonazzola, Rodrigo AU - Brown, Christopher D. AU - Cox, Nancy AU - Dermitzakis, Emmanouil T. AU - Engelhardt, Barbara E. AU - Garrido-Martin, Diego AU - Gay, Nicole R. AU - Getz, Gad AU - Guigo, Roderic AU - Hamel, Andrew R. AU - Handsaker, Robert E. AU - He, Yuan AU - Hoffman, Paul J. AU - Hormozdiari, Farhad AU - Im, Hae Kyung AU - Jo, Brian AU - Kasela, Silva AU - Kashin, Seva AU - Kim-Hellmuth, Sarah AU - Kwong, Alan AU - Lappalainen, Tuuli AU - Li, Xiao AU - Liang, Yanyu AU - MacArthur, Daniel G. AU - Mohammadi, Pejman AU - Montgomery, Stephen B. AU - Munoz-Aguirre, Manuel AU - Rouhana, John M. AU - Hormozdiari, Farhad AU - Im, Hae Kyung AU - Kim-Hellmuth, Sarah AU - Ardlie, Kristin G. AU - Getz, Gad AU - Guigo, Roderic AU - Im, Hae Kyung AU - Lappalainen, Tuuli AU - Montgomery, Stephen B. AU - Im, Hae Kyung AU - Lappalainen, Tuuli AU - Lappalainen, Tuuli AU - Anand, Shankara AU - Gabriel, Stacey AU - Getz, Gad AU - Graubert, Aaron AU - Hadley, Kane AU - Handsaker, Robert E. AU - Huang, Katherine H. AU - Kashin, Seva AU - Li, Xiao AU - MacArthur, Daniel G. AU - Meier, Samuel R. AU - Nedzel, Jared L. AU - Balliu, Brunilda AU - Conrad, Don AU - Cotter, Daniel J. AU - Das, Sayantan AU - Goede, Olivia M. AU - Eskin, Eleazar AU - Eulalio, Tiffany Y. AU - Ferraro, Nicole M. AU - Garrido-Martin, Diego AU - Gay, Nicole R. AU - Getz, Gad AU - Graubert, Aaron AU - Guigo, Roderic AU - Hadley, Kane AU - Hamel, Andrew R. AU - Handsaker, Robert E. AU - He, Yuan AU - Hoffman, Paul J. AU - Hormozdiari, Farhad AU - Hou, Lei AU - Huang, Katherine H. AU - Im, Hae Kyung AU - Jo, Brian AU - Kasela, Silva AU - Kashin, Seva AU - Kellis, Manolis AU - Kim-Hellmuth, Sarah AU - Kwong, Alan AU - Lappalainen, Tuuli AU - Li, Xiao AU - Li, Xin AU - Liang, Yanyu AU - MacArthur, Daniel G. AU - Mangul, Serghei AU - Meier, Samuel R. AU - Mohammadi, Pejman AU - Montgomery, Stephen B. AU - Munoz-Aguirre, Manuel AU - Nachun, Daniel C. AU - Nedzel, Jared L. AU - Nguyen, Duyen Y. AU - Nobel, Andrew B. AU - Park, YoSon AU - Reverter, Ferran AU - Sabatti, Chiara AU - Saha, Ashis AU - Segre, Ayellet V AU - Stephens, Matthew AU - Strober, Benjamin J. AU - Teran, Nicole A. AU - Todres, Ellen AU - Vinuela, Ana AU - Wang, Gao AU - Wen, Xiaoquan AU - Wright, Fred AU - Wucher, Valentin AU - Zou, Yuxin AU - Ferreira, Pedro G. AU - Li, Gen AU - Mele, Marta AU - Yeger-Lotem, Esti AU - Barcus, Mary E. AU - Bradbury, Debra AU - Krubit, Tanya AU - McLean, Jeffrey A. AU - Qi, Liqun AU - Robinson, Karna AU - Roche, Nancy V AU - Smith, Anna M. AU - Tabor, David E. AU - Undale, Anita AU - Bridge, Jason AU - Brigham, Lori E. AU - Foster, Barbara A. AU - Gillard, Bryan M. AU - Hasz, Richard AU - Hunter, Marcus AU - Johns, Christopher AU - Johnson, Mark AU - Karasik, Ellen AU - Kopen, Gene AU - Leinweber, William F. AU - McDonald, Alisa AU - Moser, Michael T. AU - Myer, Kevin AU - Ramsey, Kimberley D. AU - Roe, Brian AU - Shad, Saboor AU - Thomas, Jeffrey A. AU - Walters, Gary AU - Washington, Michael AU - Wheeler, Joseph AU - Jewell, Scott D. AU - Rohrer, Daniel C. AU - Valley, Dana R. AU - Davis, David A. AU - Mash, Deborah C. AU - Branton, Philip A. AU - Sobin, Leslie AU - Barker, Laura K. AU - Gardiner, Heather M. AU - Mosavel, Maghboeba AU - Siminoff, Laura A. AU - Flicek, Paul AU - Haeussler, Maximilian AU - Juettemann, Thomas AU - Kent, W. James AU - Lee, Christopher M. AU - Powell, Conner C. AU - Rosenbloom, Kate R. AU - Ruffier, Magali AU - Sheppard, Dan AU - Taylor, Kieron AU - Trevanion, Stephen J. AU - Zerbino, Daniel R. AU - Abell, Nathan S. AU - Akey, Joshua AU - Chen, Lin AU - Demanelis, Kathryn AU - Doherty, Jennifer A. AU - Feinberg, Andrew P. AU - Hansen, Kasper D. AU - Hickey, Peter F. AU - Hou, Lei AU - Jasmine, Farzana AU - Jiang, Lihua AU - Kaul, Rajinder AU - Kellis, Manolis AU - Kibriya, Muhammad G. AU - Li, Jin Billy AU - Li, Qin AU - Lin, Shin AU - Linder, Sandra E. AU - Montgomery, Stephen B. AU - Oliva, Meritxell AU - Park, Yongjin AU - Pierce, Brandon L. AU - Rizzardi, Lindsay F. AU - Skol, Andrew D. AU - Smith, Kevin S. AU - Snyder, Michael AU - Stamatoyannopoulos, John AU - Tang, Hua AU - Wang, Meng AU - Carithers, Latarsha J. AU - Guan, Ping AU - Koester, Susan E. AU - Little, A. Roger AU - Moore, Helen M. AU - Nierras, Concepcion R. AU - Rao, Abhi K. AU - Vaught, Jimmie B. AU - Volpi, Simona T2 - SCIENCE AB - The Genotype-Tissue Expression (GTEx) project was established to characterize genetic effects on the transcriptome across human tissues and to link these regulatory mechanisms to trait and disease associations. Here, we present analyses of the version 8 data, examining 15,201 RNA-sequencing samples from 49 tissues of 838 postmortem donors. We comprehensively characterize genetic associations for gene expression and splicing in cis and trans, showing that regulatory associations are found for almost all genes, and describe the underlying molecular mechanisms and their contribution to allelic heterogeneity and pleiotropy of complex traits. Leveraging the large diversity of tissues, we provide insights into the tissue specificity of genetic effects and show that cell type composition is a key factor in understanding gene regulatory mechanisms in human tissues. DA - 2020/9/11/ PY - 2020/9/11/ DO - 10.1126/science.aaz1776 VL - 369 IS - 6509 SP - 1318-1330 SN - 1095-9203 ER - TY - JOUR TI - Classification of estrogenic compounds by coupling high content analysis and machine learning algorithms AU - Mukherjee, Rajib AU - Beykal, Burcu AU - Szafran, Adam T. AU - Onel, Melis AU - Stossi, Fabio AU - Mancini, Maureen G. AU - Lloyd, Dillon AU - Wright, Fred A. AU - Zhou, Lan AU - Mancini, Michael A. AU - Pistikopoulos, Efstratios N. T2 - PLOS COMPUTATIONAL BIOLOGY AB - Environmental toxicants affect human health in various ways. Of the thousands of chemicals present in the environment, those with adverse effects on the endocrine system are referred to as endocrine-disrupting chemicals (EDCs). Here, we focused on a subclass of EDCs that impacts the estrogen receptor (ER), a pivotal transcriptional regulator in health and disease. Estrogenic activity of compounds can be measured by many in vitro or cell-based high throughput assays that record various endpoints from large pools of cells, and increasingly at the single-cell level. To simultaneously capture multiple mechanistic ER endpoints in individual cells that are affected by EDCs, we previously developed a sensitive high throughput/high content imaging assay that is based upon a stable cell line harboring a visible multicopy ER responsive transcription unit and expressing a green fluorescent protein (GFP) fusion of ER. High content analysis generates voluminous multiplex data comprised of minable features that describe numerous mechanistic endpoints. In this study, we present a machine learning pipeline for rapid, accurate, and sensitive assessment of the endocrine-disrupting potential of benchmark chemicals based on data generated from high content analysis. The multidimensional imaging data was used to train a classification model to ultimately predict the impact of unknown compounds on the ER, either as agonists or antagonists. To this end, both linear logistic regression and nonlinear Random Forest classifiers were benchmarked and evaluated for predicting the estrogenic activity of unknown compounds. Furthermore, through feature selection, data visualization, and model discrimination, the most informative features were identified for the classification of ER agonists/antagonists. The results of this data-driven study showed that highly accurate and generalized classification models with a minimum number of features can be constructed without loss of generality, where these machine learning models serve as a means for rapid mechanistic/phenotypic evaluation of the estrogenic potential of many chemicals. DA - 2020/9// PY - 2020/9// DO - 10.1371/journal.pcbi.1008191 VL - 16 IS - 9 SP - SN - 1553-7358 ER - TY - JOUR TI - MODELING AND ESTIMATION OF CONTAGION-BASED SOCIAL NETWORK DEPENDENCE WITH TIME-TO-EVENT DATA AU - Yu, Lin AU - Lu, Wenbin AU - Huang, Danyang T2 - STATISTICA SINICA DA - 2020/10// PY - 2020/10// DO - 10.5705/ss.202018.0222 VL - 30 IS - 4 SP - 2051-2074 SN - 1996-8507 KW - Contagion-based social correlation KW - generalized linear transformation model KW - nonparametric maximum likelihood estimation KW - social network KW - time-to-event data ER - TY - JOUR TI - SPARSE BAYESIAN ADDITIVE NONPARAMETRIC REGRESSION WITH APPLICATION TO HEALTH EFFECTS OF PESTICIDES MIXTURES AU - Wei, Ran AU - Reich, Brian J. AU - Hoppin, Jane A. AU - Ghosal, Subhashis T2 - STATISTICA SINICA DA - 2020/1// PY - 2020/1// DO - 10.5705/ss.202017.0315 VL - 30 IS - 1 SP - 55-79 SN - 1996-8507 KW - Additive nonparametric regression KW - Bayesian variable selection KW - continuous shrinkage prior KW - environmental epidemiology KW - posterior consistency ER - TY - JOUR TI - OPTIMAL EMG PLACEMENT FOR A ROBOTIC PROSTHESIS CONTROLLER WITH SEQUENTIAL, ADAPTIVE FUNCTIONAL ESTIMATION (SAFE) AU - Stallrich, Jonathan AU - Islam, Md Nazmul AU - Staicu, Ana-Maria AU - Crouch, Dustin AU - Pan, Lizhi AU - Huang, He T2 - ANNALS OF APPLIED STATISTICS AB - Robotic hand prostheses require a controller to decode muscle contraction information, such as electromyogram (EMG) signals, into the user’s desired hand movement. State-of-the-art decoders demand extensive training, require data from a large number of EMG sensors and are prone to poor predictions. Biomechanical models of a single movement degree-of-freedom tell us that relatively few muscles, and, hence, fewer EMG sensors are needed to predict movement. We propose a novel decoder based on a dynamic, functional linear model with velocity or acceleration as its response and the recent past EMG signals as functional covariates. The effect of each EMG signal varies with the recent position to account for biomechanical features of hand movement, increasing the predictive capability of a single EMG signal compared to existing decoders. The effects are estimated with a multistage, adaptive estimation procedure that we call Sequential Adaptive Functional Estimation (SAFE). Starting with 16 potential EMG sensors, our method correctly identifies the few EMG signals that are known to be important for an able-bodied subject. Furthermore, the estimated effects are interpretable and can significantly improve understanding and development of robotic hand prostheses. DA - 2020/9// PY - 2020/9// DO - 10.1214/20-AOAS1324 VL - 14 IS - 3 SP - 1164-1181 SN - 1932-6157 KW - Electromyography signal KW - varying functional regression KW - functional variable selection KW - adaptive group LASSO KW - correlated functional predictors KW - sequential adaptive functional estimation ER - TY - JOUR TI - Synonymous Site-to-Site Substitution Rate Variation Dramatically Inflates False Positive Rates of Selection Analyses: Ignore at Your Own Peril AU - Wisotsky, Sadie R. AU - Pond, Sergei L. Kosakovsky AU - Shank, Stephen D. AU - Muse, Spencer V T2 - MOLECULAR BIOLOGY AND EVOLUTION AB - Abstract Most molecular evolutionary studies of natural selection maintain the decades-old assumption that synonymous substitution rate variation (SRV) across sites within genes occurs at levels that are either nonexistent or negligible. However, numerous studies challenge this assumption from a biological perspective and show that SRV is comparable in magnitude to that of nonsynonymous substitution rate variation. We evaluated the impact of this assumption on methods for inferring selection at the molecular level by incorporating SRV into an existing method (BUSTED) for detecting signatures of episodic diversifying selection in genes. Using simulated data we found that failing to account for even moderate levels of SRV in selection testing is likely to produce intolerably high false positive rates. To evaluate the effect of the SRV assumption on actual inferences we compared results of tests with and without the assumption in an empirical analysis of over 13,000 Euteleostomi (bony vertebrate) gene alignments from the Selectome database. This exercise reveals that close to 50% of positive results (i.e., evidence for selection) in empirical analyses disappear when SRV is modeled as part of the statistical analysis and are thus candidates for being false positives. The results from this work add to a growing literature establishing that tests of selection are much more sensitive to certain model assumptions than previously believed. DA - 2020/8// PY - 2020/8// DO - 10.1093/molbev/msaa037 VL - 37 IS - 8 SP - 2430-2439 SN - 1537-1719 KW - evolutionary model KW - synonymous rate variation KW - codon model KW - episodic selection ER - TY - JOUR TI - Transitioning Machine Learning from Theory to Practice in Natural Resources Management AU - Saia, Sheila M. AU - Nelson, Natalie AU - Huseth, Anders S. AU - Grieger, Khara AU - Reich, Brian J. T2 - ECOLOGICAL MODELLING DA - 2020/11/1/ PY - 2020/11/1/ DO - 10.1016/j.ecolmodel.2020.109257 VL - 435 SP - SN - 1872-7026 KW - Machine learning KW - Natural resources management KW - Stakeholders KW - Decision-support tools KW - Decision-making KW - Process-based modeling ER - TY - JOUR TI - On empirical estimation of mode based on weakly dependent samples AU - Liu, Bowen AU - Ghosh, Sujit K. T2 - COMPUTATIONAL STATISTICS & DATA ANALYSIS AB - Given a large sample of observations from an unknown univariate continuous distribution, it is often of interest to empirically estimate the global mode of the underlying density. Applications include samples obtained by Monte Carlo methods with independent observations, or Markov Chain Monte Carlo methods with weakly dependent samples from the underlying stationary density. In either case, often the generating density is not available in closed form and only empirical determination of the mode is possible. Assuming that the generating density has a unique global mode, a non-parametric estimate of the density is proposed based on a sequence of mixtures of Beta densities which allows for the estimation of the mode even when the mode is possibly located on the boundary of the support of the density. Furthermore, the estimated mode is shown to be strongly universally consistent under a set of mild regularity conditions. The proposed method is compared with other empirical estimates of the mode based on popular kernel density estimates. Numerical results based on extensive simulation studies show benefits of the proposed methods in terms of empirical bias, standard errors and computation time. An R package implementing the method is also made available online. DA - 2020/12// PY - 2020/12// DO - 10.1016/j.csda.2020.107046 VL - 152 SP - SN - 1872-7352 UR - https://doi.org/10.1016/j.csda.2020.107046 KW - Bernstein polynomials KW - Empirical mode estimator KW - Strong consistency KW - Anderson-Darling test ER - TY - JOUR TI - Stickiness of rental rate and housing vacancy rate AU - Wang, Haoyu T2 - ECONOMICS LETTERS AB - We study the stickiness in house rent by examining rent under two settings: rent is determined solely by landlord and rent is set by Nash bargaining between landlord and tenant. Our results show that, under Nash bargaining, the vacancy rate is able to restrain the growth of rents and hence generates the stickiness feature in rents. DA - 2020/10// PY - 2020/10// DO - 10.1016/j.econlet.2020.109487 VL - 195 SP - SN - 1873-7374 KW - Sticky rent KW - Nash bargaining KW - Business cycle ER - TY - JOUR TI - Evaluation of a Stepped-Care eHealth HIV Prevention Program for Diverse Adolescent Men Who Have Sex With Men: Protocol for a Hybrid Type 1 Effectiveness Implementation Trial of SMART AU - Mustanski, Brian AU - Moskowitz, David A. AU - Moran, Kevin O. AU - Newcomb, Michael E. AU - Macapagal, Kathryn AU - Rodriguez-Diaz, Carlos AU - Rendina, H. Jonathon AU - Laber, Eric B. AU - Li, Dennis H. AU - Matson, Margaret AU - Talan, Ali J. AU - Cabral, Cynthia T2 - JMIR RESEARCH PROTOCOLS AB - Background Adolescent men who have sex with men (AMSM), aged 13 to 18 years, account for more than 80% of teen HIV occurrences. Despite this disproportionate burden, there is a conspicuous lack of evidence-based HIV prevention programs. Implementation issues are critical as traditional HIV prevention delivery channels (eg, community-based organizations, schools) have significant access limitations for AMSM. As such, eHealth interventions, such as our proposed SMART program, represent an excellent modality for delivering AMSM-specific intervention material where youth are. Objective This randomized trial aimed to test the effectiveness of the SMART program in reducing condom-less anal sex and increasing condom self-efficacy, condom use intentions, and HIV testing for AMSM. We also plan to test whether SMART has differential effectiveness across important subgroups of AMSM based on race and ethnicity, urban versus rural residence, age, socioeconomic status, and participation in an English versus a Spanish version of SMART. Methods Using a sequential multiple assignment randomized trial design, we will evaluate the impact of a stepped-care package of increasingly intensive eHealth interventions (ie, the universal, information-based SMART Sex Ed; the more intensive, selective SMART Squad; and a higher cost, indicated SMART Sessions). All intervention content is available in English and Spanish. Participants are recruited primarily from social media sources using paid and unpaid advertisements. Results The trial has enrolled 1285 AMSM aged 13 to 18 years, with a target enrollment of 1878. Recruitment concluded in June 2020. Participants were recruited from 49 US states as well as Puerto Rico and the District of Columbia. Assessments of intervention outcomes at 3, 6, 9, and 12 months are ongoing. Conclusions SMART is the first web-based program for AMSM to take a stepped-care approach to sexual education and HIV prevention. This design indicates that SMART delivers resources to all adolescents, but more costly treatments (eg, video chat counseling in SMART Sessions) are conserved for individuals who need them the most. SMART has the potential to reach AMSM to provide them with a sex-positive curriculum that empowers them with the information, motivation, and skills to make better health choices. Trial Registration ClinicalTrials.gov Identifier NCT03511131; https://clinicaltrials.gov/ct2/show/NCT03511131 International Registered Report Identifier (IRRID) DERR1-10.2196/19701 DA - 2020/8// PY - 2020/8// DO - 10.2196/19701 VL - 9 IS - 8 SP - SN - 1929-0748 KW - HIV prevention KW - eHealth KW - adolescents KW - men who have sex with men KW - implementation science KW - mobile phone ER - TY - RPRT TI - Statistical data integration in survey sampling: a review AU - Yang, S. AU - Kim, J.K. DA - 2020/1/9/ PY - 2020/1/9/ UR - https://arxiv.org/abs/2001.03259 ER - TY - RPRT TI - Double score matching estimators of average and quantile treatment effects AU - Yang, S. AU - Zhang, Y. DA - 2020/// PY - 2020/// UR - https://arxiv.org/abs/2001.06049 ER - TY - RPRT TI - Estimating Average Treatment Effects Utilizing Fractional Imputation when Confounders are Subject to Missingness AU - Corder, N. AU - Yang, S. DA - 2020/// PY - 2020/// UR - https://arxiv.org/pdf/1905.11497 ER - TY - RPRT TI - Integrative analysis of randomized clinicaltrials with real world evidence studies AU - Dong, L. AU - Yang, S. AU - Wang, X. AU - Zeng, D. AU - Cai, J.W. DA - 2020/// PY - 2020/// UR - https://arxiv.org/pdf/2003.01242 ER - TY - CHAP TI - Hierarchical continuous time hidden Markov model, with application in zero-inflated accelerometer data AU - Xu, Z. AU - Laber, E.B. AU - Staicu, A. T2 - Statistical Modeling for Biomedical Research: Contemporary Topics and Voices in the Field A2 - Zhao, Y. A2 - Chen, D.G T3 - Emerging Topics of Statistics and Biostatistics Book Series AB - Wearable devices including accelerometers are increasingly being used to collect high-frequency human activity data in situ. There is tremendous potential to use such data to inform medical decision making and public health policies. However, modeling such data is challenging as they are high-dimensional, heterogeneous, and subject to informative missingness, e.g., zero readings when the device is removed by the participant. We propose a flexible and extensible continuous-time hidden Markov model to extract meaningful activity patterns from human accelerometer data. To facilitate estimation with massive data we derive an efficient learning algorithm that exploits the hierarchical structure of the parameters indexing the proposed model. We also propose a bootstrap procedure for interval estimation. The proposed methods are illustrated using data from the 2003 - 2004 and 2005 - 2006 National Health and Nutrition Examination Survey. PY - 2020/// DO - 10.1007/978-3-030-33416-1_7 SP - 125-142 PB - Springer SN - 978-3-030-33416-1 ER - TY - JOUR TI - Statistical Inference for Online Decision Making: In a Contextual Bandit Setting AU - Chen, Haoyu AU - Lu, Wenbin AU - Song, Rui T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - Online decision-making problem requires us to make a sequence of decisions based on incremental information. Common solutions often need to learn a reward model of different actions given the contextual information and then maximize the long-term reward. It is meaningful to know if the posited model is reasonable and how the model performs in the asymptotic sense. We study this problem under the setup of the contextual bandit framework with a linear reward model. The ε-greedy policy is adopted to address the classic exploration-and-exploitation dilemma. Using the martingale central limit theorem, we show that the online ordinary least squares estimator of model parameters is asymptotically normal. When the linear model is misspecified, we propose the online weighted least squares estimator using the inverse propensity score weighting and also establish its asymptotic normality. Based on the properties of the parameter estimators, we further show that the in-sample inverse propensity weighted value estimator is asymptotically normal. We illustrate our results using simulations and an application to a news article recommendation dataset from Yahoo!. DA - 2020/// PY - 2020/// DO - 10.1080/01621459.2020.1770098 VL - 7 SP - 1-16 UR - http://dx.doi.org/10.1080/01621459.2020.1770098 KW - Epsilon-greedy KW - Inverse propensity weighted estimator KW - Model misspecification KW - Online decision making KW - Statistical inference ER - TY - JOUR TI - Sequencing depth and genotype quality: accuracy and breeding operation considerations for genomic selection applications in autopolyploid crops AU - Gemenet, Dorcus C. AU - Lindqvist-Kreuze, Hannele AU - De Boeck, Bert AU - da Silva Pereira, Guilherme AU - Mollinari, Marcelo AU - Zeng, Zhao-Bang AU - Craig Yencho, G. AU - Campos, Hugo T2 - Theoretical and Applied Genetics AB - Polypoid crop breeders can balance resources between density and sequencing depth, dosage information and fewer highly informative SNPs recommended, non-additive models and QTL advantages on prediction dependent on trait architecture. The autopolyploid nature of potato and sweetpotato ensures a wide range of meiotic configurations and linkage phases leading to complex gene-action and pose problems in genotype data quality and genomic selection analyses. We used a 315-progeny biparental F1 population of hexaploid sweetpotato and a diversity panel of 380 tetraploid potato, genotyped using different platforms to answer the following questions: (i) do polyploid crop breeders need to invest more for additional sequencing depth? (ii) how many markers are required to make selection decisions? (iii) does considering non-additive genetic effects improve predictive ability (PA)? (iv) does considering dosage or quantitative trait loci (QTL) offer significant improvement to PA? Our results show that only a small number of highly informative single nucleotide polymorphisms (SNPs; ≤ 1000) are adequate for prediction in the type of populations we analyzed. We also show that considering dosage information and models considering only additive effects had the best PA for most traits, while the comparative advantage of considering non-additive genetic effects and including known QTL in the predictive model depended on trait architecture. We conclude that genomic selection can help accelerate the rate of genetic gains in potato and sweetpotato. However, application of genomic selection should be considered as part of optimizing the entire breeding program. Additionally, since the predictions in the current study are based on single populations, further studies on the effects of haplotype structure and inheritance on PA should be studied in actual multi-generation breeding populations. DA - 2020/9/2/ PY - 2020/9/2/ DO - 10.1007/s00122-020-03673-2 VL - 133 IS - 12 SP - 3345-3363 J2 - Theor Appl Genet LA - en OP - SN - 0040-5752 1432-2242 UR - http://dx.doi.org/10.1007/s00122-020-03673-2 DB - Crossref ER - TY - JOUR TI - Effect of bicyclopyrone herbicide on sweetpotato and Palmer amaranth (Amaranthus palmeri) AU - Lindley, Jennifer J. AU - Jennings, Katherine M. AU - Monks, David W. AU - Chaudhari, Sushila AU - Schultheis, Jonathan R. AU - Waldschmidt, Matthew AU - Brownie, Cavell T2 - WEED TECHNOLOGY AB - Abstract Management options are needed to limit sweetpotato yield loss due to weeds. Greenhouse studies were conducted in 2018 in Greensboro, NC, and in the field from 2016 to 2018 in Clinton, NC, to evaluate the effect of bicyclopyrone on sweetpotato and Palmer amaranth (field only). In greenhouse studies, Covington and NC04-531 clones were treated with bicyclopyrone (0, 25, 50, 100, or 150 g ai ha −1 ) either preplant (PP; i.e., immediately before transplanting) or post-transplant (PT; i.e., on the same day after transplanting). Sweetpotato plant injury and stunting increased, and vine length and shoot dry weight decreased with increasing rate of bicyclopyrone regardless of clone or application timing. In field studies, Beauregard (2016) or Covington (2017 and 2018) sweetpotato clones were treated with bicyclopyrone at 50 g ha −1 PP, flumioxazin at 107 g ai ha −1 PP, bicyclopyrone at 50 or 100 g ha −1 PP followed by (fb) S -metolachlor at 800 g ai ha −1 PT, flumioxazin at 107 g ha −1 PP fb S -metolachlor at 800 g ha −1 PT, flumioxazin at 107 g ha −1 PP fb S -metolachlor at 800 g ha −1 PT fb bicyclopyrone at 50 g ha −1 PT-directed, and clomazone at 420 g ai ha −1 PP fb S -metolachlor at 800 g ha −1 PT. Bicyclopyrone PP at 100 g ha −1 fb S- metolachlor PT caused 33% or greater crop stunting and 44% or greater marketable yield reduction compared with the weed-free check in 2016 (Beauregard) and 2017 (Covington). Bicyclopyrone PP at 50 g ha −1 alone or fb S- metolachlor PT resulted in 12% or less injury and similar no. 1 and jumbo yields as the weed-free check in 2 of 3 yr. Injury to Covington from bicyclopyrone PT-directed was 4% or less at 4 or 5 wk after transplanting and marketable yield was similar to that of the weed-free check in 2017 and 2018. DA - 2020/8// PY - 2020/8// DO - 10.1017/wet.2020.13 VL - 34 IS - 4 SP - 552-559 SN - 1550-2740 KW - Greenhouse KW - weed control KW - crop injury KW - interference ER - TY - JOUR TI - A general frequency domain method for assessing spatial covariance structures AU - Van Hala, Matthew AU - Bandyopadhyay, Soutir AU - Lahiri, Soumendra N. AU - Nordman, Daniel J. T2 - BERNOULLI AB - When examining dependence in spatial data, it can be helpful to formally assess spatial covariance structures that may not be parametrically specified or fully model-based. That is, one may wish to test for general features regarding spatial covariance without presupposing any particular, or potentially restrictive, assumptions about the joint data distribution. Current methods for testing spatial covariance are often intended for specialized inference scenarios, usually with spatial lattice data. We propose instead a general method for estimation and testing of spatial covariance structure, which is valid for a variety of inference problems (including nonparametric hypotheses) and applies to a large class of spatial sampling designs with irregular data locations. In this setting, spatial statistics have limiting distributions with complex standard errors depending on the intensity of spatial sampling, the distribution of sampling locations, and the process dependence. The proposed method has the advantage of providing valid inference in the frequency domain without estimation of such standard errors, which are often intractable, and without particular distributional assumptions about the data (e.g., Gaussianity). To illustrate, we develop the method for formally testing isotropy and separability in spatial covariance and consider confidence regions for spatial parameters in variogram model fitting. A broad result is also presented to justify the method for application to other potential problems and general scenarios with testing spatial covariance. The approach uses spatial test statistics, based on an extended version of empirical likelihood, having simple chi-square limits for calibrating tests. We demonstrate the proposed method through several numerical studies. DA - 2020/11// PY - 2020/11// DO - 10.3150/19-BEJ1160 VL - 26 IS - 4 SP - 2463-2487 SN - 1573-9759 KW - confidence sets KW - spatial periodogram KW - spatial testing KW - spectral moment conditions KW - stochastic sampling ER - TY - JOUR TI - Asymptotic properties of penalized splines for functional data AU - Xiao, Luo T2 - BERNOULLI AB - Penalized spline methods are popular for functional data analysis but their asymptotic properties have not been established. We present a theoretic study of the $L_{2}$ and uniform convergence of penalized splines for estimating the mean and covariance functions of functional data under general settings. The established convergence rates for the mean function estimation are mini-max rate optimal and the rates for the covariance function estimation are comparable to those using other smoothing methods. DA - 2020/11// PY - 2020/11// DO - 10.3150/20-BEJ1209 VL - 26 IS - 4 SP - 2847-2875 SN - 1573-9759 KW - L-2 convergence KW - functional data analysis KW - nonparametric regression KW - penalized splines KW - uniform convergence ER - TY - JOUR TI - Mechanistic model of hormonal contraception AU - Wright, A. Armean AU - Fayad, Ghassan N. AU - Selgrade, James F. AU - Olufsen, Mette S. T2 - PLOS COMPUTATIONAL BIOLOGY AB - Contraceptive drugs intended for family planning are used by the majority of married or in-union women in almost all regions of the world. The two most prevalent types of hormones associated with contraception are synthetic estrogens and progestins. Hormonal based contraceptives contain a dose of a synthetic progesterone (progestin) or a combination of a progestin and a synthetic estrogen. In this study we use mathematical modeling to understand better how these contraceptive paradigms prevent ovulation, special focus is on understanding how changes in dose impact hormonal cycling. To explain this phenomenon, we added two autocrine mechanisms essential to achieve contraception within our previous menstrual cycle models. This new model predicts mean daily blood concentrations of key hormones during a contraceptive state achieved by administering progestins, synthetic estrogens, or a combined treatment. Model outputs are compared with data from two clinical trials: one for a progestin only treatment and one for a combined hormonal treatment. Results show that contraception can be achieved with synthetic estrogen, with progestin, and by combining the two hormones. An advantage of the combined treatment is that a contraceptive state can be obtained at a lower dose of each hormone. The model studied here is qualitative in nature, but can be coupled with a pharmacokinetic/pharamacodynamic (PKPD) model providing the ability to fit exogenous inputs to specific bioavailability and affinity. A model of this type may allow insight into a specific drug's effects, which has potential to be useful in the pre-clinical trial stage identifying the lowest dose required to achieve contraception. DA - 2020/6// PY - 2020/6// DO - 10.1371/journal.pcbi.1007848 VL - 16 IS - 6 SP - SN - 1553-7358 ER - TY - JOUR TI - Comparison of the Effectiveness of Online Homework With Handwritten Homework in Electrical and Computer Engineering Classes AU - Trussell, H. Joel AU - Gumpertz, Marcia L. T2 - IEEE TRANSACTIONS ON EDUCATION AB - Contribution: This article compares the predictive performance of the scores on WeBWorK homework (online) with those of standard handwritten homework. The comparison is done across six undergraduate electrical engineering classes where each of the nine instructors have used both homework modalities. Background: Online homework systems have been used for many years, but analysis of their effectiveness is mixed. Previous work has been limited to a small number of classes in a wide variety of disciplines. This article has a larger number of classes and instructors than previous studies. The classes cover many basic topic areas in electrical and computer engineering, so is directly applicable to the audience of these transactions. Research Question: What is the effect of online homework compared to traditional handwritten homework on the performance of the students on the final exams in selected ECE classes? Methodology: Mixed-effects analysis of variance models are used to determine the predictive ability of performance on homework of the two modalities on the performance on the final exams. The data are limited to classes where the instructors have taught the class using both modalities. These models incorporate the effect of modalities for each instructor and the effect of the modalities across all classes. Findings: The result is that there is no significant statistical difference in the two modalities to predict final exam scores. This indicates that the advantages of using the automated online system can be obtained with no detrimental effect on the students' learning. DA - 2020/8// PY - 2020/8// DO - 10.1109/TE.2020.2971198 VL - 63 IS - 3 SP - 209-215 SN - 1557-9638 KW - Electronic mail KW - Software KW - Education KW - Standards KW - Electrical engineering KW - Testing KW - Programming KW - Effectiveness KW - handwritten homework KW - online homework KW - statistical analysis KW - traditional homework KW - WeBWorK ER - TY - JOUR TI - Sequential Optimization in Locally Important Dimensions AU - Winkel, Munir A. AU - Stallrich, Jonathan W. AU - Storlie, Curtis B. AU - Reich, Brian T2 - TECHNOMETRICS AB - Optimizing an expensive, black-box function f(·) is challenging when its input space is high-dimensional. Sequential design frameworks first model f(·) with a surrogate function and then optimize an acquisition function to determine input settings to evaluate next. Optimization of both f(·) and the acquisition function benefit from effective dimension reduction. Global variable selection detects and removes input variables that do not affect f(·) across the input space. Further dimension reduction may be possible if we consider local variable selection around the current optimum estimate. We develop a sequential design algorithm called sequential optimization in locally important dimensions (SOLID) that incorporates global and local variable selection to optimize a continuous, differentiable function. SOLID performs local variable selection by comparing the surrogate’s predictions in a localized region around the estimated optimum with the p alternative predictions made by removing each input variable. The search space of the acquisition function is further restricted to focus only on the variables that are deemed locally active, leading to greater emphasis on refining the surrogate model in locally active dimensions. A simulation study across multiple test functions and an application to the Sarcos robot dataset show that SOLID outperforms conventional approaches. Supplementary materials for this article are available online. DA - 2020/// PY - 2020/// DO - 10.1080/00401706.2020.1714738 KW - Augmented expected improvement KW - Bayesian analysis KW - Computer experiments KW - Gaussian process KW - Local importance KW - Sequential design ER - TY - JOUR TI - Ascertaining properties of weighting in the estimation of optimal treatment regimes under monotone missingness AU - Dong, Lin AU - Laber, Eric AU - Goldberg, Yair AU - Song, Rui AU - Yang, Shu T2 - STATISTICS IN MEDICINE AB - Dynamic treatment regimes operationalize precision medicine as a sequence of decision rules, one per stage of clinical intervention, that map up‐to‐date patient information to a recommended intervention. An optimal treatment regime maximizes the mean utility when applied to the population of interest. Methods for estimating an optimal treatment regime assume the data to be fully observed, which rarely occurs in practice. A common approach is to first use multiple imputation and then pool the estimators across imputed datasets. However, this approach requires estimating the joint distribution of patient trajectories, which can be high‐dimensional, especially when there are multiple stages of intervention. We examine the application of inverse probability weighted estimating equations as an alternative to multiple imputation in the context of monotonic missingness. This approach applies to a broad class of estimators of an optimal treatment regime including both Q‐learning and a generalization of outcome weighted learning. We establish consistency under mild regularity conditions and demonstrate its advantages in finite samples using a series of simulation experiments and an application to a schizophrenia study. DA - 2020/11/10/ PY - 2020/11/10/ DO - 10.1002/sim.8678 VL - 39 IS - 25 SP - 3503-3520 SN - 1097-0258 KW - augmented inverse probability weighting KW - dynamic treatment regimes KW - monotonic coarseness KW - outcome weighted learning KW - Q-learning ER - TY - JOUR TI - A deep learning approach to identify smoke plumes in satellite imagery in near-real time for health risk communication AU - Larsen, Alexandra AU - Hanigan, Ivan AU - Reich, Brian J. AU - Qin, Yi AU - Cope, Martin AU - Morgan, Geoffrey AU - Rappold, Ana G. T2 - JOURNAL OF EXPOSURE SCIENCE AND ENVIRONMENTAL EPIDEMIOLOGY AB - Wildland fire (wildfire; bushfire) pollution contributes to poor air quality, a risk factor for premature death. The frequency and intensity of wildfires are expected to increase; improved tools for estimating exposure to fire smoke are vital. New-generation satellite-based sensors produce high-resolution spectral images, providing real-time information of surface features during wildfire episodes. Because of the vast size of such data, new automated methods for processing information are required. We present a deep fully convolutional neural network (FCN) for predicting fire smoke in satellite imagery in near-real time (NRT). The FCN identifies fire smoke using output from operational smoke identification methods as training data, leveraging validated smoke products in a framework that can be operationalized in NRT. We demonstrate this for a fire episode in Australia; the algorithm is applicable to any geographic region. The algorithm has high classification accuracy (99.5% of pixels correctly classified on average) and precision (average intersection over union = 57.6%). The FCN algorithm has high potential as an exposure-assessment tool, capable of providing critical information to fire managers, health and environmental agencies, and the general public to prevent the health risks associated with exposure to hazardous smoke from wildland fires in NRT. DA - 2020/// PY - 2020/// DO - 10.1038/s41370-020-0246-y ER - TY - JOUR TI - Semiparametric estimation of the cure fraction in population-based cancer survival analysis AU - Gu, Ennan AU - Zhang, Jiajia AU - Lu, Wenbin AU - Wang, Lianming AU - Felizzi, Federico T2 - STATISTICS IN MEDICINE AB - With rapid development in medical research, the treatment of diseases including cancer has progressed dramatically and those survivors may die from causes other than the one under study, especially among elderly patients. Motivated by the Surveillance, Epidemiology, and End Results (SEER) female breast cancer study, background mortality is incorporated into the mixture cure proportional hazards (MCPH) model to improve the cure fraction estimation in population‐based cancer studies. Here, that patients are “cured” is defined as when the mortality rate of the individuals in diseased group returns to the same level as that expected in the general population, where the population level mortality is presented by the mortality table of the United States. The semiparametric estimation method based on the EM algorithm for the MCPH model with background mortality (MCPH+BM) is further developed and validated via comprehensive simulation studies. Real data analysis shows that the proposed semiparametric MCPH+BM model may provide more accurate estimation in population‐level cancer study. DA - 2020/11/20/ PY - 2020/11/20/ DO - 10.1002/sim.8693 VL - 39 IS - 26 SP - 3787-3805 SN - 1097-0258 KW - Breslow estimator KW - EM algorithm KW - mixture cure model KW - perturbation KW - population-based study KW - semiparametric regression ER - TY - JOUR TI - Estimating the drivers of species distributions with opportunistic data using mediation analysis AU - Huberman, David B. AU - Reich, Brian J. AU - Pacifici, Krishna AU - Collazo, Jaime A. T2 - ECOSPHERE AB - Abstract Ecological occupancy modeling has historically relied on high‐quality, low‐quantity designed‐survey data for estimation and prediction. In recent years, there has been a large increase in the amount of high‐quantity, unknown‐quality opportunistic data. This has motivated research on how best to combine these two data sources in order to optimize inference. Existing methods can be infeasible for large datasets or require opportunistic data to be located where designed‐survey data exist. These methods map species occupancies, motivating a need to properly evaluate covariate effects (e.g., land cover proportion) on their distributions. We describe a spatial estimation method for supplementarily including additional opportunistic data using mediation analysis concepts. The opportunistic data mediate the effect of the covariate on the designed‐survey data response, decomposing it into a direct and indirect effect. A component of the indirect effect can then be quickly estimated via regressing the mediator on the covariate, while the other components are estimated through a spatial occupancy model. The regression step allows for use of large quantities of opportunistic data that can be collected in locations with no designed‐survey data available. Simulation results suggest that the mediated method produces an improvement in relative MSE when the data are of reasonable quality. However, when the simulated opportunistic data are poorly correlated with the true spatial process, the standard, unmediated method is still preferable. A spatiotemporal extension of the method is also developed for analyzing the effect of deciduous forest land cover on red‐eyed vireo distribution in the southeastern United States and find that including the opportunistic data do not lead to a substantial improvement. Opportunistic data quality remains an important consideration when employing this method, as with other data integration methods. DA - 2020/6// PY - 2020/6// DO - 10.1002/ecs2.3165 VL - 11 IS - 6 SP - SN - 2150-8925 KW - mediation analysis KW - occupancy modeling KW - opportunistic data KW - spatial statistics ER - TY - JOUR TI - GRID: A VARIABLE SELECTION AND STRUCTURE DISCOVERY METHOD FOR HIGH DIMENSIONAL NONPARAMETRIC REGRESSION AU - Giordano, Francesco AU - Lahiri, Soumendra Nath AU - Parrella, Maria Lucia T2 - ANNALS OF STATISTICS AB - We consider nonparametric regression in high dimensions where only a relatively small subset of a large number of variables are relevant and may have nonlinear effects on the response. We develop methods for variable selection, structure discovery and estimation of the true low-dimensional regression function, allowing any degree of interactions among the relevant variables that need not be specified a-priori. The proposed method, called the GRID, combines empirical likelihood based marginal testing with the local linear estimation machinery in a novel way to select the relevant variables. Further, it provides a simple graphical tool for identifying the low dimensional nonlinear structure of the regression function. Theoretical results establish consistency of variable selection and structure discovery, and also Oracle risk property of the GRID estimator of the regression function, allowing the dimension $d$ of the covariates to grow with the sample size $n$ at the rate $d=O(n^{a})$ for any $a\in(0,\infty)$ and the number of relevant covariates $r$ to grow at a rate $r=O(n^{\gamma})$ for some $\gamma\in(0,1)$ under some regularity conditions that, in particular, require finiteness of certain absolute moments of the error variables depending on $a$. Finite sample properties of the GRID are investigated in a moderately large simulation study. DA - 2020/6// PY - 2020/6// DO - 10.1214/19-AOS1846 VL - 48 IS - 3 SP - 1848-1874 SN - 0090-5364 KW - Empirical likelihood KW - marginal testing KW - variable selection consistency ER - TY - JOUR TI - ROBUST AND RATE-OPTIMAL GIBBS POSTERIOR INFERENCE ON THE BOUNDARY OF A NOISY IMAGE AU - Syring, Nicholas AU - Martin, Ryan T2 - ANNALS OF STATISTICS AB - Detection of an image boundary when the pixel intensities are measured with noise is an important problem in image segmentation, with numerous applications in medical imaging and engineering. From a statistical point of view, the challenge is that likelihood-based methods require modeling the pixel intensities inside and outside the image boundary, even though these are typically of no practical interest. Since misspecification of the pixel intensity models can negatively affect inference on the image boundary, it would be desirable to avoid this modeling step altogether. Towards this, we develop a robust Gibbs approach that constructs a posterior distribution for the image boundary directly, without modeling the pixel intensities. We prove that, for a suitable prior on the image boundary, the Gibbs posterior concentrates asymptotically at the minimax optimal rate, adaptive to the boundary smoothness. Monte Carlo computation of the Gibbs posterior is straightforward, and simulation experiments show that the corresponding inference is more accurate than that based on existing Bayesian methodology. DA - 2020/6// PY - 2020/6// DO - 10.1214/19-AOS1856 VL - 48 IS - 3 SP - 1498-1513 SN - 0090-5364 KW - Adaptation KW - boundary detection KW - likelihood-free inference KW - model misspecification KW - posterior concentration rate ER - TY - JOUR TI - Genetic and environmental risk for lymphoma in boxer dogs AU - Craun, Kaitlyn AU - Ekena, Joanne AU - Sacco, James AU - Jiang, Tao AU - Motsinger-Reif, Alison AU - Trepanier, Lauren A. T2 - JOURNAL OF VETERINARY INTERNAL MEDICINE AB - Non-Hodgkin lymphoma in humans is associated with environmental chemical exposures, and risk is enhanced by genetic variants in glutathione S-transferases (GST) enzymes.We hypothesized that boxer dogs, a breed at risk for lymphoma, would have a higher prevalence of GST variants with predicted low activity, and greater accumulated DNA damage, compared to other breeds. We also hypothesized that lymphoma in boxers would be associated with specific environmental exposures and a higher prevalence of canine GST variants.Fifty-four healthy boxers and 56 age-matched nonboxer controls; 63 boxers with lymphoma and 89 unaffected boxers ≥10 years old.We resequenced variant loci in canine GSTT1, GSTT5, GSTM1, and GSTP1 and compared endogenous DNA damage in peripheral leukocytes of boxers and nonboxers using the comet assay. We also compared GST variants and questionnaire-based environmental exposures in boxers with and without lymphoma.Endogenous DNA damage did not differ between boxers and nonboxers. Boxers with lymphoma were more likely to live within 10 miles of a nuclear power plant and within 2 miles of a chemical supplier or crematorium. Lymphoma risk was not modulated by known canine GST variants.Proximity to nuclear power plants, chemical suppliers, and crematoria were significant risk factors for lymphoma in this population of boxers. These results support the hypothesis that aggregate exposures to environmental chemicals and industrial waste may contribute to lymphoma risk in dogs. DA - 2020/9// PY - 2020/9// DO - 10.1111/jvim.15849 VL - 34 IS - 5 SP - 2068-2077 SN - 1939-1676 KW - canine KW - detoxification KW - exposure KW - lymphosarcoma ER - TY - JOUR TI - Posterior contraction and credible sets for filaments of regression functions AU - Li, Wei AU - Ghosal, Subhashis T2 - ELECTRONIC JOURNAL OF STATISTICS AB - A filament consists of local maximizers of a smooth function $f$ when moving in a certain direction. A filamentary structure is an important feature of the shape of an object and is also considered as an important lower dimensional characterization of multivariate data. There have been some recent theoretical studies of filaments in the nonparametric kernel density estimation context. This paper supplements the current literature in two ways. First, we provide a Bayesian approach to the filament estimation in regression context and study the posterior contraction rates using a finite random series of B-splines basis. Compared with the kernel-estimation method, this has a theoretical advantage as the bias can be better controlled when the function is smoother, which allows obtaining better rates. Assuming that $f:\mathbb{R}^{2}\mapsto \mathbb{R}$ belongs to an isotropic Hölder class of order $\alpha \geq 4$, with the optimal choice of smoothing parameters, the posterior contraction rates for the filament points on some appropriately defined integral curves and for the Hausdorff distance of the filament are both $(n/\log n)^{(2-\alpha )/(2(1+\alpha ))}$. Secondly, we provide a way to construct a credible set with sufficient frequentist coverage for the filaments. We demonstrate the success of our proposed method in simulations and one application to earthquake data. DA - 2020/// PY - 2020/// DO - 10.1214/20-EJS1705 VL - 14 IS - 1 SP - 1707-1743 SN - 1935-7524 KW - Filament KW - nonparametric regression KW - posterior contraction KW - credibility KW - coverage KW - B-splines ER - TY - JOUR TI - Central limit theorems for classical multidimensional scaling AU - Li, Gongkai AU - Tang, Minh AU - Charon, Nichlas AU - Priebe, Carey T2 - ELECTRONIC JOURNAL OF STATISTICS AB - Classical multidimensional scaling is a widely used method in dimensionality reduction and manifold learning. The method takes in a dissimilarity matrix and outputs a low-dimensional configuration matrix based on a spectral decomposition. In this paper, we present three noise models and analyze the resulting configuration matrices, or embeddings. In particular, we show that under each of the three noise models the resulting embedding gives rise to a central limit theorem. We also provide compelling simulations and real data illustrations of these central limit theorems. This perturbation analysis represents a significant advancement over previous results regarding classical multidimensional scaling behavior under randomness. DA - 2020/// PY - 2020/// DO - 10.1214/20-EJS1720 VL - 14 IS - 1 SP - 2362-2394 SN - 1935-7524 KW - Classical multidimensional scaling KW - dissimilarity matrix KW - perturbation analysis KW - central limit theorem ER - TY - JOUR TI - Semiparametric regression of the illness-death model with interval censored disease incidence time: An application to the ACLS data AU - Zhou, Jie AU - Zhang, Jiajia AU - McLain, Alexander C. AU - Lu, Wenbin AU - Sui, Xuemei AU - Hardin, James W. T2 - STATISTICAL METHODS IN MEDICAL RESEARCH AB - To investigate the effect of fitness on cardiovascular disease and all-cause mortality using the Aerobics Center Longitudinal Study, we develop a semiparametric illness-death model account for intermittent observations of the cardiovascular disease incidence time and the right censored data of all-cause mortality. The main challenge in estimation is to handle the intermittent observations (interval censoring) of cardiovascular disease incidence time and we develop a semiparametric estimation method based on the expectation-maximization algorithm for a Markov illness-death regression model. The variance of the parameters is estimated using profile likelihood methods. The proposed method is evaluated using extensive simulation studies and illustrated with an application to the Aerobics Center Longitudinal Study data. DA - 2020/12// PY - 2020/12// DO - 10.1177/0962280220939123 VL - 29 IS - 12 SP - 3707-3720 SN - 1477-0334 KW - Semi-competing model KW - illlness-death model KW - semi-parametric regression KW - interval censoring KW - Markov models ER - TY - JOUR TI - Gastric artery embolization: studying the effects of catheter type and injection method on microsphere distributions within a benchtop arterial model AU - Jernigan, Shaphan R. AU - Osborne, Jason A. AU - Buckner, Gregory D. T2 - BIOMEDICAL ENGINEERING ONLINE AB - Abstract Aims The objective of the study is to investigate the effect of catheter type and injection method on microsphere distributions, specifically vessel targeting accuracy. Materials and methods The study utilized three catheter types (a standard end-hole micro-catheter, a Surefire anti-reflux catheter, and an Endobar occlusion balloon catheter) and both manual and computer-controlled injection schemes. A closed-loop, dynamically pressurized surrogate arterial system was assembled to replicate arterial flow for bariatric embolization procedures. Four vessel branches immediately distal to the injection site were targeted for embolization. Embolic microspheres were injected into the model using these three catheter types and both manual and computer-controlled injections. Results Across all injection methods, the catheter effect on the proportion of microspheres to target vessels (vs. non-target vessels) was significant ( p = 0.005). The catheter effect on the number of non-target vessels embolized was nearly significant ( p = 0.059). Across all catheter types, the injection method effect was not statistically significant for either of two outcome measures (percent microspheres to target vessels: p = 0.265, number of non-target vessels embolized: p = 0.148). Conclusion Catheter type had a significant effect on targeting accuracy across all injection methods. The Endobar catheter exhibited a higher targeting accuracy in pairwise comparisons with the other two injection catheters across all injection schemes and when considering the Endobar catheter with the manifold injection method vs. each of the catheters with the manual injection method; the differences were significant in three of four analyses. The injection method effect was not statistically significant across all catheter types and when considering the Endobar catheter/Endobar manifold combination vs. Endobar catheter injections with manual and pressure-replicated methods. DA - 2020/6/26/ PY - 2020/6/26/ DO - 10.1186/s12938-020-00794-z VL - 19 IS - 1 SP - SN - 1475-925X KW - Gastric artery KW - Embolization KW - Vessel targeting KW - Reflux ER - TY - JOUR TI - Ultrasoft Liquid Metal Elastomer Foams with Positive and Negative Piezopermittivity for Tactile Sensing AU - Yang, Jiayi AU - Tang, David AU - Ao, Jinping AU - Ghosh, Tushar AU - Neumann, Taylor V. AU - Zhang, Dongguang AU - Piskarev, Yegor AU - Yu, Tingting AU - Truong, Vi Khanh AU - Xie, Kai AU - Lai, Ying-Chih AU - Li, Yang AU - Dickey, Michael D. T2 - ADVANCED FUNCTIONAL MATERIALS AB - Abstract Soft, capacitive tactile (pressure) sensors are important for applications including human–machine interfaces, soft robots, and electronic skins. Such capacitors consist of two electrodes separated by a soft dielectric. Pressing the capacitor brings the electrodes closer together and thereby increases capacitance. Thus, sensitivity to a given force is maximized by using dielectric materials that are soft and have a high dielectric constant, yet such properties are often in conflict with each other. Here, a liquid metal elastomer foam (LMEF) is introduced that is extremely soft (elastic modulus 7.8 kPa), highly compressible (70% strain), and has a high permittivity. Compressing the LMEF displaces the air in the foam structure, increasing the permittivity over a large range (5.6–11.7). This is called “positive piezopermittivity.” Interestingly, it is discovered that the permittivity of such materials decreases (“negative piezopermittivity”) when compressed to large strain due to the geometric deformation of the liquid metal droplets. This mechanism is theoretically confirmed via electromagnetic theory, and finite element simulation. Using these materials, a soft tactile sensor with high sensitivity, high initial capacitance, and large capacitance change is demonstrated. In addition, a tactile sensor powered wirelessly (from 3 m away) with high power conversion efficiency (84%) is demonstrated. DA - 2020/9// PY - 2020/9// DO - 10.1002/adfm.202002611 VL - 30 IS - 36 SP - SN - 1616-3028 KW - foams KW - liquid metals KW - pressuring sensing KW - stretchable electronics KW - tactile sensors ER - TY - JOUR TI - Microstructural classification of unirradiated LiAlO2 pellets by deep learning methods AU - Pazdernik, Karl AU - LaHaye, Nicole L. AU - Artman, Conor M. AU - Zhu, Yuanyuan T2 - COMPUTATIONAL MATERIALS SCIENCE AB - Microstructural features and defects can greatly impact material properties and performance in a wide range of application areas. Recognition and characterization of microstructural features is essential to the understanding and prediction of material performance under various operational conditions, including irradiation. In this work, we tested a collection of Deep Convolutional Neural Network (DCNN) architectures that have been optimized for image segmentation and selected the best performer to obtain pixel-level classification of the main microstructural features in unirradiated LiAlO2 pellets, including grains, grain boundaries, voids, precipitates, and zirconia impurities. LiAlO2 is an important material that is used as a tritium producer for the Tritium Sustainment Program. While LiAlO2 pellets have been employed in tritium-producing burnable absorber rods (TPBARs) for years, comprehensive microstructural analysis of unirradiated LiAlO2, and therefore time-dependent tritium release from the material during irradiation, has not been established. A full understanding of unirradiated LiAlO2 microstructure and how it evolves as a result of neutron irradiation is necessary to produce an integrated performance model to predict in-reactor behavior as well as to target strategic experiments. This work aims at developing a fast and quantitative analysis method to classify various microstructural features in unirradiated LiAlO2 pellets that are visualized by scanning electron microscopy (SEM). Given classification results obtained, statistical analysis was then carried out to evaluate the performance of the DCNN classification and to describe the properties of the microstructural features as a whole, based on standard aggregation and spatial point-process methodology. Our results show improved performance over a baseline heuristic approach. Also, the computational efficiency of the computer-aided analytical method allows for quantitative characterization of a larger volume of SEM images than was previously possible using manual segmentation. DA - 2020/8// PY - 2020/8// DO - 10.1016/j.commatsci.2020.109728 VL - 181 SP - SN - 1879-0801 KW - Deep convolutional neural network KW - Scanning electron microscopy KW - Spatial point process KW - Image segmentation ER - TY - JOUR TI - BASELINE DRIFT ESTIMATION FOR AIR QUALITY DATA USING QUANTILE TREND FILTERING AU - Brantley, Halley L. AU - Guinness, Joseph AU - Chi, Eric C. T2 - ANNALS OF APPLIED STATISTICS AB - We address the problem of estimating smoothly varying baseline trends in time series data. This problem arises in a wide range of fields, including chemistry, macroeconomics and medicine; however, our study is motivated by the analysis of data from low cost air quality sensors. Our methods extend the quantile trend filtering framework to enable the estimation of multiple quantile trends simultaneously while ensuring that the quantiles do not cross. To handle the computational challenge posed by very long time series, we propose a parallelizable alternating direction method of multipliers (ADMM) algorithm. The ADMM algorthim enables the estimation of trends in a piecewise manner, both reducing the computation time and extending the limits of the method to larger data sizes. We also address smoothing parameter selection and propose a modified criterion based on the extended Bayesian information criterion. Through simulation studies and our motivating application to low cost air quality sensor data, we demonstrate that our model provides better quantile trend estimates than existing methods and improves signal classification of low-cost air quality sensor output. DA - 2020/6// PY - 2020/6// DO - 10.1214/19-AOAS1318 VL - 14 IS - 2 SP - 585-604 SN - 1932-6157 KW - Air quality KW - nonparametric quantile regression KW - trend estimation ER - TY - JOUR TI - Distributions of pattern statistics in sparse Markov models AU - Martin, Donald E. K. T2 - Annals of the Institute of Statistical Mathematics DA - 2020/8// PY - 2020/8// DO - 10.1007/s10463-019-00714-6 VL - 72 IS - 4 SP - 895-913 SN - 0020-3157 1572-9052 UR - http://dx.doi.org/10.1007/S10463-019-00714-6 KW - Auxiliary Markov chain KW - Pattern distribution KW - Sparse Markov model KW - Variable length Markov chain ER - TY - JOUR TI - In-Plane Thermoelectric Properties of Flexible and Room-Temperature-Doped Carbon Nanotube Films AU - Chatterjee, Kony AU - Negi, Ankit AU - Kim, Kyunghoon AU - Liu, Jun AU - Ghosh, Tushar K. T2 - ACS Applied Energy Materials AB - Soft materials with high power factors (PFs) and low thermal conductivity (κ) are critically important for integration of thermoelectric (TE) modules into flexible form factors for energy harvesting or cooling applications. Here, air stable p- and n-type multiwalled carbon nanotube films with high PFs (up to 521 μW/m K2) are reported, with n-type doping carried out in a facile two-step process. The maximum figures of merit (ZTs) of p-type and n-type CNTs are obtained as 0.019 and 0.015 at 300 K, respectively, with all three transport properties—Seebeck coefficient, electrical conductivity, and κ—measured in-plane, providing a more accurate ZT. Using time-domain thermoreflectance, we report a fast and non-contact measurement of κ without complex microfabrication or material processing. Moreover, there is no material mismatch between the p- and n-type legs of the TE module. Such materials have the potential for widespread applications in inexpensive and scalable wearable energy harvesting and localized heating/cooling. DA - 2020/7/27/ PY - 2020/7/27/ DO - 10.1021/acsaem.0c00995 VL - 3 IS - 7 SP - 6929-6936 UR - https://doi.org/10.1021/acsaem.0c00995 KW - thermoelectrics KW - carbon nanotubes KW - flexible film KW - in-plane thermal conductivity KW - air stable ER - TY - JOUR TI - Comparative Exposure Assessment Using Silicone Passive Samplers Indicates That Domestic Dogs Are Sentinels To Support Human Health Research AU - Wise, Catherine F. AU - Hammel, Stephanie C. AU - Herkert, Nicholas AU - Ma, Jun AU - Motsinger-Reif, Alison AU - Stapleton, Heather M. AU - Breen, Matthew T2 - ENVIRONMENTAL SCIENCE & TECHNOLOGY AB - Silicone wristbands are promising passive samplers to support epidemiological studies in characterizing exposure to organic contaminants; however, investigating associated health risks remains challenging because of the latency period for many chronic diseases that take years to manifest. Dogs provide valuable insights as sentinels for exposure-related human disease because they share similar exposures in the home, have shorter life spans, share many clinical/biological features, and have closely related genomes. Here, we evaluated exposures among pet dogs and their owners using silicone dog tags and wristbands to determine if contaminant levels were correlated with validated exposure biomarkers. Significant correlations between measures on dog tags and wristbands were observed (rs = 0.38–0.90; p < 0.05). Correlations with their respective urinary biomarkers were stronger in dog tags compared to that in human wristbands (rs = 0.50–0.71; p < 0.01) for several organophosphate esters. This supports the value of using silicone bands with dogs to investigate health impacts on humans from shared exposures. DA - 2020/6/16/ PY - 2020/6/16/ DO - 10.1021/acs.est.9b06605 VL - 54 IS - 12 SP - 7409-7419 SN - 1520-5851 ER - TY - JOUR TI - Vecchia Approximations of Gaussian-Process Predictions AU - Katzfuss, Matthias AU - Guinness, Joseph AU - Gong, Wenlong AU - Zilber, Daniel T2 - JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS AB - Gaussian processes are popular and flexible models for spatial, temporal, and functional data, but they are computationally infeasible for large datasets. We discuss Gaussian-process approximations that use basis functions at multiple resolutions to achieve fast inference and that can (approximately) represent any spatial covariance structure. We consider two special cases of this multi-resolution-approximation framework, a taper version and a domain-partitioning (block) version. We describe theoretical properties and inference procedures, and study the computational complexity of the methods. Numerical comparisons and an application to satellite data are also provided. DA - 2020/6/23/ PY - 2020/6/23/ DO - 10.1007/s13253-020-00401-7 SP - SN - 1537-2693 KW - Computational complexity KW - Kriging KW - Large datasets KW - Sparsity KW - Spatial statistics ER - TY - JOUR TI - Tenure and Promotion Outcomes at Four Large Land Grant Universities: Examining the Role of Gender, Race, and Academic Discipline AU - Durodoye, Raifu, Jr. AU - Gumpertz, Marcia AU - Wilson, Alyson AU - Griffith, Emily AU - Ahmad, Seher T2 - RESEARCH IN HIGHER EDUCATION DA - 2020/8// PY - 2020/8// DO - 10.1007/s11162-019-09573-9 VL - 61 IS - 5 SP - 628-651 SN - 1573-188X KW - Tenure KW - Faculty KW - Race KW - Gender KW - Discipline ER - TY - JOUR TI - The influence of packed cell volume versus plasma proteins on thromboelastographic variables in canine blood AU - Lynch, Alex M. AU - Ruterbories, Laura AU - Jack, John AU - Motsinger-Reif, Alison A. AU - Hanel, Rita T2 - JOURNAL OF VETERINARY EMERGENCY AND CRITICAL CARE AB - Abstract Objective Determine the correlation between kaolin‐activated thromboelastography (TEG) variables (R, K, angle, and maximum amplitude [MA]) and PCV, fibrinogen concentration (FC), and total fibrinogen (TF) in an ex vivo model. Animals Two healthy adult mixed‐breed dogs. Procedures Citrated whole blood was obtained and separated into packed red cells, platelet rich plasma, and platelet poor plasma (PPP). An aliquot of PPP was heated to denature heat labile proteins (fibrinogen, factor V, factor VIII). Blood components were recombined for analyses of 6 physiological scenarios: anemia with low fibrinogen; anemia with moderate fibrinogen; anemia with normal fibrinogen; anemia with normal saline; normal PCV and normal fibrinogen; and normal PCV and low fibrinogen. A Kruskal–Wallis test, along with linear regressions on pairwise combinations of TEG variables, was used to determine the correlation between TEG variables and PCV, FC, and TF. Results Maximum amplitude correlated with FC ( R 2 0.60, P < 0.001) and TF ( R 2 0.57, P < 0.001) but not PCV ( R 2 0.003, P = 0.7). Angle and K time were moderately correlated with FC ([angle: R 2 0.53, P < 0.001]; [K: R 2 0.55, P < 0.001]) and TF ([alpha angle: R 2 0.52, P < 0.001]; [K: R 2 0.51, P < 0.001]) but not PCV. The R time was weakly correlated with PCV ( R 2 0.15, P < 0.009) but not FC or TF. Conclusions and clinical relevance In an ex vivo model, plasma proteins but not PCV impacted TEG variables. This suggests that TEG changes noted with anemia are imparted by changes in available fibrinogen in a fixed microenvironment rather than artifact of anemia. DA - 2020/7// PY - 2020/7// DO - 10.1111/vec.12979 VL - 30 IS - 4 SP - 418-425 SN - 1476-4431 ER - TY - JOUR TI - Peptide variability and signatures associated with disease progression in CSF collected longitudinally from ALS patients AU - Mellinger, Allyson L. AU - Griffith, Emily H. AU - Bereman, Michael S. T2 - ANALYTICAL AND BIOANALYTICAL CHEMISTRY DA - 2020/9// PY - 2020/9// DO - 10.1007/s00216-020-02765-8 VL - 412 IS - 22 SP - 5465-5475 SN - 1618-2650 KW - Amyotrophic lateral sclerosis KW - Cerebrospinal fluid KW - Longitudinal modeling KW - Proteomics KW - Biomarker ER - TY - JOUR TI - Global forensic geolocation with deep neural networks AU - Grantham, Neal S. AU - Reich, Brian J. AU - Laber, Eric B. AU - Pacifici, Krishna AU - Dunn, Robert R. AU - Fierer, Noah AU - Gebert, Matthew AU - Allwood, Julia S. AU - Faith, Seth A. T2 - JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS AB - Summary An important problem in modern forensic analyses is identifying the provenance of materials at a crime scene, such as biological material on a piece of clothing. This procedure, which is known as geolocation, is conventionally guided by expert knowledge of the biological evidence and therefore tends to be application specific, labour intensive and often subjective. Purely data-driven methods have yet to be fully realized in this domain, because in part of the lack of a sufficiently rich source of data. However, high throughput sequencing technologies can identify tens of thousands of fungi and bacteria taxa by using DNA recovered from a single swab collected from nearly any object or surface. This microbial community, or microbiome, may be highly informative of the provenance of the sample, but data on the spatial variation of microbiomes are sparse and high dimensional and have a complex dependence structure that render them difficult to model with standard statistical tools. Deep learning algorithms have generated a tremendous amount of interest within the machine learning community for their predictive performance in high dimensional problems. We present DeepSpace: a new algorithm for geolocation that aggregates over an ensemble of deep neural network classifiers trained on randomly generated Voronoi partitions of a spatial domain. The DeepSpace algorithm makes remarkably good point predictions; for example, when applied to the microbiomes of over 1300 dust samples collected across continental USA, more than half of geolocation predictions produced by this model fall less than 100 km from their true origin, which is a 60% reduction in error from competing geolocation methods. Moreover, we apply DeepSpace to a novel data set of global dust samples collected from nearly 30 countries, finding that dust-associated fungi alone predict a sample's country of origin with nearly 90% accuracy. DA - 2020/8// PY - 2020/8// DO - 10.1111/rssc.12427 VL - 69 IS - 4 SP - 909-929 SN - 1467-9876 KW - Citizen science KW - Machine learning KW - Microbiome KW - Non-homogeneous Poisson process KW - Spatial point pattern ER - TY - JOUR TI - Association test using Copy Number Profile Curves (CONCUR) enhances power in rare copy number variant analysis AU - Brucker, Amanda AU - Lu, Wenbin AU - West, Rachel Marceau AU - Yu, Qi-You AU - Hsiao, Chuhsing Kate AU - Hsiao, Tzu-Hung AU - Lin, Ching-Heng AU - Magnusson, Patrik K. E. AU - Sullivan, Patrick F. AU - Szatkiewicz, Jin P. AU - Lu, Tzu-Pin AU - Tzeng, Jung-Ying T2 - PLOS COMPUTATIONAL BIOLOGY AB - Copy number variants (CNVs) are the gain or loss of DNA segments in the genome that can vary in dosage and length. CNVs comprise a large proportion of variation in human genomes and impact health conditions. To detect rare CNV associations, kernel-based methods have been shown to be a powerful tool due to their flexibility in modeling the aggregate CNV effects, their ability to capture effects from different CNV features, and their accommodation of effect heterogeneity. To perform a kernel association test, a CNV locus needs to be defined so that locus-specific effects can be retained during aggregation. However, CNV loci are arbitrarily defined and different locus definitions can lead to different performance depending on the underlying effect patterns. In this work, we develop a new kernel-based test called CONCUR (i.e., copy number profile curve-based association test) that is free from a definition of locus and evaluates CNV-phenotype associations by comparing individuals' copy number profiles across the genomic regions. CONCUR is built on the proposed concepts of "copy number profile curves" to describe the CNV profile of an individual, and the "common area under the curve (cAUC) kernel" to model the multi-feature CNV effects. The proposed method captures the effects of CNV dosage and length, accounts for the numerical nature of copy numbers, and accommodates between- and within-locus etiological heterogeneity without the need to define artificial CNV loci as required in current kernel methods. In a variety of simulation settings, CONCUR shows comparable or improved power over existing approaches. Real data analyses suggest that CONCUR is well powered to detect CNV effects in the Swedish Schizophrenia Study and the Taiwan Biobank. DA - 2020/5// PY - 2020/5// DO - 10.1371/journal.pcbi.1007797 VL - 16 IS - 5 SP - SN - 1553-7358 ER - TY - JOUR TI - The association between neuraxial anesthesia and the development of childhood asthma - a secondary analysis of the newborn epigenetics study cohort AU - Huang, Yueyang AU - Tzeng, Jung-Ying AU - Maguire, Rachel AU - Hoyo, Cathrine AU - Allen, Terrence T2 - CURRENT MEDICAL RESEARCH AND OPINION AB - Objectives Childhood asthma is a common chronic illness that has been associated with mode of delivery. However, the effect of cesarean delivery alone does not fully account for the increased prevalence of childhood asthma. We tested the hypothesis that neuraxial anesthesia used for labor analgesia and cesarean delivery alters the risk of developing childhood asthma.Methods Within the Newborn Epigenetics Study birth cohort, 196 mother and child pairs with entries in the electronic anesthesia records were included. From these records, data on maternal anesthesia type, duration of exposure, and drugs administered peripartum were abstracted and combined with questionnaire-derived prenatal risk factors and medical records and questionnaire-derived asthma diagnosis data in children. Logistic regression models were used to evaluate associations between type of anesthesia, duration of anesthesia, and the development of asthma in males and females.Results We found that longer duration of epidural anesthesia was associated with a lower risk of asthma in male children (OR = 0.80; 95% CI = 0.66–0.95) for each hour of epidural exposure. Additionally, a unit increase in the composite dose of local anesthetics and opioid analgesics administered via the spinal route was associated with a lower risk of asthma in both male (OR = 0.59, 95% CI = 0.36–0.96) and female children (OR 0.26, 95% CI 0.09–0.82).Conclusion Our data suggest that peripartum exposure to neuraxial anesthesia may reduce the risk of childhood asthma primarily in males. Larger human studies and model systems with longer follow-up are required to elucidate these findings. DA - 2020/6/2/ PY - 2020/6/2/ DO - 10.1080/03007995.2020.1747417 VL - 36 IS - 6 SP - 1025-1032 SN - 1473-4877 KW - Anesthesia KW - opioid analgesics KW - asthma KW - children KW - sex-specific ER - TY - JOUR TI - Growth performance, oxidative stress, and antioxidant capacity of newly weaned piglets fed dietary peroxidized lipids with vitamin E or phytogenic compounds in drinking water AU - Silva-Guillen, Ysenia AU - Arellano, Consuelo AU - Martinez, Gabriela AU - Heugten, Eric T2 - APPLIED ANIMAL SCIENCE AB - This study evaluated the use of vitamin E and phytogenic compounds in drinking water on growth performance, oxidative stress, and immune status of piglets fed peroxidized lipids. In a 35-d study, 21-d-old weaned piglets (n = 96; 6.10 ± 0.64 kg of BW) were assigned within sex and BW blocks to 1 of 4 treatments, using 24 pens (4 pigs per pen; 6 replications per treatment). Diets contained either 6% soybean oil or 6% peroxidized soybean oil. Pigs fed peroxidized soybean oil received drinking water without (control) or with supplemental vitamin E (100 IU/L of RRR-α-tocopherol) or phytogenic compounds (60 μL/L for wk 1 and 30 μL/L for wk 2 to 5). Peroxidized soybean oil decreased (P < 0.001) final BW (18.2 vs. 21.6 kg) and ADG (346 vs. 441 g/d) and tended to decrease ADFI (P = 0.14; 542 vs. 617 g/d) and G:F (P = 0.07; 645 vs. 715 g/kg). Peroxidation decreased serum vitamin E concentrations (P = 0.03), which could be restored (P = 0.01) by vitamin E in the water, but not phytogenic compounds. Peroxidized soybean oil decreased serum 8-hydroxydeoxyguanosine, increased serum protein carbonyl, and had no effects on serum malondialdehyde or cytokines. Peroxidized soybean oil reduced growth performance of weaned nursery pigs, which did not appear to be related to oxidative stress or immune status. The negative effects of peroxidized soybean oil on animal performance could not be improved by supplementation of vitamin E or phytogenic compounds in the drinking water. DA - 2020/6// PY - 2020/6// DO - 10.15232/aas.2019-01976 VL - 36 IS - 3 SP - 341-351 SN - 2590-2865 KW - health KW - oxidation KW - plant extracts KW - tocopherol ER - TY - JOUR TI - Multiple QTL Mapping in Autopolyploids: A Random-Effect Model Approach with Application in a Hexaploid Sweetpotato Full-Sib Population AU - Da Silva Pereira, G. AU - Gemenet, D.C. AU - Mollinari, M. AU - Olukolu, B.A. AU - Wood, J.C. AU - Diaz, F. AU - Mosquera, V. AU - Gruneberg, W.J. AU - Khan, A. AU - Buell, C.R. AU - Yencho, G.C. AU - Zeng, Z.-B. T2 - Genetics AB - Abstract Genetic analysis in autopolyploids is a very complicated subject due to the enormous number of genotypes at a locus that needs to be considered. For instance, the number of... In developing countries, the sweetpotato, Ipomoea batatas (L.) Lam. (2n=6x=90), is an important autopolyploid species, both socially and economically. However, quantitative trait loci (QTL) mapping has remained limited due to its genetic complexity. Current fixed-effect models can fit only a single QTL and are generally hard to interpret. Here, we report the use of a random-effect model approach to map multiple QTL based on score statistics in a sweetpotato biparental population (‘Beauregard’ × ‘Tanzania’) with 315 full-sibs. Phenotypic data were collected for eight yield component traits in six environments in Peru, and jointly adjusted means were obtained using mixed-effect models. An integrated linkage map consisting of 30,684 markers distributed along 15 linkage groups (LGs) was used to obtain the genotype conditional probabilities of putative QTL at every centiMorgan position. Multiple interval mapping was performed using our R package QTLpoly and detected a total of 13 QTL, ranging from none to four QTL per trait, which explained up to 55% of the total variance. Some regions, such as those on LGs 3 and 15, were consistently detected among root number and yield traits, and provided a basis for candidate gene search. In addition, some QTL were found to affect commercial and noncommercial root traits distinctly. Further best linear unbiased predictions were decomposed into additive allele effects and were used to compute multiple QTL-based breeding values for selection. Together with quantitative genotyping and its appropriate usage in linkage analyses, this QTL mapping methodology will facilitate the use of genomic tools in sweetpotato breeding as well as in other autopolyploids. DA - 2020/5/5/ PY - 2020/5/5/ DO - 10.1534/genetics.120.303080 VL - 215 IS - 3 SP - 579-595 UR - http://dx.doi.org/10.1534/genetics.120.303080 KW - multiple interval mapping KW - polyploid QTL model KW - restricted maximum likelihood KW - variance components KW - yield components KW - heritability ER - TY - JOUR TI - Disturbances drive changes in coral community assemblages and coral calcification capacity AU - Courtney, Travis A. AU - Barnes, Brian B. AU - Chollett, Iliana AU - Elahi, Robin AU - Gross, Kevin AU - Guest, James R. AU - Kuffner, Ilsa B. AU - Lenz, Elizabeth A. AU - Nelson, Hannah R. AU - Rogers, Caroline S. AU - Toth, Lauren T. AU - Andersson, Andreas J. T2 - ECOSPHERE AB - Abstract Anthropogenic environmental change has increased coral reef disturbance regimes in recent decades, altering the structure and function of many coral reefs globally. In this study, we used coral community survey data collected from 1996 to 2015 to evaluate reef‐scale coral calcification capacity (CCC) dynamics with respect to recorded pulse disturbances for 121 reef sites in the Main Hawaiian Islands and Mo'orea (French Polynesia) in the Pacific and the Florida Keys Reef Tract and St. John (U.S. Virgin Islands) in the western Atlantic. CCC remained relatively high in the Main Hawaiian Islands in the absence of recorded widespread disturbances; declined and subsequently recovered in Mo'orea following a crown‐of‐thorns sea star outbreak, coral bleaching, and major cyclone; decreased and remained low following coral bleaching in the Florida Keys Reef Tract; and decreased following coral bleaching and disease in St. John. Individual coral taxa have variable calcification rates and susceptibility to disturbances because of their differing life‐history strategies. As a result, temporal changes in CCC in this study were driven by shifts in both overall coral cover and coral community composition. Analysis of our results considering coral life‐history strategies showed that weedy corals generally increased their contributions to CCC over time while the contribution of competitive corals decreased. Shifts in contributions by stress‐tolerant and generalist corals to CCC were more variable across regions. The increasing frequency and intensity of disturbances under 21st century global change therefore has the potential to drive lower and more variable CCC because of the increasing dominance of weedy and some stress‐tolerant corals. DA - 2020/4// PY - 2020/4// DO - 10.1002/ecs2.3066 VL - 11 IS - 4 SP - SN - 2150-8925 KW - carbonate budgets KW - climate change KW - coral bleaching KW - coral disease KW - ecological traits KW - environmental monitoring KW - resilience KW - scleractinians ER - TY - JOUR TI - DHPA: Dynamic Human Preference Analytics Framework— A Case Study on Taxi Drivers' Learning Curve Analysis AU - Pan, M. AU - Li, Y. AU - Zhou, X. AU - Liu, Z. AU - Song, R. AU - Liu, H. AU - Luo, J. AU - Huang, Weixiao AU - Tian, Zhihong T2 - ACM Transactions on Intelligent Systems and Technology AB - Many real-world human behaviors can be modeled and characterized as sequential decision-making processes, such as a taxi driver’s choices of working regions and times. Each driver possesses unique preferences on the sequential choices over time and improves the driver’s working efficiency. Understanding the dynamics of such preferences helps accelerate the learning process of taxi drivers. Prior works on taxi operation management mostly focus on finding optimal driving strategies or routes, lacking in-depth analysis on what the drivers learned during the process and how they affect the performance of the driver. In this work, we make the first attempt to establish Dynamic Human Preference Analytics. We inversely learn the taxi drivers’ preferences from data and characterize the dynamics of such preferences over time. We extract two types of features (i.e., profile features and habit features) to model the decision space of drivers. Then through inverse reinforcement learning, we learn the preferences of drivers with respect to these features. The results illustrate that self-improving drivers tend to keep adjusting their preferences to habit features to increase their earning efficiency while keeping the preferences to profile features invariant. However, experienced drivers have stable preferences over time. The exploring drivers tend to randomly adjust the preferences over time. DA - 2020/1// PY - 2020/1// DO - 10.1145/3360312 VL - 11 IS - 1 SP - SN - 2157-6912 KW - Urban computing KW - inverse reinforcement learning KW - preference dynamics ER - TY - JOUR TI - Nonlinear Dose-Response Modeling of High-Throughput Screening Data Using an Evolutionary Algorithm AU - Ma, Jun AU - Bair, Eric AU - Motsinger-Reif, Alison T2 - DOSE-RESPONSE AB - Nonlinear dose-response relationships exist extensively in the cellular, biochemical, and physiologic processes that are affected by varying levels of biological, chemical, or radiation stress. Modeling such responses is a crucial component of toxicity testing and chemical screening. Traditional model fitting methods such as nonlinear least squares (NLS) are very sensitive to initial parameter values and often had convergence failure. The use of evolutionary algorithms (EAs) has been proposed to address many of the limitations of traditional approaches, but previous methods have been limited in the types of models they can fit. Therefore, we propose the use of an EA for dose-response modeling for a range of potential response model functional forms. This new method can not only fit the most commonly used nonlinear dose-response models (eg, exponential models and 3-, 4-, and 5-parameter logistic models) but also select the best model if no model assumption is made, which is especially useful in the case of high-throughput curve fitting. Compared with NLS, the new method provides stable and robust solutions without sensitivity to initial values. DA - 2020/4// PY - 2020/4// DO - 10.1177/1559325820926734 VL - 18 IS - 2 SP - SN - 1559-3258 KW - evolutionary algorithm KW - hillslope model KW - parameter estimation KW - nonlinear regression KW - model selection ER - TY - JOUR TI - Equiprobable discrete models of site-specific substitution rates underestimate the extent of rate variability AU - Mannino, Frank AU - Wisotsky, Sadie AU - Pond, Sergei L. Kosakovsky AU - Muse, Spencer V T2 - PLOS ONE AB - It is standard practice to model site-to-site variability of substitution rates by discretizing a continuous distribution into a small number, K, of equiprobable rate categories. We demonstrate that the variance of this discretized distribution has an upper bound determined solely by the choice of K and the mean of the distribution. This bound can introduce biases into statistical inference, especially when estimating parameters governing site-to-site variability of substitution rates. Applications to two large collections of sequence alignments demonstrate that this upper bound is often reached in analyses of real data. When parameter estimation is of primary interest, additional rate categories or more flexible modeling methods should be considered. DA - 2020/3/2/ PY - 2020/3/2/ DO - 10.1371/journal.pone.0229493 VL - 15 IS - 3 SP - SN - 1932-6203 ER - TY - JOUR TI - Regional and field-specific differences in Fusarium species and mycotoxins associated with blighted North Carolina wheat AU - Cowger, Christina AU - Ward, Todd J. AU - Nilsson, Kathryn AU - Arellano, Consuelo AU - McCormick, Susan P. AU - Busman, Mark T2 - INTERNATIONAL JOURNAL OF FOOD MICROBIOLOGY AB - Worldwide, while Fusarium graminearum is the main causal species of Fusarium head blight (FHB) in small-grain cereals, a diversity of FHB-causing species belonging to different species complexes has been found in most countries. In the U.S., FHB surveys have focused on the Fusarium graminearum species complex (FGSC) and the frequencies of 3-ADON, 15-ADON, and nivalenol (NIV) chemotypes. A large-scale survey was undertaken across the state of North Carolina in 2014 to explore the frequency and distribution of F. graminearum capable of producing NIV, which is not monitored at grain intake points. Symptomatic wheat spikes were sampled from 59 wheat fields in 24 counties located in three agronomic zones typical of several states east of the Appalachian Mountains: Piedmont, Coastal Plain, and Tidewater. Altogether, 2197 isolates were identified to species using DNA sequence-based methods. Surprisingly, although F. graminearum was the majority species detected, species in the Fusarium tricinctum species complex (FTSC) that produce “emerging mycotoxins” were frequent, and even dominant in some fields. The FTSC percentage was 50–100% in four fields, 30–49% in five fields, 20–29% in five fields, and < 20% in the remaining 45 fields. FTSC species were at significantly higher frequency in the Coastal Plain than in the Piedmont or Tidewater (P < .05). Moniliformin concentrations in samples ranged from 0.0 to 38.7 μg g−1. NIV producing isolates were rare statewide (2.2%), and never >12% in a single field, indicating that routine testing for NIV is probably unnecessary. The patchy distribution of FTSC species in wheat crops demonstrated the need to investigate the potential importance of their mycotoxins and the factors that allow them to sometimes outcompete trichothecene producers. An increased sampling intensity of wheat fields led to the unexpected discovery of a minority FHB-causing population. DA - 2020/6/16/ PY - 2020/6/16/ DO - 10.1016/j.ijfoodmicro.2020.108594 VL - 323 SP - SN - 1879-3460 KW - Fusarium graminearum KW - Fusarium head blight KW - Fusarium tricinctum species complex KW - Scab KW - Deoxynivalenol KW - Moniliformin KW - Gibberella ear rot KW - Small grains KW - Chemotype ER - TY - JOUR TI - Research note: Shout-out survey for quantifying reasons for trail use AU - Hess, George R. AU - Loflin, Alexandria M. AU - Selm, Kathryn R. T2 - JOURNAL OF OUTDOOR RECREATION AND TOURISM-RESEARCH PLANNING AND MANAGEMENT AB - Gathering data about why people use greenway trails (e.g., health, recreation, transportation) requires interaction with trail users who typically do not want to stop for a survey; runners and bicyclists are particularly challenging. We placed a series of signs along trails asking users to shout out their answer to a simple question as they passed a surveyor, who also recorded observational data. In a feasibility study along greenway trails in Raleigh, NC, USA, we counted 541 users, 66% of whom shouted out whether they were using the trail for recreation or transportation. Of all users who passed, 45% were on bicycles and 55% on foot. Of those who responded, 11% were using the trail for transportation and 89% for recreation; 86% of transportation users were bicyclists. This method is generalizable and offers a way to collect additional information as individuals pass surveyors who might otherwise collect only observational data. DA - 2020/3// PY - 2020/3// DO - 10.1016/j.jort.2019.100234 VL - 29 SP - SN - 2213-0799 KW - Bicyclist KW - Greenway trail use KW - Pedestrian KW - Poll KW - Recreation KW - Survey KW - Transportation ER - TY - JOUR TI - Fully‐Textile Seam‐Line Sensors for Facile Textile Integration and Tunable Multi‐Modal Sensing of Pressure, Humidity, and Wetness AU - Agcayazi, Talha AU - Tabor, Jordan AU - McKnight, Michael AU - Martin, Isaac AU - Ghosh, Tushar K. AU - Bozkurt, Alper T2 - Advanced Materials Technologies AB - Abstract The unique potential of e‐textiles for unobtrusive and ubiquitous monitoring and their innovative interfacing with electronic devices has garnished great attention. Sensors are one of the few essential devices or components necessary for most functional e‐textile applications. Ideally, any e‐textile based sensor should be soft, easily integrated in textile manufacturing processes, and tunable for the desired applications. Here, an easy‐to‐manufacture, tunable, fully‐textile sensor system with capability of detecting pressure, humidity, or wetness is presented. Capacitive pressure sensors are formed via a traditional sewing process with two commercially available conductive sewing yarns (silver‐plated polyamide (silver) and stainless steel (SS)) with cotton knit, polyethylene‐terephthalate (PET) knit and elastomeric meltblown textile dielectrics. The relationship between the sensor's physical, mechanical, and electromechanical properties including hysteresis, sensitivity, response, and relaxation time is evaluated. In addition, the same sensor configuration is assessed for its humidity and wetness sensing performance. Results indicate that pressure, relative humidity (RH), and wetness sensing performance are easily tunable using different combinations of the conductive and dielectric textile materials. Finally, proof of concept deployment demonstrations as human‐machine interfaces within a pressure sensing mat and a smart glove capable of remotely controlling a drone are provided. DA - 2020/8// PY - 2020/8// DO - 10.1002/admt.202000155 UR - https://doi.org/10.1002/admt.202000155 KW - e-textiles KW - flexible sensors KW - humidity sensing KW - pressure sensing KW - wetness sensing ER - TY - JOUR TI - Bayesian Inference in Nonparanormal Graphical Models AU - Mulgrave, Jami J. AU - Ghosal, Subhashis T2 - BAYESIAN ANALYSIS AB - Gaussian graphical models have been used to study intrinsic dependence among several variables, but the Gaussianity assumption may be restrictive in many applications. A nonparanormal graphical model is a semiparametric generalization for continuous variables where it is assumed that the variables follow a Gaussian graphical model only after some unknown smooth monotone transformations on each of them. We consider a Bayesian approach in the nonparanormal graphical model by putting priors on the unknown transformations through a random series based on B-splines where the coefficients are ordered to induce monotonicity. A truncated normal prior leads to partial conjugacy in the model and is useful for posterior simulation using Gibbs sampling. On the underlying precision matrix of the transformed variables, we consider a spike-and-slab prior and use an efficient posterior Gibbs sampling scheme. We use the Bayesian Information Criterion to choose the hyperparameters for the spike-and-slab prior. We present a posterior consistency result on the underlying transformation and the precision matrix. We study the numerical performance of the proposed method through an extensive simulation study and finally apply the proposed method on a real data set. DA - 2020/6// PY - 2020/6// DO - 10.1214/19-BA1159 VL - 15 IS - 2 SP - 449-475 SN - 1936-0975 KW - Bayesian inference KW - nonparanormal KW - Gaussian graphical models KW - sparsity KW - continuous shrinkage prior ER - TY - JOUR TI - Comparison of decay rates between native and non-native wood species in invaded forests of the southeastern US: a rapid assessment AU - Ulyshen, Michael D. AU - Horn, Scott AU - Brownie, Cavell AU - Strickland, Michael S. AU - Wurzburger, Nina AU - Zanne, Amy T2 - BIOLOGICAL INVASIONS DA - 2020/8// PY - 2020/8// DO - 10.1007/s10530-020-02276-8 VL - 22 IS - 8 SP - 2619-2632 SN - 1573-1464 KW - Chinese privet KW - Exotic species KW - Japanese stiltgrass KW - Novel ecosystems KW - Plant traits ER - TY - JOUR TI - Use of standardized bioinformatics for the analysis of fungal DNA signatures applied to sample provenance AU - Allwood, Julia S. AU - Fierer, Noah AU - Dunn, Robert R. AU - Breen, Matthew AU - Reich, Brian J. AU - Laber, Eric B. AU - Clifton, Jesse AU - Grantham, Neal S. AU - Faith, Seth A. T2 - FORENSIC SCIENCE INTERNATIONAL AB - The use of environmental trace material to aid criminal investigations is an ongoing field of research within forensic science. The application of environmental material thus far has focused upon a variety of different objectives relevant to forensic biology, including sample provenance (also referred to as sample attribution). The capability to predict the provenance or origin of an environmental DNA sample would be an advantageous addition to the suite of investigative tools currently available. A metabarcoding approach is often used to predict sample provenance, through the extraction and comparison of the DNA signatures found within different environmental materials, such as the bacteria within soil or fungi within dust. Such approaches are combined with bioinformatics workflows and statistical modelling, often as part of large-scale study, with less emphasis on the investigation of the adaptation of these methods to a smaller scale method for forensic use. The present work was investigating a small-scale approach as an adaptation of a larger metabarcoding study to develop a model for global sample provenance using fungal DNA signatures collected from dust swabs. This adaptation was to facilitate a standardized method for consistent, reproducible sample treatment, including bioinformatics processing and final application of resulting data to the available prediction model. To investigate this small-scale method, 76 DNA samples were treated as anonymous test samples and analyzed using the standardized process to demonstrate and evaluate processing and customized sequence data analysis. This testing included samples originating from countries previously used to train the model, samples artificially mixed to represent multiple or mixed countries, as well as outgroup samples. Positive controls were also developed to monitor laboratory processing and bioinformatics analysis. Through this evaluation we were able to demonstrate that the samples could be processed and analyzed in a consistent manner, facilitated by a relatively user-friendly bioinformatic pipeline for sequence data analysis. Such investigation into standardized analyses and application of metabarcoding data is of key importance for the future use of applied microbiology in forensic science. DA - 2020/5// PY - 2020/5// DO - 10.1016/j.forsciint.2020.110250 VL - 310 SP - SN - 1872-6283 KW - Forensic microbiology KW - Bioinformatics KW - Metabarcoding KW - Sample provenance ER - TY - JOUR TI - Bayesian ordinal probit semiparametric regression models: KNHANES 2016 data analysis of the relationship between smoking behavior and coffee intake AU - Lee, Dasom AU - Lee, Eunji AU - Jo, Seogil AU - Choi, Taeryeon T2 - KOREAN JOURNAL OF APPLIED STATISTICS DA - 2020/2// PY - 2020/2// DO - 10.5351/KJAS.2020.33.1.025 VL - 33 IS - 1 SP - 25-46 SN - 2383-5818 KW - BSAR KW - Gaussian process KW - KNHANES data KW - Markov chain Monte Carlo KW - Ordinal probit KW - Semiparametric regression ER - TY - JOUR TI - Mechanistic models of PLC/PKC signaling implicate phosphatidic acid as a key amplifier of chemotactic gradient sensing AU - Nosbisch, Jamie L. AU - Rahman, Anisur AU - Mohan, Krithika AU - Elston, Timothy C. AU - Bear, James E. AU - Haugh, Jason M. T2 - PLOS COMPUTATIONAL BIOLOGY AB - Chemotaxis of fibroblasts and other mesenchymal cells is critical for embryonic development and wound healing. Fibroblast chemotaxis directed by a gradient of platelet-derived growth factor (PDGF) requires signaling through the phospholipase C (PLC)/protein kinase C (PKC) pathway. Diacylglycerol (DAG), the lipid product of PLC that activates conventional PKCs, is focally enriched at the up-gradient leading edge of fibroblasts responding to a shallow gradient of PDGF, signifying polarization. To explain the underlying mechanisms, we formulated reaction-diffusion models including as many as three putative feedback loops based on known biochemistry. These include the previously analyzed mechanism of substrate-buffering by myristoylated alanine-rich C kinase substrate (MARCKS) and two newly considered feedback loops involving the lipid, phosphatidic acid (PA). DAG kinases and phospholipase D, the enzymes that produce PA, are identified as key regulators in the models. Paradoxically, increasing DAG kinase activity can enhance the robustness of DAG/active PKC polarization with respect to chemoattractant concentration while decreasing their whole-cell levels. Finally, in simulations of wound invasion, efficient collective migration is achieved with thresholds for chemotaxis matching those of polarization in the reaction-diffusion models. This multi-scale modeling framework offers testable predictions to guide further study of signal transduction and cell behavior that affect mesenchymal chemotaxis. DA - 2020/4// PY - 2020/4// DO - 10.1371/journal.pcbi.1007708 VL - 16 IS - 4 SP - SN - 1553-7358 ER - TY - JOUR TI - Model-free posterior inference on the area under the receiver operating characteristic curve AU - Wang, Zhe AU - Martin, Ryan T2 - JOURNAL OF STATISTICAL PLANNING AND INFERENCE AB - The area under the receiver operating characteristic curve (AUC) serves as a summary of a binary classifier’s performance. For inference on the AUC, a common modeling assumption is binormality, which restricts the distribution of the score produced by the classifier. However, this assumption introduces an infinite-dimensional nuisance parameter and may be restrictive in certain machine learning settings. To avoid making distributional assumptions, and to avoid the computational challenges of a fully nonparametric analysis, we develop a direct and model-free Gibbs posterior distribution for inference on the AUC. We present the asymptotic Gibbs posterior concentration rate, and a strategy for tuning the learning rate so that the corresponding credible intervals achieve the nominal frequentist coverage probability. Simulation experiments and a real data analysis demonstrate the Gibbs posterior’s strong performance compared to existing Bayesian methods. DA - 2020/12// PY - 2020/12// DO - 10.1016/j.jspi.2020.03.008 VL - 209 SP - 174-186 SN - 1873-1171 KW - Credible interval KW - Gibbs posterior KW - Generalized bayesian inference KW - Model misspecification KW - Robustness ER - TY - JOUR TI - Spine and dine: A key defensive trait promotes ecological success in spiny ants AU - Blanchard, Benjamin D. AU - Nakamura, Akihiro AU - Cao, Min AU - Chen, Stephanie T. AU - Moreau, Corrie S. T2 - ECOLOGY AND EVOLUTION AB - Abstract A key focus of ecologists is explaining the origin and maintenance of morphological diversity and its association with ecological success. We investigate potential benefits and costs of a common and varied morphological trait, cuticular spines, for foraging behavior, interspecific competition, and predator–prey interactions in naturally co‐occurring spiny ants (Hymenoptera: Formicidae: Polyrhachis ) in an experimental setting. We expect that a defensive trait like spines might be associated with more conspicuous foraging, a greater number of workers sent out to forage, and potentially increased competitive ability. Alternatively, consistent with the ecological trade‐off hypothesis, we expect that investment in spines for antipredator defense might be negatively correlated with these other ecological traits. We find little evidence for any costs to ecological traits, instead finding that species with longer spines either outperform or do not differ from species with shorter spines for all tested metrics, including resource discovery rate and foraging effort as well as competitive ability and antipredator defense. Spines appear to confer broad antipredator benefits and serve as a form of defense with undetectable costs to key ecological abilities like resource foraging and competitive ability, providing an explanation for both the ecological success of the study genus and the large number of evolutionary origins of this trait across all ants. This study also provides a rare quantitative empirical test of ecological effects related to a morphological trait in ants. DA - 2020/6// PY - 2020/6// DO - 10.1002/ece3.6322 VL - 10 IS - 12 SP - 5852-5863 SN - 2045-7758 KW - competition KW - defense KW - morphological trait KW - predator-prey interactions KW - spines ER - TY - JOUR TI - Bayesian linear regression for multivariate responses under group sparsity AU - Ning, Bo AU - Jeong, Seonghyun AU - Ghosal, Subhashis T2 - BERNOULLI AB - We study frequentist properties of a Bayesian high-dimensional multivariate linear regression model with correlated responses. The predictors are separated into many groups and the group structure is pre-determined. Two features of the model are unique: (i) group sparsity is imposed on the predictors; (ii) the covariance matrix is unknown and its dimensions can also be high. We choose a product of independent spike-and-slab priors on the regression coefficients and a new prior on the covariance matrix based on its eigendecomposition. Each spike-and-slab prior is a mixture of a point mass at zero and a multivariate density involving the $\ell_{2,1}$-norm. We first obtain the posterior contraction rate, the bounds on the effective dimension of the model with high posterior probabilities. We then show that the multivariate regression coefficients can be recovered under certain compatibility conditions. Finally, we quantify the uncertainty for the regression coefficients with frequentist validity through a Bernstein–von Mises type theorem. The result leads to selection consistency for the Bayesian method. We derive the posterior contraction rate using the general theory by constructing a suitable test from the first principle using moment bounds for certain likelihood ratios. This leads to posterior concentration around the truth with respect to the average Rényi divergence of order $1/2$. This technique of obtaining the required tests for posterior contraction rate could be useful in many other problems. DA - 2020/8// PY - 2020/8// DO - 10.3150/20-BEJ1198 VL - 26 IS - 3 SP - 2353-2382 SN - 1573-9759 KW - Bayesian variable selection KW - covariance matrix KW - group sparsity KW - multivariate linear regression KW - posterior contraction rate KW - Renyi divergence KW - spike-and-slab prior ER - TY - JOUR TI - Rating exotic price coverage in crop revenue insurance AU - Ramsey, A. Ford AU - Ghosh, Sujit K. AU - Goodwin, Barry K. T2 - AGRICULTURAL FINANCE REVIEW AB - Purpose Revenue insurance is the most popular form of insurance available in the US federal crop insurance program. The majority of crop revenue policies are sold with a harvest price replacement feature that pays out on lost crop yields at the maximum of a realized or projected harvest price. The authors introduce a novel actuarial and statistical approach to rate revenue insurance policies with exotic price coverage: the payout depends on an order statistic or average of prices. The authors examine the price implications of different dependence models and demonstrate the feasibility of policies of this type. Design/methodology/approach Hierarchical Archimedean copulas and vine copulas are used to model dependence between prices and yields and serial dependence of prices. The authors construct several synthetic exotic price coverage insurance policies and evaluate the impact of copula models on policies covering different types of risk. Findings The authors’ findings show that the price of exotic price coverage policies is sensitive to the choice of dependence model. Serial dependence varies across the growing season. It is possible to accurately price exotic coverage policies and we suggest these add-ons as a possible avenue for developing private crop insurance markets. Originality/value The authors apply hierarchical Archimedean copulas and vine copulas that allow for flexibility in the modeling of multivariate dependence. Unlike previous research, which has primarily considered dependence across space, the form of exotic price coverage requires modeling serial dependence in relative prices. Results are important for this segment of the agricultural insurance market: one of the main areas that insurers can develop private products around the federal program. DA - 2020/// PY - 2020/// DO - 10.1108/AFR-10-2019-0107 VL - 80 IS - 5 SP - 609-631 SN - 2041-6326 KW - Crop revenue insurance KW - Nested copulas KW - Domestic credit KW - Exotic price coverage ER - TY - JOUR TI - Exploring the Usefulness of Meteorological Data for Predicting Malaria Cases in Visakhapatnam, Andhra Pradesh AU - Sehgal, Meena AU - Ghosh, Sujit T2 - WEATHER CLIMATE AND SOCIETY AB - Abstract Malaria and dengue fever are among the most important vectorborne diseases in the tropics and subtropics. Average weekly meteorological parameters—specifically, minimum temperature, maximum temperature, humidity, and rainfall—were collected using data from 100 automated weather stations from the Indian Space Research Organization. We obtained district-level weekly reported malaria cases from the Integrated Disease Surveillance Program (IDSP), Department of Health and Family Welfare, Andhra Pradesh, India, for three years, 2014–16. We used a generalized linear model with Poisson distribution and default logarithm-link to estimate model parameters, and we used a quasi-Poisson method with a generalized additive model that uses nonparametric regression with smoothing splines. It appears that higher minimum temperatures (e.g., >24°C) tend to lead to higher malaria counts but lower values do not seem to have an impact on the malaria counts. On the other hand, higher values of maximum temperature (e.g., >32°C) seem to negatively affect the malaria counts. The relationships with rainfall and humidity appear to be not as strong once we account for smooth (weekly) trends and temperatures; both smooth curves seem to hover around zero across all of their values. We note that a rainfall amount between 40 and 50 mm seems to have a positive impact on malaria counts. Our analyses show that the incremental increase in meteorological parameters does not lead to an increase in reported malaria cases in the same manner for all of the districts within the same state. This suggests that other factors such as vegetation, elevation, and water index in the environment also influence disease occurrence. DA - 2020/4// PY - 2020/4// DO - 10.1175/WCAS-D-19-0029.1 VL - 12 IS - 2 SP - 323-330 SN - 1948-8335 KW - Atmosphere KW - Asia KW - Air quality KW - Climate change ER - TY - JOUR TI - Variable selection in functional linear concurrent regression AU - Ghosal, Rahul AU - Maity, Arnab AU - Clark, Timothy AU - Longo, Stefano B. T2 - JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS AB - Summary We propose a novel method for variable selection in functional linear concurrent regression. Our research is motivated by a fisheries footprint study where the goal is to identify important time-varying sociostructural drivers influencing patterns of seafood consumption, and hence the fisheries footprint, over time, as well as estimating their dynamic effects. We develop a variable-selection method in functional linear concurrent regression extending the classically used scalar-on-scalar variable-selection methods like the lasso, smoothly clipped absolute deviation (SCAD) and minimax concave penalty (MCP). We show that in functional linear concurrent regression the variable-selection problem can be addressed as a group lasso, and their natural extension: the group SCAD or a group MCP problem. Through simulations, we illustrate that our method, particularly with the group SCAD or group MCP, can pick out the relevant variables with high accuracy and has minuscule false positive and false negative rate even when data are observed sparsely, are contaminated with noise and the error process is highly non-stationary. We also demonstrate two real data applications of our method in studies of dietary calcium absorption and fisheries footprint in the selection of influential time-varying covariates. DA - 2020/6// PY - 2020/6// DO - 10.1111/rssc.12408 VL - 69 IS - 3 SP - 565-587 SN - 1467-9876 KW - Fisheries footprint KW - Functional linear concurrent regression KW - Variable selection ER - TY - JOUR TI - Form-stable phase-change elastomer gels derived from thermoplastic elastomer copolyesters swollen with fatty acids AU - Armstrong, Daniel P. AU - Chatterjee, Kony AU - Ghosh, Tushar K. AU - Spontak, Richard J. T2 - THERMOCHIMICA ACTA AB - Phase-change materials (PCMs) are of considerable scientific and technological interest in applications related to energy management and storage, especially as they pertain to residential or commercial construction and packaging. Most PCMs developed for these purposes consist of a crystallizable species encapsulated within an impermeable polymeric shell. Such encapsulants can then be strategically embedded throughout a construct to promote thermal stability in close proximity to the normal melting point of the encapsulated species. In this study, we introduce form-stable PCMs, which avoid the need for costly and inconvenient encapsulation and consist of commercial thermoplastic elastomer copolyesters selectively swollen with crystallizable fatty acids. Since the copolyester matrices endow the PCMs with solid-like characteristics even when swollen with liquid, we refer to this particular class of materials as phase-change elastomer gels (PCEGs). In this study, we explore the thermal characteristics of PCEG films wherein the copolyester grade, gel composition and fatty acid are all varied. Our results indicate that these PCEGs exhibit non-hysteretic thermal cycling, unaffected transition temperatures, and competitive latent transition heats. Relative to model and commercially available encapsulated PCMs, the form-stable PCEGs examined here afford an alternative capable of superior thermal performance and versatility. DA - 2020/4// PY - 2020/4// DO - 10.1016/j.tca.2020.178566 VL - 686 SP - SN - 1872-762X KW - Thermoplastic elastomer KW - Physical crosslinking KW - Thermal storage KW - Phase-change material KW - Energy conservation ER - TY - JOUR TI - A STATISTICAL ANALYSIS OF NOISY CROWDSOURCED WEATHER DATA AU - Chakraborty, Arnab AU - Lahiri, Soumendra Nath AU - Wilson, Alyson T2 - ANNALS OF APPLIED STATISTICS AB - Spatial prediction of weather elements like temperature, precipitation, and barometric pressure are generally based on satellite imagery or data collected at ground stations. None of these data provide information at a more granular or “hyperlocal” resolution. On the other hand, crowdsourced weather data, which are captured by sensors installed on mobile devices and gathered by weather-related mobile apps like WeatherSignal and AccuWeather, can serve as potential data sources for analyzing environmental processes at a hyperlocal resolution. However, due to the low quality of the sensors and the nonlaboratory environment, the quality of the observations in crowdsourced data is compromised. This paper describes methods to improve hyperlocal spatial prediction using this varying-quality, noisy crowdsourced information. We introduce a reliability metric, namely Veracity Score (VS), to assess the quality of the crowdsourced observations using a coarser, but high-quality, reference data. A VS-based methodology to analyze noisy spatial data is proposed and evaluated through extensive simulations. The merits of the proposed approach are illustrated through case studies analyzing crowdsourced daily average ambient temperature readings for one day in the contiguous United States. DA - 2020/3// PY - 2020/3// DO - 10.1214/19-AOAS1290 VL - 14 IS - 1 SP - 116-142 SN - 1932-6157 KW - Veracity score KW - geostatistics KW - robust kriging KW - hyperlocal spatial prediction ER - TY - JOUR TI - FastLORS: Joint modelling for expression quantitative trait loci mapping in R AU - Rhyne, Jacob AU - Jeng, X. Jessie AU - Chi, Eric C. AU - Tzeng, Jung-Ying T2 - STAT AB - FastLORS is a software package that implements a new algorithm to solve sparse multivariate regression for expression quantitative trait loci (eQTLs) mapping. FastLORS solves the same optimization problem as LORS, an existing popular algorithm. The optimization problem is solved through inexact block coordinate descent with updates by proximal gradient steps, which reduces the computational cost compared with LORS. We apply LORS and FastLORS to a real dataset for eQTL mapping and demonstrate that FastLORS delivers comparable results with LORS in much less computing time. DA - 2020/// PY - 2020/// DO - 10.1002/sta4.265 VL - 9 IS - 1 SP - SN - 2049-1573 UR - https://doi.org/10.1002/sta4.265 KW - block coordinate descent KW - eQTL mapping KW - low-rank approximation KW - proximal gradient descent KW - sparse regression ER - TY - JOUR TI - Untargeted metabolomic profiling identifies disease-specific signatures in food allergy and asthma AU - Crestani, Elena AU - Harb, Hani AU - Charbonnier, Louis-Marie AU - Leirer, Jonathan AU - Motsinger-Reif, Alison AU - Rachid, Rima AU - Phipatanakul, Wanda AU - Kaddurah-Daouk, Rima AU - Chatila, Talal A. T2 - JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY AB - BackgroundFood allergy (FA) affects an increasing proportion of children for reasons that remain obscure. Novel disease biomarkers and curative treatment options are strongly needed.ObjectiveWe sought to apply untargeted metabolomic profiling to identify pathogenic mechanisms and candidate disease biomarkers in patients with FA.MethodsMass spectrometry–based untargeted metabolomic profiling was performed on serum samples of children with either FA alone, asthma alone, or both FA and asthma, as well as healthy pediatric control subjects.ResultsIn this pilot study patients with FA exhibited a disease-specific metabolomic signature compared with both control subjects and asthmatic patients. In particular, FA was uniquely associated with a marked decrease in sphingolipid levels, as well as levels of a number of other lipid metabolites, in the face of normal frequencies of circulating natural killer T cells. Specific comparison of patients with FA and asthmatic patients revealed differences in the microbiota-sensitive aromatic amino acid and secondary bile acid metabolism. Children with both FA and asthma exhibited a metabolomic profile that aligned with that of FA alone but not asthma. Among children with FA, the history of severe systemic reactions and the presence of multiple FAs were associated with changes in levels of tryptophan metabolites, eicosanoids, plasmalogens, and fatty acids.ConclusionsChildren with FA have a disease-specific metabolomic profile that is informative of disease mechanisms and severity and that dominates in the presence of asthma. Lower levels of sphingolipids and ceramides and other metabolomic alterations observed in children with FA might reflect the interplay between an altered microbiota and immune cell subsets in the gut. Food allergy (FA) affects an increasing proportion of children for reasons that remain obscure. Novel disease biomarkers and curative treatment options are strongly needed. We sought to apply untargeted metabolomic profiling to identify pathogenic mechanisms and candidate disease biomarkers in patients with FA. Mass spectrometry–based untargeted metabolomic profiling was performed on serum samples of children with either FA alone, asthma alone, or both FA and asthma, as well as healthy pediatric control subjects. In this pilot study patients with FA exhibited a disease-specific metabolomic signature compared with both control subjects and asthmatic patients. In particular, FA was uniquely associated with a marked decrease in sphingolipid levels, as well as levels of a number of other lipid metabolites, in the face of normal frequencies of circulating natural killer T cells. Specific comparison of patients with FA and asthmatic patients revealed differences in the microbiota-sensitive aromatic amino acid and secondary bile acid metabolism. Children with both FA and asthma exhibited a metabolomic profile that aligned with that of FA alone but not asthma. Among children with FA, the history of severe systemic reactions and the presence of multiple FAs were associated with changes in levels of tryptophan metabolites, eicosanoids, plasmalogens, and fatty acids. Children with FA have a disease-specific metabolomic profile that is informative of disease mechanisms and severity and that dominates in the presence of asthma. Lower levels of sphingolipids and ceramides and other metabolomic alterations observed in children with FA might reflect the interplay between an altered microbiota and immune cell subsets in the gut. DA - 2020/3// PY - 2020/3// DO - 10.1016/j.jaci.2019.10.014 VL - 145 IS - 3 SP - 897-906 SN - 1097-6825 KW - Asthma KW - food allergy KW - invariant natural killer T cells KW - metabolomics KW - metabolites KW - secondary bile acids KW - sphingolipids KW - tryptophan ER - TY - JOUR TI - Aridity Trends in Central America: A Spatial Correlation Analysis AU - Córdoba, Marcela Alfaro AU - Hidalgo, Hugo AU - Alfaro, Eric T2 - Atmosphere AB - Trend analyses are common in several types of climate change studies. In many cases, finding evidence that the trends are different from zero in hydroclimate variables is of particular interest. However, when estimating the confidence interval of a set of hydroclimate stations or gridded data the spatial correlation between can affect the significance assessment using for example traditional non-parametric and parametric methods. For this reason, Monte Carlo simulations are needed in order to generate maps of corrected trend significance. In this article, we determined the significance of trends in aridity, modeled runoff using the Variable Infiltration Capacity Macroscale Hydrological model, Hagreaves potential evapotranspiration (PET) and near-surface temperature in Central America. Linear-regression models were fitted considering that the predictor variable is the time variable (years from 1970 to 1999) and predictand variable corresponds to each of the previously mentioned hydroclimate variables. In order to establish if the temporal trends were significantly different from zero, a Mann Kendall and a Monte Carlo test were used. The spatial correlation was calculated first to correct the variance of each trend. It was assumed in this case that the trends form a spatial stochastic process that can be modeled as such. Results show that the analysis considering the spatial correlation proposed here can be used for identifying those extreme trends. However, a set of variables with strong spatial correlation such as temperature can have robust and widespread significant trends assuming independence, but the vast majority of the stations can still fail the Monte Carlo test. We must be vigilant of the statistically robust changes in key primary parameters such as temperature and precipitation, which are the driving sources of hydrological alterations that may affect social and environmental systems in the future. DA - 2020/4/23/ PY - 2020/4/23/ DO - 10.3390/atmos11040427 UR - http://dx.doi.org/10.3390/atmos11040427 KW - aridity KW - Central American climate KW - spatial correlation KW - trend analysis KW - variability ER - TY - JOUR TI - Tuning parameter selection for penalised empirical likelihood with a diverging number of parameters AU - Zheng, Chaowen AU - Wu, Yichao T2 - JOURNAL OF NONPARAMETRIC STATISTICS AB - Penalised likelihood methods have been a success in analysing high dimensional data. Tang and Leng [(2010), ‘Penalized High-Dimensional Empirical Likelihood’, Biometrika, 97(4), 905–920] extended the penalisation approach to the empirical likelihood scenario and showed that the penalised empirical likelihood estimator could identify the true predictors consistently in the linear regression models. However, this desired selection consistency property of the penalised empirical likelihood method relies heavily on the choice of the tuning parameter. In this work, we propose a tuning parameter selection procedure for penalised empirical likelihood to guarantee that this selection consistency can be achieved. Specifically, we propose a generalised information criterion (GIC) for the penalised empirical likelihood in the linear regression case. We show that the tuning parameter selected by the GIC yields the true model consistently even when the number of predictors diverges to infinity with the sample size. We demonstrate the performance of our procedure by numerical simulations and a real data analysis. DA - 2020/1/2/ PY - 2020/1/2/ DO - 10.1080/10485252.2020.1717491 VL - 32 IS - 1 SP - 246-261 SN - 1029-0311 KW - Tuning parameter selection KW - variable selection KW - generalised information criterion KW - empirical likelihood ER - TY - JOUR TI - Incorporating Nearest-Neighbor Site Dependence into Protein Evolution Models AU - Larson, Gary AU - Thorne, Jeffrey L. AU - Schmidler, Scott T2 - JOURNAL OF COMPUTATIONAL BIOLOGY AB - Evolutionary models of proteins are widely used for statistical sequence alignment and inference of homology and phylogeny. However, the vast majority of these models rely on an unrealistic assumption of independent evolution between sites. Here we focus on the related problem of protein structure alignment, a classic tool of computational biology that is widely used to identify structural and functional similarity and to infer homology among proteins. A site-independent statistical model for protein structural evolution has previously been introduced and shown to significantly improve alignments and phylogenetic inferences compared with approaches that utilize only amino acid sequence information. Here we extend this model to account for correlated evolutionary drift among neighboring amino acid positions. The result is a spatiotemporal model of protein structure evolution, described by a multivariate diffusion process convolved with a spatial birth–death process. This extended site-dependent model (SDM) comes with little additional computational cost or analytical complexity compared with the site-independent model (SIM). We demonstrate that this SDM yields a significant reduction of bias in estimated evolutionary distances and helps further improve phylogenetic tree reconstruction. We also develop a simple model of site-dependent sequence evolution, which we use to demonstrate the bias resulting from the application of standard site-independent sequence evolution models. DA - 2020/3/1/ PY - 2020/3/1/ DO - 10.1089/cmb.2019.0500 VL - 27 IS - 3 SP - 361-375 SN - 1557-8666 KW - diffusion process KW - dynamic programming KW - evolution KW - phylogeny KW - protein structure ER - TY - JOUR TI - Evidence for temperature-dependent shifts in spawning times of anadromous alewife (Alosa pseudoharengus) and blueback herring (Alosa aestivalis) AU - Lombardo, Steven M. AU - Buckel, Jeffrey A. AU - Hain, Ernie F. AU - Griffith, Emily H. AU - White, Holly T2 - CANADIAN JOURNAL OF FISHERIES AND AQUATIC SCIENCES AB - We analyzed four decades of presence–absence data from a fishery-independent survey to characterize the long-term phenology of river herring (alewife, Alosa pseudoharengus; and blueback herring, Alosa aestivalis) spawning migrations in their southern distribution. We used logistic generalized additive models to characterize the average ingress, peak, and egress timing of spawning. In the 2010s, alewife arrived to spawning habitat 16 days earlier and egressed 27 days earlier (peak 12 days earlier) relative to the 1970s. Blueback herring arrived 5 days earlier and egressed 23 days earlier (peak 13 days earlier) in the 2010s relative to the 1980s. The changes in ingress and egress timing have shortened the occurrence in spawning systems by 11 days for alewife over four decades and 18 days for blueback herring over three decades. We found that the rate of vernal warming was faster during 2001–2016 relative to 1973–1988 and is the most parsimonious explanation for changes in spawning phenology. The influence of a shortened spawning season on river herring population dynamics warrants further investigation. DA - 2020/4// PY - 2020/4// DO - 10.1139/cjfas-2019-0140 VL - 77 IS - 4 SP - 741-751 SN - 1205-7533 ER - TY - JOUR TI - Probabilistic Detection and Estimation of Conic Sections From Noisy Data AU - Guha, Subharup AU - Ghosh, Sujit K. T2 - JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS AB - Subharup Guhaa* & Sujit K. Ghoshb a Department of Biostatistics, University of Florida, Gainesville, FL; b Department of Statistics, North Carolina State University, Raleigh, NC DA - 2020/7/2/ PY - 2020/7/2/ DO - 10.1080/10618600.2020.1737084 VL - 29 IS - 3 SP - 513-522 SN - 1537-2715 UR - https://doi.org/10.1080/10618600.2020.1737084 KW - Bayesian hierarchical model KW - Bernstein basis polynomials KW - Focus-directrix approach KW - Markov chain Monte Carlo KW - Metropolis-Hastings algorithm KW - Partial conics ER - TY - JOUR TI - Correlation models for monitoring fetal growth AU - Feng, Yuan AU - Xiao, Luo AU - Li, Cai AU - Chen, Stephanie T. AU - Ohuma, Eric O. T2 - STATISTICAL METHODS IN MEDICAL RESEARCH AB - Ultrasound growth measurements are monitored to evaluate if a fetus is growing normally compared with a defined standard chart at a specified gestational age. Using data from the Fetal Growth Longitudinal Study of the INTERGROWTH-21 st project, we have modelled the longitudinal dependence of fetal head circumference, biparietal diameter, occipito-frontal diameter, abdominal circumference, and femur length using a two-stage approach. The first stage involved finding a suitable transformation of the raw fetal measurements (as the marginal distributions of ultrasound measurements were non-normal) to standardized deviations (Z-scores). In the second stage, a correlation model for a Gaussian process is fitted, yielding a correlation for any pair of observations made between 14 and 40 weeks. The correlation structure of the fetal Z-score can be used to assess whether the growth, for example, between successive measurements is satisfactory. The paper is accompanied by a Shiny application, see https://lxiao5.shinyapps.io/shinycalculator/ . DA - 2020/10// PY - 2020/10// DO - 10.1177/0962280220905623 VL - 29 IS - 10 SP - 2795-2813 SN - 1477-0334 KW - Fetal health KW - longitudinal study KW - correlation KW - reference chart ER - TY - JOUR TI - Using Irrigation to Increase Stormwater Mitigation Potential of Rainwater Harvesting Systems AU - Gee, K. D. AU - Hunt, W. F. AU - Peacock, C. H. AU - Woodward, M. D. AU - Arellano, C. T2 - JOURNAL OF SUSTAINABLE WATER IN THE BUILT ENVIRONMENT AB - Rainwater harvesting (RWH) systems used for irrigation often provide fewer stormwater management benefits than systems used for year-round, nondiscretionary purposes because there is diminished demand for harvested rainwater during the nongrowing season or rainy periods. Thus, identifying demands during these periods would improve the stormwater mitigation potential of RWH systems. This study evaluated how irrigating bermudagrass year-round at rates exceeding those for minimum water conservation affected the stormwater benefits provided by an RWH system. Results indicated significant increases in runoff volume retention when turf was irrigated at 25 and 50 mm/week, compared to an evapotranspiration/effective precipitation (or agronomic)–based regime. While overall soil moisture content increased with irrigation rate, there were no concomitant increases in pest occurrences or runoff generation. Turf quality did not differ from the control irrigation regime for either application rate, and there were no indications of soil nitrate leaching. Irrigating at rates up to 50 mm/week resulted in stormwater volume reductions up to 65% without causing a decline in turf quality. DA - 2020/5/1/ PY - 2020/5/1/ DO - 10.1061/JSWBAY.0000913 VL - 6 IS - 2 SP - SN - 2379-6111 ER - TY - JOUR TI - Water Quality and Hydrologic Performance of Two Dry Detention Basins Receiving Highway Stormwater Runoff in the Piedmont Region of North Carolina AU - Wissler, Austin D. AU - Hunt, William F. AU - McLaughlin, Richard A. T2 - JOURNAL OF SUSTAINABLE WATER IN THE BUILT ENVIRONMENT AB - Dry detention basins (DDBs) are a stormwater control measure (SCM) designed to provide flood storage, peak discharge abatement, and some water quality improvement through sedimentation; however, little data characterize DDB water quality performance in the highway environment. In this study, two DDBs [Hughes Farm Road and Poole Road basin (HFR and PRB henceforth)], constructed in 2010, mowed twice a year, receiving highway runoff, and located in the Piedmont of North Carolina (NC), USA, were monitored for up to 11 months. Flow-weighted composite samples were collected during storm events and analyzed for total phosphorus (TP); ortho-phosphorus (OP); ammonia (NH3); nitrate-nitrite (NOX); total Kjeldahl nitrogen (TKN); total suspended solids (TSS); and total Cd, Cu, Pb, and Zn. Influent runoff concentrations were similar to other studies in NC, and the monitoring revealed significant concentration reductions for most constituents in HFR. PRB significantly reduced concentrations for all pollutants except TSS, particulate phosphorous, and NH3, while significantly exporting Zn. HFR exhibited soil infiltration that led to significant pollutant load reductions (LRs) for all analytes except Cu. PRB exhibited little infiltration but had significant LRs for dissolved nutrients. This study provides evidence that DDB inlet and outlet configuration and the presence of standing water may impact DDB water quality improvement. DA - 2020/5/1/ PY - 2020/5/1/ DO - 10.1061/JSWBAY.0000915 VL - 6 IS - 2 SP - SN - 2379-6111 ER - TY - JOUR TI - A Functional Metric Approach to Assess Biosimilarity With Application to Rheumatoid Arthritis Trials AU - Ghosh, Sujit K. AU - Dong, Lin T2 - STATISTICS IN BIOPHARMACEUTICAL RESEARCH AB - In recent years there has been a lot of interest to test for similarity between biological drug products, commonly known as biologics. Biologics are large and complex molecule drugs that are produced by living cells and hence these are sensitive to the environmental changes. In addition, biologics usually induce antibodies which raise the safety and efficacy issues. The manufacturing process is also much more complicated and often costlier than the small-molecule generic drugs. Because of these complexities and inherent variability of the biologics, the testing paradigm of the traditional generic drugs cannot be directly used to test for biosimilarity. Taking into account some of these concerns we propose a functional distance based methodology that takes into consideration the entire time course of the study and is based on a class of flexible semiparametric models. The empirical results show that the proposed approach is more sensitive than the classical equivalence tests approach which are usually based on arbitrarily chosen time point. Bootstrap based methodologies are also presented for statistical inference. DA - 2020/4/2/ PY - 2020/4/2/ DO - 10.1080/19466315.2020.1733071 VL - 12 IS - 2 SP - 234-243 SN - 1946-6315 UR - https://doi.org/10.1080/19466315.2020.1733071 KW - Bernstein polynomials KW - Binary responses KW - Rheumatic arthritis KW - Semiparametric models ER - TY - JOUR TI - Low-Dose Silver Nanoparticle Surface Chemistry and Temporal Effects on Gene Expression in Human Liver Cells AU - House, John S. AU - Bouzos, Evangelia AU - Fahy, Kira M. AU - Francisco, Victorino Miguel AU - Lloyd, Dillon T. AU - Wright, Fred A. AU - Motsinger-Reif, Alison A. AU - Asuri, Prashanth AU - Wheeler, Korin E. T2 - SMALL AB - Silver nanoparticles (AgNPs) are widely incorporated into consumer and biomedical products for their antimicrobial and plasmonic properties with limited risk assessment of low-dose cumulative exposure in humans. To evaluate cellular responses to low-dose AgNP exposures across time, human liver cells (HepG2) are exposed to AgNPs with three different surface charges (1.2 µg mL-1 ) and complete gene expression is monitored across a 24 h period. Time and AgNP surface chemistry mediate gene expression. In addition, since cells are fed, time has marked effects on gene expression that should be considered. Surface chemistry of AgNPs alters gene transcription in a time-dependent manner, with the most dramatic effects in cationic AgNPs. Universal to all surface coatings, AgNP-treated cells responded by inactivating proliferation and enabling cell cycle checkpoints. Further analysis of these universal features of AgNP cellular response, as well as more detailed analysis of specific AgNP treatments, time points, or specific genes, is facilitated with an accompanying application. Taken together, these results provide a foundation for understanding hepatic response to low-dose AgNPs for future risk assessment. DA - 2020/5// PY - 2020/5// DO - 10.1002/smll.202000299 VL - 16 IS - 21 SP - SN - 1613-6829 KW - nanotoxicity KW - silver nanoparticles KW - transcriptomics ER - TY - JOUR TI - Effective SNP ranking improves the performance of eQTL mapping AU - Jeng, X. Jessie AU - Rhyne, Jacob AU - Zhang, Teng AU - Tzeng, Jung-Ying T2 - GENETIC EPIDEMIOLOGY AB - Abstract Genome‐wide expression quantitative trait loci (eQTLs) mapping explores the relationship between gene expression and DNA variants, such as single‐nucleotide polymorphism (SNPs), to understand genetic basis of human diseases. Due to the large number of genes and SNPs that need to be assessed, current methods for eQTL mapping often suffer from low detection power, especially for identifying trans ‐eQTLs. In this paper, we propose the idea of performing SNP ranking based on the higher criticism statistic, a summary statistic developed in large‐scale signal detection. We illustrate how the HC‐based SNP ranking can effectively prioritize eQTL signals over noise, greatly reduce the burden of joint modeling, and improve the power for eQTL mapping. Numerical results in simulation studies demonstrate the superior performance of our method compared to existing methods. The proposed method is also evaluated in HapMap eQTL data analysis and the results are compared to a database of known eQTLs. DA - 2020/9// PY - 2020/9// DO - 10.1002/gepi.22293 VL - 44 IS - 6 SP - 611-619 SN - 1098-2272 KW - HC ranking KW - hotspot KW - multivariate response KW - penalized regression KW - trans-eQTL ER - TY - JOUR TI - Modeling buffer capacity and pH in acid and acidified foods AU - Price, Robert E. AU - Longtin, Madyson AU - Conley-Payton, Summer AU - Osborne, Jason A. AU - Johanningsmeier, Suzanne D. AU - Bitzer, Donald AU - Breidt, Fred T2 - JOURNAL OF FOOD SCIENCE AB - Standard ionic equilibria equations may be used for calculating pH of weak acid and base solutions. These calculations are difficult or impossible to solve analytically for foods that include many unknown buffering components, making pH prediction in these systems impractical. We combined buffer capacity (BC) models with a pH prediction algorithm to allow pH prediction in complex food matrices from BC data. Numerical models were developed using Matlab software to estimate the pH and buffering components for mixtures of weak acid and base solutions. The pH model was validated with laboratory solutions of acetic or citric acids with ammonia, in combinations with varying salts using Latin hypercube designs. Linear regressions of observed versus predicted pH values based on the concentration and pK values of the solution components resulted in estimated slopes between 0.96 and 1.01 with and without added salts. BC models were generated from titration curves for 0.6 M acetic acid or 12.4 mM citric acid resulting in acid concentration and pK estimates. Predicted pH values from these estimates were within 0.11 pH units of the measured pH. Acetic acid concentration measurements based on the model were within 6% accuracy compared to high-performance liquid chromatography measurements for concentrations less than 400 mM, although they were underestimated above that. The models may have application for use in determining the BC of food ingredients with unknown buffering components. Predicting pH changes for food ingredients using these models may be useful for regulatory purposes with acid or acidified foods and for product development. PRACTICAL APPLICATION: Buffer capacity models may benefit regulatory agencies and manufacturers of acid and acidified foods to determine pH stability (below pH 4.6) and how low-acid food ingredients may affect the safety of these foods. Predicting pH for solutions with known or unknown buffering components was based on titration data and models that use only monoprotic weak acids and bases. These models may be useful for product development and food safety by estimating pH and buffering capacity. DA - 2020/4// PY - 2020/4// DO - 10.1111/1750-3841.15091 VL - 85 IS - 4 SP - 918-925 SN - 1750-3841 KW - acid KW - base KW - acid foods KW - acidified foods KW - buffer capacity KW - buffer model KW - pH ER - TY - JOUR TI - Managing a Destructive, Episodic Crop Disease: A National Survey of Wheat and Barley Growers' Experience With Fusarium Head Blight AU - Cowger, Christina AU - Smith, Joy AU - Boos, Dennis AU - Bradley, Carl A. AU - Ransom, Joel AU - Bergstrom, Gary C. T2 - PLANT DISEASE AB - The main techniques for minimizing Fusarium head blight (FHB, or scab) and deoxynivalenol in wheat and barley are well established and generally available: planting of moderately FHB-resistant cultivars, risk monitoring, and timely use of the most effective fungicides. Yet the adoption of these techniques remains uneven across the FHB-prone portions of the U.S. cereal production area. A national survey was undertaken by the U.S. Wheat and Barley Scab Initiative in 17 states where six market classes of wheat and barley are grown. In 2014, 5,107 usable responses were obtained. The highest percentages reporting losses attributable to FHB in the previous 5 years were in North Dakota, Maryland, Kentucky, and states bordering the Great Lakes but across all states, ≥75% of respondents reported no FHB-related losses in the previous 5 years. Adoption of cultivar resistance was uneven by state and market class and was low except among hard red spring wheat growers. In 13 states, a majority of respondents had not applied an FHB-targeted fungicide in the previous 5 years. Although the primary FHB information source varied by state, crop consultants were considered to be an important source or their primary source of information on risk or management of FHB by the largest percentage of respondents. Use of an FHB risk forecasting website was about twice as high in North Dakota as the 17-state average of 6%. The most frequently cited barriers to adopting FHB management practices were weather or logistics preventing timely fungicide application, difficulty in determining flowering timing for fungicide applications, and the impracticality of FHB-reducing rotations. The results highlight the challenges of managing an episodically damaging crop disease and point to specific areas for improvement. DA - 2020/3// PY - 2020/3// DO - 10.1094/PDIS-10-18-1803-SR VL - 104 IS - 3 SP - 634-648 SN - 1943-7692 KW - cereals and grains KW - chemical cultivar/resistance KW - disease management KW - disease warning systems KW - epidemiology KW - field crops KW - fungi ER - TY - JOUR TI - Growth performance, oxidative stress and immune status of newly weaned pigs fed peroxidized lipids with or without supplemental vitamin E or polyphenols AU - Silva-Guillen, Y. V. AU - Arellano, C. AU - Boyd, R. D. AU - Martinez, G. AU - Heugten, E. T2 - JOURNAL OF ANIMAL SCIENCE AND BIOTECHNOLOGY AB - This study evaluated the use of dietary vitamin E and polyphenols on growth, immune and oxidative status of weaned pigs fed peroxidized lipids. A total of 192 piglets (21 days of age and body weight of 6.62 ± 1.04 kg) were assigned within sex and weight blocks to a 2 × 3 factorial arrangement using 48 pens with 4 pigs per pen. Dietary treatments consisted of lipid peroxidation (6% edible soybean oil or 6% peroxidized soybean oil), and antioxidant supplementation (control diet containing 33 IU/kg DL-α-tocopheryl-acetate; control with 200 IU/kg additional dl-α-tocopheryl-acetate; or control with 400 mg/kg polyphenols). Pigs were fed in 2 phases for 14 and 21 days, respectively.Peroxidation of oil for 12 days at 80 °C with exposure to 50 L/min of air substantially increased peroxide values, anisidine value, hexanal, and 2,4-decadienal concentrations. Feeding peroxidized lipids decreased (P < 0.001) body weight (23.16 vs. 18.74 kg), daily gain (473 vs. 346 g/d), daily feed intake (658 vs. 535 g/d) and gain:feed ratio (719 vs. 647 g/kg). Lipid peroxidation decreased serum vitamin E (P < 0.001) and this decrease was larger on day 35 (1.82 vs. 0.81 mg/kg) than day 14 (1.95 vs. 1.38 mg/kg). Supplemental vitamin E, but not polyphenols, increased (P ≤ 0.002) serum vitamin E by 84% and 22% for control and peroxidized diets, respectively (interaction, P = 0.001). Serum malondialdehyde decreased (P < 0.001) with peroxidation on day 14, but not day 35 and protein carbonyl increased (P < 0.001) with peroxidation on day 35, but not day 14. Serum 8-hydroxydeoxyguanosine was not affected (P > 0.05). Total antioxidant capacity decreased with peroxidation (P < 0.001) and increased with vitamin E (P = 0.065) and polyphenols (P = 0.046) for the control oil diet only. Serum cytokine concentrations increased with feeding peroxidized lipids on day 35, but were not affected by antioxidant supplementation (P > 0.05).Feeding peroxidized lipids negatively impacted growth performance and antioxidant capacity of nursery pigs. Supplementation of vitamin E and polyphenols improved total antioxidant capacity, especially in pigs fed control diets, but did not restore growth performance. DA - 2020/3/5/ PY - 2020/3/5/ DO - 10.1186/s40104-020-0431-9 VL - 11 IS - 1 SP - SN - 2049-1891 KW - Antioxidants KW - Immune status KW - Lipid peroxidation KW - Oxidative stress KW - Piglets KW - Polyphenols KW - Vitamin E ER - TY - JOUR TI - Heatwave duration: Characterizations using probabilistic inference AU - Raha, Sohini AU - Ghosh, Sujit K. T2 - ENVIRONMETRICS AB - Abstract Characterization of heatwave duration is becoming increasingly important in environmental research as they pose a significant threat to many human lives worldwide. Although several quantification of the extremities of a heatwave have been proposed in literature, they are mostly improvised and there does not exist a universally accepted definition of heatwave. In this article, we devise a probabilistic inferential framework to characterize heatwave and come up with a definition that can capture the essence of all existing ad hoc definitions. We derive an exact distribution on the frequency of such durations for a stationary Markov process and also an approximate distribution of durations for a stationary non‐Markov time series. For a given site, using a daily time series (of ambient temperature or heat‐index), we define a heatwave as the number of sustained days above a given threshold using the probability distribution of the durations. We illustrate the proposed methodology using daily time series of ambient temperature for a fixed site (of Atlanta) and also using the USCRN consisting of 126 sites across the United States. Furthermore, we also derive an empirical quadratic curve based relationship between expected durations and extreme thresholds. The proofs of the theorems, datasets, algorithms, and computer codes are provided in the supplementary materials. DA - 2020/8// PY - 2020/8// DO - 10.1002/env.2626 VL - 31 IS - 5 SP - SN - 1099-095X UR - https://doi.org/10.1002/env.2626 KW - Bayesian KW - hierarchical model KW - Poisson approximation KW - sum of dependent Bernoulli sequence ER - TY - JOUR TI - Hydrologic and water quality performance of two aging and unmaintained dry detention basins receiving highway stormwater runoff AU - Wissler, Austin D. AU - Hunt, William F. AU - McLaughlin, Richard A. T2 - JOURNAL OF ENVIRONMENTAL MANAGEMENT AB - Dry detention basins (DDBs) are a type of stormwater control measure (SCM) designed to provide flood storage, peak discharge reduction, and some water quality improvement through sedimentation. DDBs are ubiquitous in the urban environment, but are expensive to maintain. In this study, two overgrown DDBs near Raleigh, NC, receiving highway runoff were monitored for up to one year to quantify their water quality and hydrologic performance. Both basins, B1 and B2, have not received vegetation maintenance since construction in 2007. Flow-weighted composite samples were collected during storm events and analyzed for nutrients (Total Phosphorus (TP), Ortho-phosphorus (OP), Ammonia-N (NH3), NO2-3-N (NOX), and Total Kjeldahl Nitrogen (TKN)), total suspended solids (TSS), and total Cd, Cu, Pb, and Zn. An annual water balance was also conducted to quantify runoff volume reduction. Despite low influent concentrations from the highway, significant removal efficiencies were found for all constituents except NH3 in B1. TP, OP, NOX, TSS, and Zn were reduced in B2. Both basins achieved greater than 41% volume reduction through soil infiltration and evapotranspiration, resulting in significant pollutant load reductions for all detected constituents, between 59% and 79% in B1 and 35% and 81% in B2. This study provides evidence that overgrown and unmaintained DDBs can reduce pollutant concentrations comparable to those reported for maintained DDBs, while reducing more volume than standard DDBs. Moreover, carbon sequestration likely increases while maintenance costs decrease. DA - 2020/2/1/ PY - 2020/2/1/ DO - 10.1016/j.jenvman.2019.109853 VL - 255 SP - SN - 1095-8630 KW - Highway KW - Stormwater KW - Dry detention basin KW - Maintenance KW - Non-point source pollution KW - Carbon sequestration ER - TY - JOUR TI - Robust estimation for moment condition models with data missing not at random AU - Li, W. AU - Yang, S. AU - Han, P. T2 - Journal of Statistical Planning and Inference AB - We consider estimation for parameters defined through moment conditions when data are missing not at random. The missingness mechanism cannot be determined from the data alone, and inference under missingness not at random may be sensitive to unverifiable assumptions about the missingness mechanism. To add protection against model misspecification, we posit multiple models for the response probability and propose a weighting estimator with calibrated weights. Assuming the conditional distribution of the outcome given covariates is correctly modeled, we show that if any one of the multiple models for the response probability is correctly specified, the proposed estimator is consistent for the true value. A simulation study confirms that our estimator has multiple robustness when the outcome data is missing not at random. The method is also applied to an application. DA - 2020/7// PY - 2020/7// DO - 10.1016/j.jspi.2020.01.001 VL - 207 SP - 246-254 SN - 1873-1171 KW - Identification KW - Empirical likelihood KW - Missing not at random KW - Multiple robustness KW - Semiparametric maximum likelihood estimator ER - TY - JOUR TI - Robust kernel association testing (RobKAT) AU - Martinez, Kara AU - Maity, Arnab AU - Yolken, Robert H. AU - Sullivan, Patrick F. AU - Tzeng, Jung-Ying T2 - GENETIC EPIDEMIOLOGY AB - Abstract Testing the association between single‐nucleotide polymorphism (SNP) effects and a response is often carried out through kernel machine methods based on least squares, such as the sequence kernel association test (SKAT). However, these least‐squares procedures are designed for a normally distributed conditional response, which may not apply. Other robust procedures such as the quantile regression kernel machine (QRKM) restrict the choice of the loss function and only allow inference on conditional quantiles. We propose a general and robust kernel association test with a flexible choice of the loss function, no distributional assumptions, and has SKAT and QRKM as special cases. We evaluate our proposed robust association test (RobKAT) across various data distributions through a simulation study. When errors are normally distributed, RobKAT controls type I error and shows comparable power with SKAT. In all other distributional settings investigated, our robust test has similar or greater power than SKAT. Finally, we apply our robust testing method to data from the Clinical Antipsychotic Trials of Intervention Effectiveness (CATIE) clinical trial to detect associations between selected genes including the major histocompatibility complex (MHC) region on chromosome six and neurotropic herpesvirus antibody levels in schizophrenia patients. RobKAT detected significant association with four SNP sets ( HST1H2BJ , MHC, POM12L2 , and SLC17A1 ), three of which were undetected by SKAT. DA - 2020/4// PY - 2020/4// DO - 10.1002/gepi.22280 VL - 44 IS - 3 SP - 272-282 SN - 1098-2272 UR - https://doi.org/10.1002/gepi.22280 KW - kernel association test KW - multimarker hypothesis test KW - robust regression KW - schizophrenia KW - semiparametric ER - TY - JOUR TI - Q-Learning: Theory and Applications AU - Clifton, Jesse AU - Laber, Eric T2 - ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 7, 2020 AB - Q-learning, originally an incremental algorithm for estimating an optimal decision strategy in an infinite-horizon decision problem, now refers to a general class of reinforcement learning methods widely used in statistics and artificial intelligence. In the context of personalized medicine, finite-horizon Q-learning is the workhorse for estimating optimal treatment strategies, known as treatment regimes. Infinite-horizon Q-learning is also increasingly relevant in the growing field of mobile health. In computer science, Q-learning methods have achieved remarkable performance in domains such as game-playing and robotics. In this article, we ( a) review the history of Q-learning in computer science and statistics, ( b) formalize finite-horizon Q-learning within the potential outcomes framework and discuss the inferential difficulties for which it is infamous, and ( c) review variants of infinite-horizon Q-learning and the exploration-exploitation problem, which arises in decision problems with a long time horizon. We close by discussing issues arising with the use of Q-learning in practice, including arguments for combining Q-learning with direct-search methods; sample size considerations for sequential, multiple assignment randomized trials; and possibilities for combining Q-learning with model-based methods. DA - 2020/// PY - 2020/// DO - 10.1146/annurev-statistics-031219-041220 VL - 7 SP - 279-301 SN - 2326-831X KW - reinforcement learning KW - dynamic treatment regimes KW - model-free KW - causal inference KW - policy search ER - TY - JOUR TI - Contraction properties of shrinkage priors in logistic regression AU - Wei, Ran AU - Ghosal, Subhashis T2 - JOURNAL OF STATISTICAL PLANNING AND INFERENCE AB - Bayesian shrinkage priors have received a lot of attention recently because of their efficiency in computation and accuracy in estimation and variable selection. In this paper, we study the contraction properties of shrinkage priors in a logistic regression model where the number of covariates is high. For a shrinkage prior distribution that is heavy-tailed and concentrated around zero with high probability such as the horseshoe prior, the Dirichlet–Laplace prior, and the normal-gamma prior with appropriate choices of hyper-parameters, estimates of the logistic regression coefficient are shown to asymptotically concentrate around the true sparse vector in the L2-sense. It is shown that the proposed contraction rate is comparable with the point mass prior that is studied in Atchadé (2017). The simulation study under the logistic regression model verifies the theoretical results by showing that the horseshoe prior and the Dirichlet–Laplace prior perform like the point mass prior for the estimation, variable selection and prediction, and yield much better results than Bayesian lasso and the non-informative normal prior. DA - 2020/7// PY - 2020/7// DO - 10.1016/j.jspi.2019.12.004 VL - 207 SP - 215-229 SN - 1873-1171 KW - Bayesian variable selection KW - Continuous shrinkage KW - Contraction rate KW - Logistic regression KW - Point mass prior ER - TY - JOUR TI - Preface of the Special Issue in Honor of Professor Jayanta Kumar Ghosh AU - Ghosal, Subhashis T2 - SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY DA - 2020/3/3/ PY - 2020/3/3/ DO - 10.1007/s13171-020-00199-z SP - SN - 0976-8378 ER - TY - JOUR TI - Smart Textile‐Based Personal Thermal Comfort Systems: Current Status and Potential Solutions AU - Tabor, Jordan AU - Chatterjee, Kony AU - Ghosh, Tushar K. T2 - Advanced Materials Technologies AB - Abstract Thermophysiological comfort in humans is sought universally but seldom achieved due to biological and physiological variances. Most people in developed parts of the world rely on highly energy‐intensive, and inefficient central heating/cooling systems to achieve thermophysiological comfort which is rarely satisfactory. A potential solution to this issue is a wearable personal thermal comfort system (PTCS) consisting of textile‐based temperature and moisture sensors, thermal and moisture responsive actuators, and/or heating/cooling devices, that can sense the environment and physiology of the wearer, and accordingly provide an individualized thermal environment. Moving thermal regulation away from the built environment to the microclimate surrounding the human body using textiles has the potential to provide personalized thermal comfort and energy savings. Such a system may employ thermal comfort models and leverage the Internet of Things (IoT) and machine learning (ML) to understand individuals' comfort requirements. Herein, the current state of textile‐based active and passive comfort systems/technologies are summarized, including their environmental impact, major thermal comfort models, and factors influencing comfort. Also, active and passive textile‐based devices (sensors, actuators, and flexible heating/cooling devices) that may be incorporated into a textile‐based wearable PTCS are comprehensively discussed with an emphasis on their advantages, limitations, and prospects. DA - 2020/5// PY - 2020/5// DO - 10.1002/admt.201901155 UR - https://doi.org/10.1002/admt.201901155 KW - actuators KW - e-textiles KW - flexible sensors KW - thermal comfort KW - thermoelectric fabrics ER - TY - JOUR TI - Lethal and sublethal effects of toxicants on bumble bee populations: a modelling approach AU - Banks, J. E. AU - Banks, H. T. AU - Myers, N. AU - Laubmeier, A. N. AU - Bommarco, R. T2 - ECOTOXICOLOGY AB - Abstract Pollinator decline worldwide is well-documented; globally, chemical pesticides (especially the class of pesticides known as neonicotinoids) have been implicated in hymenopteran decline, but the mechanics and drivers of population trends and dynamics of wild bees is poorly understood. Declines and shifts in community composition of bumble bees (Bombus spp .) have been documented in North America and Europe, with a suite of lethal and sub-lethal effects of pesticides on bumble bee populations documented. We employ a mathematical model parameterized with values taken from the literature that uses differential equations to track bumble bee populations through time in order to attain a better understanding of toxicant effects on a developing colony of bumble bees. We use a delay differential equation (DDE) model, which requires fewer parameter estimations than agent-based models while affording us the ability to explicitly describe the effect of larval incubation and colony history on population outcomes. We explore how both lethal and sublethal effects such as reduced foraging ability may combine to affect population outcomes, and discuss the implications for the protection and conservation of ecosystem services. DA - 2020/4// PY - 2020/4// DO - 10.1007/s10646-020-02162-y VL - 29 IS - 3 SP - 237-245 SN - 1573-3017 KW - Hymenoptera KW - Neonicitinoid KW - Delay differential equation ER - TY - JOUR TI - Spatiotemporal signal detection using continuous shrinkage priors AU - Jhuang, An-Ting AU - Fuentes, Montserrat AU - Bandyopadhyay, Dipankar AU - Reich, Brian J. T2 - STATISTICS IN MEDICINE AB - Periodontal disease (PD) is a chronic inflammatory disease that affects the gum tissue and bone supporting the teeth. Although tooth‐site level PD progression is believed to be spatio‐temporally referenced, the whole‐mouth average periodontal pocket depth (PPD) has been commonly used as an indicator of the current/active status of PD. This leads to imminent loss of information, and imprecise parameter estimates. Despite availability of statistical methods that accommodates spatiotemporal information for responses collected at the tooth‐site level, the enormity of longitudinal databases derived from oral health practice‐based settings render them unscalable for application. To mitigate this, we introduce a Bayesian spatiotemporal model to detect problematic/diseased tooth‐sites dynamically inside the mouth for any subject obtained from large databases. This is achieved via a spatial continuous sparsity‐inducing shrinkage prior on spatially varying linear‐trend regression coefficients. A low‐rank representation captures the nonstationary covariance structure of the PPD outcomes, and facilitates the relevant Markov chain Monte Carlo computing steps applicable to thousands of study subjects. Application of our method to both simulated data and to a rich database of electronic dental records from the HealthPartners Institute reveal improved prediction performances, compared with alternative models with usual Gaussian priors for regression parameters and conditionally autoregressive specification of the covariance structure. DA - 2020/6/15/ PY - 2020/6/15/ DO - 10.1002/sim.8514 VL - 39 IS - 13 SP - 1817-1832 SN - 1097-0258 KW - nonstationary covariance KW - periodontal disease KW - shrinkage priors KW - space-time disease surveillance ER - TY - JOUR TI - Metal contamination of river otters in North Carolina AU - Sanders, Charles W., II AU - Pacifici, Krishna AU - Hess, George R. AU - Olfenbuttel, Colleen AU - DePerno, Christopher S. T2 - ENVIRONMENTAL MONITORING AND ASSESSMENT DA - 2020/// PY - 2020/// DO - 10.1007/s10661-020-8106-8 VL - 192 IS - 2 ER - TY - JOUR TI - HyPhy 2.5-A Customizable Platform for Evolutionary Hypothesis Testing Using Phylogenies AU - Pond, Sergei L. Kosakovsky AU - Poon, Art F. Y. AU - Velazquez, Ryan AU - Weaver, Steven AU - Hepler, N. Lance AU - Murrell, Ben AU - Shank, Stephen D. AU - Magalis, Brittany Rife AU - Bouvier, Dave AU - Nekrutenko, Anton AU - Wisotsky, Sadie AU - Spielman, Stephanie J. AU - Frost, Simon D. W. AU - Muse, Spencer V T2 - MOLECULAR BIOLOGY AND EVOLUTION AB - Abstract HYpothesis testing using PHYlogenies (HyPhy) is a scriptable, open-source package for fitting a broad range of evolutionary models to multiple sequence alignments, and for conducting subsequent parameter estimation and hypothesis testing, primarily in the maximum likelihood statistical framework. It has become a popular choice for characterizing various aspects of the evolutionary process: natural selection, evolutionary rates, recombination, and coevolution. The 2.5 release (available from www.hyphy.org) includes a completely re-engineered computational core and analysis library that introduces new classes of evolutionary models and statistical tests, delivers substantial performance and stability enhancements, improves usability, streamlines end-to-end analysis workflows, makes it easier to develop custom analyses, and is mostly backward compatible with previous HyPhy releases. DA - 2020/1// PY - 2020/1// DO - 10.1093/molbev/msz197 VL - 37 IS - 1 SP - 295-299 SN - 1537-1719 KW - evolutionary analysis KW - natural selection KW - hypothesis testing KW - statistical inference KW - software engineering ER - TY - JOUR TI - Improving Cancer Drug Discovery by Studying Cancer across the Tree of Life AU - Somarelli, Jason A. AU - Boddy, Amy M. AU - Gardner, Heather L. AU - DeWitt, Suzanne Bartholf AU - Tuohy, Joanne AU - Megquier, Kate AU - Sheth, Maya U. AU - Hsu, Shiaowen David AU - Thorne, Jeffrey L. AU - London, Cheryl A. AU - Eward, William C. T2 - MOLECULAR BIOLOGY AND EVOLUTION AB - Abstract Despite a considerable expenditure of time and resources and significant advances in experimental models of disease, cancer research continues to suffer from extremely low success rates in translating preclinical discoveries into clinical practice. The continued failure of cancer drug development, particularly late in the course of human testing, not only impacts patient outcomes, but also drives up the cost for those therapies that do succeed. It is clear that a paradigm shift is necessary if improvements in this process are to occur. One promising direction for increasing translational success is comparative oncology—the study of cancer across species, often involving veterinary patients that develop naturally-occurring cancers. Comparative oncology leverages the power of cross-species analyses to understand the fundamental drivers of cancer protective mechanisms, as well as factors contributing to cancer initiation and progression. Clinical trials in veterinary patients with cancer provide an opportunity to evaluate novel therapeutics in a setting that recapitulates many of the key features of human cancers, including genomic aberrations that underly tumor development, response and resistance to treatment, and the presence of comorbidities that can affect outcomes. With a concerted effort from basic scientists, human physicians and veterinarians, comparative oncology has the potential to enhance the cost-effectiveness and efficiency of pipelines for cancer drug discovery and other cancer treatments. DA - 2020/1// PY - 2020/1// DO - 10.1093/molbev/msz254 VL - 37 IS - 1 SP - 11-17 SN - 1537-1719 KW - veterinary oncology KW - cross-species studies KW - cancer drug discovery KW - evolutionary biology ER - TY - JOUR TI - Data transforming augmentation for heteroscedastic models AU - Tak, Hyungsuk AU - You, Kisung AU - Ghosh, Sujit K. AU - Su, Bingyue AU - Kelly, Joseph T2 - JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS AB - Data augmentation (DA) turns seemingly intractable computational problems into simple ones by augmenting latent missing data. In addition to computational simplicity, it is now well-established that DA equipped with a deterministic transformation can improve the convergence speed of iterative algorithms such as an EM algorithm or Gibbs sampler. In this article, we outline a framework for the transformation-based DA, which we call data transforming augmentation (DTA), allowing augmented data to be a deterministic function of latent and observed data, and unknown parameters. Under this framework, we investigate a novel DTA scheme that turns heteroscedastic models into homoscedastic ones to take advantage of simpler computations typically available in homoscedastic cases. Applying this DTA scheme to fitting linear mixed models, we demonstrate simpler computations and faster convergence rates of resulting iterative algorithms, compared with those under a non-transformation-based DA scheme. We also fit a Beta-Binomial model using the proposed DTA scheme, which enables sampling approximate marginal posterior distributions that are available only under homoscedasticity. An R package, Rdta, is publicly available at CRAN. DA - 2020/7/2/ PY - 2020/7/2/ DO - 10.1080/10618600.2019.1704295 VL - 29 IS - 3 SP - 659-667 SN - 1537-2715 UR - https://doi.org/10.1080/10618600.2019.1704295 KW - Beta-Binomial KW - EM algorithm KW - Gibbs sampler KW - hierarchical model KW - linear mixed model KW - missing data ER - TY - JOUR TI - Use of unconventional mixed Acetone-Butanol-Ethanol solvents for anthocyanin extraction from Purple-Fleshed sweetpotatoes AU - Zuleta-Correa, Ana AU - Chinn, Mari Sum AU - Alfaro-Córdoba, Marcela AU - Truong, Van-Den AU - Yencho, George Craig AU - Bruno-Bárcena, José Manuel T2 - Food Chemistry AB - Anthocyanins from purple-fleshed sweetpotatoes constitute highly valued natural colorants and functional ingredients. In the past, anthocyanin extraction conditions and efficiencies using a single acidified solvent have been assessed. However, the potential of solvent mixes that can be generated by fermentation of biomass-derived sugars have not been explored. In this study, the effects of single and mixed solvent, time, temperature, sweetpotato genotype and preparation, on anthocyanin and phenolic extraction were evaluated. Results indicated that unconventional diluted solvent mixes containing acetone, butanol, and ethanol were superior or equally efficient for extracting anthocyanins when compared to commonly used concentrated extractants. In addition, analysis of anthocyanidins concentrations including cyanidin (cy), peonidin (pe), and pelargonidin (pl), indicated that different ratios of pn/cy were obtained depending on the solvent used. These results could be useful when selecting processing conditions that better suit particular end-use applications and more environmentally friendly process development for purple sweetpotatoes. DA - 2020/// PY - 2020/// DO - 10.1016/j.foodchem.2019.125959 VL - 314 SP - 125959 UR - http://www.scopus.com/inward/record.url?eid=2-s2.0-85078103152&partnerID=MN8TOARS KW - Ipomoea batatas KW - Anthocyanidins KW - Phenolics KW - Cyanidin KW - Peonidin KW - Temperature KW - Flour ER - TY - JOUR TI - Unraveling the Hexaploid Sweetpotato Inheritance Using Ultra-Dense Multilocus Mapping AU - Mollinari, M. AU - Olukolu, B.A. AU - Da Pereira, G.S. AU - Khan, A. AU - Gemenet, D. AU - Craig Yencho, G. AU - Zeng, Z.-B. T2 - G3&#58; Genes|Genomes|Genetics AB - The hexaploid sweetpotato (Ipomoea batatas (L.) Lam., 2n = 6x = 90) is an important staple food crop worldwide and plays a vital role in alleviating famine in developing countries. Due to its high ploidy level, genetic studies in sweetpotato lag behind major diploid crops significantly. We built an ultra-dense multilocus integrated genetic map and characterized the inheritance system in a sweetpotato full-sib family using our newly developed software, MAPpoly. The resulting genetic map revealed 96.5% collinearity between I. batatas and its diploid relative I. trifida We computed the genotypic probabilities across the whole genome for all individuals in the mapping population and inferred their complete hexaploid haplotypes. We provide evidence that most of the meiotic configurations (73.3%) were resolved in bivalents, although a small portion of multivalent signatures (15.7%), among other inconclusive configurations (11.0%), were also observed. Except for low levels of preferential pairing in linkage group 2, we observed a hexasomic inheritance mechanism in all linkage groups. We propose that the hexasomic-bivalent inheritance promotes stability to the allelic transmission in sweetpotato. DA - 2020/1// PY - 2020/1// DO - 10.1534/g3.119.400620 VL - 10 IS - 1 SP - 281-292 UR - http://dx.doi.org/10.1534/g3.119.400620 KW - Polyploidy KW - Genetic Linkage KW - Hexasomic Inheritance KW - Haplotyping KW - Preferential Pairing KW - Multivalent ER - TY - JOUR TI - Quantitative trait loci and differential gene expression analyses reveal the genetic basis for negatively associated β-carotene and starch content in hexaploid sweetpotato [Ipomoea batatas (L.) Lam.] AU - Gemenet, D.C. AU - Silva Pereira, G. AU - De Boeck, B. AU - Wood, J.C. AU - Mollinari, M. AU - Olukolu, B.A. AU - Diaz, F. AU - Mosquera, V. AU - Ssali, R.T. AU - David, M. AU - Kitavi, M.N. AU - Burgos, G. AU - Felde, T.Z. AU - Ghislain, M. AU - Carey, E. AU - Swanckaert, J. AU - Coin, L.J.M. AU - Fei, Z. AU - Hamilton, J.P. AU - Yada, B. AU - Yencho, G.C. AU - Zeng, Z.-B. AU - Mwanga, R.O.M. AU - Khan, A. AU - Gruneberg, W.J. AU - Buell, C.R. T2 - Theoretical and Applied Genetics AB - β-Carotene content in sweetpotato is associated with the Orange and phytoene synthase genes; due to physical linkage of phytoene synthase with sucrose synthase, β-carotene and starch content are negatively correlated. In populations depending on sweetpotato for food security, starch is an important source of calories, while β-carotene is an important source of provitamin A. The negative association between the two traits contributes to the low nutritional quality of sweetpotato consumed, especially in sub-Saharan Africa. Using a biparental mapping population of 315 F1 progeny generated from a cross between an orange-fleshed and a non-orange-fleshed sweetpotato variety, we identified two major quantitative trait loci (QTL) on linkage group (LG) three (LG3) and twelve (LG12) affecting starch, β-carotene, and their correlated traits, dry matter and flesh color. Analysis of parental haplotypes indicated that these two regions acted pleiotropically to reduce starch content and increase β-carotene in genotypes carrying the orange-fleshed parental haplotype at the LG3 locus. Phytoene synthase and sucrose synthase, the rate-limiting and linked genes located within the QTL on LG3 involved in the carotenoid and starch biosynthesis, respectively, were differentially expressed in Beauregard versus Tanzania storage roots. The Orange gene, the molecular switch for chromoplast biogenesis, located within the QTL on LG12 while not differentially expressed was expressed in developing roots of the parental genotypes. We conclude that these two QTL regions act together in a cis and trans manner to inhibit starch biosynthesis in amyloplasts and enhance chromoplast biogenesis, carotenoid biosynthesis, and accumulation in orange-fleshed sweetpotato. Understanding the genetic basis of this negative association between starch and β-carotene will inform future sweetpotato breeding strategies targeting sweetpotato for food and nutritional security. DA - 2020/1// PY - 2020/1// DO - 10.1007/s00122-019-03437-7 VL - 133 IS - 1 SP - 23-36 UR - http://dx.doi.org/10.1007/s00122-019-03437-7 ER - TY - JOUR TI - Distribution of fiber intersections in two-dimensional random fiber web cases with a mixture of two fiber lengths AU - Chun, Heuiju AU - Suh, Moon W. T2 - TEXTILE RESEARCH JOURNAL AB - The statistical distribution of the number of fiber intersections in a unit area is of great importance in determining the physical and mechanical properties of random fiber webs and the products produced. The distribution of the number of fiber intersections determines the non-uniformity of the basis weight and can be used in designing optimal control strategies relating to such physical properties as strength, elongation, air/water permeability, acoustics and filtering efficiencies of fiber webs and nonwoven fabrics. This paper developed a geometrical and probabilistic model for the number of fiber intersections in two-dimensional random fiber webs, where two distinct fiber lengths are mixed at varying ratios. This work is an extension of a previously derived paper where the model assumed that all fiber lengths are equal. Here, we present a geometrical probabilistic model, theories for deriving expectations and variances of the number of intersections in random fiber webs. The model and statistical parameters are validated through an extensive computer simulation study. DA - 2020/8// PY - 2020/8// DO - 10.1177/0040517519898158 VL - 90 IS - 15-16 SP - 1851-1859 SN - 1746-7748 KW - fiber intersections KW - mixed fiber length webs KW - mean KW - variance KW - nonwovens ER - TY - JOUR TI - Statistical Inference for High-Dimensional Models via Recursive Online-Score Estimation AU - Shi, Chengchun AU - Song, Rui AU - Lu, Wenbin AU - Li, Runze T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - In this paper, we develop a new estimation and valid inference method for single or low-dimensional regression coefficients in high-dimensional generalized linear models. The number of the predictors is allowed to grow exponentially fast with respect to the sample size. The proposed estimator is computed by solving a score function. We recursively conduct model selection to reduce the dimensionality from high to a moderate scale and construct the score equation based on the selected variables. The proposed confidence interval (CI) achieves valid coverage without assuming consistency of the model selection procedure. When the selection consistency is achieved, we show the length of the proposed CI is asymptotically the same as the CI of the "oracle" method which works as well as if the support of the control variables were known. In addition, we prove the proposed CI is asymptotically narrower than the CIs constructed based on the de-sparsified Lasso estimator (van de Geer et al., 2014) and the decorrelated score statistic (Ning and Liu, 2017). Simulation studies and real data applications are presented to back up our theoretical findings. DA - 2020/// PY - 2020/// DO - 10.1080/01621459.2019.1710154 KW - Confidence interval KW - Generalized linear models KW - Online estimation KW - Ultrahigh dimensions ER - TY - JOUR TI - A New Liver Expression Quantitative Trait Locus Map From 1,183 Individuals Provides Evidence for Novel Expression Quantitative Trait Loci of Drug Response, Metabolic, and Sex-Biased Phenotypes AU - Etheridge, Amy S. AU - Gallins, Paul J. AU - Jima, Dereje AU - Broadaway, K. Alaine AU - Ratain, Mark J. AU - Schuetz, Erin AU - Schadt, Eric AU - Schroder, Adrian AU - Molony, Cliona AU - Zhou, Yihui AU - Mohlke, Karen L. AU - Wright, Fred A. AU - Innocenti, Federico T2 - CLINICAL PHARMACOLOGY & THERAPEUTICS AB - Expression quantitative trait locus (eQTL) studies in human liver are crucial for elucidating how genetic variation influences variability in disease risk and therapeutic outcomes and may help guide strategies to obtain maximal efficacy and safety of clinical interventions. Associations between expression microarray and genome-wide genotype data from four human liver eQTL studies (n = 1,183) were analyzed. More than 2.3 million cis-eQTLs for 15,668 genes were identified. When eQTLs were filtered against a list of 1,496 drug response genes, 187,829 cis-eQTLs for 1,191 genes were identified. Additionally, 1,683 sex-biased cis-eQTLs were identified, as well as 49 and 73 cis-eQTLs that colocalized with genome-wide association study signals for blood metabolite or lipid levels, respectively. Translational relevance of these results is evidenced by linking DPYD eQTLs to differences in safety of chemotherapy, linking the sex-biased regulation of PCSK9 expression to anti-lipid therapy, and identifying the G-protein coupled receptor GPR180 as a novel drug target for hypertriglyceridemia. DA - 2020/6// PY - 2020/6// DO - 10.1002/cpt.1751 VL - 107 IS - 6 SP - 1383-1393 SN - 1532-6535 ER - TY - JOUR TI - Designing Dry Swales for Stormwater Quality Improvement Using the Aberdeen Equation AU - Hunt, W. F. AU - Fassman-Beck, E. A. AU - Ekka, S. A. AU - Shaneyfelt, K. C. AU - Deletic, A. T2 - JOURNAL OF SUSTAINABLE WATER IN THE BUILT ENVIRONMENT AB - This case study presents a semiempirical method for designing water quality swales to treat stormwater runoff that is an alternative to current mostly anecdotal design approaches. Water quality swales are intended to reduce pollutant concentrations; they are not just flow conveyance systems. The design presented herein is a two-part process: (1) hydraulic design, and (2) treatment design. A hydraulic design feature unique to water quality swales includes maximum flow depths typically lower than grass height. Frequency analysis is used to estimate the water quality design storm intensity, and the design peak flow rate is estimated using the Rational method. Subsequently, Manning’s equation is used to determine the swale cross-section and slope. A relatively high roughness coefficient (n=∼0.35) is applied because the water is not intended to overtop the vegetation. This case study used the Aberdeen equation to calculate pollutant removal efficiencies if particle-size information was available. The method was applied to field-monitored swales in Auckland, New Zealand and Knightdale, North Carolina, US, and was found to accurately predict sediment capture. The conceptual approach presented here can be used to estimate reductions in total suspended solids by swales. However, the method needs to be validated with appropriate monitoring data in estimating removal of metals and other particulate-bound pollutants, but it is not applicable to the dissolved fraction of pollutants. DA - 2020/2// PY - 2020/2// DO - 10.1061/JSWBAY.0000886 VL - 6 IS - 1 SP - SN - 2379-6111 ER - TY - JOUR TI - Platelet aggregometry testing during aspirin or clopidogrel treatment and measurement of clopidogrel metabolite concentrations in dogs with protein-losing nephropathy AU - Shropshire, Sarah AU - Johnson, Tyler AU - Olver, Christine T2 - JOURNAL OF VETERINARY INTERNAL MEDICINE AB - Abstract Background Dogs with protein‐losing nephropathy (PLN) are treated with antiplatelet drugs for thromboprophylaxis but no standardized method exists to measure drug response. It is also unknown if clopidogrel metabolite concentrations [CM] differ between healthy and PLN dogs. Objectives Assess response to aspirin or clopidogrel in PLN dogs using platelet aggregometry (PA) and compare [CM] between healthy and PLN dogs. Animals Six healthy and 14 PLN dogs. Methods Platelet aggregometry using adenosine diphosphate (ADP), arachidonic acid (AA), and saline was performed in healthy dogs at baseline and 1‐week postclopidogrel administration to identify responders or nonresponders. A decrease of ≥60% for ADP or ≥30% for AA at 1 or 3 hours postpill was used to define a responder. At 1 and 3 hours postclopidogrel, [CM] and PA were measured in healthy and PLN dogs. Platelet aggregometry was performed in PLN dogs at baseline, 1, 6, and 12 weeks after clopidogrel or aspirin administration. Results In PLN dogs receiving clopidogrel, PA differed from baseline at all time points for ADP but not for AA at any time point. Most dogs responded at 1 or both time points except for 1 dog that showed no response. For PLN dogs receiving aspirin, no differences from baseline were observed at any time point for either ADP or AA. No differences in [CM] were found at either time point between healthy and PLN dogs. Conclusions and Clinical Importance Platelet aggregometry may represent an objective method to evaluate response to clopidogrel or aspirin treatment and PLN dogs appear to metabolize clopidogrel similarly to healthy dogs. DA - 2020/// PY - 2020/// DO - 10.1111/jvim.15694 ER - TY - JOUR TI - How Urban Identity, Affect, and Knowledge Predict Perceptions About Coyotes and Their Management AU - Drake, Michael D. AU - Peterson, M. Nils AU - Griffith, Emily H. AU - Olfenbuttel, Colleen AU - DePerno, Cristopher S. AU - Moorman, Christopher E. T2 - ANTHROZOOS AB - Globally, the number of humans and wildlife species sharing urban spaces continues to grow. As these populations grow, so too does the frequency of human–wildlife interactions in urban areas. Carnivores in particular pose urban wildlife conservation challenges owing to the strong emotions they elicit and the potential threats they can present to humans. These challenges can be better addressed with an understanding of the different factors that influence public perceptions of carnivores and their management. We conducted mail surveys in four cities in North Carolina (n =721) to explore how (a) city of residence, (b) affectual connections to coyotes (Canis latrans), and (c) biological knowledge predicted perceptions of the danger posed by coyotes, the support for wild coyotes living nearby, and the support for lethal coyote removal methods. Our results provide the first assessment of how public perceptions of carnivores and their management vary between cities of different types. Residents from a tourism-driven city were more supportive of coyotes than residents from an industrial city and less concerned about risk than residents from a commercial city. We found affectual connection to coyotes and city of residence were consistent predictors of coyote perceptions. Respondents’ knowledge of coyote biology was not a significant predictor of any perceptions of coyotes despite the relatively high statistical power of the tests. Affectual connection to coyotes had the greatest effect on predicting coyote perceptions, suggesting efforts to promote positive emotional connections to wildlife may be a better way to increase acceptance of carnivores in urban areas than focusing on biological knowledge. DA - 2020/1/2/ PY - 2020/1/2/ DO - 10.1080/08927936.2020.1694302 VL - 33 IS - 1 SP - 5-19 SN - 1753-0377 KW - affect KW - Canis latrans KW - coyotes KW - urban identity KW - wildlife knowledge ER - TY - JOUR TI - Improving Safety, Efficiency, and Productivity: Evaluation of Fall Protection Systems for Bridge Work Using Wearable Technology and Utility Analysis AU - Zuluaga, Carlos M. AU - Albert, Alex AU - Winkel, Munir A. T2 - JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT AB - The construction industry is experiencing a number of challenges. For example, construction workplaces report poor safety performance, widespread inefficiencies, and stagnant productivity rates. These challenges often translate into higher-order issues including cost overruns, schedule growths, and project failure. Accordingly, much of construction research has focused on identifying best practices to improve safety, efficiency, and productivity. However, the majority of these efforts focus on resolving one of these challenges (e.g., safety) rather than holistically addressing safety, efficiency, and productivity in unison. Unfortunately, such an approach can yield unintended consequences in certain circumstances. For example, a narrow focus on productivity may adversely affect safety performance, and vice versa. One nationwide safety issue that has received much recent attention is the protection of highway and bridge workers from falls to lower levels when working on bridge decks. In these circumstances, highway and bridge workers largely rely on existing bridge guardrails for their protection against falls. However, most bridge guardrails do not offer a barrier height of 107±8 cm (42±3 in.) for sufficient protection as per regulatory requirements. To protect these workers, a few transportation agencies are beginning to adopt passive fall protection systems that can be attached to the guardrails to temporarily increase the barrier height. The purpose of the current research was to support these efforts by evaluating four fall protection systems that are actively being considered for adoption based on the expected safety, efficiency, and productivity benefits they offer. The study objectives were accomplished through 96 field trials where physiological responses, postural demands, activity rates, and the associated utility were gathered from participating workers using wearable technology and a questionnaire survey. The research effort identified fall protection systems that offer the most advantages in terms of safety, efficiency, and productivity. The adoption of the recommended systems can yield substantial benefits in terms of safety, efficiency, and productivity, apart from reducing the risk of falls. DA - 2020/2/1/ PY - 2020/2/1/ DO - 10.1061/(ASCE)CO.1943-7862.0001764 VL - 146 IS - 2 SP - SN - 1943-7862 KW - Fall protection KW - Construction safety KW - Wearable technology KW - Productivity KW - Efficiency ER - TY - JOUR TI - Model misspecification, Bayesian versus credibility estimation, and Gibbs posteriors AU - Hong, Liang AU - Martin, Ryan T2 - SCANDINAVIAN ACTUARIAL JOURNAL AB - In the context of predicting future claims, a fully Bayesian analysis – one that specifies a statistical model, prior distribution, and updates using Bayes's formula – is often viewed as the gold-standard, while Bühlmann's credibility estimator serves as a simple approximation. But those desirable properties that give the Bayesian solution its elevated status depend critically on the posited model being correctly specified. Here we investigate the asymptotic behavior of Bayesian posterior distributions under a misspecified model, and our conclusion is that misspecification bias generally has damaging effects that can lead to inaccurate inference and prediction. The credibility estimator, on the other hand, is not sensitive at all to model misspecification, giving it an advantage over the Bayesian solution in those practically relevant cases where the model is uncertain. This begs the question: does robustness to model misspecification require that we abandon uncertainty quantification based on a posterior distribution? Our answer to this question is No, and we offer an alternative Gibbs posterior construction. Furthermore, we argue that this Gibbs perspective provides a new characterization of Bühlmann's credibility estimator. DA - 2020/8/8/ PY - 2020/8/8/ DO - 10.1080/03461238.2019.1711154 VL - 2020 IS - 7 SP - 634-649 SN - 1651-2030 KW - Asymptotics KW - Bernstein-von Mises phenomenon KW - exponential family KW - robustness KW - uncertainty quantification ER - TY - BOOK TI - Dynamic Treatment Regimes: Statistical Methods for Precision Medicine AU - Tsiatis, A.A. AU - Davidian, M. AU - Laber, E.B. AU - Holloway, S.T. DA - 2020/// PY - 2020/// DO - 10.1201/9780429192692/dynamic-treatment-regimes-anastasios-tsiatis-marie-davidian-shannon-holloway-eric-labe PB - Chapman & Hall/CRC Press SN - 9781498769778 UR - https://www.taylorfrancis.com/books/mono/10.1201/9780429192692/dynamic-treatment-regimes-anastasios-tsiatis-marie-davidian-shannon-holloway-eric-labe ER - TY - JOUR TI - A test of homogeneity of distributions when observations are subject to measurement errors AU - Lee, DongHyuk AU - Lahiri, Soumendra N. AU - Sinha, Samiran T2 - BIOMETRICS AB - When the observed data are contaminated with errors, the standard two-sample testing approaches that ignore measurement errors may produce misleading results, including a higher type-I error rate than the nominal level. To tackle this inconsistency, a nonparametric test is proposed for testing equality of two distributions when the observed contaminated data follow the classical additive measurement error model. The proposed test takes into account the presence of errors in the observed data, and the test statistic is defined in terms of the (deconvoluted) characteristic functions of the latent variables. Proposed method is applicable to a wide range of scenarios as no parametric restrictions are imposed either on the distribution of the underlying latent variables or on the distribution of the measurement errors. Asymptotic null distribution of the test statistic is derived, which is given by an integral of a squared Gaussian process with a complicated covariance structure. For data-based calibration of the test, a new nonparametric Bootstrap method is developed under the two-sample measurement error framework and its validity is established. Finite sample performance of the proposed test is investigated through simulation studies, and the results show superior performance of the proposed method than the standard tests that exhibit inconsistent behavior. Finally, the proposed method was applied to real data sets from the National Health and Nutrition Examination Survey. An R package MEtest is available through CRAN. DA - 2020/9// PY - 2020/9// DO - 10.1111/biom.13207 VL - 76 IS - 3 SP - 821-833 SN - 1541-0420 KW - Bootstrap KW - characteristic function KW - chi-square KW - Gaussian process KW - power KW - two-sample test ER - TY - JOUR TI - Writing Assignments to Assess Statistical Thinking AU - Woodard, Victoria AU - Lee, Hollylynne AU - Woodard, Roger T2 - JOURNAL OF STATISTICS EDUCATION AB - One of the main goals of statistics is to use data to provide evidence in support of an argument. This article will discuss some popular forms of writing assessments currently in use, to demonstrate the differences between the methods for structuring the students’ learning to support their arguments with evidence. We share a model, which was originally created to assess students in introductory statistics and has been adapted for the second course in statistics, which takes a unique approach toward assessing the students’ understanding of statistical concepts through writing. In this model, students are expected to answer prompts that required them to (1) take a stance on an argument, (2) defend their position with facts given in the prompt, (3) discern the implications that those facts implied, and (4) give a proper conclusion to their argument. We provide examples of a few of the writing assignment prompts used in the course, their intended assessment purpose, and common answers that students gave to these assignments.Supplementary materials for this article are available online. DA - 2020/1/2/ PY - 2020/1/2/ DO - 10.1080/10691898.2019.1696257 VL - 28 IS - 1 SP - 32-44 SN - 1069-1898 KW - Argumentation KW - Statistical thinking KW - Second statistics course KW - Written assessment ER - TY - JOUR TI - Correctly modeling plant-insect-herbivore-pesticide interactions as aggregate data AU - Banks, H. T. AU - Banks, John E. AU - Catenacci, Jared AU - Joyner, Michele AU - Stark, John T2 - MATHEMATICAL BIOSCIENCES AND ENGINEERING AB - We consider a population dynamics model in investigating data from controlled experiments with aphids in broccoli patches surrounded by different margin types (bare or weedy ground) and three levels of insecticide spray (no, light, or heavy spray). The experimental data is clearly aggregate in nature. In previous efforts [1], the aggregate nature of the data was ignored. In this paper, we embrace this aspect of the experiment and correctly model the data as aggregate data, comparing the results to the previous approach. We discuss cases in which the approach may provide similar results as well as cases in which there is a clear difference in the resulting fit to the data. DA - 2020/// PY - 2020/// DO - 10.3934/mbe.2020091 VL - 17 IS - 2 SP - 1743-1756 SN - 1551-0018 KW - plant-insect interactions KW - inverse problems KW - hypothesis testing and standard errors in dynamical models KW - aggregate data KW - Prohorov metric ER - TY - JOUR TI - Effects of Proportional Hazard Assumption on Variable Selection Methods for Censored Data AU - Sheng, Alvin AU - Ghosh, Sujit K. T2 - STATISTICS IN BIOPHARMACEUTICAL RESEARCH AB - The Cox proportional hazard (PH) model is widely used to determine the effects of risk factors and treatments (covariates) on survival time of subjects that might be right censored. The selection of covariates depends crucially on the specific form of the conditional hazard model, which is often assumed to be PH, accelerated failure time (AFT), or proportional odds (PO). However, we show that none of these semiparametric models allow for the crossing of the survival functions and hence such strong assumptions may adversely affect the selection of variables. Moreover, the most commonly used PH assumption may also be violated when there is a delayed effect of the risk factors. Taking into account all of these modeling assumptions, this study examines the effect of the PH assumption on covariate selection when the data generating model may have non-PH. In particular, variable selection under two alternative models are explored: (i) the penalized PH model (using the elastic-net penalty) and (ii) the linear spline based hazard regression model. We apply the aforementioned models to the ACTG-175 dataset and simulated datasets with survival times generated from the Weibull and log-normal distributions. We also examine the effect on covariate selection of stratifying the analysis on the off-treatment indicator. DA - 2020/4/2/ PY - 2020/4/2/ DO - 10.1080/19466315.2019.1694578 VL - 12 IS - 2 SP - 199-209 SN - 1946-6315 UR - https://doi.org/10.1080/19466315.2019.1694578 KW - AIDS trials KW - Crossing survival curves KW - Hazard regression KW - Penalized regression ER - TY - JOUR TI - Doubly robust inference when combining probability and non-probability samples with high dimensional data AU - Yang, Shu AU - Kim, Jae Kwang AU - Song, Rui T2 - Journal of the Royal Statistical Society: Series B (Statistical Methodology) AB - Summary We consider integrating a non-probability sample with a probability sample which provides high dimensional representative covariate information of the target population. We propose a two-step approach for variable selection and finite population inference. In the first step, we use penalized estimating equations with folded concave penalties to select important variables and show selection consistency for general samples. In the second step, we focus on a doubly robust estimator of the finite population mean and re-estimate the nuisance model parameters by minimizing the asymptotic squared bias of the doubly robust estimator. This estimating strategy mitigates the possible first-step selection error and renders the doubly robust estimator root n consistent if either the sampling probability or the outcome model is correctly specified. DA - 2020/1/7/ PY - 2020/1/7/ DO - 10.1111/rssb.12354 VL - 1 J2 - J. R. Stat. Soc. B LA - en OP - SN - 1369-7412 UR - http://dx.doi.org/10.1111/rssb.12354 DB - Crossref KW - Data integration KW - Double robustness KW - Generalizability KW - Penalized estimating equation KW - Variable selection ER - TY - JOUR TI - Inference in partially identified models with many moment inequalities using Lasso AU - Bugni, Federico A. AU - Caner, Mehmet AU - Kock, Anders Bredahl AU - Lahiri, Soumendra T2 - JOURNAL OF STATISTICAL PLANNING AND INFERENCE AB - This paper considers inference in a partially identified moment (in)equality model with many moment inequalities. We propose a novel two-step inference procedure that combines the methods proposed by Chernozhukov et al. (2018a) (Chernozhukov et al., 2018a, hereafter) with a first step moment inequality selection based on the Lasso. Our method controls asymptotic size uniformly, both in the underlying parameter and the data distribution. Also, the power of our method compares favorably with that of the corresponding two-step method in Chernozhukov et al. (2018a) for large parts of the parameter space, both in theory and in simulations. Finally, we show that our Lasso-based first step can be implemented by thresholding standardized sample averages, and so it is straightforward to implement. DA - 2020/5// PY - 2020/5// DO - 10.1016/j.jspi.2019.09.013 VL - 206 SP - 211-248 SN - 1873-1171 KW - Many moment inequalities KW - Self-normalizing sum KW - Multiplier bootstrap KW - Empirical bootstrap KW - Lasso KW - Inequality selection ER - TY - JOUR TI - Asymptotic theory and inference of predictive mean matching imputation using a superpopulation model framework AU - Yang, Shu AU - Kim, Jae Kwang T2 - SCANDINAVIAN JOURNAL OF STATISTICS AB - Abstract Predictive mean matching imputation is popular for handling item nonresponse in survey sampling. In this article, we study the asymptotic properties of the predictive mean matching estimator for finite‐population inference using a superpopulation model framework. We also clarify conditions for its robustness. For variance estimation, the conventional bootstrap inference is invalid for matching estimators with a fixed number of matches due to the nonsmoothness nature of the matching estimator. We propose a new replication variance estimator, which is asymptotically valid. The key strategy is to construct replicates directly based on the linear terms of the martingale representation for the matching estimator, instead of individual records of variables. Simulation studies confirm that the proposed method provides valid inference. DA - 2020/9// PY - 2020/9// DO - 10.1111/sjos.12429 VL - 47 IS - 3 SP - 839-861 SN - 1467-9469 KW - hot deck imputation KW - Jackknife variance estimation KW - martingale central limit theorem KW - missing at random ER - TY - JOUR TI - Empirical Priors and Coverage of Posterior Credible Sets in a Sparse Normal Mean Model AU - Martin, Ryan AU - Ning, Bo T2 - SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY AB - Bayesian methods provide a natural means for uncertainty quantification, that is, credible sets can be easily obtained from the posterior distribution. But is this uncertainty quantification valid in the sense that the posterior credible sets attain the nominal frequentist coverage probability? This paper investigates the frequentist validity of posterior uncertainty quantification based on a class of empirical priors in the sparse normal mean model. In particular, we show that our marginal posterior credible intervals achieve the nominal frequentist coverage probability under conditions slightly weaker than needed for selection consistency and a Bernstein–von Mises theorem for the full posterior, and numerical investigations suggest that our empirical Bayes method has superior frequentist coverage probability properties compared to other fully Bayes methods. DA - 2020/8// PY - 2020/8// DO - 10.1007/s13171-019-00189-w VL - 82 IS - 2 SP - 477-498 SN - 0976-8378 KW - Bayesian inference KW - Bernstein-von Mises theorem KW - Concentration rate KW - High-dimensional model KW - Uncertainty quantification ER - TY - JOUR TI - tuxnet: a simple interface to process RNA sequencing data and infer gene regulatory networks AU - Spurney, Ryan J. AU - Broeck, Lisa AU - Clark, Natalie M. AU - Fisher, Adam P. AU - Balaguer, Maria A. de Luis AU - Sozzani, Rosangela T2 - PLANT JOURNAL AB - Summary Predicting gene regulatory networks (GRNs) from expression profiles is a common approach for identifying important biological regulators. Despite the increased use of inference methods, existing computational approaches often do not integrate RNA‐sequencing data analysis, are not automated or are restricted to users with bioinformatics backgrounds. To address these limitations, we developed tuxnet , a user‐friendly platform that can process raw RNA‐sequencing data from any organism with an existing reference genome using a modified tuxedo pipeline ( hisat 2 + cufflinks package) and infer GRNs from these processed data. tuxnet is implemented as a graphical user interface and can mine gene regulations, either by applying a dynamic Bayesian network (DBN) inference algorithm, genist , or a regression tree‐based pipeline, rtp‐star . We obtained time‐course expression data of a PERIANTHIA ( PAN ) inducible line and inferred a GRN using genist to illustrate the use of tuxnet while gaining insight into the regulations downstream of the Arabidopsis root stem cell regulator PAN . Using rtp‐star , we inferred the network of ATHB13 , a downstream gene of PAN, for which we obtained wild‐type and mutant expression profiles. Additionally, we generated two networks using temporal data from developmental leaf data and spatial data from root cell‐type data to highlight the use of tuxnet to form new testable hypotheses from previously explored data. Our case studies feature the versatility of tuxnet when using different types of gene expression data to infer networks and its accessibility as a pipeline for non‐bioinformaticians to analyze transcriptome data, predict causal regulations, assess network topology and identify key regulators. DA - 2020/2// PY - 2020/2// DO - 10.1111/tpj.14558 VL - 101 IS - 3 SP - 716-730 SN - 1365-313X KW - Arabidopsis thaliana KW - gene regulatory network inference KW - graphical user interface KW - RNA sequencing processing KW - stem cell maintenance KW - technical advance ER - TY - JOUR TI - 3D Printing of Textiles: Potential Roadmap to Printing with Fibers AU - Chatterjee, Kony AU - Ghosh, Tushar K. T2 - Advanced Materials AB - 3D printing (3DP) has transformed engineering, manufacturing, and the use of advanced materials due to its ability to produce objects from a variety of materials, ranging from soft polymers to rigid ceramics. 3DP offers the advantage of being able to print at a variety of lengths scales; from a few micrometers to many meters. 3DP has the unique ability to produce customized small lots, efficiently. Yet, one crucial industry that has not been able to adequately explore its potential is textile manufacturing. The research in 3DP of textiles has lagged behind other areas primarily due to the difficulty in obtaining some of the unique characteristics of strength, flexibility, etc., of textiles, utilizing a fundamentally different manufacturing technology. Textiles are their own class of materials due to the specific structural developments that occur during the various stages of textile manufacturing: from fiber extrusion to assembly of the fibers to fabrics. Here, the current 3DP technologies are reviewed with emphasis on soft and anisotropic structures, as well as the efforts toward 3DP of textiles. Finally, a potential pathway to 3DP of textiles, dubbed as printing with fibers to create textile structures is proposed for further exploration. DA - 2020/1// PY - 2020/1// DO - 10.1002/adma.201902086 VL - 12 SP - 1902086 UR - https://doi.org/10.1002/adma.201902086 KW - 3D printing textiles KW - additive manufacturing KW - printing with fibers KW - soft materials ER - TY - JOUR TI - Solution paths for the generalized lasso with applications to spatially varying coefficients regression AU - Zhao, Yaqing AU - Bondell, Howard T2 - COMPUTATIONAL STATISTICS & DATA ANALYSIS AB - Penalized regression can improve prediction accuracy and reduce dimension. The generalized lasso problem is used in many applications in various fields. The generalized lasso penalizes a linear transformation of the coefficients rather than the coefficients themselves. The proposed algorithm solves the generalized lasso problem and provides the full solution path. A confidence set can then be constructed on the generalized lasso parameters based on the modified residual bootstrap lasso. The approach is demonstrated using spatially varying coefficients regression, and it is shown to be both accurate and efficient compared to previous work. DA - 2020/2// PY - 2020/2// DO - 10.1016/j.csda.2019.106821 VL - 142 SP - SN - 1872-7352 KW - Generalized lasso KW - Penalized regression KW - Regularization KW - Solution path algorithm ER - TY - JOUR TI - A multivariate spatial skew-t process for joint modeling of extreme precipitation indexes AU - Hazra, Arnab AU - Reich, Brian J. AU - Staicu, Ana-Maria T2 - ENVIRONMETRICS AB - Abstract To study trends in extreme precipitation across the United States over the years 1951–2017, we analyze 10 climate indexes that represent extreme precipitation, such as annual maximum of daily precipitation and annual maximum of consecutive five‐day average precipitation. We consider the gridded data produced by the CLIMDEX project ( http://www.climdex.org/gewocs.html ), constructed using daily precipitation data. These indexes exhibit spatial and mutual dependence. In this paper, we propose a multivariate spatial skew‐ t process for joint modeling of extreme precipitation indexes and discuss its theoretical properties. The model framework allows Bayesian inference while maintaining a computational time that is competitive with common multivariate geostatistical approaches. In a numerical study, we find that the proposed model outperforms several simpler alternatives in terms of various model selection criteria. We apply the proposed model to estimate the average decadal change in the extreme precipitation indexes throughout the United States and find several significant local changes. DA - 2020/5// PY - 2020/5// DO - 10.1002/env.2602 VL - 31 IS - 3 SP - SN - 1099-095X KW - climate change KW - extremal dependence KW - extremal trend analysis KW - extreme precipitation indexes KW - multivariate spatial skew-t process KW - separable covariance ER - TY - JOUR TI - Fine-Scale Spatiotemporal Air Pollution Analysis Using Mobile Monitors on Google Street View Vehicles AU - Guan, Yawen AU - Johnson, Margaret C. AU - Katzfuss, Matthias AU - Mannshardt, Elizabeth AU - Messier, Kyle P. AU - Reich, Brian J. AU - Song, Joon J. T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - People are increasingly concerned with understanding their personal environment, including possible exposure to harmful air pollutants. To make informed decisions on their day-to-day activities, they are interested in real-time information on a localized scale. Publicly available, fine-scale, high-quality air pollution measurements acquired using mobile monitors represent a paradigm shift in measurement technologies. A methodological framework utilizing these increasingly fine-scale measurements to provide real-time air pollution maps and short-term air quality forecasts on a fine-resolution spatial scale could prove to be instrumental in increasing public awareness and understanding. The Google Street View study provides a unique source of data with spatial and temporal complexities, with the potential to provide information about commuter exposure and hot spots within city streets with high traffic. We develop a computationally efficient spatiotemporal model for these data and use the model to make short-term forecasts and high-resolution maps of current air pollution levels. We also show via an experiment that mobile networks can provide more nuanced information than an equally sized fixed-location network. This modeling framework has important real-world implications in understanding citizens’ personal environments, as data production and real-time availability continue to be driven by the ongoing development and improvement of mobile measurement technologies. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement. DA - 2020/7/2/ PY - 2020/7/2/ DO - 10.1080/01621459.2019.1665526 VL - 115 IS - 531 SP - 1111-1124 SN - 1537-274X KW - Google Street View Air Quality Data KW - Kriging KW - Mobile sensors KW - Spatiotemporal models KW - Vecchia approximation ER - TY - JOUR TI - Bayesian Nonparametric Policy Search With Application to Periodontal Recall Intervals AU - Guan, Qian AU - Reich, Brian J. AU - Laber, Eric B. AU - Bandyopadhyay, Dipankar T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - Tooth loss from periodontal disease is a major public health burden in the United States. Standard clinical practice is to recommend a dental visit every six months; however, this practice is not evidence-based, and poor dental outcomes and increasing dental insurance premiums indicate room for improvement. We consider a tailored approach that recommends recall time based on patient characteristics and medical history to minimize disease progression without increasing resource expenditures. We formalize this method as a dynamic treatment regime which comprises a sequence of decisions, one per stage of intervention, that follow a decision rule which maps current patient information to a recommendation for their next visit time. The dynamics of periodontal health, visit frequency, and patient compliance are complex, yet the estimated optimal regime must be interpretable to domain experts if it is to be integrated into clinical practice. We combine nonparametric Bayesian dynamics modeling with policy-search algorithms to estimate the optimal dynamic treatment regime within an interpretable class of regimes. Both simulation experiments and application to a rich database of electronic dental records from the HealthPartners HMO shows that our proposed method leads to better dental health without increasing the average recommended recall time relative to competing methods. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement. DA - 2020/7/2/ PY - 2020/7/2/ DO - 10.1080/01621459.2019.1660169 VL - 115 IS - 531 SP - 1066-1078 SN - 1537-274X KW - Dirichlet process prior KW - Dynamic treatment regimes KW - Observational data KW - Periodontal disease KW - Practice-based setting KW - Precision medicine KW - Sequential optimization ER - TY - JOUR TI - Construction, Properties, and Analysis of Group-Orthogonal Supersaturated Designs AU - Jones, Bradley AU - Lekivetz, Ryan AU - Majumdar, Dibyen AU - Nachtsheim, Christopher J. AU - Stallrich, Jonathan W. T2 - TECHNOMETRICS AB - In this article, we propose a new method for constructing supersaturated designs that is based on the Kronecker product of two carefully chosen matrices. The construction method leads to a partitioning of the factors of the design such that the factors within a group are correlated to the others within the same group, but are orthogonal to any factor in any other group. We refer to the resulting designs as group-orthogonal supersaturated designs. We leverage this group structure to obtain an unbiased estimate of the error variance, and to develop an effective, design-based model selection procedure. Simulation results show that the use of these designs, in conjunction with our model selection procedure enables the identification of larger numbers of active main effects than have previously been reported for supersaturated designs. The designs can also be used in group screening; however, unlike previous group-screening procedures, with our designs, main effects in a group are not confounded. Supplementary materials for this article are available online. DA - 2020/7/2/ PY - 2020/7/2/ DO - 10.1080/00401706.2019.1654926 VL - 62 IS - 3 SP - 403-414 SN - 1537-2723 KW - E(s2)-optimality KW - UE(s(2))-optimality KW - Group screening designs KW - Hadamard matrices KW - Model selection ER - TY - JOUR TI - Detection, variability, and predictability of monsoon onset and withdrawal dates: A review AU - Bombardi, Rodrigo J. AU - Moron, Vincent AU - Goodnight, James S. T2 - INTERNATIONAL JOURNAL OF CLIMATOLOGY AB - Abstract This article presents a review of the scientific literature on detection, sources of variability, and predictability of the timing of monsoons. The timing of monsoons is characterized by the beginning (commonly referred to as onset) and end (commonly referred to as demise, cessation, retreat, or withdrawal) dates of the summer monsoons. The main methods used to detect the timing of monsoons are divided into two categories: local‐scale methods and regional to large‐scale methods. The sources of variability of the timing of monsoons are also separated into two categories: local‐scale and large‐scale sources. Finally, the article presents a summary of the literature on the predictability of the timing of monsoons using both dynamical and statistical approaches. We show that all methods are parameterized in some way. A comparison between two different methods shows that while there might be large differences in the definition of onset and demise dates at the local level, spatial aggregation usually reduces the noise and enhances the regional monsoonal signal, which may be predictable. DA - 2020/2// PY - 2020/2// DO - 10.1002/joc.6264 VL - 40 IS - 2 SP - 641-667 SN - 1097-0088 KW - demise KW - monsoon KW - onset KW - predictability KW - timing KW - variability ER - TY - JOUR TI - Modelling the effects of field spatial scale and natural enemy colonization behaviour on pest suppression in diversified agroecosystems AU - Banks, John E. AU - Laubmeier, Amanda N. AU - Banks, H. Thomas T2 - AGRICULTURAL AND FOREST ENTOMOLOGY AB - Abstract Diversifying agroecosystems by establishing or retaining natural vegetation in and around crop areas has long been recognized as a potentially effective means of bolstering pest control as a result of attracting more numerous and diverse natural enemies, although outcomes are inconsistent across species. Little is known about the underlying mechanisms driving such differences in species responses, creating challenges for determining how best to manage landscapes for maximizing environmental services such as biological control. The present study addresses gaps in our understanding of the link between noncrop vegetation in field margins and pest suppression by using a system of partial differential equations to model population‐level predator–prey interactions, as well as spatial processes, aiming to capture the dynamics of crop plants, herbivores and two generalist predators. We focus on differences in how two predators (a carabid and a ladybird beetle) colonize crop fields where they forage for prey, examining differences in how they move into the fields from adjacent vegetation as a potential driver of differences in overall pest suppression. The results obtained demonstrate that predator colonization behaviour and spatial scale are important factors with respect to determining the effectiveness of biological control. DA - 2020/2// PY - 2020/2// DO - 10.1111/afe.12354 VL - 22 IS - 1 SP - 30-40 SN - 1461-9563 KW - Beetle KW - differential equation KW - diffusion KW - dispersal KW - habitat heterogeneity ER - TY - JOUR TI - Nonparametric Estimation of Multivariate Mixtures AU - Zheng, Chaowen AU - Wu, Yichao T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - A multivariate mixture model is determined by three elements: the number of components, the mixing proportions, and the component distributions. Assuming that the number of components is given and ... DA - 2020/7/2/ PY - 2020/7/2/ DO - 10.1080/01621459.2019.1635481 VL - 115 IS - 531 SP - 1456-1471 SN - 1537-274X KW - Density estimation KW - Nonparametric mixture model KW - Tensor ER - TY - JOUR TI - A Sparse Random Projection-Based Test for Overall Qualitative Treatment Effects AU - Shi, Chengchun AU - Lu, Wenbin AU - Song, Rui T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - In contrast to the classical “one-size-fits-all” approach, precision medicine proposes the customization of individualized treatment regimes to account for patients’ heterogeneity in response to treatments. Most of existing works in the literature focused on estimating optimal individualized treatment regimes. However, there has been less attention devoted to hypothesis testing regarding the existence of overall qualitative treatment effects, especially when there are a large number of prognostic covariates. When covariates do not have qualitative treatment effects, the optimal treatment regime will assign the same treatment to all patients regardless of their covariate values. In this article, we consider testing the overall qualitative treatment effects of patients’ prognostic covariates in a high-dimensional setting. We propose a sample splitting method to construct the test statistic, based on a nonparametric estimator of the contrast function. When the dimension of covariates is large, we construct the test based on sparse random projections of covariates into a low-dimensional space. We prove the consistency of our test statistic. In the regular cases, we show the asymptotic power function of our test statistic is asymptotically the same as the “oracle” test statistic which is constructed based on the “optimal” projection matrix. Simulation studies and real data applications validate our theoretical findings. Supplementary materials for this article are available online. DA - 2020/7/2/ PY - 2020/7/2/ DO - 10.1080/01621459.2019.1604368 VL - 115 IS - 531 SP - 1201-1213 SN - 1537-274X KW - High-dimensional testing KW - Optimal treatment regime KW - Precision medicine KW - Qualitative treatment effects KW - Sparse random projection ER - TY - JOUR TI - MIMIX: A Bayesian Mixed-Effects Model for Microbiome Data From Designed Experiments AU - Grantham, Neal S. AU - Guan, Yawen AU - Reich, Brian J. AU - Borer, Elizabeth T. AU - Gross, Kevin T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - Recent advances in bioinformatics have made high-throughput microbiome data widely available, and new statistical tools are required to maximize the information gained from these data. For example, analysis of high-dimensional microbiome data from designed experiments remains an open area in microbiome research. Contemporary analyses work on metrics that summarize collective properties of the microbiome, but such reductions preclude inference on the fine-scale effects of environmental stimuli on individual microbial taxa. Other approaches model the proportions or counts of individual taxa as response variables in mixed models, but these methods fail to account for complex correlation patterns among microbial communities. In this article, we propose a novel Bayesian mixed-effects model that exploits cross-taxa correlations within the microbiome, a model we call microbiome mixed model (MIMIX). MIMIX offers global tests for treatment effects, local tests and estimation of treatment effects on individual taxa, quantification of the relative contribution from heterogeneous sources to microbiome variability, and identification of latent ecological subcommunities in the microbiome. MIMIX is tailored to large microbiome experiments using a combination of Bayesian factor analysis to efficiently represent dependence between taxa and Bayesian variable selection methods to achieve sparsity. We demonstrate the model using a simulation experiment and on a 2 × 2 factorial experiment of the effects of nutrient supplement and herbivore exclusion on the foliar fungal microbiome of Andropogon gerardii, a perennial bunchgrass, as part of the global Nutrient Network research initiative. Supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement. DA - 2020/4/2/ PY - 2020/4/2/ DO - 10.1080/01621459.2019.1626242 VL - 115 IS - 530 SP - 599-609 SN - 1537-274X KW - Continuous shrinkage prior KW - Factor analysis KW - Microbiome KW - Mixed model KW - Nutrient Network KW - OTU abundance data ER - TY - JOUR TI - Testing and Estimation of Social Network Dependence With Time to Event Data AU - Su, Lin AU - Lu, Wenbin AU - Song, Rui AU - Huang, Danyang T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - Lin Sua, Wenbin Lua*, Rui Songa & Danyang Huangba Department of Statistics, North Carolina State University, Raleigh, NC; b School of Statistics, Remin University, Beijing, China DA - 2020/4/2/ PY - 2020/4/2/ DO - 10.1080/01621459.2019.1617153 VL - 115 IS - 530 SP - 570-582 SN - 1537-274X KW - Cox model KW - EM algorithm KW - Social network dependence KW - Time-to-event data ER - TY - JOUR TI - Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning AU - Luckett, Daniel J. AU - Laber, Eric B. AU - Kahkoska, Anna R. AU - Maahs, David M. AU - Mayer-Davis, Elizabeth AU - Kosorok, Michael R. T2 - JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION AB - The vision for precision medicine is to use individual patient characteristics to inform a personalized treatment plan that leads to the best possible health-care for each patient. Mobile technologies have an important role to play in this vision as they offer a means to monitor a patient's health status in real-time and subsequently to deliver interventions if, when, and in the dose that they are needed. Dynamic treatment regimes formalize individualized treatment plans as sequences of decision rules, one per stage of clinical intervention, that map current patient information to a recommended treatment. However, most existing methods for estimating optimal dynamic treatment regimes are designed for a small number of fixed decision points occurring on a coarse time-scale. We propose a new reinforcement learning method for estimating an optimal treatment regime that is applicable to data collected using mobile technologies in an out-patient setting. The proposed method accommodates an indefinite time horizon and minute-by-minute decision making that are common in mobile health applications. We show that the proposed estimators are consistent and asymptotically normal under mild conditions. The proposed methods are applied to estimate an optimal dynamic treatment regime for controlling blood glucose levels in patients with type 1 diabetes. DA - 2020/4/2/ PY - 2020/4/2/ DO - 10.1080/01621459.2018.1537919 VL - 115 IS - 530 SP - 692-706 SN - 1537-274X KW - Markov decision processes KW - Precision medicine KW - Reinforcement learning KW - Type 1 diabetes ER -