Haoyu Chen
College of Sciences
Works (1)
Updated: July 5th, 2023 14:41
2020 article
Statistical Inference for Online Decision Making: In a Contextual Bandit Setting
Chen, H., Lu, W., & Song, R. (2020, May 28). Journal of the American Statistical Association, Vol. 7, pp. 1–16.
author keywords: Epsilon-greedy; Inverse propensity weighted estimator; Model misspecification; Online decision making; Statistical inference
topics (OpenAlex): Advanced Bandit Algorithms Research; Optimization and Search Problems; Reinforcement Learning in Robotics
TL;DR:
Using the martingale central limit theorem, it is shown that the online ordinary least squares estimator of model parameters is asymptotically normal and the in-sample inverse propensity weighted value estimator is asylptotic normal.
(via Semantic Scholar)
UN Sustainable Development Goal Categories
16. Peace, Justice and Strong Institutions
(OpenAlex)
Sources: Web Of Science, NC State University Libraries, ORCID
Added: July 27, 2020