2021 journal article
A projector-based approach to quantifying total and excess uncertainties for sketched linear regression
Information and Inference: A Journal of the IMA, 8.
AbstractLinear regression is a classic method of data analysis. In recent years, sketching—a method of dimension reduction using random sampling, random projections or both—has gained popularity as an effective computational approximation when the number of observations greatly exceeds the number of variables. In this paper, we address the following question: how does sketching affect the statistical properties of the solution and key quantities derived from it? To answer this question, we present a projector-based approach to sketched linear regression that is exact and that requires minimal assumptions on the sketching matrix. Therefore, downstream analyses hold exactly and generally for all sketching schemes. Additionally, a projector-based approach enables derivation of key quantities from classic linear regression that account for the combined model- and algorithm-induced uncertainties. We demonstrate the usefulness of a projector-based approach in quantifying and enabling insight on excess uncertainties and bias-variance decompositions for sketched linear regression. Finally, we demonstrate how the insights from our projector-based analyses can be used to produce practical sketching diagnostics to aid the design of judicious sketching schemes.