Data Science Seminar
Hosted by Department of Mathematical Sciences

Abstract

As an alternative to model selection, model averaging has been receiving much attention in recent years, especially in the frequentist paradigm. This dissertation suggests an approach to choosing the weights under the frequentist model averaging (FMA) framework that shows optimal properties with respect to the asymptotic risk. As a basis of demonstrating our idea, we adopt the linear regression model as our main analytical framework. Instead of averaging over least squares (LS) estimators, we develop the James-Stein type FMA estimators by combining the James-Stein estimator under each candidate model, which is motivated by the James-Stein shrinkage estimator for Gaussian means. This process, from another perspective, essentially defines a new class of weights for the LS estimators and the following weight choice strategy involves minimizing the asymptotic risk of the corresponding FMA estimator.

Additionally, we propose the post-selection model averaging procedure that combines model selection and model averaging in a unified way. A special type of variable screening procedure is introduced to eliminate poor candidate models before conducting model averaging and the corresponding weight choice strategy has been built. The asymptotic optimality of the proposed approaches are investigated, and their effectiveness is illustrated through simulation studies and real data analysis.