##Statistics Seminar##\\ Department of Mathematics and Statistics
^ **DATE:**|Thursday, November 20, 2025 |
^ **TIME:**|1:10pm -- 2:30pm |
^ **LOCATION:**|WH 100E|
^ **SPEAKER:**|Bruce Phillips, Binghamton University|
^ **TITLE:**|Subdata selection for Principal Component Analysis|
**Abstract**
Principal component analysis (PCA) is a powerful statistical tool for data dimensionality reduction. Performing PCA requires computing the eigen-decomposition of the sample covariance matrix, or the singular value decomposition of the data matrix, which is often too computationally expensive for datasets with many observations and/or high dimension. In this article, we introduce a novel subdata selection method for PCA motivated by the D-optimal design criterion. We show that our method is computationally efficient compared to performing PCA on the full data or PCA following other popular subdata selection methods, especially when the dimension of the data is large. At the same time, our method is able to achieve a lower eigenspace estimation error than other subdata selection methods adapted for PCA.