<WRAP centeralign>##Statistics Seminar##\\ Department of Mathematics and Statistics</WRAP>

<WRAP 70% center>
^  **DATE:**|Thursday, February 9, 2023 |
^  **TIME:**|1:15pm -- 2:15pm |
^  **LOCATION:**|WH 100E |
^  **SPEAKER:**|Zhou Wang, Binghamton University |
^  **TITLE:**|Neural Contextual Bandits with Deep Representation and Shallow Exploration |
</WRAP>
\\ 

<WRAP center box 80%>
<WRAP centeralign>**Abstract**</WRAP>
The authors of this paper study neural contextual bandits, a general class of contextual bandits, where each context-action pair is associated with a raw feature vector, but the specific reward-generating function is unknown. They propose a novel learning algorithm that transforms the raw feature vector using the last hidden layer of a deep ReLU neural network (deep representation learning), and uses an upper confidence bound (UCB) approach to explore the last linear layer (shallow exploration). They prove that under standard assumptions, their proposed algorithm achieves O(sqrt(T)*logT) finite time regret, where T is the learning time horizon. Compared with existing neural contextual bandit algorithms, their approach is computationally much more efficient since it only needs to explore the last layer of the deep neural network.
</WRAP>