<WRAP centeralign>##Statistics Seminar##\\ Department of Mathematical Sciences</WRAP>

<WRAP 70% center>
^  **DATE:**|Thursday, Nov. 12, 2020 |
^  **TIME:**|1:15pm -- 2:15pm |
^  **LOCATION:**|Zoom meeting |
^  **SPEAKER:**|Xiaoke Qin, Binghamton University |
^  **TITLE:**|Variable selection for sparse Dirichlet-Multinomail regression with an application to microbiome data analysis. |
</WRAP>
\\ 

<WRAP center box 80%>
<WRAP centeralign>**Abstract**</WRAP>
With the development of next generation sequencing technology, 
researchers have now been able to study the microbiome composition using 
direct sequencing, whose output are bacterial taxa counts for each 
microbiomesample. One goal of microbiome study is to associate the 
microbiome composition with environmental covariates. This paper 
proposes to model the taxa counts using a Dirichlet-multinomial (DM) 
regression model in order to account for overdispersion of observed 
counts. The DM regression model can be used for testing the association 
between taxa composition and covariates using the likelihood ratio test. 
However, when the number of covariates is large, multiple testing can 
lead to loss of power. To address the high dimensionality of the 
problem, a penalized likelihood approach is proposed to estimate the 
regression parameters and to select the variables by imposing a sparse 
group l_2 penalty to encourage both group-level and within-group 
sparsity. Such a variable selection procedure can lead to selection of 
the relevant covariates and their associated bacterial taxa. An 
efficient block-coordinate descent algorithm is developed to solve the 
optimization problem in this paper. The authors also demonstrate the 
power of the method in the analysis of a data set evaluating the 
nutrient intake on the human gut microbiome.
</WRAP>