Statistics Seminar
Department of Mathematics and Statistics

DATE:Thursday, Nov 16, 2023
TIME:1:15pm – 2:15pm
SPEAKER:Wenshu Dai, Binghamton University
TITLE:Logistic t-multinomial Clustering for Microbiome Data


In the realm of bioinformatics, we frequently encounter discrete data, particularly microbiome taxa count data obtained through 16S rRNA sequencing. These microbiome datasets are commonly characterized by their high dimensionality and the ability to provide insights solely into relative abundance, necessitating their classification as compositional data. Analyzing such data presents challenges due to their confinement within a simplex. Although the multinomial distribution considers the compositional nature of the data and a Gaussian prior provides flexibility in modeling covariance matrices, it's important to note that the log-ratio transformed compositions of microbiome data can exhibit long-tailed characteristics. Thus, we develop a robust mixture of logistic t-multinomial models using hierarchical structures of the log-ratio transformed compositional data, which provide a longer-tailed alternative to the normal distribution and employs a variational Gaussian approximation in tandem with the Expectation-Maximization (EM) algorithm for parameter recovery.

