Department of Mathematics and Statistics
|Thursday, Nov 16, 2023
|1:15pm – 2:15pm
|Wenshu Dai, Binghamton University
|Logistic t-multinomial Clustering for Microbiome Data
In the realm of bioinformatics, we frequently encounter discrete data, particularly microbiome taxa count data obtained through 16S rRNA sequencing. These microbiome datasets are commonly characterized by their high dimensionality and the ability to provide insights solely into relative abundance, necessitating their classification as compositional data. Analyzing such data presents challenges due to their confinement within a simplex. Although the multinomial distribution considers the compositional nature of the data and a Gaussian prior provides flexibility in modeling covariance matrices, it's important to note that the log-ratio transformed compositions of microbiome data can exhibit long-tailed characteristics. Thus, we develop a robust mixture of logistic t-multinomial models using hierarchical structures of the log-ratio transformed compositional data, which provide a longer-tailed alternative to the normal distribution and employs a variational Gaussian approximation in tandem with the Expectation-Maximization (EM) algorithm for parameter recovery.