Data Science Seminar
Hosted by Department of Mathematical Sciences
Clustering a large number of time series into relatively homogeneous groups is a well-studied unsupervised learning technique that has been widely used for grouping financial instruments (say, stocks) based on their stochastic properties across the entire time period under consideration. However, clustering algorithms ignore the notion of co-clustering, i.e., grouping of stocks only within a subset of times rather than over the entire time period. Biclustering techniques are useful for simultaneously clustering rows and columns of a data matrix. Over the past two decades, there has been a proliferation of biclustering approaches and interest in this area continues to grow. It is useful to apply biclustering algorithms to a large set of long time series that occur in many application domains, such as the bio-sciences, finance, etc. This talk will give an overview on biclustering approaches, followed by descriptions of two algorithms that we have developed to bicluster intra-day stock returns time series over multiple trading days. The algorithms employ the mean residue score and mutual information as metrics. Through some post-biclustering analyses, we show how data analysts may make use of the biclustering results to study co-movement patterns in sets of stock returns within time blocks. This is joint work with Jian Zou and Haitao Liu from Worcester Polytechnic Institute.