Abstract
Several statistical methods are nowadays available for the analysis of gene expression data recorded through microarray technology. In this article, we take a closer look at several Gaussian mixture models which have recently been proposed to model gene expression data. It can be shown that these are special cases of a more general model, called the mixture of structural equation models (mixture of SEMs), which has been developed in psychometrics. This model combines mixture modelling and SEMs by assuming that component-specific means and variances are subject to a SEM. The connection with SEM is useful for at least two reasons: (1) it shows the basic assumptions of existing methods more explicitly and (2) it helps in straightforward development of alternative mixture models for gene expression data with alternative mean/covariance structures. Different specifications of mixture of SEMs for clustering gene expression data are illustrated using two benchmark datasets.
Keywords: biclustering, correlated data, microarray data, mixture of SEMs, simultaneous clustering and dimensional reduction
Keywords: biclustering, correlated data, microarray data, mixture of SEMs, simultaneous clustering and dimensional reduction
| Original language | English |
|---|---|
| Pages (from-to) | 567-582 |
| Journal | Statistical Methods in Medical Research |
| Volume | 22 |
| Issue number | 6 |
| DOIs | |
| Publication status | Published - 2013 |