Measuring Similarity between Gene Expression Profiles with the Consideration of Both Shape and Magnitude

June, 2006
Report Number: 
Kyungpil Kim, Keni Jiang, Shibo Zhang, Li Cai, In-Beum Lee, Lewis Feldman, Haiyan Huang

Clustering methods have been widely applied to gene expression data in order to group genes sharing common or similar expression profiles into discrete functional groups. In such analyses, designing an appropriate (dis)similarity measure is critical. Motivated by the Poisson based similarity measure PoissonC designed for SAGE data (Cai et al., 2004), we explore more generally applicable similarity measures in clustering analysis that consider both shape and magnitude of the gene expression profile. Our idea is to model the shape and magnitude information separately and use the estimated shape and magnitude parameters to define a similarity measure in a new data space, wherein each dimension represents different aspects of an expression profile shape. We expect that our new measure would be more effective to detect shape changes compared to PoissonC and have necessary sensitivity to magnitude. The application results of our new measure to different types of expression data demonstrate the effectiveness of our method.

PDF File: