Kernel-based Data Fusion and its Application to Protein Function Prediction in Yeast

Kernel-based Data Fusion and its Application to Protein Function Prediction in Yeast

Report Number
646
Authors
G. R. G. Lanckriet, M. Deng, N. Cristianini, M. I. Jordan, and W. S. Noble
Abstract

Kernel methods provide a principled framework in which to represent many types of data, including vectors, strings, trees and graphs. As such, these methods are useful for drawing inferences about biological phenomena. We describe a method for combining multiple kernel representations in an optimal fashion, by formulating the problem as a convex optimization problem that can be solved using semidefinite programming techniques. The method is applied to the problem of predicting yeast protein functional classifications using a support vector machine (SVM) trained on five types of data. For this problem, the new method performs better than a previously-described Markov random field method, and better than the SVM trained on any single type of data.

PDF File
Postscript File