DISCRETE APPLIED MATHEMATICS, cilt.157, ss.2388-2394, 2009 (SCI İndekslerine Giren Dergi)
Support vector machines (SVMs) have many applications in investigating biological data from gene expression arrays to understanding EEG signals of sleep stages. In this paper, we have developed an application that will support the prediction of the pro-peptide cleavage site of fungal extracellular proteins which display mostly a monobasic or dibasic processing site. Many of the secretory proteins and peptides are synthesized as inactive precursors and they become active after post-translational processing. A collection of fungal proprotein sequences are used as a training data set. A specifically designed kernel is expressed as an application of the well-known Gaussian kernel via feature spaces defined for our problem. Rather than fixing the kernel parameters with cross validation or other methods, we introduce a novel approach that simultaneously performs model selection together with the test of accuracy and testing confidence levels. This leads us to higher accuracy at significantly reduced training times. The results of the server ProP1.0 which predicts pro-peptide cleavage sites are compared with the results of this study. A similar mathematical approach may be adapted to pro-peptide cleavage prediction in other eukaryotes. (c) 2008 Elsevier B.V. All rights reserved.