Prediction of protein subcellular localization based on primary sequence data

Ozarar, M; Atalay, MEHMET; Atalay, RENGÜL

Prediction of protein subcellular localization based on primary sequence data

COMPUTER AND INFORMATION SCIENCES - ISCIS 2003, cilt.2869, ss.611-618, 2003 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 2869
Basım Tarihi: 2003
Dergi Adı: COMPUTER AND INFORMATION SCIENCES - ISCIS 2003
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, EMBASE, MathSciNet, Philosopher's Index, zbMATH
Sayfa Sayıları: ss.611-618
Orta Doğu Teknik Üniversitesi Adresli: Evet

Özet

This paper describes a system called prediction of protein subcellular localization (P2SL) that predicts the subcellular localization of proteins in eukaryotic organisms based on the amino acid content of primary sequences using amino acid order. Our approach for prediction is to find the most frequent motifs for each protein (class) based on clustering and then to use these most frequent motifs as features for classification. This approach allows a classification independent of the length of the sequence. Another important property of the approach is to provide a means to perform reverse analysis and analysis to extract rules. In addition to these and more importantly, we describe the use of a new encoding scheme for the amino acids that conserves biological function based on point of accepted mutations (PAM) substitution matrix. We present preliminary results of our system on a two class (dichotomy) classifier. However, it can be extended to multiple classes with some modifications.