Prediction of protein subcellular localization based on primary sequence data

2004-04-30
Subcellular localization is crucial for determining the functions of proteins. A system called prediction of protein subcellular localization (P2SL) that predicts the subcellular localization of proteins in eukaryotic organisms based on the amino acid content of primary sequences using amino acid order is designed. The approach for prediction is to find the most frequent motifs for each protein in a given class based on clustering via self organizing maps and then to use these most frequent motifs as features for classification by the help of multi layer perceptrons. This approach allows a classification independent of the length of the sequence. In addition to these, the use of a new encoding scheme is described for the amino acids that conserves biological function based on point of accepted mutations (PAM) substitution matrix. The statistical test results of the system is presented on a four class problem. P2SL achieves slightly higher prediction accuracy than the similar studies.

Suggestions

Prediction of protein subcellular localization based on primary sequence data
Özarar, Mert; Atalay, Mehmet Volkan; Department of Computer Engineering (2003)
Subcellular localization is crucial for determining the functions of proteins. A system called prediction of protein subcellular localization (P2SL) that predicts the subcellular localization of proteins in eukaryotic organisms based on the amino acid content of primary sequences using amino acid order is designed. The approach for prediction is to nd the most frequent motifs for each protein in a given class based on clustering via self organizing maps and then to use these most frequent motifs as features...
Prediction of protein subcellular localization based on primary sequence data
Ozarar, M; Atalay, Mehmet Volkan; Atalay, Rengül (2003-01-01)
This paper describes a system called prediction of protein subcellular localization (P2SL) that predicts the subcellular localization of proteins in eukaryotic organisms based on the amino acid content of primary sequences using amino acid order. Our approach for prediction is to find the most frequent motifs for each protein (class) based on clustering and then to use these most frequent motifs as features for classification. This approach allows a classification independent of the length of the sequence. ...
Characterization and prediction of protein interfaces to infer protein-protein interaction networks
Keskin, Ozlem; Tunçbağ, Nurcan; GÜRSOY, Attila (2008-04-01)
Complex protein-protein interaction networks govern biological processes in cells. Protein interfaces are the sites where proteins physically interact. Identification and characterization of protein interfaces will lead to understanding how proteins interact with each other and how they are involved in protein-protein interaction networks. What makes a given interface bind to different proteins; how similar/different the interactions in proteins are some key questions to be answered. Enormous amount of prot...
Multi-view subcellular localization prediction of human proteins
Özsarı, Gökhan; Atalay, M. Volkan.; Department of Computer Engineering (2019)
Determining the subcellular localization of proteins is crucial for Understanding the functions of proteins, drug targeting, systems biology, and proteomics research. Experimental validation of subcellular localization is an expensive and challenging process. There exist several computational methods for automated prediction of protein subcellular localization; however, there is still room for better performance. Here, we propose a multi-view SVM-based approach that provides predictions for human proteins. ...
Modeling of various biological networks via LCMARS
AYYILDIZ DEMİRCİ, EZGİ; Purutçuoğlu Gazi, Vilda (Elsevier BV, 2018-09-01)
In system biology, the interactions between components such as genes, proteins, can be represented by a network. To understand the molecular mechanism of complex biological systems, construction of their networks plays a crucial role. However, estimation of these biological networks is a challenging problem because of their high dimensional and sparse structures. Several statistical methods are proposed to overcome this issue. The Conic Multivariate Adaptive Regression Splines (CMARS) is one of the recent n...
Citation Formats
M. Ozarar, M. V. Atalay, and R. Atalay, “Prediction of protein subcellular localization based on primary sequence data,” presented at the IEEE 12th Signal Processing and Communications Applications Conference, Kusadasi, TURKEY, 2004, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/30491.