Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Identification of functionally orthologous protein groups in different species based on protein network alignment
Download
index.pdf
Date
2010
Author
Yaveroğlu, Ömer Nebil
Metadata
Show full item record
Item Usage Stats
159
views
62
downloads
Cite This
In this study, an algorithm named ClustOrth is proposed for determining and matching functionally orthologous protein clusters in different species. The algorithm requires protein interaction networks of the organisms to be compared and GO terms of the proteins in these interaction networks as prior information. After determining the functionally related protein groups using the Repeated Random Walks algorithm, the method maps the identified protein groups according to the similarity metric defined. In order to evaluate the similarities of protein groups, graph theoretical information is used together with the context information about the proteins. The clusters are aligned using GO-Term-based protein similarity measures defined in previous studies. These alignments are used to evaluate cluster similarities by defining a cluster similarity metric from protein similarities. The top scoring cluster alignments are considered as orthologous. Several data sources providing orthology information have shown that the defined cluster similarity metric can be used to make inferences about the orthological relevance of protein groups. Comparison with a protein orthology prediction algorithm named ISORANK also showed that the ClustOrth algorithm is successful in determining orthologies between proteins. However, the cluster similarity metric is too strict and many cluster matches are not able to produce high scores for this metric. For this reason, the number of predictions performed is low. This problem can be overcomed with the introduction of different sources of information related to proteins in the clusters for the evaluation of the clusters. The ClustOrth algorithm also outperformed the NetworkBLAST algorithm which aims to find orthologous protein clusters using protein sequence information directly for determining orthologies. It can be concluded that this study is one of the leading studies addressing the protein cluster matching problem for identifying orthologous functional modules of protein interaction networks computationally.
Subject Keywords
Electronic computers.
,
Computer science.
URI
http://etd.lib.metu.edu.tr/upload/12612395/index.pdf
https://hdl.handle.net/11511/19924
Collections
Graduate School of Natural and Applied Sciences, Thesis
Suggestions
OpenMETU
Core
Computational representation of protein sequences for homology detection and classification
Oğul, Hasan; Mumcuoğlu, Ünal Erkan; Department of Information Systems (2006)
Machine learning techniques have been widely used for classification problems in computational biology. They require that the input must be a collection of fixedlength feature vectors. Since proteins are of varying lengths, there is a need for a means of representing protein sequences by a fixed-number of features. This thesis introduces three novel methods for this purpose: n-peptide compositions with reduced alphabets, pairwise similarity scores by maximal unique matches, and pairwise similarity scores by...
A pattern classification approach boosted with genetic algorithms
Yalabık, İsmet; Yarman Vural, Fatoş Tunay; Department of Computer Engineering (2007)
Ensemble learning is a multiple-classier machine learning approach which combines, produces collections and ensembles statistical classiers to build up more accurate classier than the individual classiers. Bagging, boosting and voting methods are the basic examples of ensemble learning. In this thesis, a novel boosting technique targeting to solve partial problems of AdaBoost, a well-known boosting algorithm, is proposed. The proposed systems nd an elegant way of boosting a bunch of classiers successively t...
Modeling of various biological networks via LCMARS
AYYILDIZ DEMİRCİ, EZGİ; Purutçuoğlu Gazi, Vilda (Elsevier BV, 2018-09-01)
In system biology, the interactions between components such as genes, proteins, can be represented by a network. To understand the molecular mechanism of complex biological systems, construction of their networks plays a crucial role. However, estimation of these biological networks is a challenging problem because of their high dimensional and sparse structures. Several statistical methods are proposed to overcome this issue. The Conic Multivariate Adaptive Regression Splines (CMARS) is one of the recent n...
An image encryption algorithm robust to post-encryption bitrate conversion
Akdağ, Sadık Bahaettin; Candan, Çağatay; Department of Electrical and Electronics Engineering (2006)
In this study, a new method is proposed to protect JPEG still images through encryption by employing integer-to-integer transforms and frequency domain scrambling in DCT channels. Different from existing methods in the literature, the encrypted image can be further compressed, i.e. transcoded, after the encryption. The method provides selective encryption/security level with the adjustment of its parameters. The encryption method is tested with various images and compared with the methods in the literature ...
A clustering method for the problem of protein subcellular localization
Bezek, Perit; Atalay, Mehmet Volkan; Department of Computer Engineering (2006)
In this study, the focus is on predicting the subcellular localization of a protein, since subcellular localization is helpful in understanding a protein’s functions. Function of a protein may be estimated from its sequence. Motifs or conserved subsequences are strong indicators of function. In a given sample set of protein sequences known to perform the same function, a certain subsequence or group of subsequences should be common; that is, occurrence (frequency) of common subsequences should be high. Our ...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
Ö. N. Yaveroğlu, “Identification of functionally orthologous protein groups in different species based on protein network alignment,” M.S. - Master of Science, Middle East Technical University, 2010.