Show/Hide Menu
Hide/Show Apps
anonymousUser
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Açık Bilim Politikası
Açık Bilim Politikası
Frequently Asked Questions
Frequently Asked Questions
Browse
Browse
By Issue Date
By Issue Date
Authors
Authors
Titles
Titles
Subjects
Subjects
Communities & Collections
Communities & Collections
Multi-view subcellular localization prediction of human proteins
Download
index.pdf
Date
2019
Author
Özsarı, Gökhan
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
4
views
3
downloads
Determining the subcellular localization of proteins is crucial for Understanding the functions of proteins, drug targeting, systems biology, and proteomics research. Experimental validation of subcellular localization is an expensive and challenging process. There exist several computational methods for automated prediction of protein subcellular localization; however, there is still room for better performance. Here, we propose a multi-view SVM-based approach that provides predictions for human proteins. We represent each protein sequence by multi-view features; i.e., physicochemical properties, amino acid compositions, and homology-based features. Our classification model contains seven classifiers for each localization, where each classifier provides a probabilistic result. To develop a multi-view voting classifier, we employ a weighted classifier combination method that assigns different weights to classifiers based on their discriminative strengths. We evaluated the described method on previously used datasets, as well as on our in-house dataset, called Trust dataset. Trust dataset is created by using a new subcellular localization hierarchy which merges UniProt Subcellular Location hierarchy and GO Cellular Component hierarchy by applying it on only manual experimental annotations in UniProtKB. We compared our results with five state-of-the-art methods, which are SubCons, LocTree2, CELLO2.5, MultiLoc2, and DeepLoc. Our approach outperformed the others with 59%, 61%, 68% overall Matthews correlation coefficient (MCC) scores on Trust, Golden (SubCons benchmark dataset), Golden-Trust (refined Golden dataset) datasets, respectively where SubCon’s MCC scores were 43%, 53%, and 56%.
Subject Keywords
Proteins.
,
Keywords: Subcellular localization prediction human proteins svm multi-view.
URI
http://etd.lib.metu.edu.tr/upload/12623896/index.pdf
https://hdl.handle.net/11511/44466
Collections
Graduate School of Natural and Applied Sciences, Thesis