SLPred: a multi-view subcellular localization prediction tool for multi-location human proteins

2022-09-01
Özsarı, Gökhan
Rifaioglu, Ahmet Sureyya
Atakan, Ahmet
DOĞAN, TUNCA
Martin, Maria Jesus
Atalay, Rengul Cetin
Atalay, Mehmet Volkan
Accurate prediction of the subcellular locations (SLs) of proteins is a critical topic in protein science. In this study, we present SLPred, an ensemble-based multi-view and multi-label protein subcellular localization prediction tool. For a query protein sequence, SLPred provides predictions for nine main SLs using independent machine-learning models trained for each location. We used UniProtKB/Swiss-Prot human protein entries and their curated SL annotations as our source data. We connected all disjoint terms in the UniProt SL hierarchy based on the corresponding term relationships in the cellular component category of Gene Ontology and constructed a training dataset that is both reliable and large scale using the re-organized hierarchy. We tested SLPred on multiple benchmarking datasets including our-in house sets and compared its performance against six state-of-the-art methods. Results indicated that SLPred outperforms other tools in the majority of cases.

Suggestions

ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature
Dalkıran, Alperen; Rifaioğlu, Ahmet Süreyya; Dogan, Tunca; Atalay, Mehmet Volkan (2018-09-21)
Background: The automated prediction of the enzymatic functions of uncharacterized proteins is a crucial topic in bioinformatics. Although several methods and tools have been proposed to classify enzymes, most of these studies are limited to specific functional classes and levels of the Enzyme Commission (EC) number hierarchy. Besides, most of the previous methods incorporated only a single input feature type, which limits the applicability to the wide functional space. Here, we proposed a novel enzymatic f...
DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks
Rifaioğlu, Ahmet Süreyya; Martin, Maria Jesus; Atalay, Rengül; Atalay, Mehmet Volkan (2019-05-14)
Automated protein function prediction is critical for the annotation of uncharacterized protein sequences, where accurate prediction methods are still required. Recently, deep learning based methods have outperformed conventional algorithms in computer vision and natural language processing due to the prevention of overfitting and efficient training. Here, we propose DEEPred, a hierarchical stack of multi-task feed-forward deep neural networks, as a solution to Gene Ontology (GO) based protein function pred...
Learning functional properties of proteins with language models
Unsal, Serbulent; Atas, Heval; ALBAYRAK, MUAMMER; TURHAN, KEMAL; Acar, Aybar Can; DOĞAN, TUNCA (2022-03-01)
Data-centric approaches have been used to develop predictive methods for elucidating uncharacterized properties of proteins; however, studies indicate that these methods should be further improved to effectively solve critical problems in biomedicine and biotechnology, which can be achieved by better representing the data at hand. Novel data representation approaches mostly take inspiration from language models that have yielded ground-breaking improvements in natural language processing. Lately, these appr...
Distance-based Indexing of Residue Contacts for Protein Structure Retrieval and Alignment
Sacan, Ahmet; Toroslu, İsmail Hakkı; Ferhatosmanoglu, Hakan (2008-10-10)
New protein structures are continuously being determined with the hope of deriving insights into the function and mechanisms of proteins, and consequently, protein structure repositories are growing by leaps and bounds. However, we are still far from having the right methods for sensitive and effective use of the available structural data. The fact that current structural analysis tools are impractical for large-scale applications have given rise to several approaches that try to quickly identify candidate ...
Multi-view subcellular localization prediction of human proteins
Özsarı, Gökhan; Atalay, M. Volkan.; Department of Computer Engineering (2019)
Determining the subcellular localization of proteins is crucial for Understanding the functions of proteins, drug targeting, systems biology, and proteomics research. Experimental validation of subcellular localization is an expensive and challenging process. There exist several computational methods for automated prediction of protein subcellular localization; however, there is still room for better performance. Here, we propose a multi-view SVM-based approach that provides predictions for human proteins. ...
Citation Formats
G. Özsarı et al., “SLPred: a multi-view subcellular localization prediction tool for multi-location human proteins,” BIOINFORMATICS, vol. 38, no. 17, pp. 4226–4229, 2022, Accessed: 00, 2022. [Online]. Available: https://hdl.handle.net/11511/100343.