DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks

2019-05-14
Rifaioğlu, Ahmet Süreyya
Martin, Maria Jesus
Atalay, Rengül
Atalay, Mehmet Volkan
Automated protein function prediction is critical for the annotation of uncharacterized protein sequences, where accurate prediction methods are still required. Recently, deep learning based methods have outperformed conventional algorithms in computer vision and natural language processing due to the prevention of overfitting and efficient training. Here, we propose DEEPred, a hierarchical stack of multi-task feed-forward deep neural networks, as a solution to Gene Ontology (GO) based protein function prediction. DEEPred was optimized through rigorous hyper-parameter tests, and benchmarked using three types of protein descriptors, training datasets with varying sizes and GO terms form different levels. Furthermore, in order to explore how training with larger but potentially noisy data would change the performance, electronically made GO annotations were also included in the training process. The overall predictive performance of DEEPred was assessed using CAFA2 and CAFA3 challenge datasets, in comparison with the state-of-the-art protein function prediction methods. Finally, we evaluated selected novel annotations produced by DEEPred with a literature-based case study considering the 'biofilm formation process' in Pseudomonas aeruginosa. This study reports that deep learning algorithms have significant potential in protein function prediction; particularly when the source data is large. The neural network architecture of DEEPred can also be applied to the prediction of the other types of ontological associations. The source code and all datasets used in this study are available at: https://github.com/cansyl/DEEPred.
SCIENTIFIC REPORTS

Suggestions

Pattern search in pathogenic bacterial proteins for localization and secretory systems
Özcan, Orhan; Özcengiz, Gülay; Can, Tolga; Department of Biotechnology (2015)
Computational prediction of bacterial protein localization (BPL) is a very useful tool which provides clues about protein function. For pathogenic proteins in particular, detection of their subcellular location and their secretory pathways have great implications for vaccine and drug design. Cell surface and/or secreted proteins of microbes can also be used as biomarkers for sensor applications. At present, there are numerous BPL prediction algorithms and programs available, however, most of them give false...
ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature
Dalkıran, Alperen; Rifaioğlu, Ahmet Süreyya; Dogan, Tunca; Atalay, Mehmet Volkan (2018-09-21)
Background: The automated prediction of the enzymatic functions of uncharacterized proteins is a crucial topic in bioinformatics. Although several methods and tools have been proposed to classify enzymes, most of these studies are limited to specific functional classes and levels of the Enzyme Commission (EC) number hierarchy. Besides, most of the previous methods incorporated only a single input feature type, which limits the applicability to the wide functional space. Here, we proposed a novel enzymatic f...
Quantitative Analysis of MAP-Mediated Regulation of Microtubule Dynamic Instability In Vitro: Focus on Tau
Kiriş, Erkan; Feinstein, Stuart C (Elsevier, 2010-01-01)
The regulation of microtubule growing and shortening dynamics is essential for proper cell function and viability, and microtubule-associated proteins (MAPs) such as the neural protein tau are critical regulators of these dynamic processes. Further, we and our colleagues have proposed that misregulation of microtubule dynamics may contribute to tau-mediated neuronal cell death and dementia in Alzheimer’s and related diseases. In the first part of this chapter, we present a general background on microtubule ...
Distance-based Indexing of Residue Contacts for Protein Structure Retrieval and Alignment
Sacan, Ahmet; Toroslu, İsmail Hakkı; Ferhatosmanoglu, Hakan (2008-10-10)
New protein structures are continuously being determined with the hope of deriving insights into the function and mechanisms of proteins, and consequently, protein structure repositories are growing by leaps and bounds. However, we are still far from having the right methods for sensitive and effective use of the available structural data. The fact that current structural analysis tools are impractical for large-scale applications have given rise to several approaches that try to quickly identify candidate ...
Analysis of protein-protein interaction networks using random walks
Can, Tolga; Singh, Ambuj K. (2005-08-21)
Genome wide protein networks have become reality in recent years due to high throughput methods for detecting protein interactions. Recent studies show that a networked representation of proteins provides a more accurate model of biological systems and processes compared to conventional pair-wise analyses. Complementary to the availability of protein networks, various graph analysis techniques have been proposed to mine these networks for pathway discovery, function assignment, and prediction of complex mem...
Citation Formats
A. S. Rifaioğlu, M. J. Martin, R. Atalay, and M. V. Atalay, “DEEPred: Automated Protein Function Prediction with Multi-task Feed-forward Deep Neural Networks,” SCIENTIFIC REPORTS, pp. 0–0, 2019, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/30761.