Multi-task Deep Neural Networks in Protein Function Prediction

Date

2017-05-01

Author

Rifaioğlu, Ahmet Süreyya
Doğan, Tunca
Martin, Maria Jesus
Atalay, Rengül
Atalay, Mehmet Volkan

Metadata

Show full item record

Item Usage Stats

285
views

0
downloads

In recent years, deep learning algorithms have outperformed the state-of-the art methods in several areas thanks to the efficient methods for training and for preventing overfitting, advancement in computer hardware, the availability of vast amount data. The high performance of multi-task deep neural networks in drug discovery has attracted the attention to deep learning algorithms in bioinformatics area. Here, we proposed a hierarchical multi-task deep neural network architecture based on Gene Ontology (GO) terms as a solution to protein function prediction problem and investigated various aspects of the proposed architecture by performing several experiments. First, we showed that there is a positive correlation between performance of the system and the size of training datasets. Second, we investigated whether the level of GO terms on GO hierarchy related to their performance. We showed that there is no relation between the depth of GO terms on GO hierarchy and their performance. In addition, we included all annotations to the training of a set of GO terms to investigate whether including noisy data to the training datasets change the performance of the system. The results showed that including less reliable annotations in training of deep neural networks increased the performance of the low performed GO terms, significantly. We evaluated the performance of the system using hierarchical evaluation method. Mathews correlation coefficient was calculated as 0.75, 0.49 and 0.63 for molecular function, biological process and cellular component categories, respectively. We showed that deep learning algorithms have a great potential in protein function prediction area. We plan to further improve the DEEPred by including other types of annotations from various biological data sources. We plan to construct DEEPred as an open access online tool.

Subject Keywords

Gene Ontology, Multi-task deep neuralnetworks, Deep Learning, Protein Function Prediction

URI

https://arxiv.org/abs/1705.04802
https://hdl.handle.net/11511/86536
https://arxiv.org/pdf/1705.04802.pdf

Journal

arXiv

Collections

Graduate School of Informatics, Article

Suggestions

OpenMETU
Core

Visual Object Tracking with Autoencoder Representations Besbinar, Beril; Alatan, Abdullah Aydın (2016-05-19) Deep learning is the discipline of training computational models that are composed of multiple layers and these methods have recently improved the state of the art in many areas as a virtue of large labeled datasets, increase in the computational power of current hardware and unsupervised training methods. Although such a dataset may not be available for lots of application areas, the representations obtained by the well-designed networks that have a large representation capacity and trained with enough dat...
Deep Learning-Based Hybrid Approach for Phase Retrieval IŞIL, ÇAĞATAY; Öktem, Sevinç Figen; KOÇ, AYKUT (2019-06-24) We develop a phase retrieval algorithm that utilizes the hybrid-input-output (HIO) algorithm with a deep neural network (DNN). The DNN architecture, which is trained to remove the artifacts of HIO, is used iteratively with HIO to improve the reconstructions. The results demonstrate the effectiveness of the approach with little additional cost.
Multi-time-scale input approaches for hourly-scale rainfall-runoff modeling based on recurrent neural networks Ishida, Kei; Kiyama, Masato; Ercan, Ali; Amagasaki, Motoki; Tu, Tongbi (2021-11-01) This study proposes two effective approaches to reduce the required computational time of the training process for time-series modeling through a recurrent neural network (RNN) using multi-time-scale time-series data as input. One approach provides coarse and fine temporal resolutions of the input time-series data to RNN in parallel. The other concatenates the coarse and fine temporal resolutions of the input time-series data over time before considering them as the input to RNN. In both approaches, first, ...
Competing labels: a heuristic approach to pseudo-labeling in deep semi-supervised learning Bayrak, Hamdi Burak; Ertekin Bolelli, Şeyda; Yücel, Hamdullah; Department of Scientific Computing (2022-2-10) Semi-supervised learning is one of the dominantly utilized approaches to reduce the reliance of deep learning models on large-scale labeled data. One mostly used method of this approach is pseudo-labeling. However, pseudo-labeling, especially its originally proposed form tends to remarkably suffer from noisy training when the assigned labels are false. In order to mitigate this problem, in our work, we investigate the gradient sent to the neural network and propose a heuristic method, called competing label...
Geospatial Object Detection Using Deep Networks Barut, Onur; Alatan, Abdullah Aydın (2019-01-01) In the last decade, deep learning has been drawing a huge interest due to the developments in the computational hardware and novel machine learning techniques. This progress also significantly effects satellite image analysis for various objectives, such as disaster and crisis management, forest cover, road mapping, city planning and even military purposes. For all these applications, detection of geospatial objects has crucial importance and some recent object detection techniques are still unexplored to b...

Citation Formats

A. S. Rifaioğlu, T. Doğan, M. J. Martin, R. Atalay, and M. V. Atalay, “Multi-task Deep Neural Networks in Protein Function Prediction,” arXiv, pp. 1–19, 2017, Accessed: 00, 2021. [Online]. Available: https://arxiv.org/abs/1705.04802.