Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
PREDICTING MULTIPLE TYPES OF BIOLOGICAL RELATIONSHIPS WITH INTEGRATIVE NON-NEGATIVE MATRIX FACTORIZATION
Download
10466944.pdf
Date
2022-5-09
Author
KARTLI, Onur Savaş
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
167
views
268
downloads
Cite This
Integrative research on multi-modal biological data is difficult due to their complexity and diverse structure. A critical issue in bioinformatics and computational biology is that many of the associations/relationships between biological components and concepts (i.e., genes, proteins, drugs, diseases, etc.) are still unknown due to the high costs and temporal requirements of wet-lab experiments that uncover them. This thesis aims to predict unknown relationships in biological data by leveraging documented protein-protein, drug-target, gene-disease, and drug-side effect associations. To accomplish this task, first, biological datasets are obtained from UniProt, String, Stitch, Sider, Drugbank, Drugcentral, DisGENET, and KEGG databases, and their relationships are extracted and re-formatted as multiple pairwise relationship matrices. Some of these matrices contain continuous values to be used as association weights. We obtain highly sparse matrices mainly due to the high amount of missing data in biological databases. Second, we predicted missing relationships via integrative matrix factorization, using the non-negative matrix tri-factorization algorithm which is shown to successfully solve similar problems in the literature. For this, a prediction model is trained and evaluated using both classification and regression-based metrics. Subsequently, large-scale prediction of pairwise relationships between proteins, drugs, diseases, and side effects is accomplished using the optimized model. We obtained new predictions for drug-side effect, drug-disease, drug-target protein, and gene/protein-disease interactions. We evaluated the top 250 predictions with the highest scores and validated selected ones from the literature. We hope that the results of this thesis study will help life scientists in planning experimental work by providing preliminary sets of biological associations.
Subject Keywords
Non-negative matrix factorization
,
multi-relational data
,
drug-target interactions
,
drug-side effects relationships
,
gene-disease associations
URI
https://hdl.handle.net/11511/97348
Collections
Graduate School of Informatics, Thesis
Suggestions
OpenMETU
Core
Analyzing the Information Distribution in the fMRI Measurements by Estimating the Degree of Locality
Onal, Itir; Ozay, Mete; Firat, Orhan; GİLLAM, İLKE; Yarman Vural, Fatoş Tunay (2013-07-07)
In this study, we propose a new method for analyzing and representing the distribution of discriminative information for data acquired via functional Magnetic Resonance Imaging (fMRI). For this purpose, we form a spatially local mesh with varying size, around each voxel, called the seed voxel. The relationship among each seed voxel and its neighbors is estimated using a linear regression model by minimizing the square error. Then, we estimate the optimal mesh size that represents the connections among each ...
Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM
Gursoy, Attila; Tunçbağ, Nurcan; NUSSINOV, Ruth; Keskin, Ozlem (2011-09-01)
Prediction of protein-protein interactions at the structural level on the proteome scale is important because it allows prediction of protein function, helps drug discovery and takes steps toward genome-wide structural systems biology. We provide a protocol (termed PRISM, protein interactions by structural matching) for large-scale prediction of protein-protein interactions and assembly of protein complex structures. The method consists of two components: rigid-body structural comparisons of target proteins...
Discovering functional interaction patterns in protein-protein interaction networks
Turanalp, Mehmet E.; Can, Tolga (Springer Science and Business Media LLC, 2008-06-11)
Background: In recent years, a considerable amount of research effort has been directed to the analysis of biological networks with the availability of genome-scale networks of genes and/or proteins of an increasing number of organisms. A protein-protein interaction (PPI) network is a particular biological network which represents physical interactions between pairs of proteins of an organism. Major research on PPI networks has focused on understanding the topological organization of PPI networks, evolution...
Combining Multiple Types of Biological Data in Constraint-Based Learning of Gene Regulatory Networks
Tan, Mehmet; AlShalalfa, Mohammed; Alhajj, Reda; Polat, Faruk (2008-09-17)
Due to the complex structure and scale of gene regulatory networks, we support the argument that combination of multiple types of biological data to derive satisfactory network structures is necessary to understand the regulatory mechanisms of cellular systems. In this paper, we propose a simple but effective method of combining two types of biological data, namely microarray and transcription factor (TF) binding data, to construct gene regulatory networks. The proposed algorithm is based on and extends the...
An algorithm to analyze stability of gene-expression patterns
Gebert, J; Latsch, M; Pickl, SW; Weber, Gerhard Wilhelm; Wunschiers, R (Elsevier BV, 2006-05-01)
Many problems in the field of computational biology consist of the analysis of so-called gene-expression data. The successful application of approximation and optimization techniques, dynamical systems, algorithms and the utilization of the underlying combinatorial structures lead to a better understanding in that field. For the concrete example of gene-expression data we extend an algorithm, which exploits discrete information. This is lying in extremal points of polyhedra, which grow step by step, up to a...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
O. S. KARTLI, “PREDICTING MULTIPLE TYPES OF BIOLOGICAL RELATIONSHIPS WITH INTEGRATIVE NON-NEGATIVE MATRIX FACTORIZATION,” M.S. - Master of Science, Middle East Technical University, 2022.