Loop-based conic multivariate adaptive regression splines is a novel method for advanced construction of complex biological networks

2018-11-01
Ayyıldız Demirci, Ezgi
Purutçuoğlu Gazi, Vilda
Weber, Gerhard Wilhelm
The Gaussian Graphical Model (GGM) and its Bayesian alternative, called, the Gaussian copula graphical model (GCGM) are two widely used approaches to construct the undirected networks of biological systems. They define the interactions between species by using the conditional dependencies of the multivariate normality assumption. However, when the system's dimension is high, the performance of the model becomes computationally demanding, and, particularly, the accuracy of GGM decreases when the observations are far from normality. Here, we suggest a Conic Multivariate Adaptive Regression Splines (CMARS) as an alternative to GGM and GCGM to ameliorate both problems. CMARS is a modified version of the Multivariate Adaptive Regression Spline, a well-known modeling approaches used in Operational Research (OR) to represent biological, environmental, and economic data. The main benefit of this model is its compatibility with high-dimensional and correlated measurements of serious nonlinearity, which allows for a wide field of application. We adapted CMARS to describe biological systems and called it "LCMARS" due to its loop-based description. We then applied LCMARS to simulated and real datasets, and LCMARS produced more accurate results compared to GGM and GCGM. Hereby, the ability to use LCMARS in the description of biological networks has the potential to open up new avenues in the application of OR to computational biology and bioinformatics, and can thus help us better understanding complex diseases like cancer and hepatitis.
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH

Suggestions

Semi-Bayesian Inference of Time Series Chain Graphical Models in Biological Networks
Farnoudkia, Hajar; Purutçuoğlu Gazi, Vilda (null; 2018-09-20)
The construction of biological networks via time-course datasets can be performed both deterministic models such as ordinary differential equations and stochastic models such as diffusion approximation. Between these two branches, the former has wider application since more data can be available. In this study, we particularly deal with the probabilistic approaches for the steady-state or deterministic description of the biological systems when the systems are observed though time. Hence, we consider time s...
Bernstein approximations in glasso-based estimation of biological networks
Purutçuoğlu Gazi, Vilda; Wit, Ernst (2017-03-01)
The Gaussian graphical model (GGM) is one of the common dynamic modelling approaches in the construction of gene networks. In inference of this modelling the interaction between genes can be detected mainly via graphical lasso (glasso) or coordinate descent-based approaches. Although these methods are successful in moderate networks, their performances in accuracy decrease when the system becomes sparser. We here implement a particular type of polynomial transformations, called the Bernstein polynomials, of...
Novel model selection criteria on high dimensionalbiological networks
Bülbül, Gül Baha; Purutçuoğlu Gazi, Vilda; Department of Statistics (2019)
Gaussian graphical model (GGM) is an useful tool to describe the undirected associ-ations among the genes in the sparse biological network. To infer such high dimen-sional biological networks, thel1-penalized maximum-likelihood estimation methodis used. This approach performs a variable selection procedure by using a regular-ization parameter which controls the sparsity in the network. Thus, a selection ofthe regularization parameter becomes crucial to define the true interactions in the bi-ological ne...
TRACEMIN Fiedler A Parallel Algorithm for Computing the Fiedler Vector
Manguoğlu, Murat; Saied, Faisal; Sameh, Ahmed (null; 2010-06-25)
The eigenvector corresponding to the second smallest eigenvalue of the Laplacian of a graph, known as the Fiedler vector, has a number of applications in areas that include matrix reordering, graph partitioning, protein analysis, data mining, machine learning, and web search. The computation of the Fiedler vector has been regarded as an expensive process as it involves solving a large eigenvalue problem. We present a novel and efficient parallel algorithm for computing the Fiedler vector of large graphs bas...
Hybrid wavelet-neural network models for time series data
Kılıç, Deniz Kenan; Uğur, Ömür; Department of Financial Mathematics (2021-3-3)
The thesis aims to combine wavelet theory with nonlinear models, particularly neural networks, to find an appropriate time series model structure. Data like financial time series are nonstationary, noisy, and chaotic. Therefore using wavelet analysis helps better modeling in the sense of both frequency and time. S&P500 (∧GSPC) and NASDAQ (∧ IXIC) data are divided into several components by using multiresolution analysis (MRA). Subsequently, each part is modeled by using a suitable neural network structure. ...
Citation Formats
E. Ayyıldız Demirci, V. Purutçuoğlu Gazi, and G. W. Weber, “Loop-based conic multivariate adaptive regression splines is a novel method for advanced construction of complex biological networks,” EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, pp. 852–861, 2018, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/43832.