Novel model selection criteria on high dimensionalbiological networks

Bülbül, Gül Baha
Gaussian graphical model (GGM) is an useful tool to describe the undirected associ-ations among the genes in the sparse biological network. To infer such high dimen-sional biological networks, thel1-penalized maximum-likelihood estimation methodis used. This approach performs a variable selection procedure by using a regular-ization parameter which controls the sparsity in the network. Thus, a selection ofthe regularization parameter becomes crucial to define the true interactions in the bi-ological networks. In this sense, we suggest to combine some information-theoreticmeasures such as CAIC, CAICF and ICOMP with a penalized likelihood approachin order to yield the true graph. Also, loop-based multivariate adaptive regressionsplines (LMARS) can be presented as a nonparametric modelling technique which isgood at dealing with the problem of nonlinearity and collinearity in the data whichthe problems arise from high-dimensional networks. In this study, we interfere themodel selection procedure of LMARS by applying our measures to find the correctstructure, while it has been originally introduced with generalized cross validation asa model selection technique.


Loop-based conic multivariate adaptive regression splines is a novel method for advanced construction of complex biological networks
Ayyıldız Demirci, Ezgi; Purutçuoğlu Gazi, Vilda; Weber, Gerhard Wilhelm (2018-11-01)
The Gaussian Graphical Model (GGM) and its Bayesian alternative, called, the Gaussian copula graphical model (GCGM) are two widely used approaches to construct the undirected networks of biological systems. They define the interactions between species by using the conditional dependencies of the multivariate normality assumption. However, when the system's dimension is high, the performance of the model becomes computationally demanding, and, particularly, the accuracy of GGM decreases when the observations...
Transformations of Data in Deterministic Modelling of Biological Networks
Agraz, Melih; Purutçuoğlu Gazi, Vilda (2015-05-21)
The Gaussian graphical model (GGM) is a probabilistic modelling approach used in the system biology to represent the relationship between genes with an undirected graph. In graphical models, the genes and their interactions are denoted by nodes and the edges between nodes. Hereby, in this model, it is assumed that the structure of the system can be described by the inverse of the covariance matrix, Theta, which is also called as the precision, when the observations are formulated via a lasso regression unde...
Different types of Bernstein operators in inference of Gaussian graphical model
Agraz, Melih; Purutçuoğlu Gazi, Vilda (2016-01-01)
The Gaussian graphical model (GGM) is a powerful tool to describe the relationship between the nodes via the inverse of the covariance matrix in a complex biological system. But the inference of this matrix is problematic because of its high dimension and sparsity. From previous analyses, it has been shown that the Bernstein and Szasz polynomials can improve the accuracy of the estimate if they are used in advance of the inference as a processing step of the data. Hereby in this study, we consider whether a...
Novel model selection criteria for LMARS: MARS designed for biological networks
Bulbul, Gul Bahar; Purutçuoğlu Gazi, Vilda (2021-03-01)
In higher dimensions, the loop-based multivariate adaptive regression splines (LMARS) model is used to build sparse and complex gene structure nonparametrically by correctly defining its interactions in the network. Also, it prefers to apply the generalized cross-validation (GCV) value as its original model selection criterion in order to select the best model, in turn, represent the true network structure. In this study, we suggest to modify the model selection procedure of LMARS by changing GCV with our K...
Comparison of two inference approaches in Gaussian graphical models
Purutçuoğlu Gazi, Vilda; Wit, Ernst (Walter de Gruyter GmbH, 2017-04-01)
Introduction: The Gaussian Graphical Model (GGM) is one of the well-known probabilistic models which is based on the conditional independency of nodes in the biological system. Here, we compare the estimates of the GGM parameters by the graphical lasso (glasso) method and the threshold gradient descent (TGD) algorithm.
Citation Formats
G. B. Bülbül, “Novel model selection criteria on high dimensionalbiological networks,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Statistics., Middle East Technical University, 2019.