Novel model selection criteria on high dimensionalbiological networks

Bülbül, Gül Baha
Gaussian graphical model (GGM) is an useful tool to describe the undirected associ-ations among the genes in the sparse biological network. To infer such high dimen-sional biological networks, thel1-penalized maximum-likelihood estimation methodis used. This approach performs a variable selection procedure by using a regular-ization parameter which controls the sparsity in the network. Thus, a selection ofthe regularization parameter becomes crucial to define the true interactions in the bi-ological networks. In this sense, we suggest to combine some information-theoreticmeasures such as CAIC, CAICF and ICOMP with a penalized likelihood approachin order to yield the true graph. Also, loop-based multivariate adaptive regressionsplines (LMARS) can be presented as a nonparametric modelling technique which isgood at dealing with the problem of nonlinearity and collinearity in the data whichthe problems arise from high-dimensional networks. In this study, we interfere themodel selection procedure of LMARS by applying our measures to find the correctstructure, while it has been originally introduced with generalized cross validation asa model selection technique.


Loop-based conic multivariate adaptive regression splines is a novel method for advanced construction of complex biological networks
Ayyıldız Demirci, Ezgi; Purutçuoğlu Gazi, Vilda; Weber, Gerhard Wilhelm (2018-11-01)
The Gaussian Graphical Model (GGM) and its Bayesian alternative, called, the Gaussian copula graphical model (GCGM) are two widely used approaches to construct the undirected networks of biological systems. They define the interactions between species by using the conditional dependencies of the multivariate normality assumption. However, when the system's dimension is high, the performance of the model becomes computationally demanding, and, particularly, the accuracy of GGM decreases when the observations...
Transformations of Data in Deterministic Modelling of Biological Networks
Agraz, Melih; Purutçuoğlu Gazi, Vilda (2015-05-21)
The Gaussian graphical model (GGM) is a probabilistic modelling approach used in the system biology to represent the relationship between genes with an undirected graph. In graphical models, the genes and their interactions are denoted by nodes and the edges between nodes. Hereby, in this model, it is assumed that the structure of the system can be described by the inverse of the covariance matrix, Theta, which is also called as the precision, when the observations are formulated via a lasso regression unde...
Bernstein approximations in glasso-based estimation of biological networks
Purutçuoğlu Gazi, Vilda; Wit, Ernst (2017-03-01)
The Gaussian graphical model (GGM) is one of the common dynamic modelling approaches in the construction of gene networks. In inference of this modelling the interaction between genes can be detected mainly via graphical lasso (glasso) or coordinate descent-based approaches. Although these methods are successful in moderate networks, their performances in accuracy decrease when the system becomes sparser. We here implement a particular type of polynomial transformations, called the Bernstein polynomials, of...
Novel model selection criteria for LMARS: MARS designed for biological networks
Bulbul, Gul Bahar; Purutçuoğlu Gazi, Vilda (2021-03-01)
In higher dimensions, the loop-based multivariate adaptive regression splines (LMARS) model is used to build sparse and complex gene structure nonparametrically by correctly defining its interactions in the network. Also, it prefers to apply the generalized cross-validation (GCV) value as its original model selection criterion in order to select the best model, in turn, represent the true network structure. In this study, we suggest to modify the model selection procedure of LMARS by changing GCV with our K...
Comparison of two inference approaches in Gaussian graphical models
Purutçuoğlu Gazi, Vilda; Wit, Ernst (Walter de Gruyter GmbH, 2017-04-01)
Introduction: The Gaussian Graphical Model (GGM) is one of the well-known probabilistic models which is based on the conditional independency of nodes in the biological system. Here, we compare the estimates of the GGM parameters by the graphical lasso (glasso) method and the threshold gradient descent (TGD) algorithm.
Citation Formats
G. B. Bülbül, “Novel model selection criteria on high dimensionalbiological networks,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Statistics., Middle East Technical University, 2019.