Transformations of Data in Deterministic Modelling of Biological Networks

2015-05-21
The Gaussian graphical model (GGM) is a probabilistic modelling approach used in the system biology to represent the relationship between genes with an undirected graph. In graphical models, the genes and their interactions are denoted by nodes and the edges between nodes. Hereby, in this model, it is assumed that the structure of the system can be described by the inverse of the covariance matrix, Theta, which is also called as the precision, when the observations are formulated via a lasso regression under the multivariate normality assumption of states. There are several approaches to estimate Theta in GGM. The most well-known ones are the neighborhood selection algorithm and the graphical lasso (glasso) approach. On the other hand, the multivariate adaptive regression splines (MARS) is a non-parametric regression technique to model nonlinear and highly dependent data successfully. From previous simulation studies, it has been found that MARS can be a strong alternative of GGM if the model is constructed similar to a lasso model and the interaction terms in the optimal model are ignored to get comparable results with respect to the GGM findings. Moreover, it has been detected that the major challenge in both modelling approaches is the high sparsity of Theta due to the possible non-linear interactions between genes, in particular, when the dimensions of the networks are realistically large. In this study, as the novelty, we suggest the Bernstein operators, namely, Bernstein and Szasz polynomials, in the raw data before any lasso type of modelling and associated inference approaches. Because from the findings via GGM with small and moderately large systems, we have observed that the Bernstein polynomials can increase the accuracy of the estimates. Hence, in this work, we perform these operators firstly into the most well-known inference approaches used in GGM under realistically large networks. Then, we investigate the assessment of these transformations for the MARS modelling as the alternative of GGM again under the same large complexity. By this way, we aim to propose these transformation techniques for all sorts of modellings under the steady-state condition of the protein-protein interaction networks in order to get more accurate estimates without any computational cost. In the evaluation of the results, we compare the precision and F-measures of the simulated datasets.
3rd International Conference on Applied Mathematics and Approximation Theory (AMAT), (18 - 21 Mayıs 2015)

Suggestions

Novel model selection criteria on high dimensionalbiological networks
Bülbül, Gül Baha; Purutçuoğlu Gazi, Vilda; Department of Statistics (2019)
Gaussian graphical model (GGM) is an useful tool to describe the undirected associ-ations among the genes in the sparse biological network. To infer such high dimen-sional biological networks, thel1-penalized maximum-likelihood estimation methodis used. This approach performs a variable selection procedure by using a regular-ization parameter which controls the sparsity in the network. Thus, a selection ofthe regularization parameter becomes crucial to define the true interactions in the bi-ological ne...
TRACEMIN Fiedler A Parallel Algorithm for Computing the Fiedler Vector
Manguoğlu, Murat; Saied, Faisal; Sameh, Ahmed (null; 2010-06-25)
The eigenvector corresponding to the second smallest eigenvalue of the Laplacian of a graph, known as the Fiedler vector, has a number of applications in areas that include matrix reordering, graph partitioning, protein analysis, data mining, machine learning, and web search. The computation of the Fiedler vector has been regarded as an expensive process as it involves solving a large eigenvalue problem. We present a novel and efficient parallel algorithm for computing the Fiedler vector of large graphs bas...
Application of ODSA to population calculation
Ulukaya, Mustafa; Demirbaş, Kerim; Department of Electrical and Electronics Engineering (2006)
In this thesis, Optimum Decoding-based Smoothing Algorithm (ODSA) is applied to well-known Discrete Lotka-Volterra Model. The performance of the algorithm is investigated for various parameters by simulations. Moreover, ODSA is compared with the SIR Particle Filter Algorithm. The advantages and disadvantages of the both algorithms are presented.
Bernstein approximations in glasso-based estimation of biological networks
Purutçuoğlu Gazi, Vilda; Wit, Ernst (2017-03-01)
The Gaussian graphical model (GGM) is one of the common dynamic modelling approaches in the construction of gene networks. In inference of this modelling the interaction between genes can be detected mainly via graphical lasso (glasso) or coordinate descent-based approaches. Although these methods are successful in moderate networks, their performances in accuracy decrease when the system becomes sparser. We here implement a particular type of polynomial transformations, called the Bernstein polynomials, of...
Comparison of two inference approaches in Gaussian graphical models
Purutçuoğlu Gazi, Vilda; Wit, Ernst (Walter de Gruyter GmbH, 2017-04-01)
Introduction: The Gaussian Graphical Model (GGM) is one of the well-known probabilistic models which is based on the conditional independency of nodes in the biological system. Here, we compare the estimates of the GGM parameters by the graphical lasso (glasso) method and the threshold gradient descent (TGD) algorithm.
Citation Formats
M. Agraz and V. Purutçuoğlu Gazi, “Transformations of Data in Deterministic Modelling of Biological Networks,” Ankara, Türkiye, 2015, vol. 441, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/35602.