Modeling of biochemical networks via classification and regression tree methods

2018-08-01
Seçilmiş, Deniz
Purutçuoğlu Gazi, Vilda
In the description of biological networks, a number of modeling approaches has been suggested based on different assumptions. The major problems in these models and their associated inference approaches are the complexity of biological systems, resulting in high number of model parameters, few observations from each variable in the system, their sparse structures, and high correlation between model parameters. From recent studies, it has been seen that the nonparametric methods can ameliorate these challenges and be one of the strong alternative approaches. Furthermore, it has been observed that not only the regression type of nonparametric models but also nonparametric clustering methods whose calculations are adapted to the biochemical systems can be another promising choice. Hereby, in this study, we propose the classification and regression tree (CART) method as a new approach in the construction of the complex systems when the system’s activity is described under its steady-state condition. Basically, CART is a classification technique for highly correlated data and can be represented as the nonparametric version of the generalized additive model. In this work, we use CART in the construction of biological modules and then networks. We analyze the performance of CART comprehensively under various Monte Carlo scenarios such as different data distributions and dimensions. We compare our results with the outputs of the Gaussian graphical model (GGM) which is the most well-known model under the given condition of the system. In our study, we also evaluate the performance of CART with the GGM findings by using real systems. For this purpose, we choose the pathways which have a crucial role on the cervical cancer. In the analyses, we consider this particular illness since it is the second most common cancer type in women both in Turkey and in the world after the breast cancer, and there is only a limited information for the description of this complex system disease.

Suggestions

Modeling of Biochemical Networks via Classification and Regression Tree Methods
Seçilmiş, Deniz; Purutçuoğlu Gazi, Vilda (Springer, 2019-02-01)
In the description of biological networks, a number of modeling approaches has been suggested based on different assumptions. The major problems in these models and their associated inference approaches are the complexity of biological systems, resulting in high number of model parameters, few observations from each variable in the system, their sparse structures, and high correlation between model parameters. From recent studies, it has been seen that the nonparametric methods can ameliorate these challeng...
Modeling, inference and optimization of regulatory networks based on time series data
Weber, Gerhard Wilhelm; DEFTERLİ, ÖZLEM; ALPARSLAN GÖK, Sırma Zeynep; Kropat, Erik (2011-05-16)
In this survey paper, we present advances achieved during the last years in the development and use of OR, in particular, optimization methods in the new gene-environment and eco-finance networks, based on usually finite data series, with an emphasis on uncertainty in them and in the interactions of the model items. Indeed, our networks represent models in the form of time-continuous and time-discrete dynamics, whose unknown parameters we estimate under constraints on complexity and regularization by variou...
Novel model selection criteria on high dimensionalbiological networks
Bülbül, Gül Baha; Purutçuoğlu Gazi, Vilda; Department of Statistics (2019)
Gaussian graphical model (GGM) is an useful tool to describe the undirected associ-ations among the genes in the sparse biological network. To infer such high dimen-sional biological networks, thel1-penalized maximum-likelihood estimation methodis used. This approach performs a variable selection procedure by using a regular-ization parameter which controls the sparsity in the network. Thus, a selection ofthe regularization parameter becomes crucial to define the true interactions in the bi-ological ne...
Semi-Bayesian Inference of Time Series Chain Graphical Models in Biological Networks
Farnoudkia, Hajar; Purutçuoğlu Gazi, Vilda (null; 2018-09-20)
The construction of biological networks via time-course datasets can be performed both deterministic models such as ordinary differential equations and stochastic models such as diffusion approximation. Between these two branches, the former has wider application since more data can be available. In this study, we particularly deal with the probabilistic approaches for the steady-state or deterministic description of the biological systems when the systems are observed though time. Hence, we consider time s...
Graphical models in inference of biological networks
Farnoudkia, Hajar; Purutçuoğlu Gazi, Vilda; Department of Statistics (2020)
In recent years, particularly, on the studies about the complex system’s diseases, better understanding the biological systems and observing how the system’s behaviors, which are affected by the treatment or similar conditions, accelerate with the help of the explanation of these systems via the mathematical modeling. Gaussian Graphical Models (GGM) is a model that describes the relationship between the system’s elements via the regression and represents the states of the system via the multivariate Gaussia...
Citation Formats
D. Seçilmiş and V. Purutçuoğlu Gazi, Modeling of biochemical networks via classification and regression tree methods. 2018, p. 102.