Model comparison for gynecological cancer datasets and selection of threshold value

Bahçivancı, Başak
Cancer is a very common system’s disease with its structural and functional complexities caused by high dimension and serious correlation of genes as well as sparsity of gene interactions. Hereby, different mathematical models have been suggested in the literature to unravel these challenges. Among many alternates, in this study we use the Gaussian graphical model, Gaussian copula graphical model and loop-based multivariate adaptive regression splines with/without interaction models due to their advantages over others from simulated datasets. In the first part of the thesis, we apply these models in our quasi-true cancer network by implementing real microarray datasets. The gynecological cancer is the second leading cancer type in women after the breast cancer. But there are less studies about it regarding the breast cancer because of its sociological reasons. Herein, initially, we detect the related literature and generate a list of core genes for this illness. Then, we construct a quasi-true network from these genes. Finally, we infer this network via underlying models and assess their accuracies. Hence, we can realistically evaluate the performance of these models in an actual disease’s system. In these analyses, we also observe that the estimates of models highly depend on their threshold values which convert estimated strengths of gene interactions as binary form to construct the graphical network. Thereby, in the second part of the thesis, we propose a novel approach for the selection of this value by considering the topology of networks and assess our performance via accuracy and computational time.
Citation Formats
B. Bahçivancı, “Model comparison for gynecological cancer datasets and selection of threshold value,” Thesis (M.S.) -- Graduate School of Natural and Applied Sciences. Statistics., Middle East Technical University, 2019.