A novel SVM-ID3 Hybrid Feature Selection Method to Build a Disease Model for Melanoma using Integrated Genotyping and Phenotype Data from dbGaP

The relations between Single Nucleotide Polymorphism (SNP) and complex diseases are likely to be non-linear and require analysis of the high dimensional data. Previous studies in the field mostly focus on genotyping and effects of various phenotypes are not considered. To fill this gap a hybrid feature selection model of support vector machine and decision tree has been designed. The designed method is tested on melanoma. We were able to select phenotypic features such as moles and dysplastic nevi, and SNPs those maps to specific genes such as CAMK1D. The performance results of the proposed hybrid model, on melanoma dataset are 79.07% of sensitivity and 0.81 of area under ROC curve.


A Prostate Cancer Model Build by a Novel SVM-ID3 Hybrid Feature Selection Method Using Both Genotyping and Phenotype Data from dbGaP
Yucebas, Sait Can; Aydın Son, Yeşim (2014-03-20)
Through Genome Wide Association Studies (GWAS) many Single Nucleotide Polymorphism (SNP)-complex disease relations can be investigated. The output of GWAS can be high in amount and high dimensional, also relations between SNPs, phenotypes and diseases are most likely to be nonlinear. In order to handle high volume-high dimensional data and to be able to find the nonlinear relations we have utilized data mining approaches and a hybrid feature selection model of support vector machine and decision tree has be...
A multi-layered graphical model of the relation among SNPS, GENES, and pathways based on subgraph search
Ersoy, Gökhan; Aydın Son, Yeşim; Can, Tolga; Department of Bioinformatics (2015)
The analysis of Single Nucleotide Polymorphisms (SNPs) through Genome Wide Association Studies (GWAS) presents great potential for describing disease loci and gaining insight into the underlying etiology of diseases. Recently described combined p-value approach allows identification of associations at gene and pathway level. The integrated programs like METU-SNP produce simple lists of either SNP id/gene id/pathway title and their p-values and significance status or SNP id/disease id/pathway information. In...
A photogrammetry based method for determination of 3D morphological indices of coarse aggregates
Öztürk, Hande Işık (Elsevier BV, 2020-11-30)
Over the last half-century, various image-based methods have been developed to quantify the morphological indices of aggregates. The majority of these methods rely on two-dimensional (2D) imaging for practical reasons, whereas a limited number of these use three-dimensional (3D) imaging. The focus of this study is to develop a practical and inexpensive 3D photogrammetry based method. In this study, 3D morphological indices and shape features of 57 aggregates from five sources are determined by the photogram...
A novel model-based method for feature extraction from protein sequences for classification
Sarac, Omer Sinan; Atalay, Mehmet Volkan; Atalay, Rengül (2006-01-01)
Representation of amino-acid sequences constitutes the key point in classification of proteins into functional or structural classes. The representation should contain the biologically meaningful information hidden in the primary sequence of the protein. Conserved or similar subsequences are strong indicators of functional and structural similarity. In this study we present a feature mapping that takes into account the models of the subsequences of protein sequences. An expectation-maximization algorithm al...
A Hybrid geo-activity recommendation system using advanced feature combination and semantic activity similarity
Sattari, Masoud; Toroslu, İsmail Hakkı; Department of Computer Engineering (2013)
In this study, a new method for analyzing and representing the discriminative information, distributed in functional Magnetic Resonance Imaging (fMRI) data, is proposed. For this purpose, a local mesh with varying size is formed around each voxel, called the seed voxel. The relationships among each seed voxel and its neighbors are estimated using a linear regression equation by minimizing the expectation of the squared error. This squared error coming from linear regression is used to calculate various info...
