Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
On initial population generation in feature subset selection
Date
2019-12-15
Author
Deniz, Ayca
Kiziloz, Hakan Ezgi
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
153
views
0
downloads
Cite This
Performance of evolutionary algorithms depends on many factors such as population size, number of generations, crossover or mutation probability, etc. Generating the initial population is one of the important steps in evolutionary algorithms. A poor initial population may unnecessarily increase the number of searches or it may cause the algorithm to converge at local optima. In this study, we aim to find a promising method for generating the initial population, in the Feature Subset Selection (FSS) domain. FSS is not considered as an expert system by itself, yet it constitutes a significant step in many expert systems. It eliminates redundancy in data, which decreases training time and improves solution quality. To achieve our goal, we compare a total of five different initial population generation methods; Information Gain Ranking (IGR), greedy approach and three types of random approaches. We evaluate these methods using a specialized Teaching Learning Based Optimization searching algorithm (MTLBO-MD), and three supervised learning classifiers: Logistic Regression, Support Vector Machines, and Extreme Learning Machine. In our experiments, we employ 12 publicly available datasets, mostly obtained from the well-known UCI Machine Learning Repository. According to their feature sizes and instance counts, we manually classify these datasets as small, medium, or large-sized. Experimental results indicate that all tested methods achieve similar solutions on small-sized datasets. For medium-sized and large-sized datasets, however, the IGR method provides a better starting point in terms of execution time and learning performance. Finally, when compared with other studies in literature, the IGR method proves to be a viable option for initial population generation.
Subject Keywords
General Engineering
,
Artificial Intelligence
,
Computer Science Applications
URI
https://hdl.handle.net/11511/65237
Journal
EXPERT SYSTEMS WITH APPLICATIONS
DOI
https://doi.org/10.1016/j.eswa.2019.06.063
Collections
Department of Computer Engineering, Article
Suggestions
OpenMETU
Core
ILP-based concept discovery in multi-relational data mining
Kavurucu, Yusuf; Karagöz, Pınar; Toroslu, İsmail Hakkı (Elsevier BV, 2009-11-01)
Multi-relational data mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. Several relational knowledge discovery systems have been developed employing various search strategies, heuristics, language pattern limitations and hypothesis evaluation criteria, in order to cope with intractably large search space and to be able to generate high-quality patterns. In this work, an ILP-based concept discov...
Genetic algorithm solution of the TSP avoiding special crossover and mutation
Üçoluk, Göktürk (Computers, Materials and Continua (Tech Science Press), 2002-01-01)
Ordinary representations of permutations in Genetic Algorithms (GA) is handicapped with producing offspring which are not permutations at all. The conventional solution for crossover and mutation operations of permutations is to device 'special' operators. Unfortunately these operators suffer from violating the nature of crossover. Namely, considering the gene positions on the chromosome, these methods do not allow n-point crossover techniques which are known to favour building-block formations. In this wor...
Data mining in deductive databases using query flocks
Toroslu, İsmail Hakkı (Elsevier BV, 2005-04-01)
Data mining can be defined as a process for finding trends and patterns in large data. An important technique for extracting useful information, such as regularities, from usually historical data, is called as association rule mining. Most research on data mining is concentrated on traditional relational data model. On the other hand, the query flocks technique, which extends the concept of association rule mining with a 'generate-and-test' model for different kind of patterns, can also be applied to deduct...
Classification models based on Tanaka's fuzzy linear regression approach: The case of customer satisfaction modeling
ŞİKKELİ, GİZEM; KÖKSAL, GÜLSER; Batmaz, İnci; TÜRKER BAYRAK, ÖZLEM (IOS Press, 2010-01-01)
Fuzzy linear regression (FLR) approaches are widely used for modeling relations between variables that involve human judgments, qualitative and imprecise data. Tanaka's FLR analysis is the first one developed and widely used for this purpose. However, this method is not appropriate for classification problems, because it can only handle continuous type dependent variables rather than categorical. In this study, we propose three alternative approaches for building classification models, for a customer satisf...
Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition
Buyuksahin, Umit Cavus; Ertekin Bolelli, Şeyda (Elsevier BV, 2019-10-07)
Many applications in different domains produce large amount of time series data. Making accurate forecasting is critical for many decision makers. Various time series forecasting methods exist that use linear and nonlinear models separately or combination of both. Studies show that combining of linear and nonlinear models can be effective to improve forecasting performance. However, some assumptions that those existing methods make, might restrict their performance in certain situations. We provide a new Au...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
A. Deniz and H. E. Kiziloz, “On initial population generation in feature subset selection,”
EXPERT SYSTEMS WITH APPLICATIONS
, pp. 11–21, 2019, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/65237.