Recent Advances in Optimization Models for Data Mining: Clustering, Feature Selection and Classification

2008-09-01
Fan, Y-j
İyigün, Cem
Chaovalitwongse , W. A.
Data mining aims at finding interesting, useful or profitable information in very large databases. The enormous increase in the size of available scientific and commercial databases (data avalanche) as well as the continuing and exponential growth in performance of present day computers make data mining a very active field. In many cases, the burgeoning volume of data sets has grown so large that it threatens to overwhelm rather than enlighten scientists. Therefore, traditional methods are revised and streamlined, complemented by many new methods to address challenging new problems. Mathematical Programming plays a key role in this endeavor. It helps us to formulate precise objectives (e.g., a clustering criterion or a measure of discrimination) as well as the constraints imposed on the solution (e.g., find a partition, a covering or a hierarchy in clustering). It also provides powerful mathematical tools to build highly performing exact or approximate algorithms. This book is based on lectures presented at the workshop on ""Data Mining and Mathematical Programming"" (October 10-13, 2006, Montreal) and will be a valuable scientific source of information to faculty, students, and researchers in optimization, data analysis and data mining, as well as people working in computer science, engineering and applied mathematics.

Suggestions

An Effective approach for comparison of association rule mining algorithms based on controlled data, statistical inference and multiple criteria
Azadiamin, Sanam; Köksal, Gülser; Department of Industrial Engineering (2016)
Association rules are an important set of data mining results, which are helpful in handling large amount of data and extracting useful association information from them. There are many algorithms developed for finding interesting association rules and also some other algorithms for rule reduction purposes. All of the proposed methods have some strong and weak points, which can be useful according to their application areas. In the literature, there exist several comparison studies trying to find the best a...
Using operational data for decision making a feasibility study in rail maintenance
Marsh, William; Nur, Khalid; Yet, Barbaros; Majumdar, Arnab (2016-05-01)
In many organisations, large databases are created as part of the business operation: the promise of ‘big data’ is to extract information from these databases to make smarter decisions. We explore the feasibility of this approach for better decision-making for maintenance, specifically for rail infrastructure. We argue that the data should be used within a Bayesian framework with the aim of inferring the underlying state of the system so we can predict future failures and improve decision-making. Within thi...
Data mining analysis of economic indicators of countries
Güngör, Erdem; Yozgatlıgil, Ceylan; Department of Statistics (2020-8)
Data Mining is becoming a famous analysis day by day to reveal the hidden information within big data. In the study, we use data mining techniques on the economic indicators of the countries. The four data mining techniques are to be implemented on the dataset. Making homogenous groups of the countries whose economic characteristics are similar are obtained by the Clustering Algorithm. After the clustering algorithm is performed, we pass to Association Rule Data Mining to investigate the most exported produ...
Data mining in deductive databases using query flocks
Toroslu, İsmail Hakkı (Elsevier BV, 2005-04-01)
Data mining can be defined as a process for finding trends and patterns in large data. An important technique for extracting useful information, such as regularities, from usually historical data, is called as association rule mining. Most research on data mining is concentrated on traditional relational data model. On the other hand, the query flocks technique, which extends the concept of association rule mining with a 'generate-and-test' model for different kind of patterns, can also be applied to deduct...
An application of the minimal spanning tree approach to the cluster stability problem
Volkovich, Z.; Barzily, Z.; Weber, Gerhard Wilhelm; Toledano-Kitai, D.; Avros, R. (Springer Science and Business Media LLC, 2012-03-01)
Among the areas of data and text mining which are employed today in OR, science, economy and technology, clustering theory serves as a preprocessing step in the data analyzing. An important component of clustering theory is determination of the true number of clusters. This problem has not been satisfactorily solved. In our paper, this problem is addressed by the cluster stability approach. For several possible numbers of clusters, we estimate the stability of the partitions obtained from clustering of samp...
Citation Formats
Y.-j. Fan, C. İyigün, and W. A. Chaovalitwongse, Recent Advances in Optimization Models for Data Mining: Clustering, Feature Selection and Classification. 2008, p. 95.