A mathematical contribution of statistical learning and continuous optimization using infinite and semi-infinite programming to computational statistics

Download
2009
Özöğür-Akyüz, Süreyya
A subfield of artificial intelligence, machine learning (ML), is concerned with the development of algorithms that allow computers to “learn”. ML is the process of training a system with large number of examples, extracting rules and finding patterns in order to make predictions on new data points (examples). The most common machine learning schemes are supervised, semi-supervised, unsupervised and reinforcement learning. These schemes apply to natural language processing, search engines, medical diagnosis, bioinformatics, detecting credit fraud, stock market analysis, classification of DNA sequences, speech and hand writing recognition in computer vision, to encounter just a few. In this thesis, we focus on Support Vector Machines (SVMs) which is one of the most powerful methods currently in machine learning. As a first motivation, we develop a model selection tool induced into SVM in order to solve a particular problem of computational biology which is prediction of eukaryotic pro-peptide cleavage site applied on the real data collected from NCBI data bank. Based on our biological example, a generalized model selection method is employed as a generalization for all kinds of learning problems. In ML algorithms, one of the crucial issues is the representation of the data. Discrete geometric structures and, especially, linear separability of the data play an important role in ML. If the data is not linearly separable, a kernel function transforms the nonlinear data into a higher-dimensional space in which the nonlinear data are linearly separable. As the data become heterogeneous and large-scale, single kernel methods become insufficient to classify nonlinear data. Convex combinations of kernels were developed to classify this kind of data [8]. Nevertheless, selection of the finite combinations of kernels are limited up to a finite choice. In order to overcome this discrepancy, we propose a novel method of “infinite” kernel combinations for learning problems with the help of infinite and semi-infinite programming regarding all elements in kernel space. This will provide to study variations of combinations of kernels when considering heterogeneous data in real-world applications. Combination of kernels can be done, e.g., along a homotopy parameter or a more specific parameter. Looking at all infinitesimally fine convex combinations of the kernels from the infinite kernel set, the margin is maximized subject to an infinite number of constraints with a compact index set and an additional (Riemann-Stieltjes) integral constraint due to the combinations. After a parametrization in the space of probability measures, it becomes semi-infinite. We analyze the regularity conditions which satisfy the Reduction Ansatz and discuss the type of distribution functions within the structure of the constraints and our bilevel optimization problem. Finally, we adapted well known numerical methods of semiinfinite programming to our new kernel machine. We improved the discretization method for our specific model and proposed two new algorithms. We proved the convergence of the numerical methods and we analyzed the conditions and assumptions of these convergence theorems such as optimality and convergence.

Suggestions

A hypercomputational approach to the agent causation theory of free will
Mersin, Serhan; Sayan, Erdinç; Department of Cognitive Sciences (2006)
Hypercomputation, which is the general concept embracing all machinery capable of carrying out more tasks than Turing Machines and beyond the Turing Limit, has implications for various fields including mathematics, physics, computer science and philosophy. Regarding its philosophical aspects, it is necessary to reveal the position of hypercomputation relative to the classical computational theory of mind in order to clarify and broaden the scope of hypercomputation so that it encompasses some phenomena whic...
Comparison of rough multi layer perceptron and rough radial basis function networks using fuzzy attributes
Vural, Hülya; Alpaslan, Ferda Nur; Department of Computer Engineering (2004)
The hybridization of soft computing methods of Radial Basis Function (RBF) neural networks, Multi Layer Perceptron (MLP) neural networks with back-propagation learning, fuzzy sets and rough sets are studied in the scope of this thesis. Conventional MLP, conventional RBF, fuzzy MLP, fuzzy RBF, rough fuzzy MLP, and rough fuzzy RBF networks are compared. In the fuzzy neural networks implemented in this thesis, the input data and the desired outputs are given fuzzy membership values as the fuzzy properties أlow...
A pattern classification approach for boosting with genetic algorithms
Yalabık, Ismet; Yarman Vural, Fatoş Tunay; Üçoluk, Göktürk; Şehitoğlu, Onur Tolga (2007-11-09)
Ensemble learning is a multiple-classifier machine learning approach which produces collections and ensembles statistical classifiers to build up more accurate classifier than the individual classifiers. Bagging, boosting and voting methods are the basic examples of ensemble learning. In this study, a novel boosting technique targeting to solve partial problems of AdaBoost, a well-known boosting algorithm, is proposed. The proposed system finds an elegant way of boosting a bunch of classifiers successively ...
Evolving aggregation behaviors for swarm robotics systems: a systematic case study
Bahçeci, Erkin; Şahin, Erol; Department of Computer Engineering (2005)
Evolutionary methods are shown to be useful in developing behaviors in robotics. Interest in the use of evolution in swarm robotics is also on the rise. However, when one attempts to use artificial evolution to develop behaviors for a swarm robotic system, he is faced with decisions to be made regarding some parameters of fitness evaluations and of the genetic algorithm. In this thesis, aggregation behavior is chosen as a case, where performance and scalability of aggregation behaviors of perceptron control...
A simulation tool for mc6811
Sarıkan (Tuncer), Nazlı; Güran, Hasan; Department of Electrical and Electronics Engineering (2004)
The aim of this thesis study is to develop a simulator for an 8-bit microcontroller and the written document of this thesis study analyses the process of devoloping a software for simulating an 8 bit microcontroller, MC68HC11. In this simulator study a file processing including the parsing of the assembler code and the compilation of the parsed instructions is studied. Also all the instruction execution process containing the cycle and instruction execution and the interrupt routine execution is observed th...
Citation Formats
S. Özöğür-Akyüz, “A mathematical contribution of statistical learning and continuous optimization using infinite and semi-infinite programming to computational statistics,” Ph.D. - Doctoral Program, Middle East Technical University, 2009.