Learning customized and optimized lists of rules with mathematical programming

2018-12-01
Rudin, Cynthia
Ertekin Bolelli, Şeyda
We introduce a mathematical programming approach to building rule lists, which are a type of interpretable, nonlinear, and logical machine learning classifier involving IF-THEN rules. Unlike traditional decision tree algorithms like CART and C5.0, this method does not use greedy splitting and pruning. Instead, it aims to fully optimize a combination of accuracy and sparsity, obeying user-defined constraints. This method is useful for producing non-black-box predictive models, and has the benefit of a clear user-defined tradeoff between training accuracy and sparsity. The flexible framework of mathematical programming allows users to create customized models with a provable guarantee of optimality. The software reviewed as part of this submission was given the DOI (Digital Object Identifier) https://doi.org/10.5281/zenodo.1344142.
MATHEMATICAL PROGRAMMING COMPUTATION

Suggestions

Formalizing the specification and execution of workflows using the event calculus
Çiçekli, Fehime Nihan (Elsevier BV, 2006-08-03)
The event calculus is a logic programming formalism for representing events and their effects especially in database applications. This paper proposes the event calculus as a logic-based methodology for the specification and execution of workflows. It is shown that the control flow graph of a workflow specification can be expressed as a set of logical formulas and the event calculus can be used to specify the role of a workflow manager through a set of rules for the execution dependencies of activities. The...
Improving search result clustering by integrating semantic information from Wikipedia
Çallı, Çağatay; Üçoluk, Göktürk; Şehitoğlu, Onur Tolga; Department of Computer Engineering (2010)
Suffix Tree Clustering (STC) is a search result clustering (SRC) algorithm focused on generating overlapping clusters with meaningful labels in linear time. It showed the feasibility of SRC but in time, subsequent studies introduced description-first algorithms that generate better labels and achieve higher precision. Still, STC remained as the fastest SRC algorithm and there appeared studies concerned with different problems of STC. In this thesis, semantic relations between cluster labels and documents ar...
Modeling and analyzing finite state automata in the finite field F 2
Reger, J.; Schmidt, Klaus Verner (Elsevier BV, 2004-06-29)
A method for determining multilinear state space models for general finite state automata is presented. The obtained model resides on F-2, the finite field of characteristic 2 with the operations addition and multiplication, both carried out modulo 2. It is functionally complete in the sense that it is capable of describing all finite state automata, including non-deterministic and partially defined automata. For those cases in which the model over F-2 is linear, means for a complete analysis of the cyclic ...
Computational representation of protein sequences for homology detection and classification
Oğul, Hasan; Mumcuoğlu, Ünal Erkan; Department of Information Systems (2006)
Machine learning techniques have been widely used for classification problems in computational biology. They require that the input must be a collection of fixedlength feature vectors. Since proteins are of varying lengths, there is a need for a means of representing protein sequences by a fixed-number of features. This thesis introduces three novel methods for this purpose: n-peptide compositions with reduced alphabets, pairwise similarity scores by maximal unique matches, and pairwise similarity scores by...
Improved probabilistic decoding of interleaved Reed-Solomon codes and folded Hermitian codes
Özbudak, Ferruh (Elsevier BV, 2014-02-06)
Probabilistic simultaneous polynomial reconstruction algorithm of Bleichenbacher, Kiayias, and Yung is extended to the polynomials whose degrees are allowed to be distinct. Specifically, for a finite field F, positive integers n, r, t and distinct elements z(1), z(2), ... , z(n) is an element of F, we present a probabilistic algorithm which can recover polynomials p(1), p(2), p(r) is an element of F[x] of degree less than k(1), k(2), ... , k(r) respectively for a given instance < y(i), (1, ... ,) y(i,r)>(n)...
Citation Formats
C. Rudin and Ş. Ertekin Bolelli, “Learning customized and optimized lists of rules with mathematical programming,” MATHEMATICAL PROGRAMMING COMPUTATION, pp. 659–702, 2018, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/35666.