A new contribution to nonlinear robust regression and classification with mars and its applications to data mining for quality control in manufacturing

Download
2008
Yerlikaya, Fatma
Multivariate adaptive regression spline (MARS) denotes a modern methodology from statistical learning which is very important in both classification and regression, with an increasing number of applications in many areas of science, economy and technology. MARS is very useful for high dimensional problems and shows a great promise for fitting nonlinear multivariate functions. MARS technique does not impose any particular class of relationship between the predictor variables and outcome variable of interest. In other words, a special advantage of MARS lies in its ability to estimate the contribution of the basis functions so that both the additive and interaction effects of the predictors are allowed to determine the response variable. The function fitted by MARS is continuous, whereas the one fitted by classical classification methods (CART) is not. Herewith, MARS becomes an alternative to CART. The MARS algorithm for estimating the model function consists of two complementary algorithms: the forward and backward stepwise algorithms. In the first step, the model is built by adding basis functions until a maximum level of complexity is reached. On the other hand, the backward stepwise algorithm is began by removing the least significant basis functions from the model. In this study, we propose not to use the backward stepwise algorithm. Instead, we construct a penalized residual sum of squares (PRSS) for MARS as a Tikhonov regularization problem, which is also known as ridge regression. We treat this problem using continuous optimization techniques which we consider to become an important complementary technology and alternative to the concept of the backward stepwise algorithm. In particular, we apply the elegant framework of conic quadratic programming which is an area of convex optimization that is very well-structured, herewith, resembling linear programming and, hence, permitting the use of interior point methods. The boundaries of this optimization problem are determined by the multiobjective optimization approach which provides us many alternative solutions. Based on these theoretical and algorithmical studies, this MSc thesis work also contains applications on the data investigated in a TÜBİTAK project on quality control. By these applications, MARS and our new method are compared.

Suggestions

Continuous optimization applied in MARS for modern applications in finance, science and technology
Taylan, Pakize; Weber, Gerhard Wilhelm; Yerlikaya, Fatma (2008-05-23)
Multivariate adaptive regression spline (MARS) denotes a tool from statistics, important in classification and regression, with applicability in many areas of finance, science and technology. It is very useful in high dimensions and shows a great promise for fitting nonlinear multivariate functions. The MARS algorithm for estimating the model function consists of two subalgorithms. We propose not to use the second one (backward stepwise algorithm), but we construct a penalized residual sum of squares for a ...
A Bayesian Approach to Learning Scoring Systems
Ertekin Bolelli, Şeyda (2015-12-01)
We present a Bayesian method for building scoring systems, which are linear models with coefficients that have very few significant digits. Usually the construction of scoring systems involve manual efforthumans invent the full scoring system without using data, or they choose how logistic regression coefficients should be scaled and rounded to produce a scoring system. These kinds of heuristics lead to suboptimal solutions. Our approach is different in that humans need only specify the prior over what the ...
A new approach to multivariate adaptive regression splines by using Tikhonov regularization and continuous optimization
TAYLAN, PAKİZE; Weber, Gerhard Wilhelm; Ozkurt, Fatma Yerlikaya (2010-12-01)
This paper introduces a model-based approach to the important data mining tool Multivariate adaptive regression splines (MARS), which has originally been organized in a more model-free way. Indeed, MARS denotes a modern methodology from statistical learning which is important in both classification and regression, with an increasing number of applications in many areas of science, economy and technology. It is very useful for high-dimensional problems and shows a great promise for fitting nonlinear multivar...
Uncertainty models for vector based functional curves and assessing the reliability of G-Band
Kurtar, Ahmet Kürşat; Düzgün, H. Şebnem; Department of Geodetic and Geographical Information Technologies (2006)
This study is about uncertainty medelling for vector features in geographic information systems (GIS). It has mainly two objectives which are about the band models used for uncertainty modelling . The first one is the assessment of accuracy of GBand model, which is the latest and the most complex uncertainty handling model for vector features. Some simulations and tests are applied to test the reliability of accuracy of G-Band with comparing Chrisman’s epsilon band model, which is the most frequently used b...
Configuration of Neural Networks for the Analysis of Seasonal Time Series
Taşkaya Temizel, Tuğba (2005-08-25)
Time series often exhibit periodical patterns that can be analysed by conventional statistical techniques. These techniques rely upon an appropriate choice of model parameters that are often difficult to determine. Whilst neural networks also require an appropriate parameter configuration, they offer a way in which non-linear patterns may be modelled. However, evidence from a limited number of experiments has been used to argue that periodical patterns cannot be modelled using such networks. In this paper, ...
Citation Formats
F. Yerlikaya, “A new contribution to nonlinear robust regression and classification with mars and its applications to data mining for quality control in manufacturing,” M.S. - Master of Science, Middle East Technical University, 2008.