A new contribution to nonlinear robust regression and classification with MARS and its applications to data mining for quality control in manufacturing


Tezin Türü: Yüksek Lisans

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Fen Edebiyat Fakültesi, İstatistik Bölümü, Türkiye

Tezin Onay Tarihi: 2008

Öğrenci: FATMA YERLİKAYA

Danışman: İNCİ BATMAZ

Özet:

Multivariate adaptive regression spline (MARS) denotes a modern methodology from statistical learning which is very important in both classification and regression, with an increasing number of applications in many areas of science, economy and technology. MARS is very useful for high dimensional problems and shows a great promise for fitting nonlinear multivariate functions. MARS technique does not impose any particular class of relationship between the predictor variables and outcome variable of interest. In other words, a special advantage of MARS lies in its ability to estimate the contribution of the basis functions so that both the additive and interaction effects of the predictors are allowed to determine the response variable. The function fitted by MARS is continuous, whereas the one fitted by classical classification methods (CART) is not. Herewith, MARS becomes an alternative to CART. The MARS algorithm for estimating the model function consists of two complementary algorithms: the forward and backward stepwise algorithms. In the first step, the model is built by adding basis functions until a maximum level of complexity is reached. On the other hand, the backward stepwise algorithm is began by removing the least significant basis functions from the model. In this study, we propose not to use the backward stepwise algorithm. Instead, we construct a penalized residual sum of squares (PRSS) for MARS as a Tikhonov regularization problem, which is also known as ridge regression. We treat this problem using continuous optimization techniques which we consider to become an important complementary technology and alternative to the concept of the backward stepwise algorithm. In particular, we apply the elegant framework of conic quadratic programming which is an area of convex optimization that is very well-structured, herewith, resembling linear programming and, hence, permitting the use of interior point methods. The boundaries of this optimization problem are determined by the multiobjective optimization approach which provides us many alternative solutions. Based on these theoretical and algorithmical studies, this MSc thesis work also contains applications on the data investigated in a TÜBİTAK project on quality control. By these applications, MARS and our new method are compared.