Efficient adaptive regression spline algorithms based on mapping approach with a case study on finance

Koc E. K. , İYİGÜN C., BATMAZ İ., Weber G.

JOURNAL OF GLOBAL OPTIMIZATION, vol.60, no.1, pp.103-120, 2014 (Peer-Reviewed Journal) identifier identifier

  • Publication Type: Article / Article
  • Volume: 60 Issue: 1
  • Publication Date: 2014
  • Doi Number: 10.1007/s10898-014-0211-1
  • Journal Indexes: Science Citation Index Expanded, Scopus
  • Page Numbers: pp.103-120
  • Keywords: Regression splines, Optimization, Multivariate adaptive regression splines (MARS), CMARS, Self organizing maps, Data mining, Interest rate estimation, GENERALIZED ADDITIVE-MODELS, CONTINUOUS OPTIMIZATION, MULTIVARIATE, CMARS


Multivariate adaptive regression splines (MARS) has become a popular data mining (DM) tool due to its flexible model building strategy for high dimensional data. Compared to well-known others, it performs better in many areas such as finance, informatics, technology and science. Many studies have been conducted on improving its performance. For this purpose, an alternative backward stepwise algorithm is proposed through Conic-MARS (CMARS) method which uses a penalized residual sum of squares for MARS as a Tikhonov regularization problem. Additionally, by modifying the forward step of MARS via mapping approach, a time efficient procedure has been introduced by S-FMARS. Inspiring from the advantages of MARS, CMARS and S-FMARS, two hybrid methods are proposed in this study, aiming to produce time efficient DM tools without degrading their performances especially for large datasets. The resulting methods, called SMARS and SCMARS, are tested in terms of several performance criteria such as accuracy, complexity, stability and robustness via simulated and real life datasets. As a DM application, the hybrid methods are also applied to an important field of finance for predicting interest rates offered by a Turkish bank to its customers. The results show that the proposed hybrid methods, being the most time efficient with competing performances, can be considered as powerful choices particularly for large datasets.