Optimization Approaches for Classification and Feature Selection Using Overlapping Hyperboxes


Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Mühendislik Fakültesi, Endüstri Mühendisliği Bölümü, Türkiye

Tezin Onay Tarihi: 2019

Öğrenci: DERYA AKBULUT

Eş Danışman: CEM İYİGÜN, NUR EVİN ÖZDEMİREL

Özet:

In this thesis, an optimization approach is proposed for the binary classification problem. A mixed integer programming (MIP) model formulation is used to generate hyperboxes as classifiers. The hyperboxes are determined by lower and upper bounds on the feature values, and overlapping of hyperboxes is allowed to reach a balance between misclassification and overfitting. For the test phase, distance-based heuristic algorithms are also developed to classify the overlap and uncovered samples that are not classified by the hyperboxes. A matheuristic, namely Hyperbox Classification for Binary classes (HCB), is developed based on the MIP formulation. In each iteration of the HCB algorithm, a fixed number of hyperboxes are generated using the MIP model, and unclassified sample size is reduced by a hyperbox trimming algorithm. Although HCB controls the number of hyperboxes in a greedy manner, it provides an overall hyperbox configuration with no misclassification at the end of the training phase. HCB is extended as HCB-f with the addition of feature selection property. Starting with a single feature, HCB-f inserts features and hyperboxes to the model iteratively. When the algorithm terminates, only the set of inserted features are used for classification, hence they are selected.