Distributed database design with integer linear programming and evolutionary hybrid algorithms


Tezin Türü: Doktora

Tezin Yürütüldüğü Kurum: Orta Doğu Teknik Üniversitesi, Fen Bilimleri Enstitüsü, Fen Bilimleri Enstitüsü, Türkiye

Tezin Onay Tarihi: 2013

Öğrenci: UMUT TOSUN

Danışman: AHMET COŞAR

Özet:

The communication costs of remote access and retrieval of table fragments required in the execution of distributed database queries, are the major factors determining the quality of a distributed database design. Data allocation algorithms try to minimize these costs by dividing database tables into horizontal fragments, then assigning each fragment at or near the database sites they are needed more frequently. In this thesis, we propose efficient optimization algorithms for centralized and distributed databases based on integer linear programming and the well known Quadratic Assignment Problem (QAP). In the context of distributed database design, we have formulated an optimization problem where site capacities, table replicas, communication link capacities and table fragmentation are handled. Integer linear programming is a powerful tool to represent all of these optimization parameters easily while existing models in the literature ignore one or more of these important and practical aspects. The proposed model is based on integer linear programming and for the first time it handles all of these criteria in a consistent formulation. The QAP is a difficult and important problem studied in the domain of combinatorial optimization. It is possible to solve QAP instances with 10-20 facilities using exhaustive parallel algorithms within a few days on a cluster machine. We solve large QAP instances using a variety of genetic algorithm crossover operators for these problems and experimentally verified their performance using benchmark QAP instances from QAPLIB library. We propose and experimentally evaluate the island parallel QAP-IPGA algorithm. Then, we propose a new Parallel Hybrid Genetic Tabu Search based algorithm (PHGETS) which has an intelligent diversification phase executed on a seed solution obtained using the genetic algorithm. PHGETS consistently achieves results on average within 0.05 % of the best solutions given in QAPLIB for all problem instances that were considered.