Robust heuristic algorithms for exploiting the common tasks of relational cloud database queries

2015-05-01
Dokeroglu, Tansel
Bayir, Murat Ali
Coşar, Ahmet
Cloud computing enables a conventional relational database system's hardware to be adjusted dynamically according to query workload, performance and deadline constraints. One can rent a large amount of resources for a short duration in order to run complex queries efficiently on large-scale data with virtual machine clusters. Complex queries usually contain common subexpressions, either in a single query or among multiple queries that are submitted as a batch. The common subexpressions scan the same relations, compute the same tasks (join, sort, etc.), and/or ship the same data among virtual computers. The total time spent for the queries can be reduced by executing these common tasks only once. In this study, we build and use efficient sets of query execution plans to reduce the total execution time. This is an NP-Hard problem therefore, a set of robust heuristic algorithms, Branch-and-Bound, Genetic, Hill Climbing, and Hybrid Genetic-Hill Climbing, are proposed to find (near-) optimal query execution plans and maximize the benefits. The optimization time of each algorithm for identifying the query execution plans and the quality of these plans are analyzed by extensive experiments.
APPLIED SOFT COMPUTING

Suggestions

Improving the performance of Hadoop/Hive by sharing scan and computation tasks
Özal, Serkan; Toroslu, İsmail Hakkı; Doğaç, Asuman; Department of Computer Engineering (2013)
MapReduce is a popular model of executing time-consuming analytical queries as a batch of tasks on large scale data. During simultaneous execution of multiple queries, many oppor- tunities can arise for sharing scan and/or computation tasks. Executing common tasks only once can reduce the total execution time of all queries remarkably. Therefore, we propose to use Multiple Query Optimization (MQO) techniques to improve the overall performance of Hadoop Hive, an open source SQL-based distributed warehouse sy...
EXTENSION OF AN OPEN SOURCE RESOURCE MANAGEMENT TOOL FOR HETEROGENEOUS CLOUD DATA CENTERS: IMPLEMENTATION AND EVALUATION
Doğan, Taha; Schmidt, Şenan Ece; Department of Electrical and Electronics Engineering (2022-2-11)
Cloud Computing is enabled by the virtualization of computing resources to realize users' requests of virtual machines (VMs) and data processing in the scope of Infrastructure as a Service (IaaS) and Software as a Service (SaaS) respectively. The current heterogeneous cloud data centers incorporate hardware accelerators in addition to the conventional servers to offer these services more efficiently. It is an important research problem to allocate heterogeneous physical computing resources to a mixture of ...
An Evolutionary Genetic Algorithm for Optimization of Distributed Database Queries
Sevinc, Ender; Coşar, Ahmet (2011-05-01)
High-performance low-cost PC hardware and high-speed LAN/WAN technologies make distributed database (DDB) systems an attractive research area where query optimization and DDB design are the two important and related problems. Since dynamic programming is not feasible for optimizing queries in a DDB, we propose a new genetic algorithm (GA)-based query optimizer (new genetic algorithm (NGA)) and compare its performance with random and optimal (exhaustive) algorithms. We perform experiments on a synthetic data...
Multiobjective relational data warehouse design for the cloud
Dökeroğlu, Tansel; Coşar, Ahmet; Department of Computer Engineering (2014)
Conventional distributed DataWarehouse (DW) design techniques seek to assign data tables/fragments to a given static database hardware setting optimally. However; it is now possible to use elastic virtual resources provided by the Cloud environment, thus achieve reductions in both the execution time and the monetary cost of a DW system within predefined budget and response time constraints. Finding an optimal assignment plan for database tables to machines for this design problem is NP-Hard. Therefore, robu...
Designing cloud data warehouses using multiobjective evolutionary algorithms
Dökeroǧlu, Tansel; Sert, Seyyit Alper; Çinar, M. Serkan; Coşar, Ahmet (2014-01-01)
DataBase as a Service (DBaaS) providers need to improve their existing capabilities in data management and balance the efficient usage of virtual resources to multi-users with varying needs. However, there is still no existing method that concerns both with the optimization of the total ownership price and the performance of the queries of a Cloud data warehouse by taking into account the alternative virtual resource allocation and query execution plans. Our proposed method tunes the virtual resources of a ...
Citation Formats
T. Dokeroglu, M. A. Bayir, and A. Coşar, “Robust heuristic algorithms for exploiting the common tasks of relational cloud database queries,” APPLIED SOFT COMPUTING, pp. 72–82, 2015, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/31553.