A C++ distributed database select - project - join query processor on a HPC cluster

Download
2012
Ceran, Erhan
High performance computer clusters have become popular as they are more scalable, affordable and reliable than their centralized counterparts. Database management systems are particularly suitable for distributed architectures; however distributed DBMS are still not used widely because of the design difficulties. In this study, we aim to help overcome these difficulties by implementing a simulation testbed for a distributed query plan processor. This testbed works on our departmental HPC cluster machine and is able to perform select, project and join operations. A data generation module has also been implemented which preserves the foreign key and primary key constraints in the database schema. The testbed has capability to measure, simulate and estimate the response time of a given query execution plan using specified communication network parameters. Extensive experimental work is performed to show the correctness of the produced results. The estimated execution time costs are also compared with the actual run-times obtained from the testbed to verify the proposed estimation functions. Thus, we make sure that these estimation iv functions can be used in distributed database query optimization and distributed database design tools.

Suggestions

An Evolutionary Genetic Algorithm for Optimization of Distributed Database Queries
Sevinc, Ender; Coşar, Ahmet (2011-05-01)
High-performance low-cost PC hardware and high-speed LAN/WAN technologies make distributed database (DDB) systems an attractive research area where query optimization and DDB design are the two important and related problems. Since dynamic programming is not feasible for optimizing queries in a DDB, we propose a new genetic algorithm (GA)-based query optimizer (new genetic algorithm (NGA)) and compare its performance with random and optimal (exhaustive) algorithms. We perform experiments on a synthetic data...
A Joint resource allocation system for cloud computing /
Dikbayır, Hüseyin Seçkin; Bazlamaçcı, Cüneyt Fehmi; Department of Electrical and Electronics Engineering (2014)
Cloud computing is a new trend in computing, where resources such as servers, storage devices and software applications are provided to customers over the Internet. It is typically based on a pay-per-use model similar to renting a car or taking a taxi in our daily life. The primary purpose of a cloud system is to utilize available resources effectively to provide an economic benefit to customers. To succeed in this, jobs initiated by consumers are allocated to a set of virtual machines (VM) that run in big ...
Computational platform for predicting lifetime system reliability profiles for different structure types in a network
Akgül, Ferhat (2004-01-01)
This paper presents a computational platform for predicting the lifetime system reliability profiles for different structure types located in an existing network. The computational platform has the capability to incorporate time-variant live load and resistance models. Following a review of the theoretical basis, the overall architecture of the computational platform is described. Finally, numerical examples of three existing bridges (i.e., a steel, a prestressed concrete, and a hybrid steel-concrete bridge...
Improving the performance of Hadoop/Hive by sharing scan and computation tasks
Özal, Serkan; Toroslu, İsmail Hakkı; Doğaç, Asuman; Department of Computer Engineering (2013)
MapReduce is a popular model of executing time-consuming analytical queries as a batch of tasks on large scale data. During simultaneous execution of multiple queries, many oppor- tunities can arise for sharing scan and/or computation tasks. Executing common tasks only once can reduce the total execution time of all queries remarkably. Therefore, we propose to use Multiple Query Optimization (MQO) techniques to improve the overall performance of Hadoop Hive, an open source SQL-based distributed warehouse sy...
A unification model and tool support for software functional size measurement methods
Efe, Pınar; Demirörs, Onur; Department of Information Systems (2006)
Software size estimation/measurement has been the objective of a lot of research in the software engineering community due to the need of reliable size estimates. FSM Methods have become widely used in software project management to measure the functional size of software since its first publication, late 1970s. Although all FSM methods measure the functional size by quantifying the FURs, each method defined its own measurement process and metric. Therefore, a piece of software has several functional sizes ...
Citation Formats
E. Ceran, “A C++ distributed database select - project - join query processor on a HPC cluster,” M.S. - Master of Science, Middle East Technical University, 2012.