Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Improving Hadoop Hive Query Response Times Through Efficient Virtual Resource Allocation
Date
2015-10-28
Author
Dokeroglu, Tansel
Cinar, Muhammet Serkan
SERT, SEYYİT ALPER
Coşar, Ahmet
Yazıcı, Adnan
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
207
views
0
downloads
Cite This
The performance of the MapReduce-based Cloud data warehouses mainly depends on the virtual hardware resources allocated. Most of the time, the resources are values selected/given by the Cloud service providers. However, setting the right virtual resources in accordance with the workload demands of a query, such as the number of CPUs, the size of RAM, and the network bandwidth, will improve the response time when querying large data on an optimized system. In this study, we carried out a set of experiments with a well-known Mapreduce SQL-translator, Hadoop Hive, on benchmark decision support the TPC benchmark (TPC-H) database in order to analyze the performance sensitivity of the queries under different virtual resource settings. Our results provide valuable hints for the decision makers who design efficient MapReduce-based data warehouses on the Cloud.
Subject Keywords
Hadoop
,
Hive
,
Virtual resource allocation
,
Multi-objective query optimization
URI
https://hdl.handle.net/11511/31138
DOI
https://doi.org/10.1007/978-3-319-26154-6_17
Collections
Graduate School of Natural and Applied Sciences, Conference / Seminar
Suggestions
OpenMETU
Core
Optimal dynamic resource allocation for heterogenous cloud data centers
Ekici, Nazım Umut; Güran Schmidt, Şenan.; Department of Electrical and Electronics Engineering (2019)
Today's data centers are mostly cloud-based with virtualized servers to provide on-demand scalability and flexibility of the available resources such as CPU, memory, data storage and network bandwidth. Heterogeneous cloud data centers (CDCs) offer hardware accelerators in addition to these standard cloud server resources. A cloud data center provider may provide Infrastructure as a Service and Platform as a Service (IPaaS), where the user gets a virtual machine (VM) with processing, memory, storage and netw...
Improving data freshness in random access channels
Atabay, Doğa Can; Uysal, Elif; Department of Electrical and Electronics Engineering (2019)
The conventional network performance metrics such as throughput and delay do not accurately reflect the needs of some applications. Age of information (AoI) is a newly proposed metric that indicates the freshness of information from the receiver’s perspective. In this work, a network of multiple transmitter devices continuously updating a central station over an error-free multiaccess channel is studied. The average AoI expressions are derived for Round-Robin, Slotted ALOHA, and a proposed random access str...
Generalized resource management for heterogeneous cloud data centers
Erol, Ahmet; Güran Schmidt, Şenan Ece.; Department of Electrical and Electronics Engineering (2019)
OpenStack is a widely used management tool for cloud computing which is designed to work on servers and allocate standard computing resources such as CPU, memory or disk. The current trend for integrating different hardware accelerators such as FPGAs and GPUs in the cloud requires managing these heterogeneous resources. In this thesis, we propose a generalization for OpenStack Nova project which extends the relevant data structures to include these new resources. More importantly, we present a new lightweig...
Design-objective space exploration and multi-objective optimization of initial structural design alternatives via machine learning
Yetkin, Ozan; Sorguç, Arzu; Department of Architecture (2020-9)
Increasing implementations of digital workflows within design processes generate exponentially growing data in each phase. Therefore, decision making within a design space with growing complexity is expected to be a great challenge for designers in the future. Hence, this research aimed to seek the potentials of complex relations between data within design space and objective space of structural design problems for proposing a novel approach to augment capabilities of digital tools by artificial intelligenc...
Improving the performance of Hadoop/Hive by sharing scan and computation tasks
Özal, Serkan; Toroslu, İsmail Hakkı; Doğaç, Asuman; Department of Computer Engineering (2013)
MapReduce is a popular model of executing time-consuming analytical queries as a batch of tasks on large scale data. During simultaneous execution of multiple queries, many oppor- tunities can arise for sharing scan and/or computation tasks. Executing common tasks only once can reduce the total execution time of all queries remarkably. Therefore, we propose to use Multiple Query Optimization (MQO) techniques to improve the overall performance of Hadoop Hive, an open source SQL-based distributed warehouse sy...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
T. Dokeroglu, M. S. Cinar, S. A. SERT, A. Coşar, and A. Yazıcı, “Improving Hadoop Hive Query Response Times Through Efficient Virtual Resource Allocation,” 2015, vol. 400, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/31138.