PARALLEL COMPUTING IN STATISTICAL METHODS

2022-8-17
Oltulu, Orçun
Cost-efficient data collection and storage methods enable scientists, companies, and even regular computer users to reach high-dimensional data sets faster and cheaper. Even though personal computers are getting more powerful and efficient, some algorithms, tasks, and problems still require too much computational power and time to run on a personal computer. For a few decades, parallelization in statistical computing had an increasing trend, and researchers put significant effort into converting or adjusting known statistical methods and algorithms in parallel. The main reasons for the transition to parallel methods are the rapid growth in the size and the volume of data and the accelerated hardware developments. In this study, we applied the parallelization technique to statistical algorithms such as Linear Regression models, Non-parametric Regression models, and the measurement error kernel regression operator (MEKRO) algorithm for variable selection in Non-parametric Regression models. Simulation studies are conducted for each algorithm and recorded their accuracy measures and elapsed times to compare and see whether parallelization methods offer significant efficiency while maintaining the accuracy level as high as their sequential versions. The overall simulation results show that parallelization of the offers a great potential of time efficiency with negligible or no changes in accuracy values.

Suggestions

BIG DATA FOR INDUSTRY 4.0: A CONCEPTUAL FRAMEWORK
Gökalp, Mert Onuralp; Kayabay, Kerem; Eren, Pekin Erhan; Koçyiğit, Altan (2016-12-17)
Exponential growth in data volume originating from Internet of Things sources and information services drives the industry to develop new models and distributed tools to handle big data. In order to achieve strategic advantages, effective use of these tools and integrating results to their business processes are critical for enterprises. While there is an abundance of tools available in the market, they are underutilized by organizations due to their complexities. Deployment and usage of big data analysis t...
Hierarchical Coding for Cloud Storage: Topology-Adaptivity, Scalability, and Flexibility
Yang, Siyi; Hareedy, Ahmed; Calderbank, Robert; Dolecek, Lara (2022-06-01)
In order to accommodate the ever-growing data from various, possibly independent, sources and the dynamic nature of data usage rates in practical applications, modern cloud data storage systems are required to be scalable, flexible, and heterogeneous. The recent rise of the blockchain technology is also moving various information systems towards decentralization to achieve high privacy at low costs. While codes with hierarchical locality have been intensively studied in the context of centralized cloud stor...
ACCLOUD-MAN - Power efficient resource allocation for heterogeneous clouds ACCLOUD-MAN - Heterojen bulutlarda güç etkin kaynak atamasi
Ekici, Nazim Umut; Schmidt, Klaus Werner; Yazar, Alper; Schmidt, Şenan Ece (2019-04-01)
In this paper we propose ACCLOUD-MAN, a novel resource manager for heterogeneous cloud data centers. In heterogeneous clouds a user request can be satisfied with more than one physical resource alternative. Resource manager must decide which resource alternative will be chosen, along with the decision of the server the request will be assigned to. ACCLOUD-MAN's resource management objective is to reduce the power consumption of the cloud. Manager is modeled as an Integer Linear Problem and is implemented on...
Flexible querying in an intelligent object-oriented database environment
Koyuncu, M; Yazıcı, Adnan; George, R (2000-10-28)
Many new-generation database applications demand intelligent information management necessitating efficient interactions between database gr. knowledge bases and the users. In this study we discuss evaluation of imprecise queries in an intelligent object-oriented database environment, IFOOD. A flexible query evaluation mechanism, capable of handling different data types including complex and imprecise data and knowledge is presented and key language issues are addressed.
Smart water chain: Immutable, distributed and decentralized water transaction ledgers
Satilmisoglu, Talat Kemal; Keskin, Huzur (2023-01-01)
Blockchain is a transactional data storage system where data can be stored reliably without the need for a central database or trusted authority. The data can be anything like financial transactions, supply chain processes or medical records. It is similar to a classical database but uses a decentralized ledger and allowing each participant in the network to have their own copy of the ledger and be able to see all transactions. Data stored in the distributed ledger can only be read or written, not deleted o...
Citation Formats
O. Oltulu, “PARALLEL COMPUTING IN STATISTICAL METHODS,” M.S. - Master of Science, Middle East Technical University, 2022.