PARALLEL COMPUTING IN STATISTICAL METHODS

2022-8-17
Oltulu, Orçun
Cost-efficient data collection and storage methods enable scientists, companies, and even regular computer users to reach high-dimensional data sets faster and cheaper. Even though personal computers are getting more powerful and efficient, some algorithms, tasks, and problems still require too much computational power and time to run on a personal computer. For a few decades, parallelization in statistical computing had an increasing trend, and researchers put significant effort into converting or adjusting known statistical methods and algorithms in parallel. The main reasons for the transition to parallel methods are the rapid growth in the size and the volume of data and the accelerated hardware developments. In this study, we applied the parallelization technique to statistical algorithms such as Linear Regression models, Non-parametric Regression models, and the measurement error kernel regression operator (MEKRO) algorithm for variable selection in Non-parametric Regression models. Simulation studies are conducted for each algorithm and recorded their accuracy measures and elapsed times to compare and see whether parallelization methods offer significant efficiency while maintaining the accuracy level as high as their sequential versions. The overall simulation results show that parallelization of the offers a great potential of time efficiency with negligible or no changes in accuracy values.

Suggestions

BIG DATA FOR INDUSTRY 4.0: A CONCEPTUAL FRAMEWORK
Gökalp, Mert Onuralp; Kayabay, Kerem; Eren, Pekin Erhan; Koçyiğit, Altan (2016-12-17)
Exponential growth in data volume originating from Internet of Things sources and information services drives the industry to develop new models and distributed tools to handle big data. In order to achieve strategic advantages, effective use of these tools and integrating results to their business processes are critical for enterprises. While there is an abundance of tools available in the market, they are underutilized by organizations due to their complexities. Deployment and usage of big data analysis t...
Software Architecture of a Multimedia Data Management System
Salma, Cigdem Avci; Oğuztüzün, Mehmet Halit S.; Yazıcı, Adnan (2014-04-16)
Multimedia Data Management Systems (MMDMS) enable accessing, storing, organizing and retrieving multimedia content effectively and efficiently. The primary objective of the paper is to describe the software architecture of Middle East Technical University Multimedia Data Management System (METU-MMDMS) to facilitate future research on MMDMSs. There is a lack of published descriptive architectures for similar systems. Therefore, describing a particular MMDMS architecture, investigating the design decisions, r...
Data Management in Astrobiology: Challenges and Opportunities for an Interdisciplinary Community
Aydınoğlu, Arsev Umur; Malone, Jim (2014-06-01)
Data management and sharing are growing concerns for scientists and funding organizations throughout the world. Funding organizations are implementing requirements for data management plans, while scientists are establishing new infrastructures for data sharing. One of the difficulties is sharing data among a diverse set of research disciplines. Astrobiology is a unique community of researchers, containing over 110 different disciplines. The current study reports the results of a survey of data management p...
Data mining analysis of economic indicators of countries
Güngör, Erdem; Yozgatlıgil, Ceylan; Department of Statistics (2020-8)
Data Mining is becoming a famous analysis day by day to reveal the hidden information within big data. In the study, we use data mining techniques on the economic indicators of the countries. The four data mining techniques are to be implemented on the dataset. Making homogenous groups of the countries whose economic characteristics are similar are obtained by the Clustering Algorithm. After the clustering algorithm is performed, we pass to Association Rule Data Mining to investigate the most exported produ...
Hierarchical Coding for Cloud Storage: Topology-Adaptivity, Scalability, and Flexibility
Yang, Siyi; Hareedy, Ahmed; Calderbank, Robert; Dolecek, Lara (2022-06-01)
In order to accommodate the ever-growing data from various, possibly independent, sources and the dynamic nature of data usage rates in practical applications, modern cloud data storage systems are required to be scalable, flexible, and heterogeneous. The recent rise of the blockchain technology is also moving various information systems towards decentralization to achieve high privacy at low costs. While codes with hierarchical locality have been intensively studied in the context of centralized cloud stor...
Citation Formats
O. Oltulu, “PARALLEL COMPUTING IN STATISTICAL METHODS,” M.S. - Master of Science, Middle East Technical University, 2022.