Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Fostering Undergraduate Data Science
Date
2020-01-01
Author
Gökalp Yavuz, Fulya
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
84
views
0
downloads
Cite This
Data Science is one of the newest interdisciplinary areas. It is transforming our lives unexpectedly fast. This transformation is also happening in our learning styles and practicing habits. We advocate an approach to data science training that uses several types of computational tools, including R, bash, awk, regular expressions, SQL, and XPath, often used in tandem. We discuss ways for undergraduate mentees to learn about data science topics, at an early point in their training. We give some intuition for researchers, professors, and practitioners about how to effectively embed real-life examples into data science learning environments. As a result, we have a unified program built on a foundation of team-oriented, data-driven projects.
Subject Keywords
Statistics, Probability and Uncertainty
,
Statistics and Probability
,
General Mathematics
URI
https://hdl.handle.net/11511/43893
Journal
AMERICAN STATISTICIAN
DOI
https://doi.org/10.1080/00031305.2017.1407360
Collections
Department of Statistics, Article
Suggestions
OpenMETU
Core
Non-subjective priors for wrapped Cauchy distributions
Ghosh, Malay; Zhong, Xiaolong; SenGupta, Ashis; Zhang, Ruoyang (Elsevier BV, 2019-10-01)
Circular data can arise from many sources, such as image processing, protein structure, and geological data, just to name a few. Wrapped stable family of distributions constitute one of the most widely used class of distributions for the analysis of such data. Wrapped Cauchy distribution is a member of this family and it is the only one known to have a single term explicit pdf compared to the infinite series representations for all the others. We develop in this paper reference priors and probability matchi...
Extended lasso-type MARS (LMARS) model in the description of biological network
Agraz, Melih; Purutçuoğlu Gazi, Vilda (Informa UK Limited, 2019-01-02)
The multivariate adaptive regression splines (MARS) model is one of the well-known, additive non-parametric models that can deal with highly correlated and nonlinear datasets successfully. From our previous analyses, we have seen that lasso-type MARS (LMARS) can be a strong alternative of the Gaussian graphical model (GGM) which is a well-known probabilistic method to describe the steady-state behaviour of the complex biological systems via the lasso regression. In this study, we extend our original LMARS m...
Mutual information model selection algorithm for time series
Akca, Elif; Yozgatlıgil, Ceylan (Informa UK Limited, 2020-09-01)
Time series model selection has been widely studied in recent years. It is of importance to select the best model among candidate models proposed for a series in terms of explaining the procedure that governs the series and providing the most accurate forecast for the future observations. In this study, it is aimed to create an algorithm for order selection in Box-Jenkins models that combines penalized natural logarithm of mutual information among the original series and predictions coming from each candida...
A new outlier detection method based on convex optimization: application to diagnosis of Parkinson's disease
TAYLAN, PAKİZE; Yerlikaya-Ozkurt, Fatma; Bilgic Ucak, Burcu; Weber, Gerhard Wilhelm (Informa UK Limited, 2020-12-01)
Neuroscience is a combination of different scientific disciplines which investigate the nervous system for understanding of the biological basis. Recently, applications to the diagnosis of neurodegenerative diseases like Parkinson's disease have become very promising by considering different statistical regression models. However, well-known statistical regression models may give misleading results for the diagnosis of the neurodegenerative diseases when experimental data contain outlier observations that l...
MARS as an alternative approach of Gaussian graphical model for biochemical networks
AYYILDIZ DEMİRCİ, EZGİ; Agraz, Melih; Purutçuoğlu Gazi, Vilda (Informa UK Limited, 2017-01-01)
The Gaussian graphical model (GGM) is one of the well-known modelling approaches to describe biological networks under the steady-state condition via the precision matrix of data. In literature there are different methods to infer model parameters based on GGM. The neighbourhood selection with the lasso regression and the graphical lasso method are the most common techniques among these alternative estimation methods. But they can be computationally demanding when the system's dimension increases. Here, we ...
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
F. Gökalp Yavuz, “Fostering Undergraduate Data Science,”
AMERICAN STATISTICIAN
, pp. 8–16, 2020, Accessed: 00, 2020. [Online]. Available: https://hdl.handle.net/11511/43893.