Show/Hide Menu
Hide/Show Apps
Logout
Türkçe
Türkçe
Search
Search
Login
Login
OpenMETU
OpenMETU
About
About
Open Science Policy
Open Science Policy
Open Access Guideline
Open Access Guideline
Postgraduate Thesis Guideline
Postgraduate Thesis Guideline
Communities & Collections
Communities & Collections
Help
Help
Frequently Asked Questions
Frequently Asked Questions
Guides
Guides
Thesis submission
Thesis submission
MS without thesis term project submission
MS without thesis term project submission
Publication submission with DOI
Publication submission with DOI
Publication submission
Publication submission
Supporting Information
Supporting Information
General Information
General Information
Copyright, Embargo and License
Copyright, Embargo and License
Contact us
Contact us
Predicting tennis match outcome: a machine learning approach using the SRP-CRISP-DM framework
Download
toyan_unal_tez.pdf
Date
2023-12-07
Author
Ünal, Toyan
Metadata
Show full item record
This work is licensed under a
Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License
.
Item Usage Stats
630
views
311
downloads
Cite This
Machine learning methods have demonstrated effectiveness in forecasting tennis match results. However, due to their empirical nature, decisions regarding the choice of specific datasets, models, feature sets, or hyperparameters significantly impact outcomes. In this thesis, we employed the Sports Result Prediction Cross-Industry Standard Process for Data Mining experimental framework to address this uncertainty. This approach ensures that results are both replicable and reproducible across diverse datasets and sports types. Our study encompasses 14 years of men’s singles tennis match data, from 2009 to 2022, with data from 2021 and 2022 designated as the hold-out test set. We applied six advanced feature extraction techniques, alongside three machine learning models and two feature selection methods. A 10-fold time-based cross-validation approach, coupled with hyperparameter tuning, was adopted. The Extreme Gradient Boosting model, after training and tuning, emerged as the most effective, achieving the lowest Brier score of 0.1913 and an accuracy of 70.5\% on the test set. The feature with the highest predictive power was identified as the average win ratios implied by the betting odds of the bookmakers, which played a pivotal role in forecasting match outcomes.
Subject Keywords
Sports analytics
,
Tennis match outcome prediction
,
SRP-CRISP-DM
,
Machine learning
,
Feature extraction
URI
https://hdl.handle.net/11511/106394
Collections
Graduate School of Informatics, Thesis
Citation Formats
IEEE
ACM
APA
CHICAGO
MLA
BibTeX
T. Ünal, “Predicting tennis match outcome: a machine learning approach using the SRP-CRISP-DM framework,” M.S. - Master of Science, Middle East Technical University, 2023.