Learning to predict soccer results from relational data with gradient boosted trees
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F19%3A00321375" target="_blank" >RIV/68407700:21230/19:00321375 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1007/s10994-018-5704-6" target="_blank" >https://doi.org/10.1007/s10994-018-5704-6</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s10994-018-5704-6" target="_blank" >10.1007/s10994-018-5704-6</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Learning to predict soccer results from relational data with gradient boosted trees
Popis výsledku v původním jazyce
We describe our winning solution to the 2017’s Soccer Prediction Challenge organized in conjunction with the MLJ’s special issue on Machine Learning for Soccer. The goal of the challenge was to predict outcomes of future matches within a selected time-frame from different leagues over the world. A dataset of over 200,000 past match outcomes was provided to the contestants. We experimented with both relational and feature-based methods to learn predictive models from the provided data. We employed relevant latent variables computable from the data, namely so called pi-ratings and also a rating based on the PageRank method. A method based on manually constructed features and the gradient boosted tree algorithm performed best on both the validation set and the challenge test set. We also discuss the validity of the assumption that probability predictions on the three ordinal match outcomes should be monotone, underlying the RPS measure of prediction quality.
Název v anglickém jazyce
Learning to predict soccer results from relational data with gradient boosted trees
Popis výsledku anglicky
We describe our winning solution to the 2017’s Soccer Prediction Challenge organized in conjunction with the MLJ’s special issue on Machine Learning for Soccer. The goal of the challenge was to predict outcomes of future matches within a selected time-frame from different leagues over the world. A dataset of over 200,000 past match outcomes was provided to the contestants. We experimented with both relational and feature-based methods to learn predictive models from the provided data. We employed relevant latent variables computable from the data, namely so called pi-ratings and also a rating based on the PageRank method. A method based on manually constructed features and the gradient boosted tree algorithm performed best on both the validation set and the challenge test set. We also discuss the validity of the assumption that probability predictions on the three ordinal match outcomes should be monotone, underlying the RPS measure of prediction quality.
Klasifikace
Druh
J<sub>imp</sub> - Článek v periodiku v databázi Web of Science
CEP obor
—
OECD FORD obor
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Návaznosti výsledku
Projekt
Výsledek vznikl pri realizaci vícero projektů. Více informací v záložce Projekty.
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2019
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název periodika
Machine Learning
ISSN
0885-6125
e-ISSN
1573-0565
Svazek periodika
108
Číslo periodika v rámci svazku
1
Stát vydavatele periodika
NL - Nizozemsko
Počet stran výsledku
19
Strana od-do
29-47
Kód UT WoS článku
000458551700003
EID výsledku v databázi Scopus
2-s2.0-85046456104