ALTERNATIVE SELECTION FUNCTIONS FOR INFORMATION SET MONTE CARLO TREE SEARCH

Identifikátory výsledku

Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F14%3A00223915" target="_blank" >RIV/68407700:21230/14:00223915 - isvavai.cz</a>
Výsledek na webu
<a href="https://ojs.cvut.cz/ojs/index.php/ap/article/view/2230/0" target="_blank" >https://ojs.cvut.cz/ojs/index.php/ap/article/view/2230/0</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.14311/AP.2014.54.0333" target="_blank" >10.14311/AP.2014.54.0333</a>

Alternativní jazyky

Jazyk výsledku
angličtina
Název v původním jazyce
ALTERNATIVE SELECTION FUNCTIONS FOR INFORMATION SET MONTE CARLO TREE SEARCH
Popis výsledku v původním jazyce
We evaluate the performance of various selection methods for the Monte Carlo Tree Search algorithm in two-player zero-sum extensive-form games with imperfect information. We compare the standard Upper Confident Bounds applied to Trees (UCT) along with the less common Exponential Weights for Exploration and Exploitation (Exp3) and novel Regret matching (RM) selection in two distinct imperfect information games: Imperfect Information Goofspiel and Phantom Tic-Tac-Toe. We show that UCT after initial fast convergence towards a Nash equilibrium computes increasingly worse strategies after some point in time. This is not the case with Exp3 and RM, which also show superior performance in head-to-head matches.
Název v anglickém jazyce
ALTERNATIVE SELECTION FUNCTIONS FOR INFORMATION SET MONTE CARLO TREE SEARCH
Popis výsledku anglicky
We evaluate the performance of various selection methods for the Monte Carlo Tree Search algorithm in two-player zero-sum extensive-form games with imperfect information. We compare the standard Upper Confident Bounds applied to Trees (UCT) along with the less common Exponential Weights for Exploration and Exploitation (Exp3) and novel Regret matching (RM) selection in two distinct imperfect information games: Imperfect Information Goofspiel and Phantom Tic-Tac-Toe. We show that UCT after initial fast convergence towards a Nash equilibrium computes increasingly worse strategies after some point in time. This is not the case with Exp3 and RM, which also show superior performance in head-to-head matches.

Klasifikace

Druh
J<sub>x</sub> - Nezařazeno - Článek v odborném periodiku (Jimp, Jsc a Jost)
CEP obor
IN - Informatika
OECD FORD obor
—

Návaznosti výsledku

Projekt
<a href="/cs/project/GAP202%2F12%2F2054" target="_blank" >GAP202/12/2054: Bezpečnostní hry v extenzivní formě</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Ostatní

Rok uplatnění
2014
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

Název periodika
Acta Polytechnica
ISSN
1210-2709
e-ISSN
—
Svazek periodika
45
Číslo periodika v rámci svazku
5
Stát vydavatele periodika
CZ - Česká republika
Počet stran výsledku
8
Strana od-do
333-340
Kód UT WoS článku
—
EID výsledku v databázi Scopus
—

Podobné výsledky(10)

Monte Carlo Tree Search in Simultaneous Move Games with Applications to Goofspiel Convergence of Monte Carlo Tree Search in Simultaneous Move Games Search in Imperfect Information Games Using Online Monte Carlo Counterfactual Regret Minimization

Co hledáte?

Rychlé hledání

Chytré vyhledávání

ALTERNATIVE SELECTION FUNCTIONS FOR INFORMATION SET MONTE CARLO TREE SEARCH

Identifikátory výsledku

Alternativní jazyky

Klasifikace

Návaznosti výsledku

Ostatní

Údaje specifické pro druh výsledku

Podobné výsledky(10)

Co hledáte?

Rychlé hledání

Chytré vyhledávání

Popis výsledku

Identifikátory výsledku

Identifikátory výsledku

Alternativní jazyky

Alternativní jazyky

Klasifikace

Klasifikace

Návaznosti výsledku

Návaznosti výsledku

Ostatní

Ostatní

Údaje specifické pro druh výsledku

Údaje specifické pro druh výsledku

Podobné výsledky(10)