Search in Imperfect Information Games Using Online Monte Carlo Counterfactual Regret Minimization
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21230%2F14%3A00225029" target="_blank" >RIV/68407700:21230/14:00225029 - isvavai.cz</a>
Výsledek na webu
<a href="http://aaai.org/ocs/index.php/WS/AAAIW14/paper/view/8820/8353" target="_blank" >http://aaai.org/ocs/index.php/WS/AAAIW14/paper/view/8820/8353</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Search in Imperfect Information Games Using Online Monte Carlo Counterfactual Regret Minimization
Popis výsledku v původním jazyce
Online search in games has always been a core interest of artificial intelligence. Advances made in search for perfect information games (such as Chess, Checkers, Go, and Backgammon) have led to AI capable of defeating the world's top human experts. Search in imperfect information games (such as Poker, Bridge, and Skat) is significantly more challenging due to the complexities introduced by hidden information. In this paper, we present Online Outcome Sampling (OOS), the first imperfect information search algorithm that is guaranteed to converge to an equilibrium strategy in two-player zero-sum games. We show that OOS avoids common problems encountered by existing search algorithms and we experimentally evaluate its convergence rate and practical performance against benchmark strategies in Liar's Dice and a variant of Goofspiel. We show that unlike with Information Set Monte Carlo Tree Search (ISMCTS) the exploitability of the strategies produced by OOS decreases as the amount of search
Název v anglickém jazyce
Search in Imperfect Information Games Using Online Monte Carlo Counterfactual Regret Minimization
Popis výsledku anglicky
Online search in games has always been a core interest of artificial intelligence. Advances made in search for perfect information games (such as Chess, Checkers, Go, and Backgammon) have led to AI capable of defeating the world's top human experts. Search in imperfect information games (such as Poker, Bridge, and Skat) is significantly more challenging due to the complexities introduced by hidden information. In this paper, we present Online Outcome Sampling (OOS), the first imperfect information search algorithm that is guaranteed to converge to an equilibrium strategy in two-player zero-sum games. We show that OOS avoids common problems encountered by existing search algorithms and we experimentally evaluate its convergence rate and practical performance against benchmark strategies in Liar's Dice and a variant of Goofspiel. We show that unlike with Information Set Monte Carlo Tree Search (ISMCTS) the exploitability of the strategies produced by OOS decreases as the amount of search
Klasifikace
Druh
O - Ostatní výsledky
CEP obor
IN - Informatika
OECD FORD obor
—
Návaznosti výsledku
Projekt
<a href="/cs/project/GAP202%2F12%2F2054" target="_blank" >GAP202/12/2054: Bezpečnostní hry v extenzivní formě</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2014
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů