- This event has passed.
AMII AI Seminars: Two-player, Zero-sum games
March 26 @ 12:00 pm - 1:00 pm MDTFree
Amii researcher at the University of Alberta Dustin Morrill presents “Hindsight Rationality and Efficient Deviation Types in Extensive-Form Games”. In this talk, he suggests to field learning algorithms that ensure strong performance in hindsight relative to “deviations” (pre-defined behaviour modifications) in two-player, zero-sum games.
Abstract: A successful approach to playing two-player, zero-sum games has been to deploy a static artifact resembling a Nash equilibrium, which has led work in artificial intelligence to focus on computing such artifacts. This approach is less sound and has been less successful in multi-player, general-sum games. We suggest instead to field learning algorithms that ensure strong performance in hindsight relative to “deviations”, i.e., pre-defined behavior modifications. A society of such “hindsight rational” agents converges toward mediated equilibrium, a traditional notion of equilibrium based on average correlated play rather than factored behavior, in contrast to Nash equilibrium. We re-examine deviation types and mediated equilibria in extensive-form games to gain a more complete understanding and resolve past misconceptions. We introduce a new deviation type that has implicitly formed the basis for the counterfactual regret minimization (CFR) algorithm. We generalize CFR to the extensive-form regret minimization (EFR) algorithm that is hindsight rational for any given deviation type within a broad and natural class. This class contains powerful new deviation types that are efficient to use in games with moderate lengths. We present an empirical analysis of EFR’s performance with different deviation types in common benchmark games, showing that stronger deviation types typically impart better performance, even in two-player, zero-sum games.
Presenter Bio: Dustin is a Ph.D. candidate at the University of Alberta and the Alberta Machine Intelligence Institute (Amii) working with Professor Michael Bowling. He works on multi-agent learning and scaleable, dependable learning algorithms. He is a coauthor of [DeepStack] and he created [Cepheus’s public match interface]. He completed a B.Sc. and M.Sc. in computing science at the University of Alberta where his M.Sc was also supervised by Michael Bowling. As an undergraduate, he worked with the Computer Poker Research Group (CPRG) to create an [open-source web interface to play against poker bots] and to develop the 1st-place 3-player Kuhn poker entry in the 2014 Annual Computer Poker Competition (ACPC).