Identifying and tracking switching, non-stationary opponents: a Bayesian approach

Hernandez-Leal, P; Taylor, ME; Rosman, Benjamin S; Sucar, LE; Munoz de Cote, E

dc.contributor.author	Hernandez-Leal, P
dc.contributor.author	Taylor, ME
dc.contributor.author	Rosman, Benjamin S
dc.contributor.author	Sucar, LE
dc.contributor.author	Munoz de Cote, E
dc.date.accessioned	2017-05-17T07:18:30Z
dc.date.available	2017-05-17T07:18:30Z
dc.date.issued	2016-02
dc.identifier.citation	Hernandez-Leal, P., Taylor, M.E., Rosman, B.S., Sucar, L.E. and Munoz de Cote, E. 2016. Identifying and tracking switching, non-stationary opponents: a Bayesian approach. Workshop on Multiagent Interaction without Prior Coordination (MIPC) at AAAI-16, 13 February 2016, Phoenix, Arizona USA, p. 560-566	en_US
dc.identifier.uri	https://www.aaai.org/ocs/index.php/WS/AAAIW16/paper/view/12584/12424
dc.identifier.uri	http://mipc.inf.ed.ac.uk/2016/papers/mipc2016_hernandezleal_etal.pdf
dc.identifier.uri	http://hdl.handle.net/10204/9091
dc.description	Workshop on Multiagent Interaction without Prior Coordination (MIPC) at AAAI-16, 13 February 2016, Phoenix, Arizona USA	en_US
dc.description.abstract	In many situations, agents are required to use a set of strategies (behaviors) and switch among them during the course of an interaction. This work focuses on the problem of recognizing the strategy used by an agent within a small number of interactions. We propose using a Bayesian framework to address this problem. Bayesian policy reuse (BPR) has been empirically shown to be efficient at correctly detecting the best policy to use from a library in sequential decision tasks. In this paper we extend BPR to adversarial settings, in particular, to opponents that switch from one stationary strategy to another. Our proposed extension enables learning new models in an online fashion when the learning agent detects that the current policies are not performing optimally. Experiments presented in repeated games show that our approach is capable of efficiently detecting opponent strategies and reacting quickly to behavior switches, thereby yielding better performance than state-of-the-art approaches in terms of average rewards.	en_US
dc.language.iso	en	en_US
dc.publisher	Association for the Advancement of Artificial Intelligence (AAAI)	en_US
dc.relation.ispartofseries	Worklist;16648
dc.subject	Policy reuse	en_US
dc.subject	Non-stationary opponents	en_US
dc.subject	Repeated games	en_US
dc.title	Identifying and tracking switching, non-stationary opponents: a Bayesian approach	en_US
dc.type	Conference Presentation	en_US
dc.identifier.apacitation	Hernandez-Leal, P., Taylor, M., Rosman, B. S., Sucar, L., & Munoz de Cote, E. (2016). Identifying and tracking switching, non-stationary opponents: a Bayesian approach. Association for the Advancement of Artificial Intelligence (AAAI). http://hdl.handle.net/10204/9091	en_ZA
dc.identifier.chicagocitation	Hernandez-Leal, P, ME Taylor, Benjamin S Rosman, LE Sucar, and E Munoz de Cote. "Identifying and tracking switching, non-stationary opponents: a Bayesian approach." (2016): http://hdl.handle.net/10204/9091	en_ZA
dc.identifier.vancouvercitation	Hernandez-Leal P, Taylor M, Rosman BS, Sucar L, Munoz de Cote E, Identifying and tracking switching, non-stationary opponents: a Bayesian approach; Association for the Advancement of Artificial Intelligence (AAAI); 2016. http://hdl.handle.net/10204/9091 .	en_ZA
dc.identifier.ris	TY - Conference Presentation AU - Hernandez-Leal, P AU - Taylor, ME AU - Rosman, Benjamin S AU - Sucar, LE AU - Munoz de Cote, E AB - In many situations, agents are required to use a set of strategies (behaviors) and switch among them during the course of an interaction. This work focuses on the problem of recognizing the strategy used by an agent within a small number of interactions. We propose using a Bayesian framework to address this problem. Bayesian policy reuse (BPR) has been empirically shown to be efficient at correctly detecting the best policy to use from a library in sequential decision tasks. In this paper we extend BPR to adversarial settings, in particular, to opponents that switch from one stationary strategy to another. Our proposed extension enables learning new models in an online fashion when the learning agent detects that the current policies are not performing optimally. Experiments presented in repeated games show that our approach is capable of efficiently detecting opponent strategies and reacting quickly to behavior switches, thereby yielding better performance than state-of-the-art approaches in terms of average rewards. DA - 2016-02 DB - ResearchSpace DP - CSIR KW - Policy reuse KW - Non-stationary opponents KW - Repeated games LK - https://researchspace.csir.co.za PY - 2016 T1 - Identifying and tracking switching, non-stationary opponents: a Bayesian approach TI - Identifying and tracking switching, non-stationary opponents: a Bayesian approach UR - http://hdl.handle.net/10204/9091 ER -	en_ZA

Files in this item

Name: Hernandez-Leal_20 ...

Size: 464.0Kb

Format: PDF

Description: Paper

View/Open

This item appears in the following Collection(s)

Conference Publications

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.