【Title】The Restless Bandits Problem with Partially Observable Markov Decision Processes
【Speaker】Dr. Mimi Zhang, Trinity College Dublin
【Host】Dr. Yanfu Li
【Time】10:00-11:00am, April 12th, 2018 (Thursday)
【Venue】Room N412, Shunde Building, Tsinghua University
【Abstract】The restless bandits (RBs) problem can be interpreted as a system of independent Markov decision processes, with an additional constraint on the available actions to the system. In many problem domains, however, a decision maker suffers from limited sensing capabilities that preclude him from recovering an exact Markovian state from perceptions. The current work generalizes the RBs framework to the circumstance in which the RBs' states are not perfectly known, and hence their dynamics are modelled by partially observable Markov chains. The generalized framework is called the partially observable RBs problem. Two different heuristic policies are proposed, based on two types of indices that are straightforward to calculate. Particularly, one index is obtained via approximating the value function, and the other index is obtained via the approximate linear program. The indexability of the partially observable RBs problem w.r.t. the two indices is investigated, and sufficient conditions are given. This paper also introduces a particular partially observable RBs problem of which the Whittle indexability always holds. The performance of the proposed heuristics is evaluated in a systematic computational study, showing an exceptional competence.
 
【Short Bio】Mimi Zhang is an assistant professor in the school of computer science and statistics at Trinity College Dublin. Dr.Zhang holds a B.Sc. in statistics from University of Science and Technology of China, and a Ph.D. in engineering management from City University of Hong Kong. Before joining TCD, she was a research associate at University of Strathclyde and Imperial College London. Her main research areas include Markov Decision Process, Stochastic Modelling, Multivariate Modelling and Data Mining.