MDPs with Unawareness - Computer Science > Artificial IntelligenceReport as inadecuate




MDPs with Unawareness - Computer Science > Artificial Intelligence - Download this document for free, or read online. Document in PDF available to download.

Abstract: Markov decision processes MDPs are widely used for modeling decision-makingproblems in robotics, automated control, and economics. Traditional MDPs assumethat the decision maker DM knows all states and actions. However, this maynot be true in many situations of interest. We define a new framework, MDPswith unawareness MDPUs to deal with the possibilities that a DM may not beaware of all possible actions. We provide a complete characterization of when aDM can learn to play near-optimally in an MDPU, and give an algorithm thatlearns to play near-optimally when it is possible to do so, as efficiently aspossible. In particular, we characterize when a near-optimal solution can befound in polynomial time.



Author: Joseph Y. Halpern, Nan Rong, Ashutosh Saxena

Source: https://arxiv.org/







Related documents