Journal of the Operations Research Society of China ›› 2025, Vol. 13 ›› Issue (2): 484-514.doi: 10.1007/s40305-023-00457-5

Previous Articles     Next Articles

Double-Factored Decision Theory for Markov Decision Processes with Multiple Scenarios of the Parameters

Cheng-Jun Hou   

  1. Software Research and Data Science, Amazon Robotics, North Reading, MA 01864, USA
  • Received:2022-03-16 Revised:2022-12-20 Online:2025-06-30 Published:2025-07-07
  • Contact: Cheng-Jun Hou E-mail:chengjun.hou@gmail.com
  • Supported by:
    This research was originally supported by the (United States) National Science Foundation (No. 1409214).

Abstract: The double-factored decision theory for Markov decision processes with multiple scenarios of the parameters is proposed in this article. We introduce scenario belief to describe the probability distribution of scenarios in the system, and scenario expectation to formulate the expected total discounted reward of a policy. We establish a new framework named as double-factored Markov decision process (DFMDP), in which the physical state and scenario belief are shown to be the double factors serving as the sufficient statistics for the history of the decision process. Four classes of policies for the finite horizon DFMDPs are studied and it is shown that there exists a double-factored Markovian deterministic policy which is optimal among all policies. We also formulate the infinite horizon DFMDPs and present its optimality equation in this paper. An exact solution method named as double-factored backward induction for the finite horizon DFMDPs is proposed. It is utilized to find the optimal policies for the numeric examples and then compared with policies derived from other methods from the related literatures.

Key words: Dynamic programming, Markov decision process, Parameter uncertainty, Multiple scenarios of the parameters, Double-factored Markov decision process

CLC Number: