Risk-sensitive partially observable Markov decision processes as fully observable multivariate utility optimization problems

We provide a new algorithm for solving Risk Sensitive Partially Observable Markov Decisions Processes, when the risk is modeled by a utility function, and both the state space and the space of observations are fi- nite. This algorithm is based on an observation that the change of measure and the subsequent introduction of the information space, which is used for exponential utility functions, can be actually extended for sums of exponentials if one introduces an extra vector parameter that tracks the expected accumulated cost that corresponds to each exponential. Since every increasing function can be approximated by sums of expo- nentials in finite intervals, the method can be essentially applied for any utility function, with its complexity depending on the number.

Keywords

Markov decision processes, partial observability, risk sensitivity, utility function, sums of exponentials

Publication Type

Report

Version

publishedVersion

URI

https://oa.tib.eu/renate/handle/123456789/33318
https://doi.org/10.34657/32386

Collections

Mathematik
WIAS Preprints

License

This document may be downloaded, read, stored and printed for your own use within the limits of § 53 UrhG but it may not be distributed via the internet or passed on to external parties.
Dieses Dokument darf im Rahmen von § 53 UrhG zum eigenen Gebrauch kostenfrei heruntergeladen, gelesen, gespeichert und ausgedruckt, aber nicht im Internet bereitgestellt oder an Außenstehende weitergegeben werden.

Full item page

Risk-sensitive partially observable Markov decision processes as fully observable multivariate utility optimization problems

Files

Date

Authors

Editor

Advisor

Volume

Issue

Journal

Series Titel

Book Title

Publisher

Supplementary Material

Other Versions

Link to publishers' Version

Abstract

Description

Keywords

Keywords GND

Conference

Publication Type

Version

URI

Collections

License