Repeated utility values in Value Iteration (Markov Decision Process)

The Data Mining Forum

IMPORTANT: This is the old Data Mining forum.
I keep it online so that you can read the old messages.

Please post your new messages in the new forum: https://forum2.philippe-fournier-viger.com/index.php

Goto Topic: Previous•Next

Goto: Forum List•Message List•New Topic•Search•Log In•Print View

Repeated utility values in Value Iteration (Markov Decision Process)

Posted by: RadG

Date: January 12, 2015 04:49AM

I am trying to implement the value iteration algorithm of the Markov Decision Process using python. I have one implementation. But, this is giving me many repeated values for the utilities. My transition matrix is quite sparse. Probably, this is causing the problem. But, I am not very sure if this assumption is correct. How should I correct this? The code might be pretty shoddy. I am very new to value iteration. So please help me identify problems with my code. The reference code is this :http://carlo-hamalainen.net/stuff/mdpnotes/. I have used the ipod_mdp.py code file. Here is the link to the snippet of my implementation:
http://stackoverflow.com/questions/27899682/repeating-utility-values-in-value-iteration-markov-decision-process
Thank you very much!

Options: Reply•Quote