costo - discounted stochastic control (Howard algorithm) (maxplus)
Solve (1+a)V=D(HV) corresponding to the discounted stochastic control problem described by
- the maxplus decision matrix (D) containing the gains associated to the corresponding state (the row number of the entry of D) and the decision (the column number of the entry of D),
- the standard rectangular stochastic matrix (H) describing the possible transition probabilities in which the control chooses (the row number of H).
The discount factor of the step n-1 is 1/(1+a)^n.
The first step is numbered 0.
The Howard algorithm is used.
The definition of v returned by costo is v=aV because this quantity remains finite when a goes to zero.
[v,p]=costo(D,H,0) returns v equal to the limit of aV when a goes to zero which is also the optimal average cost by unit of time.
D=sparse(#([1 1;0 0;3 3])); H=sparse([1 0 0; 0 1/2 1/2 ]); [v,p]=costo(D,H,0.001) [v,p]=costo(D,H,0)
karp, howard, semihoward,