Controlled stochastic process

A stochastic process whose probabilistic characteristics may be changed (controlled) in the course of its evolution in pursuance of some objective, normally the minimization (maximization) of a functional (the control objective) representing the quality of the control. Various types of controlled processes arise, depending on how the process is specified or on the nature of the control objective. The greatest progress has been made in the theory of controlled jump (or stepwise) Markov processes and controlled diffusion processes when the complete evolution of the process is observed by the controller. A corresponding theory has also been developed in the case of partial observations (incomplete data).

Controlled jump Markov process.

(1a)

(1b)

A typical control problem is the maximization of a functional

(2)

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c02601098.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c02601099.png" /> are bounded Borel functions on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010100.png" /> and on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010101.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010102.png" /> is a fixed number. By defining suitable functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010103.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010104.png" /> and introducing fictitious states, a wide class of functionals containing terms of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010105.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010106.png" /> is the moment of jump, and allowing termination of the process, can be reduced to the form (2). By a value function one denotes the function

(3)

(4)

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010138.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010139.png" /> (a variant of Bellman's principle). For <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010140.png" /> one obtains from (4) and (1a)–(1b) the Bellman equation

(5)

The value function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010143.png" /> is the only bounded function on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010144.png" />, absolutely continuous in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010145.png" /> and satisfying (5) and the condition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010146.png" />. Equation (5) may be solved by the method of successive approximations. For <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010147.png" /> if follows from the Kolmogorov equation for the Markov process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010148.png" /> that if the supremum in (5) is attained by a measurable function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010149.png" />, then the Markov strategy <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010150.png" /> is optimal. In this way the existence of optimal Markov strategies in semi-continuous models (in which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010151.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010152.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010153.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010154.png" /> satisfy the compactness and continuity conditions of the definition) is established, in particular for finite models (with finite <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010155.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010156.png" />). In arbitrary Borel models one can conclude the existence of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010157.png" />-a.e. <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010158.png" />-optimal Markov strategies for any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010159.png" /> by using a measurable selection theorem (cf. Selection theorems). In countable models one obtains Markov <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010160.png" />-optimal strategies. The results can partly be extended to the case when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010161.png" /> and the functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010162.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010163.png" /> are unbounded, but in general the sufficiency of Markov strategies, i.e. optimality of Markov strategies in the class <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010164.png" />, has not been proved.

For homogeneous models, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010165.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010166.png" /> do not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010167.png" />, one considers along with (2) the functionals

(6)

(7)

and moreover poses the question of sufficiency for the class <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010170.png" />. If <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010171.png" /> and the Borel function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010172.png" /> is bounded, then equation (5) for the functional (6) becomes

(8)

This equation coincides with Bellman's equation for the analogous problem with discrete time and it has a unique bounded solution. If the supremum in (8) is attained for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010174.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010175.png" />, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010176.png" /> is optimal. The results on the existence of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010177.png" />-optimal strategies in the class <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010178.png" />, analogous to those mentioned above, can also be obtained. For the criterion (7) complete results have been obtained only for finite and special forms of ergodic controlled jump Markov processes and similar cases of discrete time: one can choose <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010179.png" /> and a function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010180.png" /> in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010181.png" /> such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010182.png" /> is optimal for the criterion (2) at once for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010183.png" />, and hence optimal for the criterion (7).

Controlled diffusion process.

This is a continuous controlled random process in a <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010184.png" />-dimensional Euclidean space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010185.png" />, admitting a stochastic differential with respect to a certain Wiener process which enters exogenously. The theory of controlled diffusion processes arose as a generalization of the theory of controlled deterministic systems represented by equations of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010186.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010187.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010188.png" /> is the state of the system and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010189.png" /> is the control parameter.

(9)

(Itô's theorem). This solution, denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010237.png" />, is called a controlled diffusion process (controlled process of diffusion type); it is controlled by selection of the strategy <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010238.png" />. Besides strategies in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010239.png" /> one can consider other classes of strategies. Let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010240.png" /> be the space of continuous functions on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010241.png" /> with values in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010242.png" />. The semi-axis <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010243.png" /> may be interpreted as the set of values of the time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010244.png" />. Elements of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010245.png" /> are denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010246.png" />. Further, let <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010247.png" /> be the smallest <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010248.png" />-algebra of subsets of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010249.png" /> relative to which the coordinate functions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010250.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010251.png" /> in the space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010252.png" /> are measurable. A function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010253.png" /> with values in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010254.png" /> is called a natural strategy, or natural control, admissible at the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010255.png" /> if it is progressively measurable relative to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010256.png" /> and if for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010257.png" /> there exists at least one solution of the equation (9) that is progressively measurable relative to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010258.png" />. The set of all natural strategies admissible at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010259.png" /> is denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010260.png" />, its subset consisting of all natural strategies of the form <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010261.png" /> is denoted by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010262.png" /> and is called the set of Markov strategies, or Markov controls, admissible at the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010263.png" />. One can say that a natural strategy defines an equation at the moment of time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010264.png" /> on the basis of the observations of the process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010265.png" /> on the time interval <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010266.png" />, and that a Markov strategy defines an equation on the basis of observations of the process only at the moment of time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010267.png" />. For <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010268.png" /> (even for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010269.png" />) the solution of (9) need not be unique. Therefore, for every <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010270.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010271.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010272.png" />, one arbitrarily fixes some solution of (9) and denotes it by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010273.png" />.

Then, using the formula <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010274.png" />, one defines an imbedding <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010275.png" /> for which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010276.png" /> (a. e.).

(10)

where the indices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010290.png" /> to the expectation sign mean that they should be introduced under the expectation sign as needed. There arises then the problem of determining a strategy <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010291.png" /> maximizing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010292.png" />, and of determining a value function

(11)

(12)

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010322.png" /> are arbitrarily defined stopping times (cf. Markov moment) not exceeding <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010323.png" />. If in (12) one replaces <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010324.png" /> by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010325.png" />, and applies Itô's formula to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010326.png" />, then after some non-rigorous arguments one arrives at the Bellman equation:

(13)

where

(14)

and where the indices <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010330.png" /> are assumed to be summed from 1 to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010331.png" />; the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010332.png" /> is defined by

Bellman's equation plays a central role in the theory of controlled diffusion processes, since it often turns out that a sufficiently "good" solution of it, equal to <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010334.png" /> on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010335.png" />, is the value function, while if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010336.png" /> for every <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010337.png" /> realizes the least upper bound in (13) and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010338.png" /> is a Markov strategy admissible at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010339.png" />, then the strategy <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010340.png" /> is optimal at the point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010341.png" />. Thus one can sometimes show that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010342.png" />.

A rigorous proof of such results meets with serious difficulties, connected with the non-linear character of equation (13), which in general is a non-linear degenerate parabolic equation. The simplest case is that in which (13) is a non-degenerate quasi-linear equation (the matrix <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010343.png" /> does not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010344.png" /> and is uniformly non-degenerate in <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010345.png" />). Here, under certain additional restrictions on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010346.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010347.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010348.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010349.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010350.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010351.png" /> one can make use of results from the theory of quasi-linear parabolic equations to prove the solvability of (13) in Hölder classes of functions and to give a method for constructing <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010352.png" />-optimal strategies, based on a solution of (13). An analogous approach can be used (cf. [3]) in the one-dimensional case when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010353.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010354.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010355.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010356.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010357.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010358.png" /> are bounded and do not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010359.png" />, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010360.png" /> is uniformly bounded away from zero. In this case (13) reduces to a second-order quasi-linear equation on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010361.png" />, such that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010362.png" /> and (13) can be solved for its highest derivative <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010363.png" />. Methods of the theory of differential equations help in the study of (13) even if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010364.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010365.png" /> is a two-dimensional domain, and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010366.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010367.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010368.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010369.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010370.png" /> do not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010371.png" /> (cf. [3]). Here, as in previous cases, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010372.png" /> is allowed to depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010373.png" />. It is relevant also to mention the case of the Hamilton–Jacobi equation <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010374.png" />, which may be studied by methods of the theory of differential equations (cf. [5]).

Along with problems of controlled motion, one can also consider optimal stopping of the controlled processes for one or two persons, e.g. maximization over <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010383.png" /> and an arbitrary stopping time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010384.png" /> of a value functional of the form:

Related to the theory of controlled diffusion processes are controlled partially-observable processes and problems of control of stochastic processes, in which the control is realizable by the selection of a measure on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010387.png" /> from a given class of measures, corresponding to processes of diffusion type (cf. [3], [4], [6], [7], [8]).

References

[1]	I.I. [I.I. Gikhman] Gihman, A.V. [A.V. Skorokhod] Skorohod, "Controlled stochastic processes" , Springer (1977) (Translated from Russian)
[2]	A.A. Yushkevich, "Controlled jump Markov models" Theory Probab. Appl. , 25 (1980) pp. 244–266 Teor. Veroyatnost. i Primenen. , 25 (1980) pp. 247–270
[3]	N.V. Krylov, "Controlled diffusion processes" , Springer (1980) (Translated from Russian)
[4]	W.H. Fleming, R.W. Rishel, "Deterministic and stochastic optimal control" , Springer (1975)
[5]	S.N. Kruzhkov, "Generalized solutions of the Hamilton–Jacobi equations of eikonal type. I. Statement of the problem, existence, uniqueness and stability theorems, some properties of the solutions" Mat. Sb. , 98 : 3 (1975) pp. 450–493 (In Russian)
[6]	R.S. Liptser, A.N. Shiryaev, "Statistics of random processes" , 1–2 , Springer (1977–1978) (Translated from Russian)
[7]	W.M. Wonham, "On the separation theorem of stochastic control" SIAM J. Control , 6 (1968) pp. 312–326
[8]	M.H.A. Davis, "The separation principle in stochastic control via Girsanov solutions" SIAM J. Control and Optimization , 14 (1976) pp. 176–188

Comments

The Bellman equations mentioned above (equations (5), (13)) are in this form sometimes called the Bellman–Hamilton–Jacobi equation.

A controlled diffusion process is also defined as a controlled random process, in some Euclidean space, whose measure admits a Radon–Nikodým derivative with respect to a certain Wiener process which is independent of the control.

There are many important topics in the theory of controlled stochastic process other than those mentioned above. Controlled stepwise (jump) processes are of limited interest due to the lack of important applications. The following comments are intended to put the subject in a wider perspective, as well as pointing to some recent technical innovations.

Controlled processes in discrete time.

These are normally specified by a state transition equation

(a1)

is Markovian. The control objective is normally to minimize a cost function such as

The number <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010401.png" /> of stages may be finite or infinite; <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010402.png" /> is the discount factor. The one-stage cost with terminal cost <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010403.png" /> is

Define

then the principle of dynamic programming indicates that

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010408.png" />, etc., and that the optimal control <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010409.png" /> is the value such that

In the infinite horizon case (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010411.png" />) one expects that if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010412.png" />, then

and that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010414.png" /> will satisfy Bellman's functional equation

The general theory of discrete-time control concerns conditions under which results of the type above can be rigorously substantiated. Generally, contraction (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010416.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010417.png" />) or monotonicity (<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010418.png" />) conditions are required. <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010419.png" /> is not necessarily measurable if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010420.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010421.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010422.png" /> are merely Borel functions. However, if these functions are lower semi-analytic, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010423.png" /> is lower semi-analytic and existence of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010424.png" />-optimal universally measurable policies can be proved. [a1], [a2] are excellent references for this theory.

Viscosity solutions of the Bellman equations.

Return to the controlled diffusion problem (9), (10) and write the Bellman equation (a1) as

(a2)

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010426.png" />, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010427.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010428.png" /> coincides with the left-hand side of (13). As pointed out in the main article, it is a difficult matter to decide in which sense, if any, the value function, defined by (12), satisfies (a2). The concept of viscosity solutions of the Bellman equation, introduced for first-order equations in [a3], provides an answer to this question. A function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010429.png" /> is a viscosity solution of (a2) if for all <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010430.png" />,

Note that any <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010433.png" /> solution of (a2) is a viscosity solution and that if a viscosity solution is <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010434.png" /> at some point <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010435.png" />, then (a2) is satisfied at <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010436.png" />. It is possible to show in great generality that if the value function <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010437.png" /> of (12) is continuous, then it is a viscosity solution of (a1). [a14] can be consulted for a proof of this result, conditions under which <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010438.png" /> is continuous and other results on uniqueness and regularity of viscosity solutions.

A probabilistic approach.

The theory of controlled diffusion is intimately connected with partial differential equations. However the most general results on existence of optimal controls and on stochastic maximum principles (see below) can be obtained by purely probabilistic methods. This is described below for the diffusion model (9) where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010439.png" /> does not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010440.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010441.png" /> is uniformly positive definite. In this case a weak solution of (9) can be defined for any feedback control <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010442.png" />; denote by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010443.png" /> the set of such controls and by <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010444.png" /> the expectation with respect to the sample space measure for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010445.png" /> when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010446.png" />. Suppose that the pay-off (see also Gain function) to be maximized is

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010448.png" /> is a fixed time. Define

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010450.png" />. Let a scalar process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010451.png" /> be defined by

Thus, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010453.png" /> is the maximal expected total pay-off given the control chosen and the evolution of the process up to time <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010454.png" />. It is possible to show that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010455.png" /> is always a supermartingale (cf. Martingale) and that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010456.png" /> is a martingale if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010457.png" /> is optimal. <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010458.png" /> has the Doob–Meyer decomposition <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010459.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010460.png" /> is a martingale and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010461.png" /> is a continuous increasing process. Thus <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010462.png" /> is optimal if and only if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010463.png" />. By the martingale representation theorem (cf. Martingale), <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010464.png" /> can always be written in the form

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010466.png" /> is the Wiener process appearing in the weak solution of (9) with control <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010467.png" />. It is easily shown that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010468.png" /> does not depend on <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010469.png" /> and that the relation between <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010470.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010471.png" /> for <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010472.png" /> is

where

This immediately gives a maximum principle: if <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010475.png" /> is optimal, then <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010476.png" />; but <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010477.png" /> is increasing, so it must be the case that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010478.png" /> a.e., which implies that

(a3)

One also gets an existence theorem: Since <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010480.png" /> is the same for all controls one can construct an optimal control <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010481.png" /> by taking

Similar techniques can be applied to very general classes of controlled stochastic differential systems (not just controlled diffusion) and to optimal stopping and impulse control problems (see below). General references are [a5], [a6]. Some of this theory has also been developed using methods of non-standard analysis [a7].

Stochastic maximum principle.

The necessary condition (a3) is not as it stands a true maximum principle because the "adjoint variable" <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010483.png" /> is only implicitly characterized. It is shown in [a8] and elsewhere that under wide conditions <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010484.png" /> is given by

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010486.png" /> is an optimal control and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010487.png" /> is the fundamental solution of the linearized or derivative system corresponding to (9) with control <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010488.png" />, i.e. it satisfies

This gives the stochastic maximum principle in a form which is directly analogous to the Pontryagin maximum principle of deterministic optimal control theory.

Impulse control.

Suppose that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010524.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010525.png" />; this rules out the strategies having more than a finite number of interventions in bounded time intervals. The value function is

Define the operator <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010527.png" /> by

When <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010529.png" /> is compact and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010530.png" /> is a Feller process it is possible to show [a9] that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010531.png" /> is continuous and that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010532.png" /> is the largest continuous function satisfying

(a4)

(a5)

The optimal strategy <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010535.png" /> is:

Thus, the state space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010538.png" /> divides into a continuation set, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010539.png" />, and an intervention set, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010540.png" />. Further, <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010541.png" /> is the unique solution of

(a6)

where the infimum is taken over the set of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010543.png" /> stopping times <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010544.png" />. This shows the close connection between impulse control and optimal stopping: (a6) is an optimal stopping problem for the process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010545.png" /> with implicit obstacle <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010546.png" />. Similar results are obtained for right processes in [a6], [a10]; the measurability properties here are more delicate. There is also well-developed analytic theory of impulse control. Assuming <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010547.png" />, where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010548.png" /> is the differential generator of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010549.png" />, one obtains from (a5)

(a7)

Further, equality holds in at least one of (a4), (a7) at each <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010551.png" />, i.e.

(a8)

Equations (a4), (a7), (a8) characterize <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010553.png" /> and have been extensively studied for diffusion processes (i.e. when <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010554.png" /> is a second-order differential operator) using the method of quasi-variational inequalities [a11]. Existence and regularity properties are obtained.

Control of applied non-diffusion models.

Many applied problems in operations research — for example in queueing systems or inventory control — involve optimization of non-diffusion stochastic models. These are generalizations of the jump process described in the main article allowing for non-constant trajectories between jumps and for various sorts of boundary behaviour. There have been various attempts to create a unified theory for such problems: piecewise-deterministic Markov processes [a12], Markov decision drift processes [a13], [a14]. Both continuous and impulse control are studied, as well as discretization methods and computational techniques.

Control of partially-observed processes.

This subject is still far from completely understood, despite important recent advances. It is closely related to the theory of non-linear filtering. Consider a controlled diffusion as in (9), where control must be based on observations of a scalar process <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010555.png" /> given by

(a9)

(<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010557.png" /> is another independent Wiener process and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010558.png" /> is, say, bounded), with a pay-off functional of the form

<img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010582.png" /> can be thought of as a non-normalized conditional distribution of <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010583.png" /> given <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010584.png" />; it satisfies the Zakai equation

(a10)

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010586.png" /> is given by (14) with <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010587.png" />. It follows from the properties of conditional mathematical expectation that <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010588.png" /> can be expressed in the form

where <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010590.png" /> and <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010591.png" />. This shows that the partially-observed problem (9), (a9) is equivalent to a problem (a10),

with complete observations on the probability space <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010592.png" /> where the controlled process is the measure-valued diffusion <img align="absmiddle" border="0" src="https://www.encyclopediaofmath.org/legacyimages/c/c026/c026010/c026010593.png" />. The question of existence of optimal controls has been extensively studied. It seems that optimal controls do exist, but only if some form of randomization is introduced; see [a7], [a15], [a16]. In addition, maximum principles have been obtained [a17], [a18] and some preliminary study of the Bellman equation undertaken [a19].

References

[a1]	D.P. Bertsekas, S.E. Shreve, "Stochastic optimal control: the discrete-time case" , Acad. Press (1978)
[a2]	E.B. Dynkin, A.A. Yushkevich, "Controlled Markov processes" , Springer (1979)
[a3]	M.G. Crandall, P.L. Lions, "Viscosity solutions of Hamilton–Jacobi equations" Trans. Amer. Math. Soc. , 277 (1983) pp. 1–42
[a4a]	P.L. Lions, "Optimal control of diffusion processes and Hamilton–Jacobi–Bellman equations Part I" Comm. Partial Differential Eq. , 8 (1983) pp. 1101–1134
[a4b]	P.L. Lions, "Optimal control of diffusion processes and Hamilton–Jacobi–Bellman equations Part II" Comm. Partial Differential Eq. , 8 (1983) pp. 1229–1276
[a5]	R.J. Elliott, "Stochastic calculus and applications" , Springer (1982)
[a6]	N. El Karoui, "Les aspèctes probabilistes du contrôle stochastique" , Lect. notes in math. , 876 , Springer (1980)
[a7]	S. Albeverio, J.E. Fenstad, R. Høegh-Krohn, T. Lindstrøm, "Nonstandard methods in stochastic analysis and mathematical physics" , Acad. Press (1986)
[a8]	U.G. Hausmann, "A stochastic maximum principle for optimal control of diffusions" , Pitman (1986)
[a9]	M. Robin, "Contrôle impulsionel des processus de Markov" , Univ. Paris IX (1978) (Thèse d'Etat)
[a10]	J.P. Lepeltier, B. Marchal, "Théorie générale du contrôle impulsionnel Markovien" SIAM. J. Control and Optimization , 22 (1984) pp. 645–665
[a11]	A. Bensoussan, J.L. Lions, "Impulse control and quasi-variational inequalities" , Gauthier-Villars (1984)
[a12]	M.H.A. Davis, "Piecewise-deterministic Markov processes: a general class of non-diffusion stochastic models" J. Royal Statist. Soc. (B) , 46 (1984) pp. 353–388
[a13]	F.A. van der Duyn Schouten, "Markov decision drift processes" , CWI , Amsterdam (1983)
[a14]	A.A. Yushkevich, "Continuous-time Markov decision processes with intervention" Stochastics , 9 (1983) pp. 235–274
[a15]	W.H. Fleming, E. Paradoux, "Optimal control for partially-observed diffusions" SIAM J. Control and Optimization , 20 (1982) pp. 261–285
[a16]	V.S. Borkar, "Existence of optimal controls for partially-observed diffusions" Stochastics , 11 (1983) pp. 103–141
[a17]	A. Bensoussan, "Maximum principle and dynamic programming approaches of the optimal control of partially-observed diffusions" Stochastics , 9 (1983) pp. 169–222
[a18]	U.G. Haussmann, "The maximum principle for optimal control of diffusions with partial information" SIAM J. Control and Optimization , 25 (1987) pp. 341–361
[a19]	V.E. Beneš, I. Karatzas, "Filtering of diffusions controlled through their conditional measures" Stochastics , 13 (1984) pp. 1–23
[a20]	D.P. Bertsekas, "Dynamic programming and stochastic control" , Acad. Press (1976)
[a21]	H.J. Kushner, "Stochastic stability and control" , Acad. Press (1967)
[a22]	C. Striebel, "Optimal control of discrete time stochastic systems" , Lect. notes in econom. and math. systems , 110 , Springer (1975)
[a23]	P.L. Lions, "On the Hamilton–Jacobi–Bellmann equations" Acta Appl. Math. , 1 (1983) pp. 17–41
[a24]	M. Robin, "Long-term average cost control problems for continuous time Markov processes. A survey" Acta Appl. Math. , 1 (1983) pp. 281–299

0.00

(0 votes)

From The Encyclopedia of Math: Controlled stochastic process.

Anonymous

Search

Controlled stochastic process

Namespaces

More

Page actions

Contents

Controlled jump Markov process.

Controlled diffusion process.

References

Comments

Controlled processes in discrete time.

Viscosity solutions of the Bellman equations.

A probabilistic approach.

Stochastic maximum principle.

Impulse control.

Control of applied non-diffusion models.

Control of partially-observed processes.

References

Navigation

Navigation

Resources

Help

googletranslator

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Controlled stochastic process

Controlled jump Markov process.

Controlled diffusion process.

References

Comments

Controlled processes in discrete time.

Viscosity solutions of the Bellman equations.

A probabilistic approach.

Stochastic maximum principle.

Impulse control.

Control of applied non-diffusion models.

Control of partially-observed processes.

References

Navigation

Wiki tools

Page tools

Other projects

Categories