Efficient approximation of high-frequency pure jump problems.
Control problems are omnipresent in industrial applications of online decision-making. In many of these applications, time evolves in a naturally discrete manner. While discrete-time problems are well studied and are relatively easily solved by Dynamic Programming, the frequencies of real-world applications often make this an infeasible approach. This problem is generally avoided using some time-aggregation approximation, such as a continuous time limit. A standard example of this situation is in financial markets: at the microscopic scale there are only order books, which evolve only when the asset is traded; however, control problems are often formulated on a (continuous time) diffusion process. In this project, the goal is to assess quantitatively the benefits of such changes of scale in terms of the control problem. More precisely my co-authors and I focus on two directions:
Below I will detail in what precise sense the answer to both is yes, but first I would like to highlight the practical implications of these answers. From i. one can identify if the error in the value of the control problem is of acceptable order for the application, and therefore determine via ii. whether to invest the time and energy to solve the real problem or the approximation.
In what follows we will consider that the evolution times follow a Poisson process, although it could be interesting to look at extensions to other cases. Let $\Omega:=\Db([0,T];\Rb^d)$ be the Skorohod space on $[0,T]$, for $T>0$ possibly infinite, and consider a positive finite random measure $N$ on $\Rb^{d}\times[0,T]$ and a measure $\Pb$ on $\Omega$ such that $N$ is a $\Pb$-Poisson process with compensator $\eta\nu(\de e)\de t$, for $\eta>0$ and $\nu$ a probability measure on $\Rb^{dâ}$. Denote $\Fb$ the augmentation of the natural filtration of the (marked) Poisson process. For ease of notations, we set $N_{t}:=N(\Rb^{d},[0,t])$ for $t\ge 0$.
Let \(\Fb=(\Fc_{s})_{s\ge 0}\) be the $\Pb$-augmentation of the raw filtration generated by the random measure $N$. Given a compact subset $\Ab$ of $\Rb$, let $\Ac$ be the collection of $\Fb$-predictable processes with values in $\Ab$. We will work on the filtered probability space $(\Omega,\Fc, \Fb, \Pb)$, where $\Fc=\Fc_{T}$.
For some $x\in\Rb^d$, $\alpha\in\Ac$, and a bounded measurable map $b$ (from $ \Rb^d \x \Ab\times \Rb^d$ into $\Rb^d$), let $X^{x,\alpha}$ denote the solution to the SDE
\[\begin{align} X_\cdot^{x,\alpha} = x + \int_0^\cdot\int_{\Rb^{d}} b(s,X_{s-}^{x,\alpha}, \alpha_s, e) N(\de e,\de s)\,.\label{eq: def pure jump process}\end{align}\]We consider some instantaneous gain $r:[0,T]\x\Rb^d\x\Ab \mapsto \Rb$ obtained at each jump of the system. We can now define a cost functional for all $(t,x)\in[0,T]\x\Rb^d$ by
\[\begin{align} J(t,x;\alpha):&[0,T]\x\Rb^d\x\Ac^t \to\Rb \notag \\ &\alpha\mapsto \Eb\left[\int_t^T r(s, X_s^{t,x,\alpha},\alpha_s)\de N_s \right]\,, \notag \end{align}\]where $X^{t,x,\alpha}$ represents a process like \eqref{eq: def pure jump process} started from time $t$ instead of $0$, and $\Ac^t$, $\Fc^t$, and $\Fb^t$ are the corresponding analogues of $\Ac$, $\Fc$, and $\Fb$ respectively.
A control problem is the maximisation of a gain functional ($J$) dependent on a process ($X^{x,\alpha}$) over its driving input process ($\alpha$). Thus, control problems are a specific subcategory of functional optimisation problems. The control problem associated with $J$ is formally given by the value function $V$ defined, for any $(t,x)\in [0,T]\x \Rb^d$, as
\[ V(t,x):=\sup_{\alpha \in \Ac^{t}} J(t,x;\alpha)\,.\]
The interconnection at the infinitesimal scale between $X^{x,\alpha}$ and $\alpha$ is the key complexity of control theory. The usual method involves recasting the optimisation problem as the solution of a PDE. Indeed, assuming $(b,r)$ is a continuous and bounded map, $V$ is the unique (bounded) viscosity solution of the associated Hamilton-Jacobi-Bellman equation
\[\begin{align} &0=\partial_{t}V - \eta \sup_{a\in \Ab}\left\{ \int [V(\cdot,\cdot+b(\cdot,a,e))-V]\nu(\de e)+r(\cdot,a)\right\} \mbox{ on }[0,T)\times\Rb^d \notag \\ &0=V(T,\cdot)\mbox{ on $\Rb^d$ }\notag\,. \end{align}\]This equation is integro-differential, so it is rather unpleasant to solve numerically. The real issue in high-frequency applications is the multiplicative term in the frequency of jumps $\eta$ in front of the integral. This scaling implies that errors in numerical integration are blown up, and therefore this numerical integration must be very fine, and increasingly so as $\eta\to+\infty$.
However, there are certain instances where these problems can be circumvented. In many applications, a diffusion limit problem is solved efficiently as an approximation
This particular regime consists of problems with the following scaling. Let $\ve\in(0,1)$ and \[ \eta=\eta_\ve := \ve^{-1}\,,\; b=b_\ve:=\ve b_1 + \ve^{\frac12}b_2\,.\] We are studying the limit as $\ve\to0^+$, therefore the relevant renormalisation of the cost functional to take is \[ J_\ve(t,x;\alpha):= \frac1{\eta_\ve} \Eb\left[\int_t^T r(s, X_s^{t,x,\alpha},\alpha_s)\de N_s \right]\,,\] which is strategically equivalent to the original problem $J$.
By the Donskerâs invariance principle, the process $X^{x,\alpha}$ defined by \eqref{eq: def pure jump process} with $b=b_\ve,\eta=\eta_\ve$ converges weakly to $\bar X^{x,\alpha}$ the solution of the ItĂŽ SDE \[ \bar X^{x,\bar\alpha}=x+\int_{0}^{\cdot} \mu(\bar X^{x,\bar\alpha}_s, \bar\alpha_s)\de s + \int_0^\cdot \sigma(\bar X^{x,\bar\alpha}_s, \bar\alpha_s)\de W_s\]
with coefficients \[\mu(x,a):= \int b_{1}(x,a,e) \nu(\de e),\; \sigma(x,a):= \left( \int b_2b_2^\top(x,a,e) \nu(\de e)\right)^{\frac12}.\] The solution to this SDE exists and is unique under some continuity assumptions on $b_1$ and further regularity on $b_2$, such as the ellipticity of $\sigma$.
Here $W_s$ is a Wiener process on $\Rb^d$, and one has to reform the probability space considered to $(\Omega,\bar\Fc,\bar\Fb,\bar\Pb)$ using the standard construction for $\bar X^{x,\alpha}$. Likewise one defines $\bar\Ac,\bar\Ac^t$ analogous to $\Ac,\Ac^t$. Then, given some regularity on $r$, we can define the diffusion control problem \[ \bar V(t,x):=\sup_{\bar\alpha \in \bar \Ac^{t}} \Eb\left[\int_{t}^{T}r(\bar X^{t,x, \bar\alpha}_s, \bar\alpha_s) \de s\right]\,, \] for any $(t,x)\in [0,T]\x \Rb$.
As in the pure-jump case, $\bar V$ is the unique (bounded) viscosity solution of the PDE \(\begin{align} &0=\partial_{t} \bar V +\sup_{\bar a\in \Ab}\left\{\mu(\cdot,\bar a)\cdot\partial_{x} \bar V +\frac12 \Tr[\sigma\sigma^\top(\cdot,\bar a) \partial^{2}_{xx}\bar V] +r(\cdot,\bar a)\right\}\mbox{ on } [0, T)\x \Rb^d,\label{eq: PDE bar V}\\ &0=\bar V(T,\cdot)\; \mbox{on } \Rb^d. \notag \end{align}\)
The diffusive PDE \eqref{eq: PDE bar V} is still non-linear but it doesnât involve an integral anymore, and it has been studied at great length. This advantage is both analytic (regularity results and methods are readily found in the literature), and numeric (there is a plethora of well-studied schemes for \eqref{eq: PDE bar V}).
Nevertheless, it is non-trivial to ascertain that, in this general framework, the control problems converge as $\ve\to0^+$ and the rate of convergence isnât known. Such approximation certificates would provide the rigorous grounding of the use of the numerical resolution of the simpler equation \eqref{eq: PDE bar V} for high-frequency control problems.
To quantify the approximation error we employ an analytic-probabilistic method, using the PDE and IDE in analytic arguments to represent the error, and then concluding via the probabilistic formulation to obtain a bound on the error. More precisely we introduce a Taylor remainder
\[\dre:=\mu\cdot\partial_{x} \bar V +\frac12 \Tr[\sigma\sigma^\top \partial_{xx}^{2}\bar V] - \frac1\ve\int \bar V(\cdot,\cdot+b_\ve(\cdot,e))-\bar V\nu(\de e)\,. \]
The analysis rests upon the following observation: if $\bar V$ is a solution to equation \eqref{eq: PDE bar V}, then it is a solution to a pure-jump HJB equation with modified source term, namely:
\[ 0=\partial_{t} \bar V +\eta_\ve\sup_{\bar a\in \Ab}\left( \int \bar V(\cdot,\cdot+b_\ve(\cdot,\bar a,e))-\bar V\nu(\de e)+(r+\dre)(\cdot,\bar a)\right),\]
on $[0,T)\x \Rb^d$. From here the probabilistic interpretation gives us
\[\begin{align} \vert \bar V - V_\ve \vert \le \sup_{\alpha\in \Ac}\Eb\left[ \int_0^T \vert \dre \vert (X^{t,x, \alpha}_s, \alpha_s)\de N_s \right]\label{eq: probabilistic representation of delta r epsilon}\\ \end{align}\]and it remains to bound $\Vert\dre\Vert_\infty$. This is achieved through regularity analysis of solutions of \eqref{eq: PDE bar V}, see Proposition 1 below. Details can be found in the various articles attached to this project. Theorem 1 below gives a result from
Therefore, we can conclude about the approximation with Theorem 1.
Theorem 1 represents only the essential first step in this line of work. It serves to illustrate the motivation behind the work: controlling pure-jump processes in the high-frequency regime is costly and any analysis is complicated by the non-local nature of the HJB equation. We characterise the convergence rate of control problems along a particular small jump limit regime as the frequency of jumps $\eta\to+\infty$. This gives a principled motivation to the use of this diffusive limit approximation in practice. It also gives principles numerical schemes âfor freeâ. From an analytic standpoint, we have essentially reduced the problem of analysis to the question of the regularity of the limit problem.
There are many technical points which I havenât highlighted here for the sake of brevity and many interesting directions beyond this first result. For instance, I presented here a finite-horizon control problem but in
In the context of another project page, I present the extension to the reinforcement learning problem, i.e. where $b$ and $r$ are unknown a priori and are learned from sequential interaction.
From a more involved analytic perspective, I have written up some thoughts under another project page about a generalisation of this approach which is also elaborated on in this blog post.
First, some matters of generalisation. The lynchpin of this method is the approximation of the generator $\Lc$ of the pure-jump process by $\bar\Lc$, insofar as they are applied to $\bar V$. These are the only terms that differ between the two HJB equations. This discussion is the topic of this project page. Another generalisation would of course involve the study of different control problems, such as optimal stopping, singular control etc.
Second, there is the interesting point of the iterated error correction. Suppose the rescaled error term $\ve^{-\frac\gamma2}\dre$ is regular, and consider the equation \(\begin{align} 0=\partial_{t} \bar v +\eta_\ve\sup_{\bar a\in \Ab}\left( \int \bar v(\cdot,\cdot+b_\ve(\cdot,\bar a,e))-\bar v\nu(\de e)+\ve^{\frac\gamma2}\dre(\cdot,\bar a)\right) \label{eq: HJB error corrections} \end{align}\) which corresponds to the control problem on the right hand side of \eqref{eq: probabilistic representation of delta r epsilon}. This suggests that it can admit a diffusion limit too, and we can iterate the approximation procedure to âcorrectâ the error. That is, if the solution $\bar v$ to \eqref{eq: HJB error corrections} remains bounded and regular, the approximation $\bar V + \ve^{\frac\gamma2}\bar v$ will improve the rate. In particular, if the solution $\bar v$ is $\delta\gamma$ Hölder, then \[ \Vert V_\ve - (\bar V + \ve^{\frac\gamma2}\bar v)\Vert_\infty \le \ve^{\frac{\gamma+\delta\gamma}{2}}\] In theory, if the resulting error is smooth enough we could repeat the process. However, at each step, we must assume the regularity of the solution to a non-linear PIDE. The question here is whether we can reduce these assumptions to something depending only on $b_1$, $b_2$, and $r$, and do so in a meaningful way. A good place to start this study would probably be the linear autonomous equation arising from $\vert\Ab\vert=1$.
Finally, from a practical perspective, there is a question regarding the learning of coefficients, see also here, and especially the possibility of estimating $\ve$ in real systems. There might be some connections here to the estimation of Hurst coefficients in fractional Brownian motions.
Application to Reinforcement Learning.
A more general analytic framework for process approximations in control.