OCP Transformation

Author

Affiliation

Enrico Bertolazzi

University of Trento, Department of Industrial Engineering

Optimal Control Problem

Let:

\begin{aligned} \bm{x}(t) & \in \Bbb{R}^n \qquad & \text{(state variables)} \\ \bm{\lambda}(t) & \in \Bbb{R}^n \qquad & \text{(costates or adjoint variables)} \\ \bm{u}(t) & \in \mathcal{U} \subseteq \Bbb{R}^m \qquad & \text{(control variables)} \end{aligned}

Problem Statement

We aim to solve the following Optimal Control Problem:

Objective: Minimize the performance index

\underbrace{ \underbrace{\Phi(\bm{x}(a),\bm{x}(b))}_{\textcolor{red}{\text{Mayer term}}} + \int_a^b \underbrace{L(\bm{x},\bm{u},t)}_{\textcolor{blue}{\text{Lagrange term}}} \, \mathrm{d}t }_{\textcolor{DarkGreen}{\text{Bolza form}}}

Subject to:

System dynamics:

\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)

Boundary conditions:

\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}

Control constraints:

\bm{u}(t) \in \mathcal{U}

Hamiltonian

The Hamiltonian function is introduced as follows:

\begin{aligned} H(\bm{x}, \bm{\lambda}, \bm{u}, t) & = L(\bm{x}, \bm{u}, t) + \bm{\lambda} \cdot \bm{f}(\bm{x}, \bm{u}, t) \\ B(\bm{x}_a, \bm{x}_b, \bm{\omega}) & = \Phi(\bm{x}_a, \bm{x}_b) + \bm{\omega} \cdot \bm{b}(\bm{x}_a, \bm{x}_b) \end{aligned}

Where:

\begin{aligned} \bm{\lambda} \cdot \bm{f}(\bm{x}, \bm{u}, t) &= \sum_{k=1}^n \lambda_k f_k(\bm{x}, \bm{u}, t) \\ \bm{\omega} \cdot \bm{b}(\bm{x}_a, \bm{x}_b) &= \sum_{k=1}^p \omega_k b_k(\bm{x}_a, \bm{x}_b) \end{aligned}

Boundary Value Problem (BVP)

The candidate for the constrained optimal solution must satisfy the following Boundary Value Problem (BVP):

\left\{ \begin{aligned} \bm{x}^\prime & = \frac{\partial H}{\partial \bm{\lambda}}(\bm{x}, \bm{\lambda}, \bm{u}, t) = \bm{f}(\bm{x}, \bm{u}, t) \quad & \text{(state equation)} \\[1em] \bm{\lambda}^\prime & = -\frac{\partial H}{\partial \bm{x}}(\bm{x}, \bm{\lambda}, \bm{u}, t) \quad & \text{(adjoint equation)} \\[2em] \bm{0} & = \frac{\partial B}{\partial \bm{\omega}}(\bm{x}_a, \bm{x}_b, \bm{\omega}) = \bm{b}(\bm{x}_a, \bm{x}_b) \quad & \text{(boundary condition)} \\[2em] \bm{0} & = \frac{\partial B}{\partial \bm{x}_a}(\bm{x}(a), \bm{x}(b), \bm{\omega}) - \bm{\lambda}(a) \quad & \text{(additional boundary condition)} \\[1em] \bm{0} & = \frac{\partial B}{\partial \bm{x}_b}(\bm{x}(a), \bm{x}(b), \bm{\omega}) + \bm{\lambda}(b) \quad & \text{(additional boundary condition)} \\[2em] \bm{u}(t) & = \mathop{\textrm{argmin}}\limits_{\bm{v} \in \mathcal{U}} H(\bm{x}(t), \bm{\lambda}(t), \bm{v}, t) \quad & \text{(control equation)} \end{aligned} \right.

Remark 1 (Remark on Boundary Conditions). From:

\begin{aligned} \bm{0} & = \frac{\partial B}{\partial \bm{x}_a}(\bm{x}(a), \bm{x}(b), \bm{\omega}) - \bm{\lambda}(a), \\ \bm{0} & = \frac{\partial B}{\partial \bm{x}_b}(\bm{x}(a), \bm{x}(b), \bm{\omega}) + \bm{\lambda}(b), \end{aligned}

It is often possible to eliminate \bm{\omega} from the boundary conditions.

Lagrange to Mayer Transformation

Consider the following Optimal Control Problem (OCP):

Objective

Minimize the integral of the running cost L(\bm{x}, \bm{u}, t):

\text{minimize} \quad \int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t

Subject to:

System dynamics (state evolution):

\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)

Boundary conditions:

\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}

Introducing an Auxiliary Variable

To transform the problem from Lagrange form to Mayer form, introduce a new auxiliary variable z to represent the accumulation of the running cost L(\bm{x}, \bm{u}, t):

z' = L(\bm{x}, \bm{u}, t), \qquad z(a) = 0

Reformulating the Objective

The integral of L(\bm{x}, \bm{u}, t) can be rewritten using the new variable z:

\int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t = \int_a^b z' \, \mathrm{d}t = z(b) - z(a) = z(b)

Thus, the original objective of minimizing the integral can now be expressed as minimizing z(b):

Transformed Objective

\text{minimize} \quad z(b)

Transformed System Dynamics

The system dynamics now include both the original state \bm{x} and the new variable z:

\left\{ \begin{aligned} \bm{x}' &= \bm{f}(\bm{x}, \bm{u}, t) \\ z' &= L(\bm{x}, \bm{u}, t) \end{aligned} \right.

Transformed Boundary Conditions

The boundary conditions now include the original condition on \bm{x} and the initial condition for z:

\left\{ \begin{aligned} \bm{b}(\bm{x}(a), \bm{x}(b)) &= \bm{0} \\ z(a) &= 0 \end{aligned} \right.

Reformulating the Problem Using New Variables

We can define the augmented state \bm{w} to combine \bm{x} and z, and correspondingly define the new system dynamics and boundary conditions:

\begin{aligned} \bm{w} &= \begin{pmatrix} \bm{x} \\ z \end{pmatrix}, \quad & \bm{F}(\bm{w}, \bm{u}, t) &= \begin{pmatrix} \bm{f}(\bm{x}, \bm{u}, t) \\ L(\bm{x}, \bm{u}, t) \end{pmatrix} \end{aligned}

The new objective becomes:

\Phi(\bm{w}(a), \bm{w}(b)) = z(b)

The boundary conditions are expressed as:

\bm{c}(\bm{w}(a), \bm{w}(b)) = \begin{pmatrix} \bm{b}(\bm{x}(a), \bm{x}(b)) \\ z(a) \end{pmatrix}

Final Transformed Problem

The transformed OCP now takes the Mayer form:

Objective

\text{minimize} \quad \Phi(\bm{w}(a), \bm{w}(b))

Subject to:

Augmented system dynamics:

\bm{w}' = \bm{F}(\bm{w}, \bm{u}, t)

Augmented boundary conditions:

\bm{c}(\bm{w}(a), \bm{w}(b)) = \bm{0}

Mayer to Lagrange Transformation

Consider the following Optimal Control Problem (OCP) given in Mayer form.

Objective

Minimize the terminal cost \Phi(\bm{x}(a), \bm{x}(b)):

\text{minimize} \quad \Phi(\bm{x}(a), \bm{x}(b))

Subject to:

System dynamics (state evolution):

\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)

Boundary conditions:

\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}

Introducing an Auxiliary Function

To transform this problem from Mayer form to Lagrange form (which involves a running cost), we introduce a new auxiliary function g(t) that reconstructs the Mayer terminal cost \Phi(\bm{x}(a), \bm{x}(b)) as a cumulative function of t. This allows us to spread the cost over the time horizon and thus define a running cost function.

Define g(t) as:

g(t) = \Phi(\bm{x}(a), \bm{x}(t)) + \frac{t - a}{b - a} \Phi(\bm{x}(a), \bm{x}(a))

Here, \bm{x}(t) is the state at time t, and \Phi(\bm{x}(a), \bm{x}(t)) represents the contribution to the cost as the system evolves. The second term in g(t), \frac{t - a}{b - a} \Phi(\bm{x}(a), \bm{x}(a)), ensures that the cost is scaled proportionally to the time horizon.

Deriving the Running Cost

Next, we differentiate g(t) with respect to time t to obtain a running cost.

First, compute g'(t):

g'(t) = \frac{\partial \Phi(\bm{x}(a), \bm{x}(t))}{\partial \bm{x}_b} \bm{x}'(t) + \frac{\Phi(\bm{x}(a), \bm{x}(a))}{b - a}

Since \bm{x}'(t) = \bm{f}(\bm{x}, \bm{u}, t) (from the system dynamics), we substitute this into the expression:

g'(t) = \frac{\partial \Phi(\bm{x}(a), \bm{x}(t))}{\partial \bm{x}_b} \bm{f}(\bm{x}, \bm{u}, t) + \frac{\Phi(\bm{x}(a), \bm{x}(a))}{b - a}

Integrating the Running Cost

By integrating g'(t) over the time interval [a, b], we can reconstruct the original Mayer cost \Phi(\bm{x}(a), \bm{x}(b)):

\int_a^b g'(t) \, \mathrm{d}t = g(b) - g(a)

Substituting the values of g(b) and g(a), we get:

g(b) - g(a) = \Phi(\bm{x}(a), \bm{x}(b)) + \cancel{\Phi(\bm{x}(a), \bm{x}(a))} - \cancel{\Phi(\bm{x}(a), \bm{x}(a))}

Thus, the integral of g'(t) gives us the original Mayer cost:

\int_a^b g'(t) \, \mathrm{d}t = \Phi(\bm{x}(a), \bm{x}(b))

Defining the Lagrange Running Cost

Now, we can define the running cost L(\bm{x}, \bm{u}, t) for the Lagrange formulation of the problem. From the expression for g'(t), we set:

L(\bm{x}, \bm{u}, t) = \frac{\partial \Phi(\bm{x}(a), \bm{x}(t))}{\partial \bm{x}_b} \bm{f}(\bm{x}, \bm{u}, t) + \frac{\Phi(\bm{x}(a), \bm{x}(a))}{b - a}

This running cost L(\bm{x}, \bm{u}, t) captures both the immediate cost rate associated with the evolution of the state and a constant term related to the total cost scaling over time.

Final Transformed Problem (Lagrange Form)

The problem is now expressed in Lagrange form:

Objective

Minimize the integral of the running cost L(\bm{x}, \bm{u}, t):

\text{minimize} \quad \int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t

Subject to:

System dynamics:

\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)

Boundary conditions:

\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}

By introducing the auxiliary function g(t) and deriving the running cost L(\bm{x}, \bm{u}, t), we have successfully transformed the Mayer problem into an equivalent Lagrange problem.

Remark 2. In practice Mayer form, Lagrange form or Bolza form of the target are equivalent.

Constant Parameters to Constant Functions

Consider the following Optimal Control Problem (OCP), where the performance index depends also on constant parameters \bm{\mu} \in \mathbb{R}^q.

Objective

Minimize the performance index, which is a combination of the terminal cost \Phi and the integral of the running cost L:

\Phi(\bm{x}(a), \bm{x}(b), \bm{\mu}) + \int_a^b L(\bm{x}, \bm{u}, \bm{\mu}, t) \, \mathrm{d}t

Subject to:

System dynamics:

The system evolves according to the following differential equation, which depends on both the state \bm{x} and the control \bm{u}, as well as the constant parameters \bm{\mu}:

\bm{x}' = \bm{f}(\bm{x}, \bm{u}, \bm{\mu}, t)

Boundary conditions:

The system must satisfy specific boundary conditions at the initial and terminal time points:

\bm{b}(\bm{x}(a), \bm{x}(b), \bm{\mu}) = \bm{0}

Control constraints:

The control \bm{u}(t) must remain within a predefined admissible set \mathcal{U}:

\bm{u}(t) \in \mathcal{U}

Here, \bm{\mu} \in \mathbb{R}^q are constant parameters that remain fixed over time. However, we can transform the problem by introducing constant functions that represent these parameters.

Introducing Constant Functions

To rewrite the problem in a form suitable for certain optimization techniques, we introduce constant functions for each component of the parameter vector \bm{\mu}:

\mu_k(t) \equiv \mu_k, \qquad k = 1, 2, \ldots, q

Each \mu_k(t) is a constant function that satisfies the differential equation:

\mu_k'(t) = 0 \quad \text{for all} \quad t \in [a, b]

In vector form, this is written as:

\bm{\mu}'(t) = \bm{0}

Reformulated Problem

By introducing these constant functions, we can now treat the parameters \bm{\mu} as state variables that evolve according to the trivial dynamic equation \bm{\mu}'(t) = \bm{0}. This allows us to apply standard optimal control techniques to problems involving constant parameters. The reformulated problem is as follows:

Objective

Minimize the modified performance index:

\Phi(\bm{x}(a), \bm{x}(b), \bm{\mu}(a)) + \int_a^b L(\bm{x}(t), \bm{u}(t), \bm{\mu}(t), t) \, \mathrm{d}t

Subject to:

Extended system dynamics:

\left\{ \begin{aligned} \bm{x}' &= \bm{f}(\bm{x}, \bm{u}, \bm{\mu}, t) \\ \bm{\mu}' &= \bm{0} \end{aligned} \right.

Boundary conditions:

\bm{b}(\bm{x}(a), \bm{x}(b), \bm{\mu}(a)) = \bm{0}

Control constraints:

\bm{u}(t) \in \mathcal{U}

Remark 3. By introducing the constant functions \mu_k(t), we have transformed the original OCP with fixed parameters \bm{\mu} into an equivalent OCP where these parameters are treated as constant functions governed by the trivial dynamic equation \bm{\mu}'(t) = \bm{0}. This reformulation allows the use of standard optimal control methods to solve problems with constant parameters.

Integral Constraints

Consider the following Optimal Control Problem (OCP), where the performance index is influenced by certain integral constraints.

Objective

Minimize the performance index, which consists of a terminal cost and the integral of a running cost:

\text{Minimize} \quad \Phi(\bm{x}(a), \bm{x}(b)) + \int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t

Subject to:

System Dynamics:

The evolution of the state variables is governed by the following differential equation:

\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)

Integral Constraints:

The problem must satisfy specific integral constraints over the time interval ([a, b]):

\int_a^b \bm{g}(\bm{x}, \bm{u}, t) \, \mathrm{d}t = \bm{0}

Boundary Conditions:

The system must adhere to specified boundary conditions at both the initial and terminal time points:

\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}

Control Constraints:

The control input \bm{u}(t) is required to remain within a predefined admissible set \mathcal{U}:

\bm{u}(t) \in \mathcal{U}

Introducing Auxiliary Functions

To effectively incorporate the integral constraints into our OCP, we introduce auxiliary functions \bm{z}(t) that capture the cumulative behavior of the integral constraints over time. We define \bm{z}(t) as follows:

\bm{z}(t) = \int_a^t \bm{g}(\bm{x}(s), \bm{u}(s), s) \, \mathrm{d}s

This function represents the accumulated effect of the constraint \bm{g} from the starting time (a) up to any time (t). The derivative of \bm{z}(t) is given by:

\bm{z}'(t) = \bm{g}(\bm{x}(t), \bm{u}(t), t)

Reformulated Problem

By introducing the auxiliary function \bm{z}(t), we can reformulate the OCP as follows:

Objective

The objective remains to minimize the modified performance index:

\text{Minimize} \quad \Phi(\bm{x}(a), \bm{x}(b)) + \int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t

Subject to:

Extended System Dynamics:

The dynamics of the state variables and the auxiliary function can now be described as:

\left\{ \begin{aligned} \bm{x}' &= \bm{f}(\bm{x}, \bm{u}, t) \\ \bm{z}' &= \bm{g}(\bm{x}, \bm{u}, t) \end{aligned} \right.

Integral Constraints:

The integral constraints are now effectively incorporated into the system through the auxiliary function \bm{z}(t), allowing them to be handled dynamically.

Boundary Conditions:

The boundary conditions remain unchanged:

\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}

Control Constraints:

The control constraints continue to apply:

\bm{u}(t) \in \mathcal{U}

In summary, the introduction of the auxiliary function \bm{z}(t) enables the reformulation of the original OCP, allowing for the integral constraints to be treated as part of the system dynamics.

Free-Time

In some optimal control problems, the time horizon T is not fixed but instead is an unknown variable that must be determined as part of the solution. This setup leads to what is known as the free-time problem, which can be formalized as:

\begin{aligned} \text{Minimize:} \quad & \phi\big(\bm{x}(0), \bm{x}(T)\big) + \int_0^T L(\bm{x}, \bm{u}, t) \, \mathrm{d}t, \\ \text{Subject to:} \quad & \bm{x}' = \bm{f}(x, u, t), \\ & \bm{b}\big(\bm{x}(0), \bm{x}(T)\big) = 0, \end{aligned}

where:

\bm{x}(t) is the state variable,
\bm{u}(t) is the control input,
T is the free (unknown) time horizon,
\phi(\bm{x}(0), \bm{x}(T)) represents the Mayer cost,
L(\bm{x}, \bm{u}, t) is the (Lagrange) running cost,
\bm{f}(\bm{x}, \bm{u}, t) is the system dynamics,
\bm{b}(\bm{x}(0), \bm{x}(T)) = \bm{0} represents boundary conditions.

Normalizing Time

To handle the unknown time horizon T, a common approach is to normalize the time interval to a fixed domain, say [0, 1]. This is achieved by introducing a change of variables:

t = sT, \quad s \in [0, 1],

where s is a normalized time variable. Under this transformation:

The state variable becomes \tilde{\bm{x}}(s) = \bm{x}(sT), so its derivative transforms as: \tilde{x}'(s) = \frac{\mathrm{d}\tilde{\bm{x}}}{\mathrm{d}s} = \frac{\mathrm{d}\bm{x}}{\mathrm{d}t} \frac{\mathrm{d}t}{\mathrm{d}s} = \bm{x}'(sT)T.
The control variable becomes \tilde{\bm{u}}(s) = \bm{u}(sT).

Substituting into the system dynamics, \bm{x}'(t) = \bm{f}(\bm{x}, \bm{u}, t) transforms to: \tilde{\bm{x}}'(s) = T \bm{f}\big(\tilde{\bm{x}}(s), \tilde{\bm{u}}(s), sT\big).

Similarly, the running cost \int_0^T L(\bm{x}, \bm{u}, t) \, \mathrm{d}t becomes: \int_0^T L(\bm{x}, \bm{u}, t) \, \mathrm{d}t = \int_0^1 L\big(\tilde{\bm{x}}(s), \tilde{\bm{u}}(s), sT\big) T \, \mathrm{d}s.

The Mayer cost \phi(\bm{x}(0), \bm{x}(T)) remains unchanged under the transformation.

Reformulated Problem

Using the normalized variables and the change of variable t = sT, the free-time problem can be rewritten as:

\begin{aligned} \text{Minimize:} \quad & \phi\big(\tilde{\bm{x}}(0), \tilde{\bm{x}}(1)\big) + \int_0^1 L\big(\tilde{\bm{x}}, \tilde{\bm{u}}, {\color{red}sT}\big) {\color{red}T \, \mathrm{d}s}, \\ \text{Subject to:} \quad & \tilde{\bm{x}}'(s) = {\color{red}T} \bm{f}\big(\tilde{\bm{x}}, \tilde{\bm{u}}, {\color{red}sT}\big), \\ & b\big(\tilde{\bm{x}}(0), \tilde{\bm{x}}(1)\big) = 0. \end{aligned}

Here, T is now treated as an additional optimization variable alongside the state \tilde{\bm{x}}(s) and the control \tilde{\bm{u}}(s). The problem is now defined over the fixed interval [0, 1], which simplifies the numerical treatment.

Free-Time 2

Let:

Normalizing Time

Introduce a change of variables:

t = sT(s), \quad s \in [0, 1], \qquad T'(s)=0

where s is a normalized time variable. Under this transformation:

The state variable becomes \tilde{\bm{x}}(s) = \bm{x}(sT(s)), so its derivative transforms as: \begin{aligned} \tilde{x}'(s) &= \frac{\mathrm{d}\tilde{\bm{x}}}{\mathrm{d}s} = \frac{\mathrm{d}\bm{x}}{\mathrm{d}t} \frac{\mathrm{d}t}{\mathrm{d}s} \\ & = \bm{x}'(sT(s))(sT(s))'\\ & =\bm{x}'(sT(s))(T(s)+sT'(s)) \\ & =\bm{x}'(sT(s))T(s) \end{aligned}
The control variable becomes \tilde{\bm{u}}(s) = \bm{u}(sT(s)).

Substituting into the system dynamics, \bm{x}'(t) = \bm{f}(\bm{x}, \bm{u}, t) transforms to: \tilde{\bm{x}}'(s) = T(s) \bm{f}\big(\tilde{\bm{x}}(s), \tilde{\bm{u}}(s), sT(s)\big).

The Mayer cost \phi(\bm{x}(0), \bm{x}(T)) remains unchanged under the transformation.

Reformulated Problem

Using the normalized variables and the change of variable t = sT(s), the free-time problem can be rewritten as:

\begin{aligned} \text{Minimize:} \quad & \phi\big(\tilde{\bm{x}}(0), \tilde{\bm{x}}(1)\big) + \int_0^1 L\big(\tilde{\bm{x}}, \tilde{\bm{u}}, {\color{red}sT(s)}\big) {\color{red}T(s) \, \mathrm{d}s}, \\ \text{Subject to:} \quad & \tilde{\bm{x}}'(s) = {\color{red}T(s)} \bm{f}\big(\tilde{\bm{x}}, \tilde{\bm{u}}, {\color{red}sT(s)}\big), \\ & b\big(\tilde{\bm{x}}(0), \tilde{\bm{x}}(1)\big) = 0. \end{aligned}