OCP Transformation
Optimal Control Problem
Let:
\begin{aligned} \bm{x}(t) & \in \Bbb{R}^n \qquad & \text{(state variables)} \\ \bm{\lambda}(t) & \in \Bbb{R}^n \qquad & \text{(costates or adjoint variables)} \\ \bm{u}(t) & \in \mathcal{U} \subseteq \Bbb{R}^m \qquad & \text{(control variables)} \end{aligned}
Problem Statement
We aim to solve the following Optimal Control Problem:
Objective: Minimize the performance index
\underbrace{ \underbrace{\Phi(\bm{x}(a),\bm{x}(b))}_{\textcolor{red}{\text{Mayer term}}} + \int_a^b \underbrace{L(\bm{x},\bm{u},t)}_{\textcolor{blue}{\text{Lagrange term}}} \, \mathrm{d}t }_{\textcolor{DarkGreen}{\text{Bolza form}}}
Subject to:
- System dynamics:
\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)
- Boundary conditions:
\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}
- Control constraints:
\bm{u}(t) \in \mathcal{U}
Hamiltonian
The Hamiltonian function is introduced as follows:
\begin{aligned} H(\bm{x}, \bm{\lambda}, \bm{u}, t) & = L(\bm{x}, \bm{u}, t) + \bm{\lambda} \cdot \bm{f}(\bm{x}, \bm{u}, t) \\ B(\bm{x}_a, \bm{x}_b, \bm{\omega}) & = \Phi(\bm{x}_a, \bm{x}_b) + \bm{\omega} \cdot \bm{b}(\bm{x}_a, \bm{x}_b) \end{aligned}
Where:
\begin{aligned} \bm{\lambda} \cdot \bm{f}(\bm{x}, \bm{u}, t) &= \sum_{k=1}^n \lambda_k f_k(\bm{x}, \bm{u}, t) \\ \bm{\omega} \cdot \bm{b}(\bm{x}_a, \bm{x}_b) &= \sum_{k=1}^p \omega_k b_k(\bm{x}_a, \bm{x}_b) \end{aligned}
Boundary Value Problem (BVP)
The candidate for the constrained optimal solution must satisfy the following Boundary Value Problem (BVP):
\left\{ \begin{aligned} \bm{x}^\prime & = \frac{\partial H}{\partial \bm{\lambda}}(\bm{x}, \bm{\lambda}, \bm{u}, t) = \bm{f}(\bm{x}, \bm{u}, t) \quad & \text{(state equation)} \\[1em] \bm{\lambda}^\prime & = -\frac{\partial H}{\partial \bm{x}}(\bm{x}, \bm{\lambda}, \bm{u}, t) \quad & \text{(adjoint equation)} \\[2em] \bm{0} & = \frac{\partial B}{\partial \bm{\omega}}(\bm{x}_a, \bm{x}_b, \bm{\omega}) = \bm{b}(\bm{x}_a, \bm{x}_b) \quad & \text{(boundary condition)} \\[2em] \bm{0} & = \frac{\partial B}{\partial \bm{x}_a}(\bm{x}(a), \bm{x}(b), \bm{\omega}) - \bm{\lambda}(a) \quad & \text{(additional boundary condition)} \\[1em] \bm{0} & = \frac{\partial B}{\partial \bm{x}_b}(\bm{x}(a), \bm{x}(b), \bm{\omega}) + \bm{\lambda}(b) \quad & \text{(additional boundary condition)} \\[2em] \bm{u}(t) & = \mathop{\textrm{argmin}}\limits_{\bm{v} \in \mathcal{U}} H(\bm{x}(t), \bm{\lambda}(t), \bm{v}, t) \quad & \text{(control equation)} \end{aligned} \right.
Remark 1 (Remark on Boundary Conditions). From:
\begin{aligned} \bm{0} & = \frac{\partial B}{\partial \bm{x}_a}(\bm{x}(a), \bm{x}(b), \bm{\omega}) - \bm{\lambda}(a), \\ \bm{0} & = \frac{\partial B}{\partial \bm{x}_b}(\bm{x}(a), \bm{x}(b), \bm{\omega}) + \bm{\lambda}(b), \end{aligned}
It is often possible to eliminate \bm{\omega} from the boundary conditions.
Lagrange to Mayer Transformation
Consider the following Optimal Control Problem (OCP):
Objective
Minimize the integral of the running cost L(\bm{x}, \bm{u}, t):
\text{minimize} \quad \int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t
Subject to:
- System dynamics (state evolution):
\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)
- Boundary conditions:
\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}
Introducing an Auxiliary Variable
To transform the problem from Lagrange form to Mayer form, introduce a new auxiliary variable z to represent the accumulation of the running cost L(\bm{x}, \bm{u}, t):
z' = L(\bm{x}, \bm{u}, t), \qquad z(a) = 0
Reformulating the Objective
The integral of L(\bm{x}, \bm{u}, t) can be rewritten using the new variable z:
\int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t = \int_a^b z' \, \mathrm{d}t = z(b) - z(a) = z(b)
Thus, the original objective of minimizing the integral can now be expressed as minimizing z(b):
Transformed Objective
\text{minimize} \quad z(b)
Transformed System Dynamics
The system dynamics now include both the original state \bm{x} and the new variable z:
\left\{ \begin{aligned} \bm{x}' &= \bm{f}(\bm{x}, \bm{u}, t) \\ z' &= L(\bm{x}, \bm{u}, t) \end{aligned} \right.
Transformed Boundary Conditions
The boundary conditions now include the original condition on \bm{x} and the initial condition for z:
\left\{ \begin{aligned} \bm{b}(\bm{x}(a), \bm{x}(b)) &= \bm{0} \\ z(a) &= 0 \end{aligned} \right.
Reformulating the Problem Using New Variables
We can define the augmented state \bm{w} to combine \bm{x} and z, and correspondingly define the new system dynamics and boundary conditions:
\begin{aligned} \bm{w} &= \begin{pmatrix} \bm{x} \\ z \end{pmatrix}, \quad & \bm{F}(\bm{w}, \bm{u}, t) &= \begin{pmatrix} \bm{f}(\bm{x}, \bm{u}, t) \\ L(\bm{x}, \bm{u}, t) \end{pmatrix} \end{aligned}
The new objective becomes:
\Phi(\bm{w}(a), \bm{w}(b)) = z(b)
The boundary conditions are expressed as:
\bm{c}(\bm{w}(a), \bm{w}(b)) = \begin{pmatrix} \bm{b}(\bm{x}(a), \bm{x}(b)) \\ z(a) \end{pmatrix}
Final Transformed Problem
The transformed OCP now takes the Mayer form:
Objective
\text{minimize} \quad \Phi(\bm{w}(a), \bm{w}(b))
Subject to:
- Augmented system dynamics:
\bm{w}' = \bm{F}(\bm{w}, \bm{u}, t)
- Augmented boundary conditions:
\bm{c}(\bm{w}(a), \bm{w}(b)) = \bm{0}
Mayer to Lagrange Transformation
Consider the following Optimal Control Problem (OCP) given in Mayer form.
Objective
Minimize the terminal cost \Phi(\bm{x}(a), \bm{x}(b)):
\text{minimize} \quad \Phi(\bm{x}(a), \bm{x}(b))
Subject to:
- System dynamics (state evolution):
\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)
- Boundary conditions:
\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}
Introducing an Auxiliary Function
To transform this problem from Mayer form to Lagrange form (which involves a running cost), we introduce a new auxiliary function g(t) that reconstructs the Mayer terminal cost \Phi(\bm{x}(a), \bm{x}(b)) as a cumulative function of t. This allows us to spread the cost over the time horizon and thus define a running cost function.
Define g(t) as:
g(t) = \Phi(\bm{x}(a), \bm{x}(t)) + \frac{t - a}{b - a} \Phi(\bm{x}(a), \bm{x}(a))
Here, \bm{x}(t) is the state at time t, and \Phi(\bm{x}(a), \bm{x}(t)) represents the contribution to the cost as the system evolves. The second term in g(t), \frac{t - a}{b - a} \Phi(\bm{x}(a), \bm{x}(a)), ensures that the cost is scaled proportionally to the time horizon.
Deriving the Running Cost
Next, we differentiate g(t) with respect to time t to obtain a running cost.
First, compute g'(t):
g'(t) = \frac{\partial \Phi(\bm{x}(a), \bm{x}(t))}{\partial \bm{x}_b} \bm{x}'(t) + \frac{\Phi(\bm{x}(a), \bm{x}(a))}{b - a}
Since \bm{x}'(t) = \bm{f}(\bm{x}, \bm{u}, t) (from the system dynamics), we substitute this into the expression:
g'(t) = \frac{\partial \Phi(\bm{x}(a), \bm{x}(t))}{\partial \bm{x}_b} \bm{f}(\bm{x}, \bm{u}, t) + \frac{\Phi(\bm{x}(a), \bm{x}(a))}{b - a}
Integrating the Running Cost
By integrating g'(t) over the time interval [a, b], we can reconstruct the original Mayer cost \Phi(\bm{x}(a), \bm{x}(b)):
\int_a^b g'(t) \, \mathrm{d}t = g(b) - g(a)
Substituting the values of g(b) and g(a), we get:
g(b) - g(a) = \Phi(\bm{x}(a), \bm{x}(b)) + \cancel{\Phi(\bm{x}(a), \bm{x}(a))} - \cancel{\Phi(\bm{x}(a), \bm{x}(a))}
Thus, the integral of g'(t) gives us the original Mayer cost:
\int_a^b g'(t) \, \mathrm{d}t = \Phi(\bm{x}(a), \bm{x}(b))
Defining the Lagrange Running Cost
Now, we can define the running cost L(\bm{x}, \bm{u}, t) for the Lagrange formulation of the problem. From the expression for g'(t), we set:
L(\bm{x}, \bm{u}, t) = \frac{\partial \Phi(\bm{x}(a), \bm{x}(t))}{\partial \bm{x}_b} \bm{f}(\bm{x}, \bm{u}, t) + \frac{\Phi(\bm{x}(a), \bm{x}(a))}{b - a}
This running cost L(\bm{x}, \bm{u}, t) captures both the immediate cost rate associated with the evolution of the state and a constant term related to the total cost scaling over time.
Final Transformed Problem (Lagrange Form)
The problem is now expressed in Lagrange form:
Objective
Minimize the integral of the running cost L(\bm{x}, \bm{u}, t):
\text{minimize} \quad \int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t
Subject to:
- System dynamics:
\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)
- Boundary conditions:
\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}
By introducing the auxiliary function g(t) and deriving the running cost L(\bm{x}, \bm{u}, t), we have successfully transformed the Mayer problem into an equivalent Lagrange problem.
Remark 2. In practice Mayer form, Lagrange form or Bolza form of the target are equivalent.
Constant Parameters to Constant Functions
Consider the following Optimal Control Problem (OCP), where the performance index depends also on constant parameters \bm{\mu} \in \mathbb{R}^q.
Objective
Minimize the performance index, which is a combination of the terminal cost \Phi and the integral of the running cost L:
\Phi(\bm{x}(a), \bm{x}(b), \bm{\mu}) + \int_a^b L(\bm{x}, \bm{u}, \bm{\mu}, t) \, \mathrm{d}t
Subject to:
- System dynamics:
The system evolves according to the following differential equation, which depends on both the state \bm{x} and the control \bm{u}, as well as the constant parameters \bm{\mu}:
\bm{x}' = \bm{f}(\bm{x}, \bm{u}, \bm{\mu}, t)
- Boundary conditions:
The system must satisfy specific boundary conditions at the initial and terminal time points:
\bm{b}(\bm{x}(a), \bm{x}(b), \bm{\mu}) = \bm{0}
- Control constraints:
The control \bm{u}(t) must remain within a predefined admissible set \mathcal{U}:
\bm{u}(t) \in \mathcal{U}
Here, \bm{\mu} \in \mathbb{R}^q are constant parameters that remain fixed over time. However, we can transform the problem by introducing constant functions that represent these parameters.
Introducing Constant Functions
To rewrite the problem in a form suitable for certain optimization techniques, we introduce constant functions for each component of the parameter vector \bm{\mu}:
\mu_k(t) \equiv \mu_k, \qquad k = 1, 2, \ldots, q
Each \mu_k(t) is a constant function that satisfies the differential equation:
\mu_k'(t) = 0 \quad \text{for all} \quad t \in [a, b]
In vector form, this is written as:
\bm{\mu}'(t) = \bm{0}
Reformulated Problem
By introducing these constant functions, we can now treat the parameters \bm{\mu} as state variables that evolve according to the trivial dynamic equation \bm{\mu}'(t) = \bm{0}. This allows us to apply standard optimal control techniques to problems involving constant parameters. The reformulated problem is as follows:
Objective
Minimize the modified performance index:
\Phi(\bm{x}(a), \bm{x}(b), \bm{\mu}(a)) + \int_a^b L(\bm{x}(t), \bm{u}(t), \bm{\mu}(t), t) \, \mathrm{d}t
Subject to:
- Extended system dynamics:
\left\{ \begin{aligned} \bm{x}' &= \bm{f}(\bm{x}, \bm{u}, \bm{\mu}, t) \\ \bm{\mu}' &= \bm{0} \end{aligned} \right.
- Boundary conditions:
\bm{b}(\bm{x}(a), \bm{x}(b), \bm{\mu}(a)) = \bm{0}
- Control constraints:
\bm{u}(t) \in \mathcal{U}
Remark 3. By introducing the constant functions \mu_k(t), we have transformed the original OCP with fixed parameters \bm{\mu} into an equivalent OCP where these parameters are treated as constant functions governed by the trivial dynamic equation \bm{\mu}'(t) = \bm{0}. This reformulation allows the use of standard optimal control methods to solve problems with constant parameters.
Integral Constraints
Consider the following Optimal Control Problem (OCP), where the performance index is influenced by certain integral constraints.
Objective
Minimize the performance index, which consists of a terminal cost and the integral of a running cost:
\text{Minimize} \quad \Phi(\bm{x}(a), \bm{x}(b)) + \int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t
Subject to:
- System Dynamics:
The evolution of the state variables is governed by the following differential equation:
\bm{x}' = \bm{f}(\bm{x}, \bm{u}, t)
- Integral Constraints:
The problem must satisfy specific integral constraints over the time interval ([a, b]):
\int_a^b \bm{g}(\bm{x}, \bm{u}, t) \, \mathrm{d}t = \bm{0}
- Boundary Conditions:
The system must adhere to specified boundary conditions at both the initial and terminal time points:
\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}
- Control Constraints:
The control input \bm{u}(t) is required to remain within a predefined admissible set \mathcal{U}:
\bm{u}(t) \in \mathcal{U}
Introducing Auxiliary Functions
To effectively incorporate the integral constraints into our OCP, we introduce auxiliary functions \bm{z}(t) that capture the cumulative behavior of the integral constraints over time. We define \bm{z}(t) as follows:
\bm{z}(t) = \int_a^t \bm{g}(\bm{x}(s), \bm{u}(s), s) \, \mathrm{d}s
This function represents the accumulated effect of the constraint \bm{g} from the starting time (a) up to any time (t). The derivative of \bm{z}(t) is given by:
\bm{z}'(t) = \bm{g}(\bm{x}(t), \bm{u}(t), t)
Reformulated Problem
By introducing the auxiliary function \bm{z}(t), we can reformulate the OCP as follows:
Objective
The objective remains to minimize the modified performance index:
\text{Minimize} \quad \Phi(\bm{x}(a), \bm{x}(b)) + \int_a^b L(\bm{x}, \bm{u}, t) \, \mathrm{d}t
Subject to:
- Extended System Dynamics:
The dynamics of the state variables and the auxiliary function can now be described as:
\left\{ \begin{aligned} \bm{x}' &= \bm{f}(\bm{x}, \bm{u}, t) \\ \bm{z}' &= \bm{g}(\bm{x}, \bm{u}, t) \end{aligned} \right.
- Integral Constraints:
The integral constraints are now effectively incorporated into the system through the auxiliary function \bm{z}(t), allowing them to be handled dynamically.
- Boundary Conditions:
The boundary conditions remain unchanged:
\bm{b}(\bm{x}(a), \bm{x}(b)) = \bm{0}
- Control Constraints:
The control constraints continue to apply:
\bm{u}(t) \in \mathcal{U}
In summary, the introduction of the auxiliary function \bm{z}(t) enables the reformulation of the original OCP, allowing for the integral constraints to be treated as part of the system dynamics.
Free-Time
In some optimal control problems, the time horizon T is not fixed but instead is an unknown variable that must be determined as part of the solution. This setup leads to what is known as the free-time problem, which can be formalized as:
\begin{aligned} \text{Minimize:} \quad & \phi\big(\bm{x}(0), \bm{x}(T)\big) + \int_0^T L(\bm{x}, \bm{u}, t) \, \mathrm{d}t, \\ \text{Subject to:} \quad & \bm{x}' = \bm{f}(x, u, t), \\ & \bm{b}\big(\bm{x}(0), \bm{x}(T)\big) = 0, \end{aligned}
where:
\bm{x}(t) is the state variable,
\bm{u}(t) is the control input,
T is the free (unknown) time horizon,
\phi(\bm{x}(0), \bm{x}(T)) represents the Mayer cost,
L(\bm{x}, \bm{u}, t) is the (Lagrange) running cost,
\bm{f}(\bm{x}, \bm{u}, t) is the system dynamics,
\bm{b}(\bm{x}(0), \bm{x}(T)) = \bm{0} represents boundary conditions.
Normalizing Time
To handle the unknown time horizon T, a common approach is to normalize the time interval to a fixed domain, say [0, 1]. This is achieved by introducing a change of variables:
t = sT, \quad s \in [0, 1],
where s is a normalized time variable. Under this transformation:
- The state variable becomes \tilde{\bm{x}}(s) = \bm{x}(sT), so its derivative transforms as: \tilde{x}'(s) = \frac{\mathrm{d}\tilde{\bm{x}}}{\mathrm{d}s} = \frac{\mathrm{d}\bm{x}}{\mathrm{d}t} \frac{\mathrm{d}t}{\mathrm{d}s} = \bm{x}'(sT)T.
- The control variable becomes \tilde{\bm{u}}(s) = \bm{u}(sT).
Substituting into the system dynamics, \bm{x}'(t) = \bm{f}(\bm{x}, \bm{u}, t) transforms to: \tilde{\bm{x}}'(s) = T \bm{f}\big(\tilde{\bm{x}}(s), \tilde{\bm{u}}(s), sT\big).
Similarly, the running cost \int_0^T L(\bm{x}, \bm{u}, t) \, \mathrm{d}t becomes: \int_0^T L(\bm{x}, \bm{u}, t) \, \mathrm{d}t = \int_0^1 L\big(\tilde{\bm{x}}(s), \tilde{\bm{u}}(s), sT\big) T \, \mathrm{d}s.
The Mayer cost \phi(\bm{x}(0), \bm{x}(T)) remains unchanged under the transformation.
Reformulated Problem
Using the normalized variables and the change of variable t = sT, the free-time problem can be rewritten as:
\begin{aligned} \text{Minimize:} \quad & \phi\big(\tilde{\bm{x}}(0), \tilde{\bm{x}}(1)\big) + \int_0^1 L\big(\tilde{\bm{x}}, \tilde{\bm{u}}, {\color{red}sT}\big) {\color{red}T \, \mathrm{d}s}, \\ \text{Subject to:} \quad & \tilde{\bm{x}}'(s) = {\color{red}T} \bm{f}\big(\tilde{\bm{x}}, \tilde{\bm{u}}, {\color{red}sT}\big), \\ & b\big(\tilde{\bm{x}}(0), \tilde{\bm{x}}(1)\big) = 0. \end{aligned}
Here, T is now treated as an additional optimization variable alongside the state \tilde{\bm{x}}(s) and the control \tilde{\bm{u}}(s). The problem is now defined over the fixed interval [0, 1], which simplifies the numerical treatment.
Free-Time 2
Let:
\begin{aligned} \text{Minimize:} \quad & \phi\big(\bm{x}(0), \bm{x}(T)\big) + \int_0^T L(\bm{x}, \bm{u}, t) \, \mathrm{d}t, \\ \text{Subject to:} \quad & \bm{x}' = \bm{f}(x, u, t), \\ & \bm{b}\big(\bm{x}(0), \bm{x}(T)\big) = 0, \end{aligned}
Normalizing Time
Introduce a change of variables:
t = sT(s), \quad s \in [0, 1], \qquad T'(s)=0
where s is a normalized time variable. Under this transformation:
- The state variable becomes \tilde{\bm{x}}(s) = \bm{x}(sT(s)), so its derivative transforms as: \begin{aligned} \tilde{x}'(s) &= \frac{\mathrm{d}\tilde{\bm{x}}}{\mathrm{d}s} = \frac{\mathrm{d}\bm{x}}{\mathrm{d}t} \frac{\mathrm{d}t}{\mathrm{d}s} \\ & = \bm{x}'(sT(s))(sT(s))'\\ & =\bm{x}'(sT(s))(T(s)+sT'(s)) \\ & =\bm{x}'(sT(s))T(s) \end{aligned}
- The control variable becomes \tilde{\bm{u}}(s) = \bm{u}(sT(s)).
Substituting into the system dynamics, \bm{x}'(t) = \bm{f}(\bm{x}, \bm{u}, t) transforms to: \tilde{\bm{x}}'(s) = T(s) \bm{f}\big(\tilde{\bm{x}}(s), \tilde{\bm{u}}(s), sT(s)\big).
Similarly, the running cost \int_0^T L(\bm{x}, \bm{u}, t) \, \mathrm{d}t becomes: \int_0^T L(\bm{x}, \bm{u}, t) \, \mathrm{d}t = \int_0^1 L\big(\tilde{\bm{x}}(s), \tilde{\bm{u}}(s), sT(s)\big) T(s) \, \mathrm{d}s.
The Mayer cost \phi(\bm{x}(0), \bm{x}(T)) remains unchanged under the transformation.
Reformulated Problem
Using the normalized variables and the change of variable t = sT(s), the free-time problem can be rewritten as:
\begin{aligned} \text{Minimize:} \quad & \phi\big(\tilde{\bm{x}}(0), \tilde{\bm{x}}(1)\big) + \int_0^1 L\big(\tilde{\bm{x}}, \tilde{\bm{u}}, {\color{red}sT(s)}\big) {\color{red}T(s) \, \mathrm{d}s}, \\ \text{Subject to:} \quad & \tilde{\bm{x}}'(s) = {\color{red}T(s)} \bm{f}\big(\tilde{\bm{x}}, \tilde{\bm{u}}, {\color{red}sT(s)}\big), \\ & b\big(\tilde{\bm{x}}(0), \tilde{\bm{x}}(1)\big) = 0. \end{aligned}