Minimization of a Functional with generic boundary conditions

Authors

Affiliation

Enrico Bertolazzi

University of Trento, Department of Industrial Engineering

Matteo Dalle Vedove

University of Trento, Department of Industrial Engineering

Extended definition

Considering now the functional \mathcal{B}(x) defined as

\mathcal{B}(x) = \int_a^b L (x(t),x'(t),x''(t),t)\, \mathrm{d}t \tag{1}

the problem to solve now is the minimization of \mathcal{B}(x) for all the function x\in\Bbb{V} where the functional space is defined as

\Bbb{V} = \left\{ x \textrm{ such that } \quad \begin{aligned} & x\in C^4([a,b]) \\ & x(a) = x_a, x'(a) = x_a', x(b) = x_b, x'(b) = x_b' \end{aligned} \right\}

and directional derivatives \delta x \in\Bbb{D} in the functional domain

\Bbb{D} = \left\{ \delta x \textrm{ such that } \begin{aligned} & x\in C^\infty([a,b]) \\ & x(a) = x'(a) = x(b) = x'(b) = 0 \end{aligned} \right\}

When trying to calculate the directional derivative of the functional \mathcal{B} (skipping all the unnecessary computation that’s similar to the cases yet studies) we end up to the following results:

\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}\alpha}\Big|_{\alpha = 0} \mathcal B(x+\alpha, \delta x) & = \frac{\mathrm{d}}{\mathrm{d}\alpha}\Big|_{\alpha = 0} \int_a^b L \big( x + \alpha\, \delta x, x' + \alpha\, \delta x', x''+\alpha\, \delta x'',t\big) \, \mathrm{d}t \\[1em] & = \int_a^b \left( \frac{\partial L}{\partial x}(x,x',x'',t)\delta x + \frac{\partial L}{\partial x'}\delta x' + \frac{\partial L}{\partial x'''}\delta x'' \right) \, \mathrm{d}t \end{aligned}

To cancel out the terms involving the terms \delta x', \delta x'' it’s necessary to use integration by parts considering the following derivatives:

\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t} \left( \frac{\partial L}{\partial x'}\delta x \right) & = \left( \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x}\delta x \right) + \frac{\partial L}{\partial x'}\delta x' \\[1em] \frac{\mathrm{d}}{\mathrm{d}t} \left( \frac{\partial L}{\partial x''}\delta x' \right) & = \left(\frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x''}\delta x'\right) + \frac{\partial L}{\partial x''}\delta x'' \end{aligned}

and so with that said the derivative becomes

\begin{aligned} \delta \mathcal{B} & = \int_a^b \frac{\partial L}{\partial x}\delta x + \cancel{\frac{\mathrm{d}}{\mathrm{d}t} \left(\frac{\partial L}{\partial x'}\delta x \right)} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x'}\delta x \\ &\qquad + \cancel{\frac{\mathrm{d}}{\mathrm{d}t}\left( \frac{\partial L}{\partial x''}\delta x' \right) } - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x''}\delta x' \, \mathrm{d}t \\ & = \int_a^b \left[\left( \frac{\partial L}{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x'}\right)\delta x - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x''}\delta x' \right]\, \mathrm{d}t \end{aligned}

In the first line the terms are cancelled because if evaluated singularly we can see that they become in the form

\int_a^b \frac{\mathrm{d}}{\mathrm{d}t} \left( \frac{\partial L}{\partial x'}\delta x\right) \, \mathrm{d}t = \left[ \frac{\partial L}{\partial x'}\delta x \right]_a^b \xrightarrow{\delta x(a) = \delta x(b) = 0} 0

To finish the analysis of the derivation we have to consider one last integration by part for the element

\frac{\mathrm{d}}{\mathrm{d}t} \left( \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x''}\delta x \right) = \frac{\mathrm{d}^2}{\mathrm{d}t^2} \frac{\partial L}{\partial x''}\delta x + \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x''}\delta x'

\implies \qquad \delta\mathcal B(x) = \int_a^b \underbrace{\left( \frac{\partial L}{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x'} + \frac{\mathrm{d}^2}{\mathrm{d}t^2} \frac{\partial L}{\partial x''}\right)}_{=f} \delta x\, \mathrm{d}t \tag{2}

Using so the fundamental lemma of calculus of variation in order to determine the function x that minimize the functional \mathcal{B} we need to solve the system of differential equation

\frac{\partial L}{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x'} + \frac{\mathrm{d}^2}{\mathrm{d}t^2} \frac{\partial L}{\partial x''} = 0

subjected to the the boundary conditions initially defined.

Example 1 (minimization of a functional with the Euler-Lagrange method) Let’s consider the problem

\begin{aligned} \textrm{minimize:}\qquad & \int_a^b \Big( \big(x''\big)^2 - x^2 + t \Big)\,\mathrm{d}t \\ \textrm{with:}\qquad & x(a) = 0,\quad x'(a) = 1 \\ & x(b) = 1,\quad x'(b) = 2 \end{aligned}

In this case the function in minimization functional is

L(x,x',x'',t) = (x'')^2 - x^2+t

and all the boundaries conditions are well defined. In this case to calculate the variation of the functional

\mathcal{A}(x) = \int_a^b L (x,x',x'',t)\, \mathrm{d}t

by using equation Equation 2:

\delta \mathcal{A} = \int_a^b \left( -2x - 0 + \frac{\mathrm{d}^2x''}{\mathrm{d}t^2} \right) \delta x\, \mathrm{d}t

Using the fundamental lemma of calculus of variation the terms in the bracket must be always equal to zero: this so represent, in conjunction with the boundary conditions, the ordinary differential equation that has to be solved to determine the solution of the problem:

\begin{cases} x^{\prime\prime\prime\prime} - x = 0 \\ x(a) = 0, x'(a) = 1 \\ x(b) = 1, x'(b) = 2 \end{cases}

As we can see the differential equation is of order 4 and having 4 boundary condition and so it’s possible to compute the solution.

Boundary conditions

Until now we considered minimization problems with all boundary conditions values set, while however this might not always be the case: Equation 2 in fact is derived in the assumption of having all the variation constants \delta x = \delta x^{(k)} = 0 (for all k), but when this won’t happens and so the general expression of the first variation is

\begin{aligned} \delta \mathcal{B}(x;\delta x) &= \int_a^b \left( \frac{\partial L}{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x'} + \frac{\mathrm{d}^2}{\mathrm{d}t^2} \frac{\partial L}{\partial x''} \right) \delta x\, \mathrm{d}t \\ & + \left[ \left( \frac{\partial L}{\partial x'}- \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x''}\right) \delta x \right]_a^b + \left[ \left(- \frac{\partial L}{\partial x''}\right) \delta x' \right]_a^b \\ &= \int_a^b (A)\delta x\, \mathrm{d}t + (B) \delta x(b) - (C)\delta x(a) + (D) \delta x'(b) - (E) \delta x'(a) \end{aligned}

where

\begin{aligned} (A) &= \frac{\partial L}{\partial x}(x,x',t) - \frac{\mathrm{d}}{\mathrm{d}t}\frac{\partial L}{\partial x'}(x,x',t) + \frac{\mathrm{d}^2}{\mathrm{d}t^2}\frac{\partial L}{\partial x''}(x,x',t) \\[1em] (B) &= \frac{\partial L}{\partial x'}(x(b),x'(b),b)- \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x''}(x(b),x'(b),b) \\[1em] (C) &= \frac{\partial L}{\partial x'}(x(a),x'(a),a)- \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x''}(x(a),x'(a),a) \\[1em] (D) &= -\frac{\partial L}{\partial x''}(x(b),x'(b),b) \\[1em] (E) &= -\frac{\partial L}{\partial x''}(x(a),x'(a),a) \end{aligned}

In this case if we consider that the boundary conditions fixed are only the one

x(a) = x_a \qquad x'(b) = x_b'

this also reflect on the functional domain of the variations \delta x\in\Bbb{D} that becomes

\widetilde{\Bbb{D}} = \big\{ \delta x\in C^{\infty}([a,b]) \textrm{ such that } \delta x(a) = \delta x'(b) = 0 \big\}

We can now see that in general the domain \widetilde{\Bbb{D}} of the variation set condition for value of the function (and it’s derivative) on points only where the boundary conditions are defined. In the case described considering the values fixed it means that the solution for the minimum is the one that satisfies:

\delta\mathcal{B}(x;\delta x) = \int_a^b (A) \delta x\, \mathrm{d}t \quad + (B) \delta x(b) - (E)\delta x'(a) \tag{3}

for all \delta x \in \widetilde{\Bbb{D}}. In this case we have 2 boundary degrees of freedom for the variation that we can define \delta x(b) = \delta_{xb} and \delta x'(a) = \delta_{xa}' (that are real evaluated variable); the simplest function that we might determine in order to create a variation \delta x(t) is a polynomial function depending on this parameters, and so

\begin{aligned} \delta x(t) = \delta_{xa}' (t-a) & + \frac{ 2\delta_{xa}'(a-b) + 3 \delta_{xb} }{(a-b)^2} \big( t-a \big)^2 \\ & + \frac{ \delta_{xa}'(a-b) + 2 \delta_{xb} }{(a-b)^3} \big(t-a\big)^3 \end{aligned}

Considering that the Gateaux derivative \delta \mathcal{B}(x;\delta x) should always be zero for each function \delta x(t) in it’s domain, we can consider the special function with parameters \delta_{xb} = 1 and \delta_{xa}' = 0 becoming

\delta x(t) = \frac 3{(a-b)^2}\big(t-a\big)^2 + \frac 2{(a-b)^3} \big(t-a\big)^3

Considering now that \delta x'(a) = 0 we can see that in expression Equation 3 the terms (E) is free and in order to have a null derivative it must be that (B)=0. Similarly we can create a particular polynomial (in particular considering

\delta_{xb} = \delta x(b) = 0 \quad \textrm{and}\quad \delta_{xa}' = 1

that set to zero the coefficients associated to (B) (and so leaving it free to change) and so, to have a zero evaluated derivative, it must be that (E)=0.

In this case the differential equation associated to the terms B, E are called trasversality conditions and are necessary when the boundary conditions of the problem are not enough (in fact that expressions are evaluated at precise point on the boarder of the domain). At this point the solution of the original problem becomes

\begin{cases} \dfrac{\partial L }{\partial x} - \dfrac{\mathrm{d}}{\mathrm{d}t} \dfrac{\partial L }{\partial x'} + \dfrac{\mathrm{d}^2}{\mathrm{d}t^2} \dfrac{L }{\partial x''} = 0 \\[1em] \dfrac{\partial L }{\partial x'} \Big|_b - \dfrac{\mathrm{d}}{\mathrm{d}t} \dfrac{\partial L }{\partial x''} \Big|_b = 0\\[1em] -\dfrac{\partial L }{\partial x''} \Big|_a = 0 \\[1em] x(a) = x_a, \ x'(b) = x_b' \end{cases}

Example 2 (minimization with less boundary conditions) Let’s consider the problem of minimizing the functional \mathcal A as in Example 1 here reported

\textrm{minimize:}\qquad \int_a^b \Big( \big(x''\big)^2 - x^2 + t \Big)\,\mathrm{d}t

where the boundary conditions in this case are only

x(a) = 0 \qquad x'(b) = 2

In this case the boundary conditions are not enough and the case is similar to the theory yet described: in this case the domain of the variations \delta x is described as

\Bbb{D} = \Big\{ \delta x\in C^\infty([a,b]) \textrm{ such that } \delta x(a) =\delta x'(b) = 0 \Big\}

No information are set for the values \delta x(b), \delta x'(a) that are so free to have different values in \Bbb{R} so determining the following two trasverality conditions:

\begin{aligned} \dfrac{\partial L }{\partial x'} \Big|_b - \dfrac{\mathrm{d}}{\mathrm{d}t} \dfrac{\partial L }{\partial x''} \Big|_b & = 0-2x'''\Big|_b = -2x'''(b) = 0 \\ -\dfrac{\partial L }{\partial x''} \Big|_a & = - \big(-2x''\big)\Big|_a = 2x''(a) = 0 \end{aligned}

This, in conjunction with the differential equation determined by the function in the integral and the known boundary conditions, determines the following ordinary system of equation that’s the solution that minimize the functional:

\begin{cases} x'''' - x = 0 \\ x'''(b) = 0 \\ x''(a) = 0 \\ x(a) = 0, x'(b) = 2 \end{cases}

Example 3 (minimization of a functional) Let’s consider the problem

\begin{aligned} \textrm{minimize:}& \quad \mathcal{A}(y) = \int_0^1 \left( \frac{\big(y'(x)\big)^2}{2} + y(x)y'(x) + y(x) \right)\, \mathrm{d}x \\ \textrm{with:}& \quad y(1) = 1 \end{aligned}

In this case the lagrangian of the problem is defined as

L(y,y',x) = \dfrac{(y')^2}{2} + y y' + y

the first variation of this functional so becomes

\delta \mathcal{A}(y;\delta y) = \int_0^1 \left( \frac{\partial L}{\partial y} - \frac{\mathrm{d}}{\mathrm{d}x} \frac{\partial L}{\partial y'} \right)\delta y\, \mathrm{d}x + \left[ \frac{\partial L}{\partial y'}\delta y\right] \Big|_{x=0}^{x=1} = 0

To formally compute the derivative we firstly need to define the domain of the variation \delta y that, having only one boundary condition, is

\Bbb{D} = \Big\{ \delta y \in C^\infty([0,1]) \textrm{ with } \delta y(1) = 0 \Big\}

At this point the Gateaux derivative can be computed as

\begin{aligned} \delta \mathcal{A}(y;\delta y) & = \int_0^1 \underbrace{\left( y' - (y'' - y') +1 \right) }_{=f} \delta y \, \mathrm{d}x + \Big[ \big(y'+y\big) \delta y \Big]_{x=0}^{x=1} \\ & = \int_0^1 1-y'' \delta y\, \mathrm{d}x - \underbrace{\big(y'+y\big)\delta y(0)}_\textrm{trasv. cond.} \end{aligned}

As we can see we have that the term into the integral (associated to y'') related to the fundamental lemma must be zero and so represent the differential equation associated to the solution of the problem while the second term y'+y evaluated for x = 0 represent the transversality condition that allow to have a unique solution of the system of ordinary differential equation that is:

\left\{ \begin{aligned} y''(x) &= 1 \\ y'(0) + y(0) &= 0 \\ y(1) &= 1 \end{aligned} \right.

By integration of the first differential equation we can determine the parametric solution of the system whose coefficients can be matched considering the boundary conditions:

y(x) = \dfrac{1}{2} x^2+ c_1x + c_2

Substituting the parametric solution on the boundary conditions we can solve for the parameters:

\begin{cases} c_1 + c_2 = 0 \\ \dfrac{1}{2} + c_1+c_2 = 0 \end{cases}

In this case the system of linear equation has no solution (in fact we have that c_1 + c_2 = 0 \neq -1/2) and so this means that the functional \mathcal{A} cannot be minimized.

Functional minimization with generic constraints

Let’s consider the problem of the minimization of a functional \mathcal{F} subjected to an inequality constraints as follows:

\begin{aligned} \textrm{minimize:}& \qquad \mathcal F(z) = \int_a^b L (z,z',t)\, \mathrm{d}t \\ \textrm{subject to:}& \qquad w\big(z(a),z(b) \big) = 0 \end{aligned} \tag{4}

In this case we want to find the solutions for the function z(t) in the functional spaced defined as

\Bbb{V} = \Big\{ z \in C^2([a,b]) \textrm{ such that } w\big(z(a),z(b)\big) = 0 \Big\}

In this case we cannot consider the linearity of the functional space (in fact two functions z_1,z_2 can satisfy the constraint b, but their sum z_1 + z_2 doesn’t belong to the domain), and more difficult is the definition of the directional domain \Bbb{D} on which we can compute all the possible derivatives.

Discretization approach

A way to solve this problem is by discretizing the problem: given the domain [a,b] we can subdivide him in n subintervals (in this case equally spaced) having length h = \frac{b-a}{2}; the axis t is so discretized in values

t_k = t_0 + k h = a + k \frac {b-a}n

With this definition we can compute the function z in the various point considering that z(t_k)=z_k (and indeed we can also note that t_0 = a and t_n = b). Using the mid-point squaring numerical method to integrate the original function, we can approximate the functional as

\mathcal{F}(z) = \int_a^b L (z,z',t)\, \mathrm{d}t \approx h \sum_{k=1}^{n} L \left( z_{k-\frac 1 2}, z_{k-\frac 1 2}', t_{k-\frac 1 2} \right)

where

z_{k-\tfrac{1}{2}} = \frac{z_k + z_{k-1}}{2} \quad z'_{k-\tfrac{1}{2}} = \frac{z_k - z_{k-1}}{h} \quad t_{k-\tfrac{1}{2}} = t_k - \frac{h}{2}

With this being said the initial constrained minimization problem (Equation 4) can be discretized so obtaining the form

\begin{aligned} \textrm{minimize:}& \qquad f(\bm{z}) = h \sum_{k=1}^{n} L\left( z_{k-\tfrac{1}{2}}, z'_{k-\tfrac{1}{2}}, t_{k-\tfrac{1}{2}} \right) \\ \textrm{subject to:}& \qquad w\big(z_0,z_n \big) = 0 \end{aligned} \tag{5}

where \bm{z} = \big(z_0,z_1,\dots,z_n\big)^t is the vector off all the discretized values of the initial function z(t). This formulation recall the constrained minimization problem with equality constraints and so we can use the Lagrange multiplier method by defining the expression

\mathscr{L}(\bm{z},\lambda) = f(\bm{z}) - \lambda w(z_0,z_n)

To solve this kind of problem we need to find the stationary point of the yet built lagrangian \mathcal{L},and so this means solving the following non-linear system determined by the equations

\frac{\partial\mathscr{L}}{\partial z_i} = 0 \qquad \forall i = 0,1,\dots,n \qquad \frac{\partial\mathscr{L}}{\partial\lambda} = 0

Starting with i=0 we can compute that derivative of the lagrangian as

\begin{aligned} \frac{\partial\mathscr{L}}{\partial z_0} &= \frac{\partial}{\partial z_0} \left(h \sum_{k=1}^{n} L\left(\frac{z_k + z_{k-1}}{2}, \frac{z_k - z_{k-1}}{h}, t_{k-\tfrac{1}{2}}\right) - \lambda w(z_0,z_n) \right) \\ &= \frac{\partial}{\partial z_0} \Bigg(h L\Big( \underbrace{\color{blue} \frac{z_1 + z_0}{2}, \frac{z_1 - z_0}{h}, t_{\tfrac{1}{2}}}_{\color{blue}[1]} \Big) - \lambda w(z_0,z_n) \Bigg) \\ & = h \frac{\partial L([1])}{\partial z} \frac{\partial}{\partial z_0} \left( \frac{z_1 + z_0}{2} \right) + h \frac{\partial L([1])}{\partial z'} \frac{\partial}{\partial z_0} \left( \frac{z_1 - z_0}{h} \right) - \lambda \frac{\partial w(z_0,z_n)}{\partial z_0} \\ & = \frac h 2 \frac{\partial L({\color{blue}[1]})}{\partial z} - \frac{\partial L({\color{blue}[1]})}{\partial z'} - \lambda\frac{\partial w(z_0,z_n)}{\partial z_0} \end{aligned}

Note that passing from the first to the second line only the terms associated to k=1 present terms depending on z_0, and so only that part of the summation has been considered.

For all the other values i\neq 0,n, the mathematical expression of the derivative \partial\mathscr{L}/\partial z_i becomes more (due to the fact that we have to consider two elements of the summation) and so we can use the simplified notation to express the partial terms such

\frac{\partial L}{\partial z} \Big|_{k +\tfrac{1}{2}} := \frac{\partial}{\partial z}{L\left(\frac{z_{k+1}+z_k}2,\frac{z_{k+1}-z_k}h, t_{k+\tfrac{1}{2}}\right)}

With this, doing the steps as previously shown, we can compute the partial derivatives as

\begin{aligned} \frac{\partial\mathscr{L}}{\partial z_k} &= \frac{\partial}{\partial z_k} \left(h \sum_{j=1}^{n} L\left(\frac{z_j + z_{j-1}}{2}, \frac{z_j - z_{j-1}}{h}, t_{j-\tfrac{1}{2}}\right) - \lambda w(z_0,z_n) \right) \\ & = \frac{h}{2}\left( \frac{\partial L}{\partial z}\Big|_{k-\tfrac{1}{2}} + \frac{\partial L}{\partial z}\Big|_{k+\tfrac{1}{2}}\right) + \frac{\partial L}{\partial z'}\Big|_{k-\tfrac{1}{2}} - \frac{\partial L}{\partial z'}\Big|_{k+\tfrac{1}{2}} \\ & = h \left(\frac{ \dfrac{\partial L}{\partial z}\Big|_{k-\tfrac{1}{2}} + \dfrac{\partial L}{\partial z}\Big|_{k+\tfrac{1}{2}}}{2} - \frac{ \dfrac{\partial L}{\partial z'}\Big|_{k+\tfrac{1}{2}} - \dfrac{\partial L}{\partial z'}\Big|_{k-\tfrac{1}{2}}}{h} \right) \end{aligned}

The last partial derivative (computed for k=n) is instead

\frac{\partial\mathscr{L}}{\partial z_n} =\frac{h}{2} \frac{\partial L}{\partial z}\Big|_{k-\tfrac{1}{2}} + \frac{\partial L}{\partial z'}\Big|_{k-\tfrac{1}{2}} - \lambda \frac{\partial w(z_0,z_n)}{\partial z_n}

With all this calculation being done we determine that the non-linear system of equations representing the first order necessary condition for the minimum point of the lagrangian \mathscr{L} is

\begin{cases} \dfrac{ \dfrac{\partial L}{\partial z}\Big|_{k-\tfrac{1}{2}} + \dfrac{\partial L}{\partial z}\Big|_{k+\tfrac{1}{2}}}{2} - \dfrac{ \dfrac{\partial L}{\partial z'}\Big|_{k+\tfrac{1}{2}} - \dfrac{\partial L}{\partial z'}\Big|_{k-\tfrac{1}{2}}}{h} = 0 \quad & : A \\[1em] \dfrac{h}{2} \dfrac{\partial L}{\partial z}\Big|_{\tfrac{1}{2}} - \dfrac{\partial L}{\partial z'}\Big|_{\tfrac{1}{2}} - \lambda \dfrac{\partial w(z_0,z_n)}{\partial z_0} = 0&:B\\[1em] \dfrac{h}{2} \dfrac{\partial L}{\partial z}\Big|_{n-\tfrac{1}{2}} + \dfrac{\partial L}{\partial z'}\Big|_{n-\tfrac{1}{2}} - \lambda \dfrac{\partial w(z_0,z_n)}{\partial z_n} = 0 &:C \end{cases}

By pushing the limit for h\to 0 (to have a continuous discretization), we can see a correlation of the two terms composing equation A, in fact

\begin{aligned} \dfrac{ \dfrac{\partial L}{\partial z}\Big|_{k-\tfrac{1}{2}} + \dfrac{\partial L}{\partial z}\Big|_{k+\tfrac{1}{2}}}{2} &\approx \dfrac{\partial L}{\partial z}\big(z_k,z_k',t_k\big) \\ \dfrac{ \dfrac{\partial L}{\partial z'}\Big|_{k+\tfrac{1}{2}} - \dfrac{\partial L}{\partial z'}\Big|_{k-\tfrac{1}{2}}}{h} &\approx \dfrac{\mathrm{d}}{\mathrm{d}t} \left( \dfrac{\partial L}{\partial z'}\big(z_k,z_k',t_k\big) \right) \end{aligned}

Similarly both the terms A and B presents the terms similar to the previous definition and they also presents another contribute due to the Lagrange multiplier \lambda. By so considering the limit h\to 0 we can see that k=\tfrac{1}{2} tends to be a while n-\tfrac{1}{2} tends to b, and so we can rewrite the systems of non-linear equations as

\begin{cases} \dfrac{\partial L}{\partial z}(z,z',t) - \dfrac{\mathrm{d}}{\mathrm{d}t} \dfrac{\partial L}{\partial z'}(z,z',t) = 0 \\[1em] -\dfrac{\partial L}{\partial z}(z(a),z'(a),a) -\lambda \dfrac{\partial w(z(a),z(b))}{\partial z(a)} = 0 \\[1em] -\dfrac{\partial L}{\partial z}(z(b),z'(b),b) -\lambda \dfrac{\partial w(z(a),z(b))}{\partial z(b)}= 0 \\[1em] w\big(z(a),z(b)\big) = 0 \end{cases}

This is so the general formulation of the initial problem of minimizing a functional

\mathcal{F}(z) = \int_a^b L(z,z^\prime,t) \, \mathrm{d}t

subject to an equality constraint w.

Heuristic formulation

The same result can also be achieved with an heuristic formulation. Given so the problem of minimizing a functional \mathcal{F} subjected to an equality constraint w (as in Equation 4), the solution can be achieved by determining a new functional \mathcal{L} defined as the lagrangian of the system:

\mathcal{L}(z,\lambda) = \int_a^b L (z,z',t)\, \mathrm{d}t - \lambda w\big(z(a),z(b)\big)

At this point we can compute the variation of this functional considering it’s Gateaux derivative \delta that’s

\begin{aligned} \delta\mathcal{L}(z,\lambda;\delta z,\delta\lambda) &= \frac{\mathrm{d}} {\mathrm{d}\alpha}\Big|_{\alpha=0} \mathcal{L}\big(z + \alpha\, \delta z, \lambda + \alpha\,\delta \lambda\big) \\[1em] &= \delta \int_a^b L(z,z',t)\, \mathrm{d}t - \delta \lambda\,w\big(z(a),z(b)\big) - \lambda \, \delta w\big(z(a),z(b)\big) \\[1em] &=\int_a^b \left( \frac{\partial\mathcal{L}}{\partial z} \delta z + \frac{\partial\mathcal{L}}{\partial z'} \delta z' \right) \, \mathrm{d}t - \delta \lambda\,w\big(z(a),z(b)\big) \\ & - \lambda \left( \frac{\partial w(z(a),z(b))}{z(a)}\delta z(a) + \frac{\partial w(z(a),z(b))}{z(b)}\delta z(b) \right) \\[1em] &= \int_a^b \left( \frac{\partial\mathcal{L}}{\partial z} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial\mathcal{L}}{\partial z'} \right)\delta z\, \mathrm{d}t + \left[\frac{\partial\mathcal{L}}{\partial z'}\delta z \right]\Big|_a^b - \delta \lambda\,w\big(z(a),z(b)\big) \\ & - \lambda\left( \frac{\partial w}{\partial z(a)} \delta {z(a)} + \frac{\partial w}{\partial z(b)}\delta {z(b)}\right) \end{aligned}

By evaluating the partial derivative \partial \mathcal{L}/\partial z' and collecting common terms we can reduce the variation of the functional to the form

\delta \mathcal{L}(z,\lambda;\delta z,\delta\lambda) = \int_a^b (A)\,\delta z\, \mathrm{d}t - \delta \lambda\, w\big(z(a),z(b)\big) + (B) \delta {z(a)} + (C) \delta {z(b)}

where

\begin{aligned} (A) &= \frac{\partial\mathcal{L}}{\partial z}(z,z^\prime,t) - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial\mathcal{L}}{\partial z'}(z,z^\prime,t) \\ (B) &= -\frac{\partial\mathcal{L}}{\partial z'} (z(a),z^\prime(a),a) - \lambda \frac{\partial w}{\partial z(a)}(z(a),z(b)) \\ (C) &=\frac{\partial\mathcal{L}}{\partial z'}(z(b),z^\prime(b),b) - \lambda \frac{\partial w}{\partial z(b)}(z(a),z(b)) \end{aligned}

We start by noting that the variation \delta\mathcal{L}(z,\lambda;\delta z,\delta\lambda) must be equal to zero for all variations \delta z \in C^\infty([a,b]) and for all constants \delta \lambda \in \mathbb{R}.

First, consider the case where \delta \lambda = 0, \delta z(a) = 0, and \delta z(b) = 0. In this scenario, we can apply the fundamental lemma of the calculus of variations, which implies that the term A must equal zero.

Next, let’s examine the case where \delta \lambda \neq 0 while maintaining \delta z(a) = 0 and \delta z(b) = 0. Under these conditions, we arrive at the initial equality constraint:

w(z(a), z(b)) = 0.

Now, consider the situation where \delta \lambda = 0, \delta z(a) \neq 0, and \delta z(b) = 0. In this case, we can conclude that the term (B) must also equal zero. Similarly, if we have \delta z(b) \neq 0 while keeping \delta \lambda = 0 and \delta z(a) = 0, we find that the term (C) must be zero as well.

This means that the resultant system of non-linear equation that solves the problem can be expressed as

\begin{cases} \dfrac{\partial\mathcal{L}}{\partial z}(z,z',t) - \dfrac{\mathrm{d}}{\mathrm{d}t} \dfrac{\partial\mathcal{L}}{\partial z'}(z,z',t) = 0 \\[1em] w\big(z(a),z(b)\big) = 0 \\[1em] \fcolorbox{green}{LightYellow}{$\color{MidnightBlue} \dfrac{\partial\mathcal{L}}{\partial z'}(z(a),z'(a),a) + \lambda \dfrac{\partial w\big(z(a),z(b)\big)}{\partial z(a)} = 0 $} \\[1em] \fcolorbox{green}{LightYellow}{$\color{MidnightBlue} \dfrac{\partial\mathcal{L}}{\partial z'}(z(b),z'(b),b)- \lambda \dfrac{\partial w\big(z(a),z(b)\big)}{\partial z(b)} = 0$} \end{cases} \tag{6}

We can observe that the resulting system is equivalent to the one obtained by pushing the limit h\to 0 with the discretized version previously performed.

Verification

Considering the common case described by the problem

\begin{aligned} \textrm{minimize:}& \qquad \mathcal F(x) = \int_a^b L (x,x',t)\, \mathrm{d}t \\ \textrm{subject to:}& \qquad x(a) = x_a, \ x(b) = x_b \end{aligned}

than the solution can be still achieved using the method yet shown. By in fact building the Lagrangian functional

\mathcal{L}(x,\lambda_1,\lambda_2) = \int_a^b L (x,x',t)\, \mathrm{d}t - \lambda_1\big(x(a)-x_a\big) -\lambda_2 \big(x(b) - x_b \big)

substituting this function in the result of Equation 6 we get the following differential system of equations:

\begin{cases} \dfrac{\partial L}{\partial x}(x,x',t) - \dfrac{\mathrm{d}}{\mathrm{d}t} \dfrac{\partial L }{\partial x'}(x,x',t) = 0 \\[1em] x(a) = x_a, \quad x(b) = x_b\\[1em] \fcolorbox{green}{LightYellow}{$\color{MidnightBlue} -\lambda_1 - \dfrac{\partial L}{\partial x'}(x(a),x'(a),a)= 0 $} \\[1em] \fcolorbox{green}{LightYellow}{$\color{MidnightBlue} -\lambda_2 + \dfrac{\partial L}{\partial x'}(x(b),x'(b),b) = 0 $} \end{cases}

The first two lines represent the terms that were always presents in the previous statement of the problem (without the equality constraints), while the last two depending from \lambda_i gives no real information because they can always be verified (in fact there will always exists a parameter \lambda_i that equals the derivative \partial L/\partial x'); in general the variable \lambda_i can appear in the other equations, and so this last can be used to determine the stationary solution of the lagrangian \mathcal{L}.

Example 4 (computation of first variation) Given the functional

\mathcal{F}(x) = x(0) + \int_0^1 tx^2 + \big(x'\big)^2\, \mathrm{d}t

it’s first variation can be computed considering that

L(x,x',t) = tx^2 + \big(x'\big)^2

so that

\frac{\partial L}{\partial x}(x,x',t) = 2tx \qquad \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x'}(x,x',t) = \frac{\mathrm{d}}{\mathrm{d}t}(2x') = 2x''

To compute the complete variation of the system it’s mandatory to consider the terms resulting from the integration by part (due to the fact that no boundary conditions are set) related to the term

\left[ \frac{\partial L}{\partial x'}\delta x \right]_a^b

With that said, the overall first variation can be computed as

\begin{aligned} \delta \mathcal{F}(x;\delta x) & = \delta {x(0)} + 2\int_0^1 \big(tx-x''\big) \delta x\, \mathrm{d}t + 2x' \delta {x(1)} - 2x' \delta {x(0)} \\ & = 2\int_0^1 \big(tx-x''\big) \delta x\, \mathrm{d}t + 2x'(1) \delta {x(1)} + \big( 1 - 2x'(0) \big) \delta {x(0)} \end{aligned}

and the resulting BVP by the application of foundamental lemma of calculus of variations is

\left\{\begin{aligned} tx-x'' &=0\\ x'(1) &=0 \\ 2x'(0) &=1 \end{aligned}\right.

Example 5 (boundary value problem) Given the problem

\begin{aligned} \textrm{minimize:} & \qquad \mathcal{F}(x) = x(0) + \int_0^1 x^2 + \big(x'-t\big)^2\, \mathrm{d}t \\ \textrm{subject to:} & \qquad \int_0^1 x\, \mathrm{d}t = 0 \\ & \qquad x(1) = 2 \end{aligned}

the resulting boundary value problem can be computed considering the lagrangian \mathcal{L} of the system that’s

\begin{aligned} \mathcal{L}(x,\lambda,\mu) & = \mathcal{F}(x) -\lambda \int_0^1 x\, \mathrm{d}t - \mu\big(x(1)-2\big) \\ & = \int_0^1 \underbrace{x^2 + \big(x'-t\big)^2 - \lambda x}_{L}\, \mathrm{d}t + x(0) - \mu\big(x(1)-2\big) \end{aligned}

Starting from this point we can so compute the first variation of \mathcal{L} that’s:

\begin{aligned} \delta \mathcal{L} &= \int_0^1 \left( \frac{\partial L}{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x'} \right) \delta x\, \mathrm{d}t + \left[ \frac{\partial L}{\partial x'}\delta x \right]_0^1 - \int_0^1 x\, \delta\lambda \, \mathrm{d}t \\ & + \delta {x(0)} - \mu \delta {x(1)} - \big( x(1) - 2 \big) \delta \mu \\ &= \int_0^1 \big( -2x'' + 2x +2 -\lambda \big) \delta x\, \mathrm{d}t \\ & + 2\big(x'(1) - 1\big)\delta {x(1)} - 2 x'(0) \delta {x(0)} \\ & - \int_0^1 x \delta \lambda\, \mathrm{d}t + \delta {x(0)} - \mu \delta {x(1)} - \big(x(1)-2\big) \delta\mu \end{aligned}

The resulting boundary valued problem is so determined by setting to zero the terms that multiply every variation \delta \cdot and so:

\begin{cases} \delta x: \qquad & 2 x'' - 2x - 2 + \lambda = 0 \\ \delta {x(0)} : & 2x'(0) = 1 \\ \delta {x(1)} : & \cancel{2x'(1) - 2 = \mu} \qquad \textrm{: trivially satisfied} \\ \delta \lambda: & \int_0^1 x = 0 \\ \delta \mu: & x(1) = 2 \\ \end{cases}

Example 6 (exam’s simulation) Given the functional

\begin{aligned} \textrm{minimize:} \qquad &\mathcal{F}(x,y,z) = y(1) + \int_0^1 x^2 + z^2 +x'y' +z'^2 \, \mathrm{d}t \\ \textrm{subject to:} \qquad &x(1) = 2 \end{aligned}

the resulting boundary value problem can be stated by determining the first variation of the lagrangian \mathcal{L}(x,y,z,\lambda) = y(1) + \int_0^1 \underbrace{x^2 + z^2 +x'y' +z'^2}_{\color{blue}=L(x,y,z,x',y',z')}\, \mathrm{d}t - \lambda\big(x(1) - 2\big)

that results in \begin{aligned} \delta\mathcal{L}(x,y,z,\lambda) &=\delta {y(1)} + \int_0^1 \left( \frac{\partial L}{\partial x} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial x'} \right) \delta x\, \mathrm{d}t + \left[\frac{\partial L}{\partial x'}\delta x \right]_0^1 \\ & + \int_0^1 \left( \frac{\partial L}{\partial y} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial y'}\right) \delta y\, \mathrm{d}t + \left[ \frac{\partial L}{\partial y'}\delta y \right]_0^1 \\ & + \int_0^1 \left(\frac{\partial L}{\partial z} - \frac{\mathrm{d}}{\mathrm{d}t} \frac{\partial L}{\partial z'}\right) \delta z\, \mathrm{d}t + \left[ \frac{\partial L}{\partial z'}\delta z \right]_0^1 \\ & +\lambda \delta {x(1)} + \big(x(1)-2\big) \delta\lambda \\[1em] &=\delta {y(1)} + \int_0^1 \big( 2x - y'' \big) \delta x\, \mathrm{d}t + y'(1)\delta {x(1)} - y'(0)\delta {x(0)} \\ & + \int_0^1 -x'' \delta y \, \mathrm{d}t + x'(1) \delta {y(1)} - x'(0) \delta {y(0)} \\ & + \int_0^1 \big(2z - 2z''\big)\delta z\, \mathrm{d}t \\ & + 2z'(1) \delta {z(1)} - 2z'(0) \delta {z(0)} + \lambda \delta {x(1)} + \big(x(1)-2\big) \delta \lambda \end{aligned}

Setting to zero the terms associated to each variation \delta \cdot determines the following boundary value problem, solution of the functional minimization:

\begin{cases} 2x - y'' = 0 & : \delta x \\ x'' = 0 & : \delta y \\ z - z'' = 0 & : \delta z \\ y'(0) = 0 & : \delta {x(0)} \\ \cancel{y'(1) + \lambda = 0} & : \delta {x(1)} \textrm{, trivially solved in } \lambda \\ x'(0) = 0 & : \delta {y(0)} \\ 1 + x'(1) = 0 & : \delta {y(1)} \\ z'(0) = 0 & : \delta {z(0)} \\ z'(1) = 0 & : \delta {z(1)} \\ x(1) = 2 & : \delta\lambda \end{cases}

References

Betts, John T. 2010. Practical Methods for Optimal Control Using Nonlinear Programming. 3rd ed. Society for Industrial; Applied Mathematics.

Bryson, Arthur E., and Yu-Chi Ho. 1975. Applied Optimal Control: Optimization, Estimation, and Control. Wiley.

Kirk, Donald E. 2004. Optimal Control Theory: An Introduction. Dover Publications.

Liberzon, Daniel. 2012. Calculus of Variations and Optimal Control Theory: A Concise Introduction. Princeton University Press.