Exercize on KKT
Example 1: Detailed Example of Constrained Minimization
In this example, we will minimize the function
\begin{aligned} \textrm{minimize:}\quad & f(x, y) = x + y, \\ \textrm{subject to the constraint:}\quad & h(x, y) = x^2 + y^2 - 1 = 0, \end{aligned}
where h(x, y) = 0 describes a unit circle. We will approach this problem using two methods:
- Parametrization with x = \cos(t) and y = \sin(t).
- Lagrange Multipliers.
Parametric Solution
The constraint h(x, y) = x^2 + y^2 - 1 represents a unit circle. We can parametrize the constraint as:
x(t) = \cos(t), \quad y(t) = \sin(t), \quad t \in [0, 2\pi].
Substituting the parametrization into the objective function yields:
f(\cos(t), \sin(t)) = \cos(t) + \sin(t).
To find the critical points, we can differentiate this parametric function and analyze it:
Critical Points
The maximum and minimum values of f can be found by solving for t:
- The function f(\cos(t), \sin(t)) achieves its extrema when:
\frac{\mathrm{d}}{\mathrm{d}t} (\cos(t) + \sin(t)) = -\sin(t) + \cos(t) = 0.
This gives us:
\sin(t) = \cos(t) \quad \Rightarrow \quad t = \frac{\pi}{4} + k\pi, \, k \in \mathbb{Z}.
Function Values
Evaluating the function at these points:
- At t = \dfrac{5\pi}{4} (minimum):
\left. \begin{aligned} x &= \cos\left(\frac{5\pi}{4}\right) = -\frac{\sqrt{2}}{2}, \\ y &= \sin\left(\frac{5\pi}{4}\right) = -\frac{\sqrt{2}}{2}, \end{aligned} \right\} \quad \implies\quad f(x, y) = -\sqrt{2}.
- At t = \dfrac{\pi}{4} (maximum):
\left. \begin{aligned} x &= \cos\left(\frac{\pi}{4}\right) = \frac{\sqrt{2}}{2}, \\ y &= \sin\left(\frac{\pi}{4}\right) = \frac{\sqrt{2}}{2}, \end{aligned} \right\} \quad\implies\quad f(x, y) = \sqrt{2}.
Visualization
Let’s visualize the unit circle and the level sets of the objective function.
Code
# R Code to plot the level set, constraint, normals, and gradient field
# Generate data for the unit circle
<- seq(0, 2 * pi, length.out = 100)
theta <- cos(theta)
x_circle <- sin(theta)
y_circle
# Create a coarser grid for the level set and fewer arrows
<- seq(-1.5, 1.5, length.out = 10) # Fewer grid points for arrows
x_range <- seq(-1.5, 1.5, length.out = 10)
y_range <- expand.grid(x = x_range, y = y_range)
grid
# Calculate f(x,y) for level sets
$f <- with(grid, x + y)
grid
# Gradient of f(x, y) = x + y is constant: [1, 1]
$fx <- 1 # Gradient in x direction
grid$fy <- 1 # Gradient in y direction
grid
# Normal vectors for the circle at selected points
<- data.frame(
circle_normals x = cos(seq(0, 2 * pi, length.out = 10)),
y = sin(seq(0, 2 * pi, length.out = 10))
%>%
) mutate(nx = x, ny = y) # Normals are the same as the (x, y) on the unit circle
# Plot the constraint, level sets, gradient field, and normals
ggplot() +
# Circle with border
geom_polygon(aes(x = x_circle, y = y_circle), fill = "lightblue", color = "black", alpha = 0.5) +
# Contour lines for f(x, y) = x + y
geom_contour(data = grid, aes(x = x, y = y, z = f), bins = 10, color = "blue") +
# Gradient field for f(x, y) = x + y with light grey arrows
geom_segment(data = grid, aes(x = x, y = y, xend = x + 0.15*fx, yend = y + 0.15*fy),
arrow = arrow(length = unit(0.2, "cm")), color = "orange") +
# Normals to the circle
geom_segment(data = circle_normals, aes(x = x, y = y, xend = x + 0.2*nx, yend = y + 0.2*ny),
arrow = arrow(length = unit(0.2, "cm")), color = "darkgreen") +
# Minimum and maximum points
geom_point(aes(x = -sqrt(2)/2, y = -sqrt(2)/2), color = "red", size = 3) + # Minimum point
geom_point(aes(x = sqrt(2)/2, y = sqrt(2)/2), color = "green", size = 3) + # Maximum point
# Labels and limits
labs(title = TeX("gradient for $f(x,y)=x+y$ and constraint $h(x,y)=x^2+y^2-1$"),
x = "x", y = "y") +
# Center title and subtitle
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5, size = 14), # Centered and slightly larger title
+
) xlim(-1.5, 1.5) + ylim(-1.5, 1.5) +
coord_fixed() + # Fix the aspect ratio to 1:1
theme_minimal()
Lagrange Multipliers Method
Now, we will apply the method of Lagrange multipliers to our constrained optimization problem. We define the Lagrangian as follows:
\mathcal{L}(x, y, \lambda) = f(x, y) - \lambda h(x, y) = x + y - \lambda (x^2 + y^2 - 1).
First-Order Conditions
To find the extrema, we take the gradients of the Lagrangian with respect to x, y, and \lambda. This leads to the first-order conditions:
For x: \nabla_x \mathcal{L} = 1 - 2\lambda x = 0 \quad \Rightarrow \quad \lambda = \frac{1}{2x}.
For y: \nabla_y \mathcal{L} = 1 - 2\lambda y = 0 \quad \Rightarrow \quad \lambda = \frac{1}{2y}.
The constraint condition: h(x, y) = x^2 + y^2 - 1 = 0.
Solving the nonlinear system
Setting the two expressions for \lambda equal gives us:
\frac{1}{2x} = \frac{1}{2y} \quad \Rightarrow \quad x = y.
Next, substituting x = y into the constraint:
2x^2 = 1 \quad \Rightarrow \quad x = \pm \frac{1}{\sqrt{2}} \quad \Rightarrow \quad y = \pm \frac{1}{\sqrt{2}}.
Second-Order Conditions
To determine whether the critical points are minima or maxima, we need to evaluate the second-order conditions.
Second-Order Necessary Condition: The Hessian of the Lagrangian, \nabla_{(x,y)}^2 \mathcal{L}(x, y, \lambda), must be semi-positive definite in the null space of the constraint gradients.
Second-Order Sufficient Condition: The Hessian of the Lagrangian, \nabla_{(x,y)}^2 \mathcal{L}(x, y, \lambda), must be positive definite in the null space of the constraint gradients.
Computing the Hessian
The Hessian matrix of the Lagrangian with respect to x and y is given by:
\nabla^2 \mathcal{L}(x, y, \lambda) = \begin{bmatrix} \dfrac{\partial^2 \mathcal{L}}{\partial x^2} & \dfrac{\partial^2 \mathcal{L}}{\partial x \partial y} \\[1em] \dfrac{\partial^2 \mathcal{L}}{\partial y \partial x} & \dfrac{\partial^2 \mathcal{L}}{\partial y^2} \end{bmatrix} = \begin{bmatrix} -2\lambda & 0 \\ 0 & -2\lambda \end{bmatrix}.
Kernel of the Constraint
The kernel of the constraint
h(x, y) = 0
consists of the vectors \bm{z} that satisfy
\nabla h(x, y) \cdot \bm{z} = 0.
The gradient of the constraint is:
\nabla h(x, y) = [2x, 2y]
and it is easy to verify that the kernel of the constraint gradient can be spanned by vector \bm{z} = \begin{pmatrix} -y \\ x \end{pmatrix}
Check qualification
Gradient of active constraint \nabla h(x, y) consist of a single nonzero row. Thus, is full rank and satify LI (linear indipendence) qualification.
Evaluating the Hessian at Critical Points
At point (x^\star,y^\star)=\left(-\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}}\right) the multiplier \lambda = -\frac{1}{\sqrt{2}}. The kernel of h(x,y) is spanned by \bm{z} = t \begin{pmatrix} y^\star\\-x^\star\end{pmatrix} = \frac{t}{\sqrt{2}} \begin{pmatrix} -1\\1\end{pmatrix} \in\ker(\nabla (x^\star,y^\star))
Thus the Hessian and second order condition becomes: \begin{aligned} \nabla^2 \mathcal{L} & = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}, \qquad \bm{z}^2 \nabla^2 \mathcal{L} \bm{z}\quad\implies \\[1em] & \frac{t^2}{2} \begin{bmatrix} 1 & -1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 1 \\ -1 \end{bmatrix} =t^2 > 0, \end{aligned} for all t\neq 0. Therefore, this point is a local minimum.
At point (x^\star,y^\star)=\left(\frac{1}{\sqrt{2}},\frac{1}{\sqrt{2}}\right) the multiplier \lambda = \frac{1}{\sqrt{2}}.
Thus the Hessian and second order condition becomes: \begin{aligned} \nabla^2 \mathcal{L} & = \begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix}, \qquad \bm{z}^2 \nabla^2 \mathcal{L} \bm{z}\quad\implies \\[1em] & \frac{t^2}{2} \begin{bmatrix} 1 & -1 \end{bmatrix} \begin{bmatrix} -1 & 0 \\ 0 & -1 \end{bmatrix} \begin{bmatrix} 1 \\ -1 \end{bmatrix} =-t^2 < 0,\qquad \forall t\neq 0 \end{aligned} Therefore, this point is a local maximum.
Conclusion
Both methods yield the same results:
- The minimum occurs at \left(-\frac{1}{\sqrt{2}}, -\frac{1}{\sqrt{2}}\right) with f(x, y) = -\sqrt{2}.
- The maximum occurs at \left(\frac{1}{\sqrt{2}}, \frac{1}{\sqrt{2}}\right) with f(x, y) = \sqrt{2}.
Example 2: Detailed example of KKT
In this example, we will minimize the function
\begin{aligned} \textrm{minimize:}\quad & f(x, y, z ) = x^2 + y + z, \\ \textrm{equality constraint:}\quad & x+y^2+z^2=1, \\ \textrm{and inequalities:}\quad & z\leq 1, \quad x \geq 0, \end{aligned}
Rewriting in Standard Form
We can rewrite the problem in a more standard form:
\begin{aligned} \textrm{minimize:}\quad & f(x, y, z ) = x^2 + y + z, \\ \textrm{equality constraint:}\quad & h(x,y,z) = x+y^2+z^2-1 = 0, \\ \textrm{and inequalities:}\quad & g_1(x,y,z) = 1-z\geq 0, \\ & g_2(x,y,z) = x \geq 0, \end{aligned}
Here, h(x, y, z) = 0 describes a unit circle. We will now solve this problem using the KKT (Karush-Kuhn-Tucker) conditions.
Lagrangian Function
We define the Lagrangian function as:
\mathcal{L}(x,y,z,\lambda,\mu_1,\mu_2)= f(x,y,z) - \lambda h(x,y,z) - \mu_1 g_1(x,y,z) - \mu_2 g_2(x,y,z)
Substituting the expressions for f, h, g_1, and g_2:
\mathcal{L}(x,y,z,\lambda,\mu_1,\mu_2) = x^2 + y + z - \lambda(x+y^2+z^2-1) - \mu_1( 1-z) - \mu_2x
First-Order Conditions
To satisfy the first-order conditions, we compute the partial derivatives of \mathcal{L}:
\left\{\begin{aligned} \dfrac{\partial\mathcal{L}}{\partial x} &= 2x-\lambda-\mu_2 = 0\\ \dfrac{\partial\mathcal{L}}{\partial y} &= 1-2\lambda y = 0\\ \dfrac{\partial\mathcal{L}}{\partial x} &= 1-2\lambda z+\mu_1 = 0 \end{aligned}\right.
The KKT conditions also require the original constraints to hold: \left\{\begin{aligned} x+y^2+z^2-1 & = 0\\ \mu_1(1-z) & = 0\\ \mu_2 x & = 0 \end{aligned}\right.
We now proceed with different cases based on the values of \mu_1 and \mu_2
Case 1: \mu_1=\mu_2=0
In this case, the first-order conditions simplify to:
\left\{\begin{aligned} & 2x-\lambda = 0\\ & 1-2\lambda y = 0\\ & 1-2\lambda z = 0\\ & x+y^2+z^2-1 = 0\\ \end{aligned}\right. \implies \left\{\begin{aligned} &x=\lambda/2\\ &y=1/(2\lambda)\\ &z=1/(2\lambda)\\ & \lambda/2+2/(2\lambda)^2-1 = 0 \end{aligned}\right. \implies \lambda^3-2\lambda^2+1=0 solving the cubic (with Maple) the real roots are \lambda = 1, \; \dfrac{1+\sqrt{5}}{2}, \; \dfrac{1-\sqrt{5}}{2}
From these values, we compute the corresponding x, y, and z:
\mu_1=\mu_0=0,\quad \left\{\begin{aligned} &x=y=z=1/2, \quad& \lambda&=1 \\ &x = (1+\sqrt{5})/4,\quad& y &= z = 1/(\sqrt{5}+1), \quad \lambda = (\sqrt{5}+1)/2 \\ &{\color{red}x = (1-\sqrt{5})/4<0},\quad& y &= z = 1/(1-\sqrt{5}), \quad \lambda = (1-\sqrt{5})/2 \end{aligned}\right. the last solution must be discarded
Case 2: \mu_1=0, and \mu_2>0
In this case, x = 0, and the system becomes:
\left\{\begin{aligned} &2x-\lambda-\mu_2 = 0\\ &1-2\lambda y = 0\\ &1-2\lambda z = 0 \\ &x+y^2+z^2-1 = 0\\ &x = 0 \end{aligned}\right. \implies \left\{\begin{aligned} &\lambda=-\mu_2\\ &y=1/(2\lambda)=-1/(2\mu_2)\\ &z=1/(2\lambda)=-1/(2\mu_2)\\ &2/(2\mu_2)^2-1 = 0\\ &x = 0 \end{aligned}\right. \implies \mu_2^2=\dfrac{1}{2} the positive root is \mu_2=1/\sqrt{2} so that x=0, \quad y = z = -\dfrac{1}{2\mu_2} = -\dfrac{\sqrt{2}}{2},\quad \lambda=-\dfrac{1}{\sqrt{2}}=-\dfrac{\sqrt{2}}{2},\quad \mu_1=0,\quad \mu_2=\dfrac{1}{\sqrt{2}}
Case 3: \mu_2=0, and \mu_1>0
\left\{\begin{aligned} & 2x-\lambda = 0\\ & 1-2\lambda y = 0\\ & 1-2\lambda z+\mu_1 = 0 \\ &x+y^2+z^2-1= 0\\ &z=1 \end{aligned}\right. \implies \left\{\begin{aligned} & 2x-\lambda = 0\\ & 1-2\lambda y = 0\\ & 1-2\lambda +\mu_1 = 0 \\ &x+y^2= 0\\ &z=1 \end{aligned}\right. \implies \left\{\begin{aligned} & x=\lambda/2\\ & y=1/(2\lambda)\\ & \mu_1 = 2\lambda-1 \\ &x+y^2= 0\\ &z=1 \end{aligned}\right.
so that the real root of \lambda is
\dfrac{\lambda}{2}+\dfrac{1}{4\lambda^2}=0\;\implies\; 2\lambda^3+1=0\;\implies\; \lambda=-2^{-1/3}\;\implies\;
and
{\color{red}x = -2^{-4/3}< 0},\quad y=2^{-2/3}, \quad z=1, \quad \lambda=-2^{-1/3}, \quad \mu_1 = 2^{2/3}-1>0, \quad \mu_2=0
the solutoion must be discarded.
Case 4: \mu_2>0 and \mu_1>0
\left\{\begin{aligned} & 2x-\lambda-\mu_2 = 0\\ & 1-2\lambda y = 0\\ & 1-2\lambda z+\mu_1 = 0\\ &x+y^2+z^2-1 = 0\\ &z=1\\ &x=0 \end{aligned}\right. \implies \left\{\begin{aligned} & -\lambda-\mu_2 = 0\\ & 1-2\lambda y = 0\\ & 1-2\lambda +\mu_1 = 0\\ &y^2 = 0\\ &z=1\\ &x=0 \end{aligned}\right. \implies 1-2\lambda \cdot 0 = 0, \quad \textrm{no solution}
All the solutions
\bm{x} | \bm{y} | \bm{z} | \bm{\lambda} | \bm{\mu_1} | \bm{\mu_2} |
---|---|---|---|---|---|
\dfrac{1}{2} | \dfrac{1}{2} | \dfrac{1}{2} | 1 | 0 | 0 |
\dfrac{1+\sqrt{5}}{4} | \dfrac{1}{\sqrt{5}+1} | \dfrac{1}{\sqrt{5}+1} | \dfrac{\sqrt{5}+1}{2} | 0 | 0 |
0 | -\dfrac{\sqrt{2}}{2} | -\dfrac{\sqrt{2}}{2} | -\dfrac{\sqrt{2}}{2} | 0 | \dfrac{\sqrt{2}}{2} |
Study of solution 1: x=y=z=\dfrac{1}{2}, \lambda=1, \mu_1=\mu_2=0
Compute active constraints
\begin{aligned} g_1(x,y,z) & = 1-z = 1-\dfrac{1}{2}=\dfrac{1}{2} > 0, \qquad & \textrm{\color{DarkGreen}(not-active)} \\ g_2(x,y,z) & = x = \dfrac{1}{2}> 0, \qquad & \textrm{\color{DarkGreen}(not-active)} \end{aligned}
Gradient of active constraints
\nabla h(x,y,z) = \nabla (x+y^2+z^2-1) = [1,2y,2z] = [1,1,1]
find the kernel of the active constraint
[1,1,1]\begin{pmatrix} z_1\\z_2\\z_3\end{pmatrix} = z_1+z_2+z_3 \implies \begin{pmatrix} z_1\\z_2\\z_3\end{pmatrix} = \begin{pmatrix} -\alpha-\beta\\\alpha\\\beta\end{pmatrix} = \underbrace{ \begin{pmatrix} -1 & -1 \\ 1 & 0 \\ 0 & 1\end{pmatrix} }_{\bm{K}^a} \begin{pmatrix} \alpha \\ \beta\end{pmatrix}
Check qualification
Gradient of active constraint \nabla h(x, y, z) consist of a single nonzero row. Thus, is full rank and satify LI (linear indipendence) qualification.
Compute hessian and reduced hessian
\nabla^2_{(x,y,z)} = \begin{pmatrix} 2 & 0 & 0 \\ 0 & -2\lambda & 0 \\ 0 & 0 & -2\lambda \end{pmatrix} \implies \begin{pmatrix} 2 & 0 & 0 \\ 0 & -2 & 0 \\ 0 & 0 & -2 \end{pmatrix}
reduced hessian \begin{aligned} \bm{K}^{\top,a}\nabla^2_{(x,y,z)}\bm{K}^a &= \begin{pmatrix} -1 & 1 & 0 \\ -1 & 0 & 1 \end{pmatrix} \begin{pmatrix} 2 & 0 & 0 \\ 0 & -2 & 0 \\ 0 & 0 & -2 \end{pmatrix} \begin{pmatrix} -1 & -1 \\ 1 & 0 \\ 0 & 1\end{pmatrix} \\ &= \begin{pmatrix} -1 & 1 & 0 \\ -1 & 0 & 1 \end{pmatrix} \begin{pmatrix} -2 & -2 \\ -2 & 0 \\ 0 & -2\end{pmatrix} \\ &= \begin{pmatrix} 0 & 2 \\ 2 & 0 \end{pmatrix} \end{aligned}
Use Sylverster to check positivity
- \bm{H}_1 = (0), determinant 0, check perturbed \det(0+\epsilon) = \epsilon and \det(0+\epsilon)^\prime = 1>0 [OK]
- \det\begin{pmatrix} 0 & 2 \\ 2 & 0 \end{pmatrix}=-4<0 [NO]
the point is not a minimum.
Study of solution 2: x=\frac{\sqrt{5}+1}{4}, y=z=\frac{\sqrt{5}-1}{4}, \lambda=\frac{\sqrt{5}+1}{2}, \mu_1=\mu_2=0
Compute active constraints
\begin{aligned} g_1(x,y,z) & = 1-z = 1-\frac{\sqrt{5}-1}{4}=\frac{5-\sqrt{5}}{4}> 0, \qquad & \textrm{\color{DarkGreen}(not-active)} \\ g_2(x,y,z) & = x = \frac{\sqrt{5}+1}{4}> 0, \qquad & \textrm{\color{DarkGreen}(not-active)} \end{aligned}
Gradient of active constraints
\nabla h(x,y,z) = \begin{pmatrix}1 & 2y & 2z \end{pmatrix} = \begin{pmatrix} 1 & \dfrac{\sqrt{5}-1}{2} & \dfrac{\sqrt{5}-1}{2}\end{pmatrix}
find the kernel of the active constraint
\begin{aligned} \begin{pmatrix} 1 & \dfrac{\sqrt{5}-1}{2} & \dfrac{\sqrt{5}-1}{2}\end{pmatrix} \begin{pmatrix} z_1\\z_2\\z_3\end{pmatrix} & = z_1+\dfrac{\sqrt{5}-1}{2}(z_2+z_3) \\ & \Downarrow \\ \begin{pmatrix} z_1\\z_2\\z_3\end{pmatrix} & = \begin{pmatrix} \dfrac{1-\sqrt{5}}{2}(\alpha+\beta)\\ \alpha\\\beta\end{pmatrix} = \underbrace{ \begin{pmatrix} \dfrac{1-\sqrt{5}}{2} & \dfrac{1-\sqrt{5}}{2} \\ 1 & 0 \\ 0 & 1\end{pmatrix} }_{\bm{K}^a} \begin{pmatrix} \alpha \\ \beta\end{pmatrix} \end{aligned}
Check qualification
Gradient of active constraint \nabla h(x, y, z) consist of a single nonzero row. Thus, is full rank and satify LI (linear indipendence) qualification.
Compute hessian and reduced hessian
\nabla^2_{(x,y,z)} = \begin{pmatrix} 2 & 0 & 0 \\ 0 & -2\lambda & 0 \\ 0 & 0 & -2\lambda \end{pmatrix} \implies \begin{pmatrix} 2 & 0 & 0 \\ 0 & -(\sqrt{5}+1) & 0 \\ 0 & 0 & -(\sqrt{5}+1) \end{pmatrix}
reduced hessian \begin{aligned} \bm{K}^{\top,a}\nabla^2_{(x,y,z)}\bm{K}^a &= \begin{pmatrix} -1 & 1 & 0 \\ -1 & 0 & 1 \end{pmatrix} \begin{pmatrix} 2 & 0 & 0 \\ 0 & -(\sqrt{5}+1) & 0 \\ 0 & 0 & -(\sqrt{5}+1) \end{pmatrix} \begin{pmatrix} -1 & -1 \\ 1 & 0 \\ 0 & 1\end{pmatrix} \\ &= \begin{pmatrix} -1 & 1 & 0 \\ -1 & 0 & 1 \end{pmatrix} \begin{pmatrix} -2 & -2 \\ -(\sqrt{5}+1) & 0 \\ 0 & -(\sqrt{5}+1)\end{pmatrix} \\ &= \begin{pmatrix} 1-\sqrt{5} & 2 \\ 2 & 1-\sqrt{5} \end{pmatrix} \end{aligned}
Use Sylverster to check positivity
- \bm{H}_1 = (01-\sqrt{5}), determinant 1-\sqrt{5}<0 [NO]
the point is not a minimum.
Study of solution 3: x=0, y=z=\lambda=-\frac{\sqrt{2}}{2}, \mu_1=0, \mu_2=\frac{\sqrt{2}}{2}
Compute active constraints
\begin{aligned} g_1(x,y,z) & = 1-z = 1+\frac{\sqrt{2}}{2}=\frac{2+\sqrt{2}}{2}> 0, \qquad & \textrm{\color{DarkGreen}(not-active)} \\ g_2(x,y,z) & = x = 0, \qquad & \textrm{\color{blue}(active)} \end{aligned}
Gradient of active constraints
\begin{pmatrix} \nabla h(x,y,z) \\ \nabla g_2(x,y,z) \end{pmatrix} = \begin{pmatrix} 1 & 2y & 2z \\ 1 & 0 & 0 \end{pmatrix} = \begin{pmatrix} 1 & -\sqrt{2} & -\sqrt{2} \\ 1 & 0 & 0 \end{pmatrix}
find the kernel of the active constraint
\begin{aligned} \begin{pmatrix} 1 & -\sqrt{2} & -\sqrt{2} \\ 1 & 0 & 0 \end{pmatrix} \begin{pmatrix} z_1\\z_2\\z_3\end{pmatrix} & = \begin{pmatrix} z_1-\sqrt{2}(z_2+z_3) \\ z_1 \end{pmatrix} \\ & \Downarrow \\ \begin{pmatrix} z_1\\z_2\\z_3\end{pmatrix} & = \begin{pmatrix}0\\\alpha\\-\alpha\end{pmatrix} = \underbrace{ \begin{pmatrix} 0 \\ 1 \\ -1 \end{pmatrix} }_{\bm{K}^a} \alpha \end{aligned} \tag{1}
Check qualification
Gradient of active constraint Equation 1 is full rank and satify LI (linear indipendence) qualification.
Compute hessian and reduced hessian
\nabla^2_{(x,y,z)} = \begin{pmatrix} 2 & 0 & 0 \\ 0 & -2\lambda & 0 \\ 0 & 0 & -2\lambda \end{pmatrix} \implies \begin{pmatrix} 2 & 0 & 0 \\ 0 & \sqrt{2} & 0 \\ 0 & 0 & \sqrt{2} \end{pmatrix}
reduced hessian \begin{aligned} \bm{K}^{\top,a}\nabla^2_{(x,y,z)}\bm{K}^a &= \begin{pmatrix} 0 & 1 & -1 \end{pmatrix} \begin{pmatrix} 2 & 0 & 0 \\ 0 & \sqrt{2} & 0 \\ 0 & 0 & \sqrt{2} \end{pmatrix} \begin{pmatrix} 0 \\ 1 \\ -1 \end{pmatrix} \\ &= \begin{pmatrix} 0 & 1 & -1 \end{pmatrix} \begin{pmatrix} 0 \\ \sqrt{2} \\ -\sqrt{2} \end{pmatrix} \\ &= 2\sqrt{2} > 0 \end{aligned}
Use Sylverster to check positivity
- \bm{H}_1 = 2\sqrt{2} > 0, [OK]
the point is a minimum.
Example 3
Consider the problem
\begin{aligned} \textrm{minimize:}\quad & f(x, y) = y \\ \textrm{inequalities:}\quad & g_1(x,y,z) = x^2+y^2-1 \geq 0, \\ & g_2(x,y,z) = 1-x^2-\frac{1}{4}y^2 \geq 0, \end{aligned}
Code
# Define the functions for the implicit curves
<- function(x, y) { x^2 + y^2 - 1 } # Circle: x^2 + y^2 = 1
g1 <- function(x, y) { 1 - x^2 - (y / 2)^2 } # Ellipse: x^2 + (y/2)^2 = 1
g2
# Create a grid of points
<- seq(-2, 2, length.out = 300)
x_vals <- seq(-2, 2, length.out = 300)
y_vals
# Generate the data for g1 and g2
<- outer(x_vals, y_vals, g1)
z_g1 <- outer(x_vals, y_vals, g2)
z_g2
# Create a data frame for plotting with ggplot2
<- expand.grid(x = x_vals, y = y_vals)
df $g1 <- as.vector(z_g1)
df$g2 <- as.vector(z_g2)
df
# Define the grid for the gradient vectors with more points
<- expand.grid(x = seq(-2, 2, length.out = 15), # Increased grid size for more arrows
vector_grid y = seq(-2, 2, length.out = 15))
# The gradient of f(x, y) = y is (0, 1) at every point
$u <- 0 # x-component of the gradient
vector_grid$v <- 1 # y-component of the gradient
vector_grid
# Plot using ggplot2
ggplot(df, aes(x = x, y = y)) +
# Fill the first curve (circle) with semi-transparent red
geom_contour_filled(aes(z = g1), breaks = c(-Inf, 0), fill = "red", alpha = 0.5) +
# Fill the second curve (ellipse) with semi-transparent blue
geom_contour_filled(aes(z = g2), breaks = c(-Inf, 0), fill = "blue", alpha = 0.5) +
# Add contour lines for both curves
geom_contour(aes(z = g1), breaks = 0, color = "red", linewidth = 1) +
geom_contour(aes(z = g2), breaks = 0, color = "blue", linewidth = 1) +
# Add the gradient field with more arrows and darker color
geom_segment(data = vector_grid,
aes(x = x, y = y, xend = x + u * 0.2, yend = y + v * 0.2),
arrow = arrow(length = unit(0.1, "inches")),
color = "gray50", # Darker color for arrows
linewidth = 0.7) + # Slightly larger arrow size
# Add the point at (0, -2)
geom_point(aes(x = 0, y = -2), color = "red", size = 3) +
geom_point(aes(x = 0, y = 1), color = "blue", size = 3) +
coord_fixed() + # Fix the aspect ratio to constrain scaling
theme_minimal() +
labs(title = "Ellipses with Gradient Field of f(x, y) = y", x = "x", y = "y")
Lagrangian Function
We define the Lagrangian function as follows:
\mathcal{L}(x,y,\mu_1,\mu_2) = y - \mu_1(x^2 + y^2 - 1) - \mu_2\left(1 - x^2 - \frac{1}{4}y^2\right)
First-Order Conditions
The first-order conditions derived from the Lagrangian are:
\begin{aligned} \frac{\partial \mathcal{L}}{\partial x} &= 2x(\mu_2 - \mu_1), \\ \frac{\partial \mathcal{L}}{\partial y} &= 1 + y\left(\frac{1}{2}\mu_2 - 2\mu_1\right), \\ 0 &= \mu_1(x^2 + y^2 - 1), \\ 0 &= \mu_2\left(1 - x^2 - \frac{1}{4}y^2\right). \end{aligned}
Solutions of First-Order Conditions
The solutions to the first-order conditions yield the following critical points:
\begin{aligned} x &= 0, & y &= 1, & \mu_1 &= \frac{1}{2}, & \mu_2 &= 0, \\ x &= 0, & y &= -2, & \mu_1 &= 0, & \mu_2 &= 1. \end{aligned}
Analysis of First Candidate
Compute Active Constraints
We evaluate the constraints at the first candidate (x, y) = (0, 1):
\begin{aligned} g_1(x,y) &= x^2 + y^2 - 1 = 0^2 + 1^2 - 1 = 0, & \text{(active)} \\ g_2(x,y) &= 1 - x^2 - \frac{1}{4}y^2 = 1 - 0^2 - \frac{1}{4}1^2 = \frac{3}{4} > 0, & \text{(not-active)} \end{aligned}
Gradient of Active Constraints
The gradient of the active constraint g_1 is computed as follows:
\nabla g_1(x,y) = [2x, 2y] = [0, 2].
Check qualification
Gradient of active constraint \nabla g_1(x,y) is full rank and satify LI (linear indipendence) qualification.
Kernel of the Gradient of Active Constraints
The kernel of the gradient can be expressed as:
[0, 2] \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} = 2z_2 = 0 \implies \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} = \begin{pmatrix} \alpha \\ 0 \end{pmatrix} = \underbrace{\begin{pmatrix} 1 \\ 0 \end{pmatrix}}_{\bm{K}^a} \alpha.
Reduced Hessian
The reduced Hessian is given by:
\begin{aligned} \nabla^2_{(x,y)} \mathcal{L}(x,y,\mu_1,\mu_2) &= \begin{pmatrix} 2(\mu_2 - \mu_1) & 0 \\ 0 & \frac{1}{2}(\mu_2 - 4\mu_1) \end{pmatrix} \\ & \implies \begin{pmatrix} -1 & 0 \\ 0 & -1 \end{pmatrix}. \end{aligned}
Evaluating the quadratic form yields:
\begin{aligned} \bm{K}^{a,\top} \nabla^2_{(x,y)} \mathcal{L}(x,y,\mu_1,\mu_2) \bm{K}^a &= \begin{pmatrix} 1 & 0 \end{pmatrix} \begin{pmatrix} -1 & 0 \\ 0 & -1 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} = -1. \end{aligned}
Since the Hessian is not semi-definite positive, this point is not a minimum.
Analysis of Second Candidate
Compute Active Constraints
Next, we evaluate the constraints at the second candidate (x, y) = (0, -2):
\begin{aligned} g_1(x,y) &= x^2 + y^2 - 1 = 0^2 + (-2)^2 - 1 = 3 > 0, & \text{(not-active)} \\ g_2(x,y) &= 1 - x^2 - \frac{1}{4}y^2 = 1 - 0^2 - \frac{1}{4}(-2)^2 = 0, & \text{(active)} \end{aligned}
Gradient of Active Constraints
The gradient of the active constraint g_2 is computed as:
\nabla g_2(x,y) = [-2x, -\frac{1}{2}y] = [0, -1].
Check qualification
Gradient of active constraint \nabla g_1(x,y) is full rank and satify LI (linear indipendence) qualification.
Kernel of the Gradient of Active Constraints
The kernel of the gradient can be expressed as:
[0, -1] \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} = 2z_2 = 0 \implies \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} = \begin{pmatrix} \alpha \\ 0 \end{pmatrix} = \underbrace{\begin{pmatrix} 1 \\ 0 \end{pmatrix}}_{\bm{K}^a} \alpha.
Reduced Hessian
The reduced Hessian is given by:
\begin{aligned} \nabla^2_{(x,y)} \mathcal{L}(x,y,\mu_1,\mu_2) &= \begin{pmatrix} 2(\mu_2 - \mu_1) & 0 \\ 0 & \frac{1}{2}(\mu_2 - 4\mu_1) \end{pmatrix} \\ & \implies \begin{pmatrix} 2 & 0 \\ 0 & \frac{1}{2} \end{pmatrix}. \end{aligned}
Evaluating the quadratic form yields:
\begin{aligned} \bm{K}^{a,\top} \nabla^2_{(x,y)} \mathcal{L}(x,y,\mu_1,\mu_2) \bm{K}^a &= \begin{pmatrix} 1 & 0 \end{pmatrix} \begin{pmatrix} 2 & 0 \\ 0 & \frac{1}{2} \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} = 1. \end{aligned}
Since the Hessian is definite positive, this point is a minimum.
Example 4
Consider the problem
\begin{aligned} \textrm{minimize:}\quad & f(x, y) = x \\ \textrm{inequalities:}\quad & g_1(x,y,z) = x^2+y^2-1 \geq 0, \\ & g_2(x,y,z) = 1-x^2-\frac{1}{4}y^2 \geq 0, \end{aligned}
Code
# Define the functions for the implicit curves
<- function(x, y) { x^2 + y^2 - 1 } # Circle: x^2 + y^2 = 1
g1 <- function(x, y) { 1 - x^2 - (y / 2)^2 } # Ellipse: x^2 + (y/2)^2 = 1
g2
# Create a grid of points
<- seq(-2, 2, length.out = 300)
x_vals <- seq(-2, 2, length.out = 300)
y_vals
# Generate the data for g1 and g2
<- outer(x_vals, y_vals, g1)
z_g1 <- outer(x_vals, y_vals, g2)
z_g2
# Create a data frame for plotting with ggplot2
<- expand.grid(x = x_vals, y = y_vals)
df $g1 <- as.vector(z_g1)
df$g2 <- as.vector(z_g2)
df
# Define the grid for the gradient vectors with more points
<- expand.grid(x = seq(-2, 2, length.out = 15), # Increased grid size for more arrows
vector_grid y = seq(-2, 2, length.out = 15))
# The gradient of f(x, y) = x is (1, 0) at every point
$u <- 1 # x-component of the gradient
vector_grid$v <- 0 # y-component of the gradient
vector_grid
# Plot using ggplot2
ggplot(df, aes(x = x, y = y)) +
# Fill the first curve (circle) with semi-transparent red
geom_contour_filled(aes(z = g1), breaks = c(-Inf, 0), fill = "red", alpha = 0.5) +
# Fill the second curve (ellipse) with semi-transparent blue
geom_contour_filled(aes(z = g2), breaks = c(-Inf, 0), fill = "blue", alpha = 0.5) +
# Add contour lines for both curves
geom_contour(aes(z = g1), breaks = 0, color = "red", linewidth = 1) +
geom_contour(aes(z = g2), breaks = 0, color = "blue", linewidth = 1) +
# Add the gradient field with more arrows and darker color
geom_segment(data = vector_grid,
aes(x = x, y = y, xend = x + u * 0.2, yend = y + v * 0.2),
arrow = arrow(length = unit(0.1, "inches")),
color = "gray50", # Darker color for arrows
linewidth = 0.7) + # Slightly larger arrow size
# Add the point at (0, -2)
geom_point(aes(x = -1, y = 0), color = "red", size = 3) +
geom_point(aes(x = 1, y = 0), color = "blue", size = 3) +
coord_fixed() + # Fix the aspect ratio to constrain scaling
theme_minimal() +
labs(title = "Ellipses with Gradient Field of f(x, y) = y", x = "x", y = "y")
Lagrangian Function
We define the Lagrangian function as follows:
\mathcal{L}(x,y,\mu_1,\mu_2) = x - \mu_1(x^2 + y^2 - 1) - \mu_2\left(1 - x^2 - \frac{1}{4}y^2\right)
First-Order Conditions
The first-order conditions derived from the Lagrangian are:
\begin{aligned} \frac{\partial \mathcal{L}}{\partial x} &= 1+2x(\mu_2 - \mu_1), \\ \frac{\partial \mathcal{L}}{\partial y} &= y\left(\frac{1}{2}\mu_2 - 2\mu_1\right), \\ 0 &= \mu_1(x^2 + y^2 - 1), \\ 0 &= \mu_2\left(1 - x^2 - \frac{1}{4}y^2\right). \end{aligned}
Solutions of First-Order Conditions
Case y=0 \begin{aligned} 0 &= 1+2x(\mu_2 - \mu_1), \\ 0 &= \mu_1(x^2 - 1), \\ 0 &= \mu_2(1 - x^2). \end{aligned}
in this case if \mu_1=\mu_2=0 the first equation fails. Thus x=1 or x=-1
- Case x=1 solving 1+2(\mu_2 - \mu_1)=0 we have \mu_1=1/2+\mu_2
- Case x=-1 solving 1-2(\mu_2 - \mu_1)=0 we have \mu_2=1/2+\mu_1
thus we have two family of solution
- x=1, y=0, \mu_1=1/2+\mu_2 with \mu_1\geq 0
- x=-1, y=0, \mu_2=1/2+\mu_1 with \mu_2\geq 0
Case y\neq 0 in this case \mu_2=4\mu_1 and \left\{ \begin{aligned} 0 &= 1+6x\mu_1, \\ 0 &= \mu_1(x^2 + y^2 - 1), \\ 0 &= 4\mu_1\left(1 - x^2 - \frac{1}{4}y^2\right). \end{aligned}\right. \implies \mu_1\neq 0\implies \left\{ \begin{aligned} 0 &= x^2 + y^2 - 1, \\ 0 &= 1 - x^2 - \frac{1}{4}y^2. \end{aligned}\right. but the last system has solutioon y=0 and x=\pm1. Thus the solution must be discarded.
The solutions to the first-order conditions yield the following critical points:
\begin{aligned} x &= 1, & y &= 0, & \mu_1 &= \frac{1}{2}+\mu_2, \\ x &= -1, & y &= 0, & \mu_2 &= \frac{1}{2}+\mu_1 \end{aligned}
Analysis of First Candidate
Compute Active Constraints
We evaluate the constraints at the first candidate (x, y) = (1, 0):
\begin{aligned} g_1(x,y) &= x^2 + y^2 - 1 = 1^2 + 0^2 - 1 = 0, & \text{(active)} \\ g_2(x,y) &= 1 - x^2 - \frac{1}{4}y^2 = 1 - 1^2 - \frac{1}{4}0^2 = 0, & \text{(active)} \end{aligned}
Gradient of Active Constraints
The gradient of the active constraint g_1 and g_2 is computed as follows:
\begin{pmatrix} \nabla g_1(x,y) \\ \nabla g_2(x,y) \end{pmatrix} = \begin{pmatrix} 2x & 2y \\ -2x & -y/2 \end{pmatrix} \;\implies\; \begin{pmatrix} 2 & 0 \\ -2 & 0 \end{pmatrix} \tag{2}
Check qualification
Gradient of active constraint Equation 2 is not full rank and do not satify LI (linear indipendence) qualification.
Remark 2. However, by choosing the solution with \mu_2 = 0, we can eliminate the second constraint and still obtain the same candidate point. Thus, by removing the first inequality, the result for this point remains unchanged, and the constraint qualification holds.
Kernel of the Gradient of Active Constraints
The kernel of the gradient can be expressed as:
\begin{aligned} \begin{pmatrix} 2 & 0 \\ -2 & 0 \end{pmatrix} \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} & = \begin{pmatrix} 2z_1 \\ -2z_1 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix} \\ & \Downarrow \\ \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} & = \begin{pmatrix} 0 \\ \alpha \end{pmatrix} = \underbrace{\begin{pmatrix} 0 \\ 1 \end{pmatrix}}_{\bm{K}^a} \alpha. \end{aligned}
Reduced Hessian
The reduced Hessian (with \mu_2=0 and \mu_1=1/2) is given by:
\begin{aligned} \nabla^2_{(x,y)} \mathcal{L}(x,y,\mu_1,\mu_2) &= \begin{pmatrix} 2(\mu_2 - \mu_1) & 0 \\ 0 & \frac{1}{2}(\mu_2 - 4\mu_1) \end{pmatrix} \\ & \implies \begin{pmatrix} -1 & 0 \\ 0 & -1 \end{pmatrix}. \end{aligned}
Evaluating the quadratic form yields:
\begin{aligned} \bm{K}^{a,\top} \nabla^2_{(x,y)} \mathcal{L}(x,y,\mu_1,\mu_2) \bm{K}^a &= \begin{pmatrix} 0 & 1 \end{pmatrix} \begin{pmatrix} -1 & 0 \\ 0 & -1 \end{pmatrix} \begin{pmatrix} 0 \\ 1 \end{pmatrix} = \frac{1}{4}. \end{aligned}
Since the Hessian is not semi-definite positive, this point is not a minimum.
Analysis of Second Candidate
Compute Active Constraints
Next, we evaluate the constraints at the second candidate (x, y) = (-1,0):
\begin{aligned} g_1(x,y) &= x^2 + y^2 - 1 = (-1)^2 + 0^2 - 1 = 0, & \text{(active)} \\ g_2(x,y) &= 1 - x^2 - \frac{1}{4}y^2 = 1 - (-1)^2 - \frac{1}{4}(0)^2 = 0, & \text{(active)} \end{aligned}
Gradient of Active Constraints
The gradient of the active constraint g_1 and g_2 is computed as:
\begin{pmatrix} \nabla g_1(x,y) \\ \nabla g_2(x,y) \end{pmatrix} = \begin{pmatrix} 2x & 2y \\ -2x & -y/2 \end{pmatrix} \;\implies\; \begin{pmatrix} 2 & 0 \\ -2 & 0 \end{pmatrix}
Kernel of the Gradient of Active Constraints
The kernel of the gradient can be expressed as:
\begin{aligned} \begin{pmatrix} 2 & 0 \\ -2 & 0 \end{pmatrix} \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} & = \begin{pmatrix} 2z_1 \\ -2z_1 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix} \\ & \Downarrow \\ \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} & = \begin{pmatrix} 0 \\ \alpha \end{pmatrix} = \underbrace{\begin{pmatrix} 0 \\ 1 \end{pmatrix}}_{\bm{K}^a} \alpha. \end{aligned} \tag{3}
Check qualification
Gradient of active constraint Equation 3 is not full rank and do not satify LI (linear indipendence) qualification.
Remark 2. However, by choosing the solution with \mu_1 = 0, we can eliminate the first constraint and still obtain the same candidate point. Thus, by removing the first inequality, the result for this point remains unchanged, and the constraint qualification holds.
Reduced Hessian
The reduced Hessian (with \mu_1=0 and \mu_2=1/2) is given by: \begin{aligned} \nabla^2_{(x,y)} \mathcal{L}(x,y,\mu_1,\mu_2) &= \begin{pmatrix} 2(\mu_2 - \mu_1) & 0 \\ 0 & \frac{1}{2}(\mu_2 - 4\mu_1) \end{pmatrix} \\ & \implies \begin{pmatrix} 1 & 0 \\ 0 & \frac{1}{4} \end{pmatrix}. \end{aligned}
Evaluating the quadratic form yields:
\begin{aligned} \bm{K}^{a,\top} \nabla^2_{(x,y)} \mathcal{L}(x,y,\mu_1,\mu_2) \bm{K}^a &= \begin{pmatrix} 0 & 1 \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & \frac{1}{4} \end{pmatrix} \begin{pmatrix} 0 \\ 1 \end{pmatrix} = \frac{1}{4}. \end{aligned}
Since the Hessian is definite positive, this point is a minimum.
Example 5
Consider the problem
\begin{aligned} \textrm{minimize:}\quad & f(x,y) = 6 + 2y-x^2 \\ \textrm{equality:}\quad & h(x,y) = y - x^2 \end{aligned}
Lagrangian function
\mathcal{L}(x,y,\lambda) = 6 + 2y-x^2 - \lambda(y - x^2)
First-Order Conditions
\begin{aligned} \frac{\partial\mathcal{L}}{\partial x} & = 2x(\lambda-1) = 0\\ \frac{\partial\mathcal{L}}{\partial y} & = 2 - \lambda = 0\\ 0 & = y - x^2 \end{aligned}
Solution First-Order Conditions
- \lambda = 2
- x=0
- y=0
Analisys of The Candidate
Gradient of the constraints
\nabla (y - x^2) = \begin{pmatrix} -2x & 1 \end{pmatrix},\quad \implies\quad \begin{pmatrix} 0 & 1 \end{pmatrix}
Check Qualification
Only one vector non zero is linearly independent. The constraint is qualified.
Kernel of the Gradient of Active Conbstraints
\begin{pmatrix} 0 & 1 \end{pmatrix} \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} = z_2 = 0,\quad \implies\quad \begin{pmatrix} z_1 \\ z_2 \end{pmatrix} = \begin{pmatrix} \alpha \\ 0 \end{pmatrix} = \underbrace{\begin{pmatrix} 1 \\ 0 \end{pmatrix}}_{\bm{K}^a} \alpha
Reduced Hessian
\nabla_{(x,y)}\mathcal{L}(x,y,\lambda) = \begin{pmatrix} \lambda - 1 & 0 \\ 0 & 0 \end{pmatrix} \quad\implies\quad \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix}
\bm{K}^{a,\top} \nabla_{(x,y)}\mathcal{L}(0,0,2) \bm{K}^{a} = \begin{pmatrix} 1 & 0 \end{pmatrix} \begin{pmatrix} 1 & 0 \\ 0 & 0 \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} = (1)
the reduced Hessian is definite positive, the point is a minimum.