Bad SDPs and beginner mistakes

Beginners in optimization and semidefinite programming often overestimate what semidefinite programming solvers are capable of, and what the L in LMI stands for. Sometimes completely unrealistically complicated models are derived and they are bending backwards to do the impossible, while sometimes trivial improvements can be made to allow them to be solved.

Unnecessary nonlinear terms

The absolutely most common mistake is the introduction of a single variable nonlinearly which destroys both convexity and linearity. A too literate reading of a theorem and lack of understanding what L stands for often leads to models similiar to this

P = sdpvar(2);
A = [-1 2;2 -3];
B = [1;1];
sdpvar gamma
MyLMI = [P >= 0, [A'*P + P*A P*B;B'*P -gamma^2] <= 0];
optimize(MyLMI, gamma)

For comical effect, we have named the obviously nonlinear constraint MyLMI to indicate the lack of understanding here. L in LMI stands for linear, and the squared gamma obviously destroys that here.

When sending this model to optimize it will either lead to a diagnostic message saying there is no solver available, or you might get the global nonlinear solver BMIBNB started (and most likely fail). This is completely unnecessary though.

A completely equivalent form, as gamma only is used in squared form, is to introduce a new variable which replaces the squared term.

P = sdpvar(2);
A = [-1 2;2 -3];
B = [1;1];
sdpvar t
MyLMI = [P >= 0, [A'*P + P*A P*B;B'*P -t] <= 0];
optimize(MyLMI, sqrt(t))

This is still essentially as bad though, as we now try to minimize a concave objective and this cannot be reformulated as a linear semidefinite program. However, the objective is monotonic so an equivalent linear problem can be solved, and the original variable can be recovered.

P = sdpvar(2);
A = [-1 2;2 -3];
B = [1;1];
sdpvar t
MyLMI = [P >= 0, [A'*P + P*A P*B;B'*P -t] <= 0];
optimize(MyLMI, t)
gamma = sqrt(value(t))

A Schur complement away

The most commonly trick used in control-related semidefinite programs is the Schur complement. Consider the following model were we try to find a minimum-norm state-feedback matrix to stabilize a given system with a predefined Lyapunov function

K = sdpvar(1,2);
A = [-1 2;2 -3];
B = [1;1];
P = [33 -56;-56 97];
MyLMI = [ (A + B*K)'*P*(A + B*K) - P <= 0];
optimize(MyLMI,norm(K))

Once again we indicate how confused we are giving the quadratic semidefinite constraint a name indicating we think it is linear. The correct model is only a Schur complement away though.

K = sdpvar(1,2);
A = [-1 2;2 -3];
B = [1;1];
P = [33 -56;-56 97];
MyLMI = [ [P (A + B*K)';(A + B*K) inv(P)] >= 0];
optimize(MyLMI,norm(K))

Congruence and variable change

An even more complicated model would arise if we wanted to optimize over both \(K\) and \(P\) in the previous example (now only looking for a feasible solution, since the minimum-norm state-feedback problem cannot be expressed using a convex semidefinite model)

The most naive model would be the following

K = sdpvar(1,2);
P = sdpvar(2);
A = [-1 2;2 -3];
B = [1;1];
MyLMI = [ (A + B*K)'*P*(A + B*K) - P <= 0];
optimize(MyLMI)

The initial model is cubic in the decision variables. After reading the previous section, our hero might develop the following model

K = sdpvar(1,2);
P = sdpvar(2);
A = [-1 2;2 -3];
B = [1;1];
MyLMI = [ [P (A + B*K)';(A + B*K) inv(P)] >= 0];
optimize(MyLMI)

The cubic terms have been eliminated, but this model is arguably even worse, as it involves an inverse. The trick here is to employ a congruence transformation with a block-diagonal matrix with blocks \( P^{-1} \) and \( I \), or alternatively view the original constraint as \( P^{-1}(A + BK)^T P (A + BK)P^{-1} - P^{-1}PP^{-1} \preceq 0\). This leads to

K = sdpvar(1,2);
P = sdpvar(2);
A = [-1 2;2 -3];
B = [1;1];
MyLMI = [ [inv(P) inv(P)*(A + B*K)';(A + B*K)*inv(P) inv(P)] >= 0];
optimize(MyLMI)

Even worse, but now we perform an invertible variable change using \( Q = P^{-1}, Y = KP^{-1} \) and the result is linear

Y = sdpvar(1,2);
Q = sdpvar(2);
A = [-1 2;2 -3];
B = [1;1];
MyLMI = [ [Q (A*Q + B*Y)';(A*Q + B*Y) Q] >= 0];
optimize(MyLMI)
K = value(Y)*inv(value(Q))

Impossible Schur complement

Beginners who have learned the magic of Schur complements to linearize \(A - B^TC^{-1}B \succeq, C\succeq 0\) sometimes start bending backwards to massage the nonconvex model \(A + B^TC^{-1}B \succeq, C\succeq 0\) into a convex form, for instance by writing it as \(A - B^T(-C^{-1})B \succeq, C\succeq 0\). It simply does not work as it is a nonconvex constraint. The flaw here is that we forgot that the correct form after changing the sign would be \(A - B^T(-C^{-1})B \succeq 0, -C\succeq 0\), which is very different from the initial constraint.

That is not a variable change

Another typical mistake is to fail to understand the difference between a variable change, and simply assigning expressions to temporary variables. The state-feedback design problem above is not fixed by simply assigning the variables to some other expression. This is just a re-arrangement of code by introducing a temporary expression holder, the resulting model is precisely the same.

K = sdpvar(1,2);
P = sdpvar(2);
A = [-1 2;2 -3];
B = [1;1];
Q = inv(P);
Y = K*inv(P);
MyLMI = [ [Q (A*Q + B*Y)';(A*Q + B*Y) Q] >= 0];
optimize(MyLMI)

Similarily, the following model does indeed introduce new variables, but the old ones are kept and add nonlinear equalities to the model rendering the problem intractable.

K = sdpvar(1,2);
P = sdpvar(2);
Y = sdpvar(1,2);
Q = sdpvar(2);
A = [-1 2;2 -3];
B = [1;1];
Q = inv(P);
Y = K*inv(P);
MyLMI = [ [Q (A*Q + B*Y)';(A*Q + B*Y) Q] >= 0, Q == inv(P), Y == K*inv(P)];
optimize(MyLMI)

Solving homogenuous problems

A very common situation is formulation of models which are homogenuous, which means that if \(x\) is a solution, then so is \(tx\) for any \(t\geq 0\). These models typically only make sense when there is a strict inequality somewhere, and since that is not supported, chaos ensues.

Consider proving stability of a linear system \(\dot{x}=Ax\) by finding a Lyapunov matrix satisfying \(A^TP+PA \prec 0, P \succ 0\).

A = eye(2);

P = sdpvar(2);
optimize([A'*P+P*A<=0, P>=0])

The solver does not complain and will not tell us the problem is infeasible (since it is feasible!) and thus proves stability of an obviously unstable system. The problem is that we have replaced the strict inequality with a non-strict (as we have to do that in practice). This is dangerous on homogenuous models.

Since the problem is homogenuous the scale of the variables are arbitrary, so we can replace \(P \succ 0 \) with \( P\succeq I\)

optimize([A'*P+P*A <=0, P>=eye(2)])

This will lead to infeasibility in the solver, as expected.

Johan Löfberg