Erling's blog: SOCP

Showing posts with label SOCP. Show all posts

Wednesday, November 6, 2013

Complexity of solving conic quadratic problems

First a clarification conic quadratic optimization and second order cone optimization is the same thing. I prefer the name conic quadratic optimization though.

Frequently it is asked on the internet what is the computational complexity of solving conic quadratic problems. Or the related questions what is the complexity of the algorithms implemented in MOSEK, SeDuMi or SDPT3.

Here are a some typical questions

To the best of my knowledge almost all open source and commercial software employ a primal-dual interior-point algorithm using for instance the so-called Nesterov-Todd scaling.

A conic quadratic problem can be stated on the form

\[
\begin{array}{lccl}
\mbox{min} & \sum_{j=1}^d (c^j)^T x^j & \\
\mbox{st} & \sum_{j=1}^d A^j x^j & = & b \\
& x^j \in K^j & \\
\end{array}
\]
where \(K_j\) is a \(n^j\) dimensional quadratic cone. Moreover, I will use \(A = [A^1,\ldots, A^d ]\) and \(n=\sum_j n^j\). Note that \(d \leq n\). First observe the problem cannot be solved exactly on a computer using floating numbers since the solution might be irrational. This is in contrast to linear problems that always have rational solution if the data is rational.

Using for instance the primal-dual interior point algorithm the problem can be solved to \(\varepsilon\) accuracy in \(O(\sqrt{d} \ln(\varepsilon^{-1}))\) interior-point iterations, where \(\varepsilon\) is the accepted duality gap. The most famous variant having that iteration complexity is based on Nesterov and Todds beautiful work on symmetric cones.

Each iteration requires solution of a linear system with the coefficient matrix
\[ \label{neweq} \left [ \begin{array}{cc} H & A^T \\ A & 0 \\ \end{array} \right ] \mbox{ (*)} \]
This is the most expensive operation and that can be done in \(O(n^3)\) complexity using Gaussian elimination so we end at the complexity \(O(n^{3.5}\ln(\varepsilon^{-1}))\).

That is the theoretical result. In practice the algorithms usually works much better because they normally finish in something like 10 to 100 iterations and rarely employs more than 200 iterations. In fact if the algorithm requires more than 200 iterations then typically numerical issues prevent the software from solving the problem.

Finally, typically conic quadratic problem is sparse and that implies the linear system mentioned above can be solved must faster when the sparsity is exploited. Figuring our to solve the linear equation system (*) in the lowest complexity when exploiting sparsity is NP hard and therefore optimization only employs various heuristics such minimum degree order that helps cutting the iteration complexity. If you want to know more then read my Mathematical Programming publication mentioned below. One important fact is that it is impossible to predict the iteration complexity without knowing the problem structure and then doing a complicated analysis of that. I.e. the iteration complexity is not a simple function of the number constraints and variables unless A is completely dense.

To summarize primal-dual interior-point algorithms solve a conic quadratic problem in less 200 times the cost of solving the linear equation system (*) in practice.

So can the best proven polynomial complexity bound be proven for software like MOSEK. In general the answer is no because the software employ an bunch of tricks that speed up the practical performance but unfortunately they destroy the theoretical complexity proof. In fact, it is commonly accepted that if the algorithm is implemented strictly as theory suggest then it will be hopelessly slow.

I have spend of lot time on implementing interior-point methods as documented by the Mathematical Programming publication and my view on the practical implementations are that are they very close to theory.

Wednesday, June 26, 2013

Slow convergence in conic quadratic optimization

The recent paper by Gould, Orban, and Robinson in Mathematical Programming Computation study the QP
\[
\begin{array}{lc}
\mbox{min} & x_1^2    \\
\mbox{st}     &  x_1 \geq 0
\end{array}
\]
because the problem does not have s strict complementarity solution. This fact leads interior-point methods to converge slowly.

In interesting observation is that it can be formulated as the following conic quadratic optimization problem:
\[
\begin{array}{lc}
\mbox{min} &  x_2        \\
\mbox{st}    & x_2 \geq ||x_1|| \\
                    &  x_1  \geq 0
\end{array}
\]
An optimal primal solution is \(x_1 = 0\) and \(x_2=0\). Observe Vanderbei study the same problem in his working paper Section 2.2) on solving SOCPs with LOQO.

If I solve the problem with MOSEK I obtain

ITE PFEAS    DFEAS    GFEAS    PRSTATUS   POBJ              DOBJ              MU       TIME  
0   1.0e+000 1.0e+000 2.0e+000 0.00e+000  1.000000000e+000  0.000000000e+000  1.0e+000 0.00  
1   1.6e-001 1.6e-001 3.2e-001 1.00e+000  2.404340377e-001  0.000000000e+000  1.6e-001 0.00  
2   8.2e-003 8.2e-003 1.6e-002 1.12e+000  1.181687831e-002  0.000000000e+000  8.2e-003 0.00  
3   4.1e-004 4.1e-004 8.2e-004 1.01e+000  5.891864452e-004  0.000000000e+000  4.1e-004 0.00  
4   2.1e-005 2.1e-005 4.1e-005 1.00e+000  2.945495827e-005  0.000000000e+000  2.1e-005 0.00  
5   2.6e-010 2.6e-010 5.2e-010 1.00e+000  4.144019664e-010  0.000000000e+000  2.6e-010 0.00

As can be seen from mu column (average complementarity gap) MOSEK converges quickly. In fact there is quadratic convergence in the last iteration. Now the dual problem is
\[
\begin{array}{lc}
\mbox{max} & 0        \\
\mbox{st}    & s_3 = 1 \\
                     & s_2 \geq ||s_1|| \\
                    &  s_1  \geq 0
\end{array}
\]
and an optimal dual solution is \(s_1=0.0\), \(s_2=1.0\) and \(s_3=1\). In this case the dual solution is in the interior and hence the solutions are stricly complementarity. This is also indicated by the interior-point optimizer in MOSEK which also converge quickly.

An equivalent conic quadratic problem using a rotated quadratic cone can be stated as follows
\[
\begin{array}{lc}
\mbox{min} &  x_2        \\
\mbox{st}    & 2 x_3 x_2 \geq ||x_1||^2 \\
                    & x_3 =1 \\
                    &  x_1  \geq 0
\end{array}
\]
Now solving that problem with MOSEK produces

ITE PFEAS    DFEAS    GFEAS    PRSTATUS   POBJ              DOBJ              MU       TIME  
0   1.0e+000 1.0e+000 1.7e+000 0.00e+000  7.071067812e-001  0.000000000e+000  1.0e+000 0.00  
1   1.4e-001 1.4e-001 2.4e-001 6.09e-001  1.517397994e-001  -4.749764644e-002 1.4e-001 0.00  
2   2.2e-002 2.2e-002 3.7e-002 1.21e+000  2.120093451e-002  -6.971441263e-003 2.2e-002 0.00  
3   5.3e-003 5.3e-003 9.0e-003 1.14e+000  4.674957163e-003  -1.445168902e-003 5.3e-003 0.00  
4   1.5e-003 1.5e-003 2.6e-003 1.07e+000  1.234899144e-003  -3.403939914e-004 1.5e-003 0.00  
5   4.4e-004 4.4e-004 7.4e-004 1.04e+000  3.331026029e-004  -7.962820209e-005 4.4e-004 0.00  
6   1.2e-004 1.2e-004 2.1e-004 1.02e+000  8.925963682e-005  -1.860468131e-005 1.2e-004 0.00  
7   3.4e-005 3.4e-005 5.8e-005 1.01e+000  2.373850359e-005  -4.389268847e-006 3.4e-005 0.00  
8   9.2e-006 9.2e-006 1.6e-005 1.01e+000  6.273175074e-006  -1.049389687e-006 9.2e-006 0.00  
9   2.4e-006 2.4e-006 4.2e-006 1.00e+000  1.649538759e-006  -2.545236451e-007 2.4e-006 0.00  
10  6.5e-007 6.5e-007 1.1e-006 1.00e+000  4.321347177e-007  -6.259486109e-008 6.5e-007 0.00  
11  1.7e-007 1.7e-007 2.9e-007 1.00e+000  1.128991865e-007  -1.558439042e-008 1.7e-007 0.00  
12  4.5e-008 4.5e-008 7.7e-008 1.00e+000  2.943872208e-008  -3.919657806e-009 4.5e-008 0.00  
13  1.2e-008 1.2e-008 2.0e-008 1.00e+000  7.665414689e-009  -9.952574739e-010 1.2e-008 0.00

It is seen that interior-point optimizer now converge at a much slower rate and takes many more iterations. It is easy to verify that there is no strictly complementarity solution in this case.

This give rise to some comments. There might be several ways of representing a set and they not all equally effective. Secondly maybe we should think about improving the code for problems without a stricly complementarity solution as Gould et al. is suggesting.

Thursday, September 8, 2011

Conference in Honor of Etienne Loute

Yesterday, I was at a conference honoring Etienne Loute where I presented the talk: "Convex Optimization : Conic Versus Functional Form". The messages of the talk is if you can formulate an optimization problem as a conic quadratic optimization problem (aka SOCP) then there are many good reasons to prefer this form over other forms. Some reasons are:

Conic quadratic problems are convex construction.
They are almost as simple as LPs to deal with in software.
Duality theory is almost as simple as the linear case.
The (Nesterov-Todd) primal-dual algorithm for conic quadratic problems is extremely good.

I had feared that the distinguished audience would consider my talk too simple. However, the speaker before me was Bob Fourer of AMPL. The title of his talk was about checking convexity of general optimization problems formulated in AMPL and it had two parts. The first part was about checking convexity and the second part was about how you in some cases automatically could convert optimization problems on functional form to a conic quadratic optimization problem. So the two talks complemented each other very well and it made me feel better about my talk.

Finally, I would like to mention that Yurii Nesterov was one of the speakers. He must have enjoyed the day since two speakers was talking about his baby named optimization over symmetric cones.

Thursday, November 4, 2010

Which cones are needed to represent almost all convex optimization problems?

Recently I visited my Professor Yinyu Ye at Stanford university. I worked with during my ph.d. studies where we did some nice work together. Now we hope we might have some new ideas to explore. More about that later. During my visit I also talked Stephen Boyd and Jacob Mattingley. Jacob also gave me a presentation of his interesting ph.d. work which he defended while I was at Stanford.

During a lunch with Yinyu, Stephen and some other Standford guys Stephen said something like: "Almost all convex optimization problem can be formulated using a combination of linear, quadratic, semi-definite and the exponential cones". It is a view I am sharing. Given it is true it has an important practical ramification I will return to shortly. Stephen is aware that his statement is hard to prove but I think it is true like almost all large scale LPs are sparse. It is not something that is not universally true but it is true in practice.

Please note that any convex problem can be formulated in conic form but Stephens postulate that only very few cone types are require. Moreover, currently it is only the exponential cone we do not know how to deal with in practice because it is a so-called nonsymmetric cone.

Occasionally at MOSEK support we get a question like: "I have a nonlinear model. How do I solve it with MOSEK?" My first reply is always: "Is your model convex?" because MOSEK only deals with convex models. If the user knows what convexity is then user will usually reply yes. Maybe, the user will add that it looks complicated to solve general convex convex models with MOSEK. That is particularly true if you do not use a modeling language like AMPL or GAMS. Here comes Stephen Boyds observation handy because if your model does not include exponential terms (or logarithmic terms) or semi-definite terms then most likely the problem can be formulated as a conic quadratic optimization problem. That is fortunate because

conic quadratic optimization problems are as easy as linear problems to deal with software wise. The user does not have to bother with gradients and Hessians for instance.
the optimizer for conic quadratic optimization problems is much more robust than the optimizer for general convex optimization problems.

Recently after I came back from Stanford an user wrote to MOSEK support: "I have this complicated convex problems that cannot be formulated as conic quadratic optimization problem. How should I solve it with MOSEK?" After some back and forth I got him to reveal that it essentially had nonlinear functions of the type

      max(x,0)^2 <= t

Now that is in fact conic quadratic representable as follows

   s^2 <= t
     x    <= s

So, it turned that the complicated convex model is a conic quadratic representable contrary to the initial statement.

The upshot from Stephen Boyds statement is that if your model is convex and does not include exponential or logarithmic terms then is very likely it can represented by conic quadratic or a semi-definite problem. I think that is a very useful observation in practice.

PS. Usually when reformulating a problem as conic quadratic optimization problem then the number of constraints and variables are expanded. Some thinks that inefficient. However, it should be noted that the structure introduced is very sparse and hence does not hurt performance much. There even some cases (thinks QPs with a low rank dense Hessian) where the conic quadratic representation is bug but requires much less space when stored sparse.

Friday, April 16, 2010

Is x raised to the power 5/3 conic quadratic representable?

The set

   x^5/3 <= t, x,t>=0.0

can be represented by

x^2    <= 2 s t, s,t >= 0,
u         =     x,
v         =     s,
z         =     v,
z^2   <= 2fg, f,g > = 0,
f         =     0.5,
4 g      =     h,
h^2 <= 2uv, u,v >= 0.

I will leave it to reader to verify it is correct.

this particular set I have come up over again and again in financial applications. I suppose it has something to do with modeling of transactions costs.

An obvious question is why replace a simple problem but something that looks quite a bit complicated.
However, both in theory and practice the conic quadratic optimization problems is easier to solve then general convex problems.

Friday, March 19, 2010

A conic quadratic representable set.

Assume you have the problem

min v
st.   |x|^3 / y^2 <= v, y>=0

A customer asked me if that can be formulated as conic quadratic problem. Indeed it can and my solution is

z^2 <= 2ty
2t      <= a
a^2   <= 2 uv
z         = 2u
x       <= z
-x      <= z
0       <= y,t,u,v

Erling's blog