Monday, May 23, 2011

A report from SIAM Optimization 2011

Last week I was at the SIAM Optimization 2011 conference in Darmstadt. It was very nice conference for the following reasons:
  • It was very easy to go there since Darmstadt is close to the major airport in Frankfurt.
  • The conference center Darmstadium in the center on Darmstadt was execellent.
  • The hotel was excellent and only 2 mins walk from the conference center.
  • The food was fairly cheap and very good. The same was true for the beer.
  • The scientific program was excellent but also very packed. I did not experience any no shows. 
Some of major topis on the conference was MINLP (mixed-integer nonlinear optimization) and SDP (semi-definite optimization). These two topics kind of got together in the plenary talk of Jon Lee the last day because it seems that SDP will play an important role in MINLP. In fact to concluded his talk by saying: "We need powerful SDP software". Since we plan to support SDP at MOSEK then this was a nice conclusion for us.

One of the most interesting talks I saw was a tutorial by my friend Etienne de Klerk. He discussed a preprocessing technique that can be used in semi-definite optimization to reduce the computational complexity for some problems. The idea is that a big semi-definite variable can decomposed into a number of smaller number of semi-definite variables given some transformations are performed on the problem data.

Another major topic at the conference was sparse optimization where by sparse is meant that the solution is sparse. There was plenary about this topic by Steve Wright and a tutorial by Michael Friedlander. Michael presented among other things a framework that could be used to understand and evaluate all the various algorithms suggested to solve structured solution sparse optimization problems.  This topic was also addressed a talk about robust support vector machines by Laurent El Ghaoui which  I liked quite a bit.

Finally, a couple MOSEK guys gave a presentation in a session. The other speakers in that session was Joachim Lofberg the author of YALMIP and Christian Bliek of IBM. Christian talked about the conic interior-point optimizer  in CPLEX. One slightly surprising announcement Christian made was that CPLEX will employ the homogenous model in their interior-point optimizer as the default from version 12.3. In MOSEK the homogeneous model always been the default.


I have definitely overlooked or forgotten something important at the conference but with 4 days conference from early morning to late evening then that is avoidable unfortunately.

Wednesday, April 27, 2011

Formulating linear programs are hard!

When I was younger I was teaching linear programming (LP) to business students. One lesson I learned is that formulating an LP is much harder than it appears when reading a standard text book. Recently I came  across the paper  "Formulating Integer Linear Programs: A Rogues’ Gallery" which provides many ideas that can be useful when formulating LPs.

Wednesday, April 20, 2011

In need of a optimal basis improvement procedure in linear programming.

One of our MOSEK customers is solving an LP with the primal simplex optimizer. Next she does a sensitivity analysis but that fails because the optimal basis is singular. This should not happen in ideal world but can happen for at least three reasons:
  • A bug may cause the optimal basis to be singular.
  • The primal simplex optimizer works on a presolved and scaled problem. Whereas the sensitivity analysis is performed on the original problems. The basis might be well-conditioned in the presolved and scaled "space" and not the original space.
  • The simplex optimizer usually starts with a nicely conditioned basis. In each iteration an LU representation of the basis is updated using rank 1 updates. If the rank 1 update signals that the basis is ill-conditioned then the iterations is continued. Using an idea by John Tomlin it is possible in some cases to discover that the basis becomes ill-conditioned during the rank-1 update. Hence, it might very well be that an ill-conditioned basis is not discovered. Particularly if it becomes ill-conditioned in the last simplex iteration.
Since most real world LPs have multiple optimal basic solutions then looking for the best conditioned (near) optimal basis might be very useful before doing sensitivity analysis or even hotstart. Finding the best conditioned optimal basis is most likely not computationally feasible but maybe it can done in approximate way.

At ISMP 2009 I saw a talk about the cutting plane methods. There was some relation between the quality of the cuts generated and the conditioning of the basis.

Btw. the John Tomlin article I am referring to is "An Accuracy Test for Updating Triangular Factors", Mathematical Programming Study 4, pp. 142-145 (1975).

Wednesday, March 2, 2011

Is it safe to move lower bounds to zero?

Assume we have the problem

  min c'x
  st.   A x = b        (P0)
         x >= l

where l is large in absolute value say l_j=-1000 for all js. (l is short for lower bounds). The dual problem is


   max b'y + l's
   st.    A'y + s = c  (D0)
           s >= 0

 It is very common to transform the problem to

  min c'x + c'l
  st.   A x = b- A l  (P1)
         x >= 0

for efficiency reasons. Indeed all interior-point optimizers will do that.

The dual problem is

   max b'y + c'l
   st.    A'y + s = c  (D1)
           s >= 0


Let us say we solve (P1) and (D1). Moreover,

   A'y + s =  c

holds only approximate which definitely be the case for interior-point methods. To be precise we have

   A'y + s =  c + e

holds exactly where 0 < |e|| << 1.


This implies if (y,s) from (D1) is reported as an optimal solution to (D0) then there can be a big error in the dual objective value. Note that is not the case if l=0. Now if instead we report (y,c-A'y) as the optimal dual solution to (D0) then objective value will be correct but then s>=0 might be violated.

The question is that which dual solution to report. I guess the answer is that it depends on your priorities.

I will leave it as an exercise to the interested reader to construct a small example demonstrating this. Since I just spend all day figuring out that happening on instance with 100K variables.


       

Thursday, November 4, 2010

Which cones are needed to represent almost all convex optimization problems?

Recently I visited my Professor Yinyu Ye at Stanford university.  I worked with during my ph.d. studies where we did some nice work together. Now we hope we might have some new ideas to explore. More about that later. During my visit I also talked Stephen Boyd and Jacob Mattingley. Jacob also gave me a presentation of his interesting ph.d. work which he defended while I was at Stanford.

During a lunch with Yinyu, Stephen and some other Standford guys Stephen said something like: "Almost all convex optimization problem can be formulated using a combination of  linear, quadratic, semi-definite and the exponential cones". It is a view I am sharing. Given it is true it has an important practical ramification I will return to shortly. Stephen is aware that his statement is hard to prove but I think it is true like almost all large scale LPs are sparse. It is not something  that is not universally true but it is true in practice.

Please note that any convex problem can be formulated in conic form but Stephens postulate that only very few cone types are require. Moreover, currently it is only the exponential cone we do not know how to deal with in practice because it is a so-called nonsymmetric cone.

Occasionally at MOSEK support we get a question like: "I have a nonlinear model. How do I solve it with MOSEK?"  My first reply is always: "Is your model convex?" because MOSEK only deals with convex models. If the user knows what convexity is then user will usually reply yes. Maybe, the user will add that it looks complicated to solve general convex convex models with MOSEK. That is particularly true if you do not use a modeling language like AMPL or GAMS. Here comes Stephen Boyds observation handy because if your model does not include exponential terms (or logarithmic terms) or semi-definite terms then most likely the problem can be formulated as a conic quadratic optimization problem. That is fortunate because
  • conic quadratic optimization problems are as easy as linear problems to deal with software wise. The user does not have to bother with gradients and Hessians for instance.
  • the optimizer for conic quadratic optimization problems is much more robust than the optimizer for general convex optimization problems.
Recently after I came back from Stanford an user wrote to MOSEK  support: "I have this complicated convex problems that cannot be formulated as conic quadratic optimization problem. How should I solve it with MOSEK?" After some back and forth I got him to reveal that it essentially had nonlinear functions of the type


      max(x,0)^2 <= t

Now that is in fact conic quadratic representable as follows

     s^2 <= t
     x    <= s


So, it turned that the complicated convex model is a conic quadratic representable contrary to the initial statement.


The upshot from Stephen Boyds statement is that if your model is convex and does not include exponential or logarithmic terms then is very likely it can represented by conic quadratic or a semi-definite problem. I think that is a very useful observation in practice.

PS. Usually when reformulating a problem as conic quadratic optimization problem then the number of constraints and variables are expanded. Some thinks that inefficient. However, it should be noted that the structure introduced is very sparse and hence does not hurt performance much.  There even some cases (thinks QPs with a low rank dense Hessian) where the conic quadratic representation is bug but requires much less space when stored sparse.

Wednesday, August 25, 2010

Exploiting the Farkas certificate in Dantzig-Wolfe decomposition

Paul Rubin ask in the comments of his blog post how to exploit a Farkas certificate Dantzig-Wolfe decomposition. I will answer that question here since it is very simple and not well-known. It is also fun to answer.

So let us assume you do

min c_1' x_1 + c_2' x_2
st.  A_1 x_1 + A_2 x_2 = b
     B_1 x_1           = f_1              (P)
     B_2 x_2           = f_2
    x_1, x_2 >= 0

In DW the master problem looks like

min h_1' z_1  + h_2' z_2
st   H_1  z_1 + H_2 z_2 = b          [y]
      e' z_1                       = 1          [u_1]
                        e' z_2    = 1           [u_2]
      z_1, z_2 >= 0

e is the vector of all ones. y, u_1 and u_2 are dual variables associated with the constraints.        

So it DW you have master problem and I will *NOT* assume that it is feasible. On other hand I will assume
it is infeasible and the LP optimizer provides a Farkas certificate of infeasibility denoted y, u_1 and u_2 i.e.

b'y + u_1 + u_2 > 0,
H_1'y + e u_1 <=0,
H_2'y + e u_2 <= 0

So if you do DW you will have to add columns to the master problem that invalids infeasibility certificate of the master problem.. Finding such column you do using the sub problems

min (0-A_j^T y)' x_j - u_j
st. B_j x_j =f_j
    x_j  >= 0

An interpretation of the subproblem is that we find a new column to the master problem that invalidates the Fakas certificate to the master problem as much as possible.

Now it is easy to verify that

  • Assume the objective value to a sub problem is negative, then the optimal solution can be used to generate  a new (useful) column to the master problem.
  • Now if all the objective values to the problem are nonnegative then the problem (P) is infeasible. Hint: Can you specify a Farkas certificate to (P)?

Wednesday, July 21, 2010

A small excercise in conic quadratic optimization.

Assume we have the small conic quadartic problem

min c'x

st. ax=b
     x_1 = l_1
     x_2 = l_2
     x_1 >= sqrt(x_2^2 + x_3^2)

c and a are arbitrary 3 dimensional vectors.  l_1 and l_2 are scalar constants.

The above problem is in fact a 1 dimensional LP.  The exercise is:
  1. Write the 1 dimensional LP?
  2. State the corresponding dual problem.
  3. Show how to recover the dual solution to conic problem from the dual solution to the the LP.
  4. What if l_1 - |l_2| = 0? Does it give rise to problems?