tag:blogger.com,1999:blog-19529646432374115792018-08-30T17:17:22.938+02:00Erling's blogThis blog is about my work at MOSEK ApS where I am the CEO, a computer programmer and tea maker.Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.comBlogger35125tag:blogger.com,1999:blog-1952964643237411579.post-83205130584215032112017-08-24T09:42:00.001+02:002017-08-24T09:44:02.027+02:00Matrix ordering and graph partitioningSince I spend a lot of time on solving sparse linear equation systems then I am also a user of sparse matrix reordering methods. My claim to fame is that I have implemented approximate minimum degree myself and it is used in MOSEK.<br /><br />Below I summarize some interesting link to graph partitioning software:<br /><br /><ul><li><a href="http://glaros.dtc.umn.edu/gkhome/metis/metis/overview">Metis</a> (the most famous).</li><li><a href="http://www.staff.science.uu.nl/~bisse101/Mondriaan/">Mondriaan</a> (do not know much about this).</li><li><a href="http://www.netlib.org/linalg/spooles/spooles.2.2.html">Spooles</a> (Not that active).</li><li><a href="http://epubs.siam.org/doi/abs/10.1137/S0895479896299081?journalCode=sjmael">Multisection</a> a generalization of bisection.</li></ul><div><br /></div><div><br /></div>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-37589947640730825562017-08-24T09:39:00.002+02:002017-08-24T09:43:12.777+02:00My Git links and tipsAt my company we have started using the source control system Git instead of Perforce. Here I have collected some Git related links.<br /><br /><br /><ul><li><a href="https://www.sbf5.com/~cduan/technical/git/">Understanding Git conceptually</a></li></ul>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-54241153836205573512017-01-09T14:16:00.001+01:002017-01-09T15:22:47.312+01:00Installing TexLive 2016 on Windows Server 2016My first attempt at intsalling TexLive 2016 on a Windows 2016 failed. However, the following worked<br /><br /><ul><li>Download <a href="http://mirror.ctan.org/systems/texlive/tlnet/install-tl-windows.exe" style="background-color: white; color: green;">install-tl-windows.exe</a><span style="background-color: white;"> from </span><a href="http://tug.org/texlive/acquire-netinstall.html">http://tug.org/texlive/acquire-netinstall.html</a></li><li>set PATH=c:\windows\system32;%PATH%</li><li>Run the script install-tl-windows.bat</li></ul><div>then reason is that cmd.exe has to be on the PATH.</div>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-90184130194058305192016-04-12T10:44:00.001+02:002016-04-12T10:44:14.289+02:00Error handling in BLAS librariesIt is very common to use a BLAS library to perform linear algebra operations such as dense matrix times dense matrix multiplication which can be performed using the dgemm function. The advantage of BLAS is<br /><br /><ul><li>it is well a defined standard.</li><li>and hardware vendors such as Intel supplies a tuned version.</li></ul><br />Now at MOSEK my employeer we use the Intel MKL library that includes a BLAS implementation. It really helps us deliver good floating point performance. Indeed we use a sequential version of Intel MKL but call it from potentially many threads using <a href="http://clikplus.org/">Clik plus</a>. This works well due to the well designed BLAS interface. However, there is one rotten apple in the basket and that is error handling.<br /><br />Here I will summarize why the error handling in the BLAS standard is awful from my perspective.<br /><br />First of all why can errors occur when you do the dgemm operation if we assume the dimensions of the matrices are correct and ignoring issues with NANs and the like. Well, in order to obtain good performance the dgemm function may allocate additional memory to store smallish matrices that fit into the cache. I.e. the library use a blocked version to improve the performance.<br /><br />Oh wait that means it can run of memory and then what? The BLAS standard error handling is to print a message to stderr or something along that line.<br /><br />Recall that dgemm is embedded deep inside MOSEK which might be embedded deep inside a third party program. This implies an error message printed to stderr does not make sense to the user. Also the user would NOT like us to terminate the application with a fatal error. Rather we want to know that an out of space situation happened and terminate gracefully. Or doing something to lower the space requirement. E.g. use a fewer threads.<br /><br />What is the solution to this problem? The only solution offered is to replace a function named xerbla that gets called when an error happens. The idea is that the function can set a global flag indicating an error happened. This might be a reasonable solution if the program is single threaded. Now instead assume you use a single threaded dgemm (from say Intel MKL) but call it from many threads. Then first of all you have to introduce a lock (a mutex) around the global error flag leading to performance issues. Next it is hard to figure out which of all the dgemm calls that failed. Hence, you have to fail them all. What pain.<br /><br />Why is the error handling so primitive in BLAS libraries. I think the reasons are:<br /><br /><ul><li>BLAS is an old Fortran based standard. </li><li>For many years BLAS routine would not allocate storage. Hence, dgemm would never fail unless the dimensions where wrong.</li><li>BLAS is proposed by academics which does not care so much about error handling. I mean if you run out of memory you just buy a bigger supercomputer and rerun your computations.</li></ul><div>If the BLAS had been invented today it would have been designed in C most likely and then all functions would have returned an error code. I know dealing with error codes is a pain too but that would have made error reporting much easier for those who wanted to do it properly.</div><br /><br />Here are some links with background information:<br /><br /><ul><li><a href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/297947">2009 Intel MKL foum post</a> (The 2 last post is the most interesting)</li><li><a href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/277562">2012 Intel MKL forum post</a> (By me on my problems with error handling)</li><li><a href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/595289">2015 Intel MKL forum post</a> (Some understand on why things as they are)</li><li><a href="https://groups.google.com/forum/#!topic/blis-devel/3p75EFRCUQk">BLIS discussion</a> (Newer BLAS library does not focus on error handling either)</li></ul>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-54439215279014695532015-05-22T13:33:00.002+02:002015-05-22T13:33:24.082+02:00Nonsymmetric semidefinite optimization problems.<div style="color: #222222; font-family: arial, sans-serif; font-size: small;">In semidefinite optimization we optimize over a matrix variable that must be symmetric and positive semidefinite.</div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;">Assume we want to relax the assumption about symmetry. Is that an important generalization? The answer is no for the following reason. Since </div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"> (X+X')/2 is PSD</div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;">implies</div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"> X </div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;">is PSD. Observe</div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"> X = (X+X')/2+(X-X')/2</div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;">and</div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"> y'( (X-X')/2) y >= 0.</div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;">implying X is PSD.</div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;">Note (X-X')'=-(X-X') implying X-X' is skew symmetric.</div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;"><br /></div><div style="color: #222222; font-family: arial, sans-serif; font-size: small;">Hence, any nonsymmetric semidefinite optimization problem can easily be posed as a standard symmetric semidefinite optimization problem.</div>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com3tag:blogger.com,1999:blog-1952964643237411579.post-26186694525648742632015-02-18T08:36:00.000+01:002016-12-27T20:01:41.552+01:00Multi threaded programmingI found the talk: <a href="https://www.youtube.com/watch?v=4OCUEgSNIAY&feature=youtu.be">Plain Threads are the GOTO of todays computing</a> by Hartmut Kaiser very interesting because I have been working on improving the multithreaded code in MOSEK recently. And is also thinking how MOSEK should deal with all the cores in the CPUs in the future. I agree with Hartmut something else than plain threads is needed.<br /><br />Here are some potential replacements: <br /><ul><li><a href="http://upc.lbl.gov/">Berkeley Unified Parallel C</a></li><li><a href="https://github.com/yosefk/checkedthreads">CheckedThreads</a> Source with a liberal license. Cilk like. Both C and C++.</li><li><a href="http://clikplus.org/">Cilk Plus</a> Supported by Intel C, gcc and Clang.</li><li><a href="http://stellar.cct.lsu.edu/tag/hpx/">HPX</a> C++ only.</li><li><a href="https://github.com/libocca/occa">Occa</a>. A unified approach to multithreaded languages. <a href="http://arxiv.org/abs/1403.0968">See also</a>.</li><li><a href="http://openmp.org/">OpenMP</a> Supported by many compilers.</li><li><a href="http://www.coin-or.org/projects/PFunc.xml">Pfunc</a> Cilk like features. Liberal license. Can be called from C but is C++.</li><li><a href="http://starpu.gforge.inria.fr/">Starpu</a></li><li><a href="https://www.threadingbuildingblocks.org/">Threaded building blocks</a>. C++ only.</li><li><a href="https://www.sics.se/~kff/wool/">Wool</a> A pure C implementation that provides something very close to Cilk. </li><li><a href="http://kaapi.gforge.inria.fr/#!index.md">XKAAPI</a></li></ul>Previously I have used OpenMP but I really dislike that. In my opinion it is somewhat ugly and you feel the Fortran origins. Recently I have played with Cilk which is very simple.<br /><br /><b>Checkedthreads</b> seems like a good option if a simple C only tool can do the job. I have plan to try that at MOSEK.<br /><br /><b>Pfunc</b> seems very powerful and there is ph.d. thesis about its design. The project might be semi-dead though.<br /><br /><b>Wool</b> also seems very interesting. It is plain C and the author claims the spawn overhead is very low. There is also an older <a href="http://soda.swedish-ict.se/3869/1/wool-comparison-final.pdf">comparison</a> with other task libraries.<br /><br />Btw I biased towards tools that has no C++ bindings because currently MOSEK is plain C code. Adding a dependency on a C++ runtime library adds headaches.<br /><br />Some common terminology when working on parallism is<br /><ul><li><a href="http://en.wikipedia.org/wiki/Fork%E2%80%93join_model">Fork-join</a> parallel model</li><li><a href="http://dreamrunner.org/wiki/public_html/Books%20Review/Pthreads%20Programming/Pthreads%20Programming.html">Pthreads programming</a> and tutorial</li></ul>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-70659011379011141092013-11-06T14:58:00.001+01:002014-11-18T12:43:47.564+01:00Complexity of solving conic quadratic problemsFirst a clarification <b>conic quadratic optimization</b> and <b>second order cone optimization </b>is the same thing. I prefer the name conic quadratic optimization though.<br /><br />Frequently it is asked on the internet what is the computational complexity of solving conic quadratic problems. Or the related questions what is the complexity of the algorithms implemented in MOSEK, SeDuMi or SDPT3.<br /><br />Here are a some typical questions<br /><ul><li><a href="http://ask.cvxr.com/question/1274/time-complexity-of-sloving-an-socp-problem/">CVX forum 1</a> </li><li><a href="http://ask.cvxr.com/question/1258/time-complexity-of-an-socp-formulation/">CVX forum 2</a></li></ul><div>To the best of my knowledge almost all open source and commercial software employ a primal-dual interior-point algorithm using for instance the so-called Nesterov-Todd scaling. <br /><br />A conic quadratic problem can be stated on the form</div>\[<br />\begin{array}{lccl} <br />\mbox{min} & \sum_{j=1}^d (c^j)^T x^j & \\ <br />\mbox{st} & \sum_{j=1}^d A^j x^j & = & b \\ <br />& x^j \in K^j & \\ <br />\end{array} <br />\]<br />where \(K_j\) is a \(n^j\) dimensional quadratic cone. Moreover, I will use \(A = [A^1,\ldots, A^d ]\) and \(n=\sum_j n^j\). Note that \(d \leq n\). First observe the problem cannot be solved exactly on a computer using floating numbers since the solution might be irrational. This is in contrast to linear problems that always have rational solution if the data is rational. <br /><div><br /></div><div>Using for instance the primal-dual interior point algorithm the problem can be solved to \(\varepsilon\) accuracy in \(O(\sqrt{d} \ln(\varepsilon^{-1}))\) interior-point iterations, where \(\varepsilon\) is the accepted duality gap. The most famous variant having that iteration complexity is based on Nesterov and Todds beautiful work on symmetric cones.</div><div><br /></div>Each iteration requires solution of a linear system with the coefficient matrix <br />\[ \label{neweq} \left [ \begin{array}{cc} H & A^T \\ A & 0 \\ \end{array} \right ] \mbox{ (*)} \] <br />This is the most expensive operation and that can be done in \(O(n^3)\) complexity using Gaussian elimination so we end at the complexity \(O(n^{3.5}\ln(\varepsilon^{-1}))\). <br /><div>That is the theoretical result. In practice the algorithms usually works much better because they normally finish in something like 10 to 100 iterations and rarely employs more than 200 iterations. In fact if the algorithm requires more than 200 iterations then typically numerical issues prevent the software from solving the problem.<br /><br /></div><div>Finally, typically conic quadratic problem is sparse and that implies the linear system mentioned above can be solved must faster when the sparsity is exploited. Figuring our to solve the linear equation system (*) in the lowest complexity when exploiting sparsity is NP hard and therefore optimization only employs various heuristics such minimum degree order that helps cutting the iteration complexity. If you want to know more then read my Mathematical Programming publication mentioned below. One <b>important fact</b> is that it is impossible to predict the iteration complexity without knowing the problem structure and then doing a complicated analysis of that. I.e. the iteration complexity is not a simple function of the number constraints and variables unless A is completely dense.<br /><br />To summarize primal-dual interior-point algorithms solve a conic quadratic problem in less 200 times the cost of solving the linear equation system (*) in practice.</div><br /><div>So can the best proven polynomial complexity bound be proven for software like MOSEK. In general the answer is no because the software employ an bunch of tricks that speed up the practical performance but unfortunately they destroy the theoretical complexity proof. In fact, it is commonly accepted that if the algorithm is implemented strictly as theory suggest then it will be hopelessly slow.</div><div><br /></div><div>I have spend of lot time on implementing interior-point methods as documented by the <a href="http://dblp.uni-trier.de/db/journals/mp/mp95.html#AndersenRT03">Mathematical Programming publication</a> and my view on the practical implementations are that are they very close to theory. </div>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com1tag:blogger.com,1999:blog-1952964643237411579.post-68421101693434071492013-11-06T08:58:00.002+01:002013-11-13T12:22:01.527+01:00Why I hate OpenMPAt +MOSEK we use <a href="http://www.openmp.org/">OpenMP</a> to parallelize our code. Well, in fact we used to use OpenMP exclusively. However, we have now moved to use pthreads or native Windows threads.<br /><br />This seems like a very bad idea because the normal recommendation is to OpenMP, CiLK or Intel parallel building blocks or like. But I will now explain why I ignore this recommendation and in fact think it is a bit naive.<br /><br />At <a href="http://mosek.com/">MOSEK</a> we build a shared library (a DLL on Windows) that we sell to our customers. This library is called from the users program or programs like MATLAB, R etc. We have absolutely no control over the calling application. The calling application may be linked OpenMP or another threading library. Our users expects our application to work almost independently of what they do.<br /><br />However, OpenMP is build on the implicit assumption you are an academic who have all the source code when building the application and if an application is linked with OpenMP then the same version of OpenMP is used throughout the application. This definitely does not hold for us.<br /><br />We have to ask ourselves what happen if the user build their application linking with MOSEK using gcc OpenMP and we build our library with Intel OpenMP. I think the answer is that the final applications crashes.<br /><br />I recently reported crash in the Intel MKL BLAS library when called from MATLAB that seems to be caused by OpenMP issue. See the <a href="http://software.intel.com/en-us/forums/topic/486842">report</a>. The answer from Intel is are you sure you use the same OpenMP as MATLAB. This is illustrates very well what I hate about OpenMP. It does not work unconditionally. I mean if you link with pthreads it then it works because pthreads is integrated into the operating system.<br /><br />So my conclusion is. If you are a commercial library vendor then do not use OpenMP unless you want to get into trouble.Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com2tag:blogger.com,1999:blog-1952964643237411579.post-60604748639919116262013-06-26T13:03:00.001+02:002013-06-26T13:32:34.282+02:00Slow convergence in conic quadratic optimizationThe recent paper by Gould, Orban, and Robinson in <a href="http://mpc.zib.de/index.php/MPC/article/view/107">Mathematical Programming Computation</a> study the QP <br />\[<br />\begin{array}{lc}<br />\mbox{min} & x_1^2 \\<br />\mbox{st} & x_1 \geq 0<br />\end{array}<br />\]<br />because the problem does not have s strict complementarity solution. This fact leads interior-point methods to converge slowly.<br /><br />In interesting observation is that it can be formulated as the following conic quadratic optimization problem:<br />\[<br />\begin{array}{lc}<br />\mbox{min} & x_2 \\<br />\mbox{st} & x_2 \geq ||x_1|| \\<br /> & x_1 \geq 0 <br />\end{array}<br />\]<br />An optimal primal solution is \(x_1 = 0\) and \(x_2=0\). Observe Vanderbei study the same problem in his <a href="http://www.orfe.princeton.edu/~rvdb/ps/socp.pdf">working paper</a> Section 2.2) on solving SOCPs with LOQO.<br /><br />If I solve the problem with <a href="http://mosek.com/">MOSEK</a> I obtain<br /><br /><pre>ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME <br />0 1.0e+000 1.0e+000 2.0e+000 0.00e+000 1.000000000e+000 0.000000000e+000 1.0e+000 0.00 <br />1 1.6e-001 1.6e-001 3.2e-001 1.00e+000 2.404340377e-001 0.000000000e+000 1.6e-001 0.00 <br />2 8.2e-003 8.2e-003 1.6e-002 1.12e+000 1.181687831e-002 0.000000000e+000 8.2e-003 0.00 <br />3 4.1e-004 4.1e-004 8.2e-004 1.01e+000 5.891864452e-004 0.000000000e+000 4.1e-004 0.00 <br />4 2.1e-005 2.1e-005 4.1e-005 1.00e+000 2.945495827e-005 0.000000000e+000 2.1e-005 0.00 <br />5 2.6e-010 2.6e-010 5.2e-010 1.00e+000 4.144019664e-010 0.000000000e+000 2.6e-010 0.00 <br /></pre><br />As can be seen from mu column (average complementarity gap) MOSEK converges quickly. In fact there is quadratic convergence in the last iteration. Now the dual problem is<br />\[<br />\begin{array}{lc}<br />\mbox{max} & 0 \\<br />\mbox{st} & s_3 = 1 \\<br /> & s_2 \geq ||s_1|| \\<br /> & s_1 \geq 0 <br />\end{array}<br />\]<br />and an optimal dual solution is \(s_1=0.0\), \(s_2=1.0\) and \(s_3=1\). In this case the dual solution is in the interior and hence the solutions are stricly complementarity. This is also indicated by the interior-point optimizer in MOSEK which also converge quickly.<br /><br />An equivalent conic quadratic problem using a rotated quadratic cone can be stated as follows<br />\[<br />\begin{array}{lc}<br />\mbox{min} & x_2 \\<br />\mbox{st} & 2 x_3 x_2 \geq ||x_1||^2 \\<br /> & x_3 =1 \\<br /> & x_1 \geq 0 <br />\end{array}<br />\]<br />Now solving that problem with MOSEK produces<br /><pre>ITE PFEAS DFEAS GFEAS PRSTATUS POBJ DOBJ MU TIME <br />0 1.0e+000 1.0e+000 1.7e+000 0.00e+000 7.071067812e-001 0.000000000e+000 1.0e+000 0.00 <br />1 1.4e-001 1.4e-001 2.4e-001 6.09e-001 1.517397994e-001 -4.749764644e-002 1.4e-001 0.00 <br />2 2.2e-002 2.2e-002 3.7e-002 1.21e+000 2.120093451e-002 -6.971441263e-003 2.2e-002 0.00 <br />3 5.3e-003 5.3e-003 9.0e-003 1.14e+000 4.674957163e-003 -1.445168902e-003 5.3e-003 0.00 <br />4 1.5e-003 1.5e-003 2.6e-003 1.07e+000 1.234899144e-003 -3.403939914e-004 1.5e-003 0.00 <br />5 4.4e-004 4.4e-004 7.4e-004 1.04e+000 3.331026029e-004 -7.962820209e-005 4.4e-004 0.00 <br />6 1.2e-004 1.2e-004 2.1e-004 1.02e+000 8.925963682e-005 -1.860468131e-005 1.2e-004 0.00 <br />7 3.4e-005 3.4e-005 5.8e-005 1.01e+000 2.373850359e-005 -4.389268847e-006 3.4e-005 0.00 <br />8 9.2e-006 9.2e-006 1.6e-005 1.01e+000 6.273175074e-006 -1.049389687e-006 9.2e-006 0.00 <br />9 2.4e-006 2.4e-006 4.2e-006 1.00e+000 1.649538759e-006 -2.545236451e-007 2.4e-006 0.00 <br />10 6.5e-007 6.5e-007 1.1e-006 1.00e+000 4.321347177e-007 -6.259486109e-008 6.5e-007 0.00 <br />11 1.7e-007 1.7e-007 2.9e-007 1.00e+000 1.128991865e-007 -1.558439042e-008 1.7e-007 0.00 <br />12 4.5e-008 4.5e-008 7.7e-008 1.00e+000 2.943872208e-008 -3.919657806e-009 4.5e-008 0.00 <br />13 1.2e-008 1.2e-008 2.0e-008 1.00e+000 7.665414689e-009 -9.952574739e-010 1.2e-008 0.00 <br /></pre><br />It is seen that interior-point optimizer now converge at a much slower rate and takes many more iterations. It is easy to verify that there is no strictly complementarity solution in this case.<br /><br />This give rise to some comments. There might be several ways of representing a set and they not all equally effective. Secondly maybe we should think about improving the code for problems without a stricly complementarity solution as Gould et al. is suggesting.<br /><br /><br /><br /><br />Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-56816186218292294582012-05-25T13:53:00.003+02:002012-05-25T13:53:50.461+02:00An example of a badly scaled model with an easy fixA scaled down version of problem we just received at MOSEK is<br /><br />min s^2<br />st. Ax - b = s (P)<br /> x >= 0 <br /><br />Assume for simplicity A is a matrix with 1 row. Now in this case the elements in A is very large i.e. of the order of magnitude 10^7. This implies the problem (P) is badly scaled because the row contains a lot of elements of the size 10^7 and one of the size 1.<br /><br />Now an equally good model is<br /><br /><br />min s^2<br />st. Ax - b = (||A||_inf) s (P')<br /> x >= 0 <br /><br /><br />which is nicely scaled. An optimal solution to (P') is clearly also optimal to (P)!<br /><br />Note this line of thinking could be relevant anywhere were penalty variables is employed. Many optimization problems contains penalty variables.<br /><br /><br /><br />Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-73349191904744488562011-12-09T07:41:00.001+01:002012-10-01T13:56:48.359+02:00An observation about ns1687037<br />Hans Mittleman test various optimization software packages on his<a href="http://plato.asu.edu/bench.html"> benchmark website</a>. One of the test problems is <span class="Apple-style-span" style="white-space: pre;"><span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;">ns1687037</span><span class="Apple-style-span" style="font-family: inherit;">. An inspection of that problem reveals many constraint blocks of the form</span></span><br /><br />R0002624: 5.015e+004 C0024008 ....+ C0025749 = 5.0307748e+007<br />R0002625: 0e+000 <= - C0025749 + C0025750<br />R0002626: 0e+000 <= C0025749 + C0025750<br />R0002627: 1.9877659e-008 C0025750 - C0025751 = 0e+000<br /> C0025749 is free<br /><br />These constraints implies<br /><br /> -C0025750 <= C0025749 <= C0025750<br />1.9877659e-008 C0025750 = C0025751<br /><br />and hence<br /><br />1.9877659e-008 abs( C0025749) <= C0025751<br /><br />In other words variable C0025749 is an elastic/penalty variable for constraint R0002624. Moreover, we see variable C0025751 is identical 1.0e-8 of the elastic variable in constraint R0002624. Now the right hand side of constraint R0002624 is 10^7 and hence it is more or less the rounding error that is penalized. That makes me wonder if that model provides reliable results or does what the author wants!<br /><div><br /></div>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com1tag:blogger.com,1999:blog-1952964643237411579.post-19169752874464963362011-11-23T09:21:00.001+01:002011-11-24T08:56:28.808+01:00A follow up on lower bounds on fill-inThis is an follow up on my previous post on lower bounds on the fill in when doing a sparse Cholesky factorization.<br /><br />I posted my question to the NA-NET and below is a commented version of the replies I got.<br /><br />Sivan Toledo pointed out the paper:<br /><br /><table><tbody><tr><td><ul><li>A Polynomial Approximation Algorithm for the Minimum Fill-In Problem</li><li>A. Natanzon, R. Shamir and R.Sharan</li><li> <a href="http://siamdl.aip.org/getpdf/servlet/GetPDFServlet?filetype=pdf&id=SMJCAT000030000004001067000001&idtype=cvips" target="_blank"><i>SIAM Journal on Computing</i></a>, Vol. 30, No. 4, pp. 1067-1079 (2000)</li></ul></td></tr><tr><td><br /></td></tr></tbody></table>and mentioned there might be a paper by Phil Klein. Most likely it is the report:<br /><ul><li>Approximating fill in solving sparse linear systems </li><li>A. Agrawal, P. Klain and R. Ravi</li><li>Download link: <a href="ftp://ftp.cs.brown.edu/pub/techreports/91/cs91-19.pdf">ftp://ftp.cs.brown.edu/pub/techreports/91/cs91-19.pdf</a>.</li></ul>and the paper<br /><ul><li><span style="font-family: Georgia,Times New Roman,Times,serif;">Cutting down on fill using nested dissection: provably good elimination orderings</span> </li><li>Ajit Agrawal, Philip N. Klein, and R. Ravi<span style="font-family: Georgia,Times New Roman,Times,serif;"> </span></li><li><span style="font-family: Georgia,Times New Roman,Times,serif;">Graph Theory and Sparse Matrix Computation, edited by A. George, J. Gilbert, and J. W. H. Liu, volume 56 in the IMA Volumes in Mathematics and its Applications, Springer-Verlag (1993), pp. 31-55.</span></li></ul>A related work to the above work the ph.d.by Wee-Liang Heng with the title: "Approximately Optimal Elimination Orderings for Sparse Matrices". I have printed copy somewhere but cannot locate it now.<br /><br />The above mentioned work provides algorithm for computing a good symmetric permutation. However, to the best of my knowledge they are not used in practice but definitely something that I should check out.<br /><br />Esmond Ng replied and said it is hard to come up with lower bounds general matrices but mentioned bounds can be obtained for special matrices. The relevant papers are<br /><br /><ul><li> Complexity Bounds for Regular Finite Difference and Finite Element Grids</li><li>Hoffman, Alan J.; Martin, Michael S.; Rose, Donald J.</li><li> SIAM Journal on Numerical Analysis, vol. 10, no. 2, pp. 364-369.</li></ul><div>and</div><div></div><div><ul><li>Nested Dissection of a Regular Finite Element Mesh </li><li>A. George </li><li>SIAM J. Numer. Anal. 10, pp. 345-363.</li></ul></div><div>I am aware of this work but I am was mainly looking for information about general matrices since the matrices I experience in MOSEK almost always never have grid structure. MOSEK is an optimization package and hence the underlying applications are mostly from economics and planning which give rise to very different matrices than those coming from physics applications. </div><div></div><div>Jeff Ovall point a diffrent line of research in his reply:</div><div></div><div><div><i>If you are willing to relax your notion of fill-in a bit, then I may be able to point you in a helpful direction. Instead of thinking of fill-in in terms of putting non-zeros where their used to be zeros, one can also think of it as (slightly) increasing the rank of low-rank blocks. For example, the inverse</i></div><div><i>of the tridiagonal matrix with stencil (-1,2,-1) has no non-zero entries, but the rank of any off-diagonal block is precisely one, so the "fill-in" in this sense is small (requiring only O(n log n) storage instead of n^2). Hierarchical matrices (Hackbusch, Grasedyck, Boerm, Bebendorf, ... ) exploit this notion of low-rank fill-in not only to compress the "important" information in a matrix, but also to maintain a compressed format while performing factorizations (LU, Cholesky). From the algebraic point of view, it is the Rank-Nullity Theorem which implies that inverses of sparse matrices will have large blocks which are of low rank. If the LU-factorization is thought of block-wise, then this theorem also has something to say about the ranks of the blocks which appear, though it is not obvious to me how to get sharp lower-bounds. The paper: </i></div><ul><li><i><span style="font-family: arial;"><span style="white-space: pre-wrap;"></span>The interplay of ranks of submatrices</span></i></li><li><i><span style="font-family: arial;"><span class="il">Strang</span>, G. & Nguyen, T.</span><span style="font-family: arial;"><i><span style="white-space: pre-wrap;"></span> </i></span></i></li><li><i><span style="font-family: arial;"><i>SIAM Rev., </i><b>2004</b>, 46, 637-646 (electronic)</span></i> </li></ul></div><div></div><div></div><div>To summarize then there does not seem to be any good lower bound on the minimal amount of fill in possible when computing a sparse Cholesky factorization of a symmetric permuted matrix.</div><br /><br /><br /><br />Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com4tag:blogger.com,1999:blog-1952964643237411579.post-20457037422638923682011-11-17T07:30:00.001+01:002011-11-23T09:20:31.744+01:00Lower bounds on the minimal fill-in when factorizing a positive symmetric matrix. Any help out there?<span style="font-family: inherit; font-size: small;">When computing a Cholesky factorization of a positive definite<br />symmetric matrix A, then it is well-known that a suitable symmetric<br />reordering helps keeping the the number of fill-ins and the total<br />number flops down.<br /><br />Finding the optimal ordering is NP-hard but good orderings can be<br />obtained with the minimum degree, nested dissection (aka graph<br />partitioning based) algorithms. Those algorithms provides an upper<br />bound on the minimal amount of fill-in. However, I wonder is there any<br />way to compute a nontrivial lower bound on the minimal amount of<br />fill-in?<br /><br />I have been searching the literature but have not found a any good<br />reference. Do you have any suggestions? The problem sounds hard I<br />know.<br /><br />Why is a lower bound important? Well, it would help evaluate the<br />quality of the ordering algorithms that we have implemented in MOSEK<br />(www.mosek.com). Moreover, the minimum degree ordering is usually<br />computational cheap whereas nested dissection is expensive. A lower<br />bound could help me determine when the minimum degree ordering<br />potentially could be improved by using nested dissection.</span><br /><span style="font-family: inherit; font-size: small;"><br /></span><span style="font-family: inherit; font-size: small;"></span><br /><span style="font-family: inherit; font-size: small;"><br /></span><span style="font-family: inherit; font-size: small;"></span>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-64816984729040568692011-09-08T13:40:00.000+02:002011-09-08T13:40:53.782+02:00Conference in Honor of Etienne LouteYesterday, I was at a conference <a href="http://www.uclouvain.be/en-365494.html">honoring</a> Etienne Loute where I presented the talk: "<a href="http://mosek.com/fileadmin/talks/func_versus_conic.pdf">Convex Optimization : Conic Versus Functional Form</a>". The messages of the talk is if you can formulate an optimization problem as a conic quadratic optimization problem (aka SOCP) then there are many good reasons to prefer this form over other forms. Some reasons are:<br /><ul><li> Conic quadratic problems are convex construction.</li><li>They are almost as simple as LPs to deal with in software.</li><li>Duality theory is almost as simple as the linear case.</li><li>The (Nesterov-Todd) primal-dual algorithm for conic quadratic problems is extremely good.</li></ul><br />I had feared that the distinguished audience would consider my talk too simple. However, the speaker before me was Bob Fourer of AMPL. The title of his talk was about checking convexity of general optimization problems formulated in AMPL and it had two parts. The first part was about checking convexity and the second part was about how you in some cases automatically could convert optimization problems on functional form to a conic quadratic optimization problem. So the two talks complemented each other very well and it made me feel better about my talk.<br /><br />Finally, I would like to mention that Yurii <strong></strong>Nesterov was one of the speakers. He must have enjoyed the day since two speakers was talking about his baby named optimization over symmetric cones.Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com2tag:blogger.com,1999:blog-1952964643237411579.post-75329827758661733082011-05-23T11:53:00.000+02:002011-05-23T11:53:32.097+02:00A report from SIAM Optimization 2011Last week I was at the SIAM Optimization 2011 conference in Darmstadt. It was very nice conference for the following reasons:<br /><ul><li>It was very easy to go there since Darmstadt is close to the major airport in Frankfurt.</li><li>The conference center Darmstadium in the center on Darmstadt was execellent.</li><li>The hotel was excellent and only 2 mins walk from the conference center.</li><li>The food was fairly cheap and very good. The same was true for the beer.</li><li>The scientific program was excellent but also very packed. I did not experience any no shows. </li></ul>Some of major topis on the conference was MINLP (mixed-integer nonlinear optimization) and SDP (semi-definite optimization). These two topics kind of got together in the plenary talk of Jon Lee the last day because it seems that SDP will play an important role in MINLP. In fact to concluded his talk by saying: "We need powerful SDP software". Since we plan to support SDP at MOSEK then this was a nice conclusion for us.<br /><br />One of the most interesting talks I saw was a tutorial by my friend Etienne de Klerk. He discussed a preprocessing technique that can be used in semi-definite optimization to reduce the computational complexity for some problems. The idea is that a big semi-definite variable can decomposed into a number of smaller number of semi-definite variables given some transformations are performed on the problem data.<br /><br />Another major topic at the conference was sparse optimization where by sparse is meant that the solution is sparse. There was plenary about this topic by Steve Wright and a tutorial by Michael Friedlander. Michael presented among other things a framework that could be used to understand and evaluate all the various algorithms suggested to solve structured solution sparse optimization problems. This topic was also addressed a talk about robust support vector machines by Laurent El Ghaoui which I liked quite a bit.<br /><br />Finally, a couple MOSEK guys gave a presentation in a session. The other speakers in that session was Joachim Lofberg the author of YALMIP and Christian Bliek of IBM. Christian talked about the conic interior-point optimizer in CPLEX. One slightly surprising announcement Christian made was that CPLEX will employ the homogenous model in their interior-point optimizer as the default from version 12.3. In MOSEK the homogeneous model always been the default.<br /><br /><br />I have definitely overlooked or forgotten something important at the conference but with 4 days conference from early morning to late evening then that is avoidable unfortunately.Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-14729729193004603752011-04-27T08:20:00.000+02:002011-04-27T08:20:45.104+02:00Formulating linear programs are hard!When I was younger I was teaching linear programming (LP) to business students. One lesson I learned is that formulating an LP is much harder than it appears when reading a standard text book. Recently I came across the paper <a href="http://faculty.nps.edu/gbrown/docs/Brown_Dell_INFORMS_Transactions_on_Education_January2007.pdf">"Formulating Integer Linear Programs: A Roguesâ€™ Gallery"</a> which provides many ideas that can be useful when formulating LPs.Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-29291913559355767812011-04-20T14:57:00.001+02:002011-04-20T14:59:44.033+02:00In need of a optimal basis improvement procedure in linear programming.One of our MOSEK customers is solving an LP with the primal simplex optimizer. Next she does a sensitivity analysis but that fails because the optimal basis is singular. This should not happen in ideal world but can happen for at least three reasons:<br /><ul><li>A bug may cause the optimal basis to be singular.</li><li>The primal simplex optimizer works on a presolved and scaled problem. Whereas the sensitivity analysis is performed on the original problems. The basis might be well-conditioned in the presolved and scaled "space" and not the original space.</li><li>The simplex optimizer usually starts with a nicely conditioned basis. In each iteration an LU representation of the basis is updated using rank 1 updates. If the rank 1 update signals that the basis is ill-conditioned then the iterations is continued. Using an idea by <a href="http://www.thetomlins.org/">John Tomlin</a> it is possible in some cases to discover that the basis becomes ill-conditioned during the rank-1 update. Hence, it might very well be that an ill-conditioned basis is not discovered. Particularly if it becomes ill-conditioned in the last simplex iteration.</li></ul>Since most real world LPs have multiple optimal basic solutions then looking for the best conditioned (near) optimal basis might be very useful before doing sensitivity analysis or even hotstart. Finding the best conditioned optimal basis is most likely not computationally feasible but maybe it can done in approximate way.<br /><br />At ISMP 2009 I saw a talk about the cutting plane methods. There was some relation between the quality of the cuts generated and the conditioning of the basis.<br /><br />Btw. the John Tomlin article I am referring to is "An Accuracy Test for Updating Triangular Factors", <i>Mathematical Programming Study 4</i>, pp. 142-145 (1975).Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com2tag:blogger.com,1999:blog-1952964643237411579.post-13421252082819970672011-03-02T14:39:00.000+01:002011-03-02T14:39:33.189+01:00Is it safe to move lower bounds to zero?Assume we have the problem<br /><br /> min c'x<br /> st. A x = b (P0)<br /> x >= l<br /><br />where l is large in absolute value say l_j=-1000 for all js. (l is short for lower bounds). The dual problem is<br /><br /><br /> max b'y + l's <br /> st. A'y + s = c (D0) <br /> s >= 0<br /><br /> It is very common to transform the problem to <br /><br /> min c'x + c'l<br /> st. A x = b- A l (P1)<br /> x >= 0<br /><br />for efficiency reasons. Indeed all interior-point optimizers will do that. <br /><br />The dual problem is<br /><br /> max b'y + c'l<br /> st. A'y + s = c (D1)<br /> s >= 0<br /><br /><br />Let us say we solve (P1) and (D1). Moreover,<br /><br /> A'y + s = c <br /><br />holds only approximate which definitely be the case for interior-point methods. To be precise we have<br /><br /> A'y + s = c + e<br /><br />holds exactly where 0 < |e|| << 1.<br /><br /><br />This implies if (y,s) from (D1) is reported as an optimal solution to (D0) then there can be a big error in the dual objective value. Note that is not the case if l=0. Now if instead we report (y,c-A'y) as the optimal dual solution to (D0) then objective value will be correct but then s>=0 might be violated.<br /><br />The question is that which dual solution to report. I guess the answer is that it depends on your priorities. <br /><br />I will leave it as an exercise to the interested reader to construct a small example demonstrating this. Since I just spend all day figuring out that happening on instance with 100K variables.<br /><br /><br /> Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com1tag:blogger.com,1999:blog-1952964643237411579.post-1013140039950036222010-11-04T14:43:00.000+01:002010-11-04T14:43:14.804+01:00Which cones are needed to represent almost all convex optimization problems?Recently I visited my Professor Yinyu Ye at Stanford university. I worked with during my ph.d. studies where we did some nice work together. Now we hope we might have some new ideas to explore. More about that later. During my visit I also talked <a href="http://www.stanford.edu/%7Eboyd/">Stephen Boyd</a> and Jacob Mattingley. Jacob also gave me a presentation of his <a href="http://stanford.edu/%7Eboyd/papers/code_gen_impl.html">interesting ph.d. work</a> which he defended while I was at Stanford.<br /><br />During a lunch with Yinyu, Stephen and some other Standford guys Stephen said something like: "Almost all convex optimization problem can be formulated using a combination of linear, quadratic, semi-definite and the exponential cones". It is a view I am sharing. Given it is true it has an important practical ramification I will return to shortly. Stephen is aware that his statement is hard to prove but I think it is true like almost all large scale LPs are sparse. It is not something that is not universally true but it is true in practice.<br /><br />Please note that any convex problem can be formulated in conic form but Stephens postulate that only very few cone types are require. Moreover, currently it is only the exponential cone we do not know how to deal with in practice because it is a so-called nonsymmetric cone.<br /><br />Occasionally at MOSEK support we get a question like: "I have a nonlinear model. How do I solve it with MOSEK?" My first reply is always: "Is your model convex?" because MOSEK only deals with convex models. If the user knows what convexity is then user will usually reply yes. Maybe, the user will add that it looks complicated to solve general convex convex models with MOSEK. That is particularly true if you do not use a modeling language like AMPL or GAMS. Here comes Stephen Boyds observation handy because if your model does not include exponential terms (or logarithmic terms) or semi-definite terms then most likely the problem can be formulated as a conic quadratic optimization problem. That is fortunate because<br /><ul><li>conic quadratic optimization problems are as easy as linear problems to deal with software wise. The user does not have to bother with gradients and Hessians for instance.</li><li>the optimizer for conic quadratic optimization problems is much more robust than the optimizer for general convex optimization problems.</li></ul>Recently after I came back from Stanford an user wrote to MOSEK support: "I have this complicated convex problems that cannot be formulated as conic quadratic optimization problem. How should I solve it with MOSEK?" After some back and forth I got him to reveal that it essentially had nonlinear functions of the type<br /><br /><br /> max(x,0)^2 <= t<br /><br />Now that is in fact conic quadratic representable as follows<br /><br /> s^2 <= t<br /> x <= s<br /><br /><br />So, it turned that the complicated convex model is a conic quadratic representable contrary to the initial statement. <br /><br /><br />The upshot from Stephen Boyds statement is that if your model is convex and does not include exponential or logarithmic terms then is very likely it can represented by conic quadratic or a semi-definite problem. I think that is a very useful observation in practice.<br /><br />PS. Usually when reformulating a problem as conic quadratic optimization problem then the number of constraints and variables are expanded. Some thinks that inefficient. However, it should be noted that the structure introduced is very sparse and hence does not hurt performance much. There even some cases (thinks QPs with a low rank dense Hessian) where the conic quadratic representation is bug but requires much less space when stored sparse.Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com1tag:blogger.com,1999:blog-1952964643237411579.post-49884659816714379612010-08-25T09:08:00.003+02:002010-11-03T14:26:03.641+01:00Exploiting the Farkas certificate in Dantzig-Wolfe decomposition<div style="margin: 0px;">Paul Rubin ask in the comments of his <a href="http://orinanobworld.blogspot.com/2010/07/infeasible-lps-and-farkas-certificates.">blog post</a> how to exploit a Farkas certificate Dantzig-Wolfe decomposition. I will answer that question here since it is very simple and not well-known. It is also fun to answer.</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">So let us assume you do</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">min c_1' x_1 + c_2' x_2</div><div style="margin: 0px;">st. A_1 x_1 + A_2 x_2 = b</div><div style="margin: 0px;"> B_1 x_1 = f_1 (P)</div><div style="margin: 0px;"> B_2 x_2 = f_2</div><div style="margin: 0px;"> x_1, x_2 >= 0</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">In DW the master problem looks like</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">min h_1' z_1 + h_2' z_2</div><div style="margin: 0px;">st H_1 z_1 + H_2 z_2 = b [y]</div><div style="margin: 0px;"> e' z_1 = 1 [u_1]</div><div style="margin: 0px;"> e' z_2 = 1 [u_2]</div><div style="margin: 0px;"> z_1, z_2 >= 0</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">e is the vector of all ones. y, u_1 and u_2 are dual variables associated with the constraints. </div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">So it DW you have master problem and I will *NOT* assume that it is feasible. On other hand I will assume</div><div style="margin: 0px;">it is infeasible and the LP optimizer provides a Farkas certificate of infeasibility denoted y, u_1 and u_2 i.e.</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">b'y + u_1 + u_2 > 0,</div><div style="margin: 0px;">H_1'y + e u_1 <=0,</div><div style="margin: 0px;">H_2'y + e u_2 <= 0</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">So if you do DW you will have to add columns to the master problem that invalids infeasibility certificate of the master problem.. Finding such column you do using the sub problems</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">min (0-A_j^T y)' x_j - u_j</div><div style="margin: 0px;">st. B_j x_j =f_j</div><div style="margin: 0px;"> x_j >= 0</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">An interpretation of the subproblem is that we find a new column to the master problem that invalidates the Fakas certificate to the master problem as much as possible.</div><div style="margin: 0px;"><br /></div><div style="margin: 0px;">Now it is easy to verify that</div><div style="margin: 0px;"><br /></div><ul><li>Assume the objective value to a sub problem is negative, then the optimal solution can be used to generate a new (useful) column to the master problem.</li><li>Now if all the objective values to the problem are nonnegative then the problem (P) is infeasible. Hint: Can you specify a Farkas certificate to (P)?</li></ul>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com4tag:blogger.com,1999:blog-1952964643237411579.post-51913639950644155442010-07-21T10:07:00.000+02:002010-07-21T10:07:29.233+02:00A small excercise in conic quadratic optimization.Assume we have the small conic quadartic problem<br /><br />min c'x<br /><br />st. ax=b<br /> x_1 = l_1<br /> x_2 = l_2<br /> x_1 >= sqrt(x_2^2 + x_3^2)<br /><br />c and a are arbitrary 3 dimensional vectors. l_1 and l_2 are scalar constants.<br /><br />The above problem is in fact a 1 dimensional LP. The exercise is: <br /><ol><li>Write the 1 dimensional LP?</li><li>State the corresponding dual problem.</li><li>Show how to recover the dual solution to conic problem from the dual solution to the the LP.</li><li>What if l_1 - |l_2| = 0? Does it give rise to problems?</li></ol>Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-91196320982710555402010-04-20T11:40:00.000+02:002010-04-20T11:40:38.402+02:00A question about the conic representation of the power function.My previous blog item discussed how to represent the set<br /><br />x^5/3 <= t, x>=0<br /><br />using linear and quadratic cones. Now the book by Ben-Tal and Nemirovski shows how to represent the set<br /><br /><br />x^p/q <= t , x>=0 (1)<br /><br />using conic quadratic cones where p and q are integers and p>=q>=1 which much more general than my example of course. Now the method suggested by Ben-Tal and Nemirovsky is not optimal in a complexity sense. Hence, I do not think it leads to the minimal number of quadratic cones. Therefore, a natural question is: What is the optimal conic quadratic representation of (1)?Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com2tag:blogger.com,1999:blog-1952964643237411579.post-91896026329173606912010-04-16T12:36:00.000+02:002010-04-16T12:36:16.114+02:00Is x raised to the power 5/3 conic quadratic representable?The set<br /><br /><br /> x^5/3 <= t, x,t>=0.0 <br /><br />can be represented by<br /><br />x^2 <= 2 s t, s,t >= 0,<br />u = x,<br />v = s, <br />z = v, <br />z^2 <= 2fg, f,g > = 0, <br />f = 0.5,<br />4 g = h, <br />h^2 <= 2uv, u,v >= 0.<br /><br />I will leave it to reader to verify it is correct.<br /><br />this particular set I have come up over again and again in financial applications. I suppose it has something to do with modeling of transactions costs.<br /><br />An obvious question is why replace a simple problem but something that looks quite a bit complicated.<br />However, both in theory and practice the conic quadratic optimization problems is easier to solve then general convex problems.Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-88180202939238302232010-03-19T11:08:00.001+01:002010-03-23T12:33:44.869+01:00A conic quadratic representable set.Assume you have the problem<br /><br />min v<br />st. |x|^3 / y^2 <= v, y>=0<br /><br />A customer asked me if that can be formulated as conic quadratic problem. Indeed it can and my solution is<br /><br />z^2 <= 2ty<br />2t <= a<br />a^2 <= 2 uv<br />z = 2u<br />x <= z<br />-x <= z<br />0 <= y,t,u,v<br /> Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0tag:blogger.com,1999:blog-1952964643237411579.post-11633315159276891852010-03-19T09:50:00.001+01:002010-03-19T11:17:10.020+01:00Effective cuts for mixed integer goal programsAssume you have the problem<br /><br />min c'x + ||Fx-f||_1<br />st. A x = b<br /> x binary<br /><br />This is sort of a goal programming model I have seen frequently. Now the typical LP formulation is<br /><br /><br />min c'x +e'(s+t)<br />st. F x - f = s-t<br /> A x = b<br /> s,t >= 0 <br /> x binary<br /><br />Note this model includes continuous variables too. At least for an application I have seen then<br /><br /><ol><li>It easy to find a feasible solution.</li><li> There is huge gap between the LP relaxation and the feasible solution.</li><li>It is hard to close the gap. A huge number of nodes is required and the standard cuts works poorly.</li></ol>Then leads to the question. What is good cuts for this type of problems?Erling D. Andersenhttp://www.blogger.com/profile/07306894197500659436noreply@blogger.com0